******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/196/196.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 17259 1.0000 500 53969 1.0000 500 21122 1.0000 500 21275 1.0000 500 47369 1.0000 500 2087 1.0000 500 47678 1.0000 500 21929 1.0000 500 29196 1.0000 500 38413 1.0000 500 48059 1.0000 500 29283 1.0000 500 48436 1.0000 500 39173 1.0000 500 48707 1.0000 500 39374 1.0000 500 40462 1.0000 500 49771 1.0000 500 33206 1.0000 500 51597 1.0000 500 44773 1.0000 500 45331 1.0000 500 54417 1.0000 500 45679 1.0000 500 36175 1.0000 500 31620 1.0000 500 42612 1.0000 500 48368 1.0000 500 44667 1.0000 500 43775 1.0000 500 49942 1.0000 500 45659 1.0000 500 48420 1.0000 500 45062 1.0000 500 44662 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/196/196.seqs.fa -oc motifs/196 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 35 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 17500 N= 35 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.253 C 0.251 G 0.237 T 0.258 Background letter frequencies (from dataset with add-one prior applied): A 0.253 C 0.251 G 0.237 T 0.258 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 13 llr = 164 E-value = 2.6e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 3:1a1:5::4129:: pos.-specific C 6a6:72:9::9::61 probability G ::2::41156:8149 matrix T 1:2:244:5:::::: bits 2.1 * * 1.9 * * 1.7 * * * * * * 1.5 * * * *** * Relative 1.2 * * * *** * Entropy 1.0 * * ******** (18.3 bits) 0.8 ** ** ******** 0.6 ** ** ********* 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CCCACGACGGCGACG consensus A TTT TA G sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 29196 398 7.18e-09 CTTTTTGGCA CCCACGACGACGACG ACGGTAGAAG 49771 298 1.31e-08 CCGGCGAGAC CCCACTACTGCGAGG ACTTAAGAAA 53969 123 6.25e-08 GCGTCGTCGT CCCACCTCGACGACG GCGCGGTGTT 42612 333 1.16e-07 ATCACTCGCT CCGACTACGACGACG ACGCTACCGC 48436 315 1.80e-07 ACTGTGAATG ACCATGACTGCGAGG GGGAGCATGG 44773 469 4.83e-07 TTGGAAATCA ACCATTTCTACGACG ACAAGAGCAT 48059 117 7.49e-07 GATTCCGCCA CCTACTACTGCAACG TGATGATGGA 45659 176 1.19e-06 GAAAAACTCG ACCATGGCGGCGAGG AACGAGGCCT 17259 233 1.19e-06 TGGCTAGTGT CCCACCTCGAAGACG CGTGATGAAC 51597 458 1.60e-06 TTGGCACTCA CCGACTACTGCGGGG TCGTTTCTTA 54417 200 3.17e-06 TGTGTACTGA CCAACGTGGGCGAGG TTAGCCCGGG 47369 282 3.37e-06 CCGGTTCCCA ACCAAGTCTGCAACG GTCCCCGTCG 2087 357 7.51e-06 CTGGCAGACT TCTACCACGGCGACC AACAAGGAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 29196 7.2e-09 397_[+1]_88 49771 1.3e-08 297_[+1]_188 53969 6.2e-08 122_[+1]_363 42612 1.2e-07 332_[+1]_153 48436 1.8e-07 314_[+1]_171 44773 4.8e-07 468_[+1]_17 48059 7.5e-07 116_[+1]_369 45659 1.2e-06 175_[+1]_310 17259 1.2e-06 232_[+1]_253 51597 1.6e-06 457_[+1]_28 54417 3.2e-06 199_[+1]_286 47369 3.4e-06 281_[+1]_204 2087 7.5e-06 356_[+1]_129 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=13 29196 ( 398) CCCACGACGACGACG 1 49771 ( 298) CCCACTACTGCGAGG 1 53969 ( 123) CCCACCTCGACGACG 1 42612 ( 333) CCGACTACGACGACG 1 48436 ( 315) ACCATGACTGCGAGG 1 44773 ( 469) ACCATTTCTACGACG 1 48059 ( 117) CCTACTACTGCAACG 1 45659 ( 176) ACCATGGCGGCGAGG 1 17259 ( 233) CCCACCTCGAAGACG 1 51597 ( 458) CCGACTACTGCGGGG 1 54417 ( 200) CCAACGTGGGCGAGG 1 47369 ( 282) ACCAAGTCTGCAACG 1 2087 ( 357) TCTACCACGGCGACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 17010 bayes= 10.8834 E= 2.6e-002 28 129 -1035 -175 -1035 199 -1035 -1035 -172 129 -62 -75 198 -1035 -1035 -1035 -172 146 -1035 -16 -1035 -12 70 57 109 -1035 -162 57 -1035 188 -162 -1035 -1035 -1035 118 84 60 -1035 138 -1035 -172 188 -1035 -1035 -72 -1035 184 -1035 186 -1035 -162 -1035 -1035 129 70 -1035 -1035 -171 196 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 13 E= 2.6e-002 0.307692 0.615385 0.000000 0.076923 0.000000 1.000000 0.000000 0.000000 0.076923 0.615385 0.153846 0.153846 1.000000 0.000000 0.000000 0.000000 0.076923 0.692308 0.000000 0.230769 0.000000 0.230769 0.384615 0.384615 0.538462 0.000000 0.076923 0.384615 0.000000 0.923077 0.076923 0.000000 0.000000 0.000000 0.538462 0.461538 0.384615 0.000000 0.615385 0.000000 0.076923 0.923077 0.000000 0.000000 0.153846 0.000000 0.846154 0.000000 0.923077 0.000000 0.076923 0.000000 0.000000 0.615385 0.384615 0.000000 0.000000 0.076923 0.923077 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CA]CCA[CT][GTC][AT]C[GT][GA]CGA[CG]G -------------------------------------------------------------------------------- Time 10.16 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 18 sites = 6 llr = 105 E-value = 2.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::7333::::2:::::: pos.-specific C 7::233:372:3a3a:aa probability G :aa:33::38a5:7:a:: matrix T 3::2::77:::::::::: bits 2.1 ** * * **** 1.9 ** * * **** 1.7 ** * * **** 1.5 ** ** * **** Relative 1.2 ** ** * **** Entropy 1.0 *** ***** ****** (25.2 bits) 0.8 *** ***** ****** 0.6 **** ************ 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel CGGAAATTCGGGCGCGCC consensus T CCACG C C sequence GG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 33206 295 2.09e-10 GTCCGAGCGC CGGACCTTCGGCCGCGCC GGCATTGGGA 54417 356 4.58e-10 ATAGAGTATA CGGACGTCCGGGCGCGCC AGGATCCGAA 49771 239 1.01e-09 ATTCCGCGCG CGGAGATTCGGACGCGCC ATTCCCGGAC 45679 332 1.26e-08 GATCCGGTAT TGGCGATTCGGGCCCGCC ATCACTTTTT 42612 238 2.75e-08 GCGCACACGG CGGTAGATGGGCCCCGCC GAGTAGTCAG 29196 170 3.56e-08 CTTTATCAAT TGGAACACGCGGCGCGCC TTTTGCCATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33206 2.1e-10 294_[+2]_188 54417 4.6e-10 355_[+2]_127 49771 1e-09 238_[+2]_244 45679 1.3e-08 331_[+2]_151 42612 2.8e-08 237_[+2]_245 29196 3.6e-08 169_[+2]_313 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=18 seqs=6 33206 ( 295) CGGACCTTCGGCCGCGCC 1 54417 ( 356) CGGACGTCCGGGCGCGCC 1 49771 ( 239) CGGAGATTCGGACGCGCC 1 45679 ( 332) TGGCGATTCGGGCCCGCC 1 42612 ( 238) CGGTAGATGGGCCCCGCC 1 29196 ( 170) TGGAACACGCGGCGCGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 16905 bayes= 12.5595 E= 2.8e+001 -923 141 -923 37 -923 -923 208 -923 -923 -923 208 -923 139 -59 -923 -63 40 41 49 -923 40 41 49 -923 40 -923 -923 137 -923 41 -923 137 -923 141 49 -923 -923 -59 181 -923 -923 -923 208 -923 -60 41 108 -923 -923 199 -923 -923 -923 41 149 -923 -923 199 -923 -923 -923 -923 208 -923 -923 199 -923 -923 -923 199 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 6 E= 2.8e+001 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.166667 0.000000 0.166667 0.333333 0.333333 0.333333 0.000000 0.333333 0.333333 0.333333 0.000000 0.333333 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.666667 0.000000 0.666667 0.333333 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.333333 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CT]GGA[ACG][ACG][TA][TC][CG]GG[GC]C[GC]CGCC -------------------------------------------------------------------------------- Time 20.49 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 14 llr = 152 E-value = 7.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :6:42::a1:17 pos.-specific C a1a249::67:: probability G :::4:1a::393 matrix T :2::4:::3::: bits 2.1 * * ** 1.9 * * ** 1.7 * * ** 1.5 * * *** * Relative 1.2 * * *** *** Entropy 1.0 * * *** *** (15.6 bits) 0.8 * * ******* 0.6 *** ******* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CACGTCGACCGA consensus T AC TG G sequence CA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 40462 236 2.26e-07 ATGAGCTGCA CACACCGACCGA GTCAATGCGT 53969 305 3.39e-07 TGACCCAATG CACCTCGACCGA TTGGGCGGGA 47678 409 7.72e-07 GAATTTTTGC CACGCCGACCGG TTTCTTGGGG 45331 475 1.34e-06 CCATCGAAGC CACCACGACCGA GAACCTATCT 38413 352 1.72e-06 CTGCACGCGC CACGACGACCGG TGAAAATAAC 45062 398 3.54e-06 ATCATGATCC CTCCCCGACCGA CTGGACAGTC 43775 157 4.39e-06 TCAAGATTCC CTCATCGATCGA ACGAGAATGC 21122 282 4.39e-06 CGCAACGATA CTCATCGATCGA GATACGGTGT 47369 300 4.93e-06 TGCAACGGTC CCCGTCGATCGA TACGGGGAAG 48368 55 7.77e-06 TGTCGACAGT CACGCGGACGGA GACCATACCA 48420 19 1.42e-05 TGTTCATCCA CACGTCGACGAG ACACCCAGCC 39173 204 1.72e-05 CCCAACAAAG CCCGTCGAACGA GGGGGCAATC 2087 411 1.83e-05 CGGAGCGTGG CACACGGACGGG CCTTCGGGGA 44662 188 2.64e-05 TGTTGGAAAC CACAACGATGAA AGGTGGAGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40462 2.3e-07 235_[+3]_253 53969 3.4e-07 304_[+3]_184 47678 7.7e-07 408_[+3]_80 45331 1.3e-06 474_[+3]_14 38413 1.7e-06 351_[+3]_137 45062 3.5e-06 397_[+3]_91 43775 4.4e-06 156_[+3]_332 21122 4.4e-06 281_[+3]_207 47369 4.9e-06 299_[+3]_189 48368 7.8e-06 54_[+3]_434 48420 1.4e-05 18_[+3]_470 39173 1.7e-05 203_[+3]_285 2087 1.8e-05 410_[+3]_78 44662 2.6e-05 187_[+3]_301 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=14 40462 ( 236) CACACCGACCGA 1 53969 ( 305) CACCTCGACCGA 1 47678 ( 409) CACGCCGACCGG 1 45331 ( 475) CACCACGACCGA 1 38413 ( 352) CACGACGACCGG 1 45062 ( 398) CTCCCCGACCGA 1 43775 ( 157) CTCATCGATCGA 1 21122 ( 282) CTCATCGATCGA 1 47369 ( 300) CCCGTCGATCGA 1 48368 ( 55) CACGCGGACGGA 1 48420 ( 19) CACGTCGACGAG 1 39173 ( 204) CCCGTCGAACGA 1 2087 ( 411) CACACGGACGGG 1 44662 ( 188) CACAACGATGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 17115 bayes= 10.8606 E= 7.2e+001 -1045 199 -1045 -1045 134 -81 -1045 -27 -1045 199 -1045 -1045 49 -23 85 -1045 -24 51 -1045 73 -1045 177 -73 -1045 -1045 -1045 208 -1045 198 -1045 -1045 -1045 -182 135 -1045 14 -1045 151 27 -1045 -83 -1045 185 -1045 149 -1045 27 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 7.2e+001 0.000000 1.000000 0.000000 0.000000 0.642857 0.142857 0.000000 0.214286 0.000000 1.000000 0.000000 0.000000 0.357143 0.214286 0.428571 0.000000 0.214286 0.357143 0.000000 0.428571 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.071429 0.642857 0.000000 0.285714 0.000000 0.714286 0.285714 0.000000 0.142857 0.000000 0.857143 0.000000 0.714286 0.000000 0.285714 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[AT]C[GAC][TCA]CGA[CT][CG]G[AG] -------------------------------------------------------------------------------- Time 30.52 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17259 4.82e-03 232_[+1(1.19e-06)]_253 53969 1.99e-07 122_[+1(6.25e-08)]_167_\ [+3(3.39e-07)]_20_[+1(5.56e-05)]_149 21122 1.20e-03 281_[+3(4.39e-06)]_207 21275 7.52e-01 500 47369 1.58e-06 191_[+2(4.29e-06)]_72_\ [+1(3.37e-06)]_3_[+3(4.93e-06)]_189 2087 1.57e-03 356_[+1(7.51e-06)]_39_\ [+3(1.83e-05)]_78 47678 2.17e-03 408_[+3(7.72e-07)]_80 21929 9.72e-02 500 29196 2.86e-09 169_[+2(3.56e-08)]_21_\ [+2(5.79e-05)]_171_[+1(7.18e-09)]_88 38413 9.63e-03 351_[+3(1.72e-06)]_137 48059 3.97e-03 116_[+1(7.49e-07)]_369 29283 7.44e-01 500 48436 3.65e-03 314_[+1(1.80e-07)]_171 39173 7.14e-02 203_[+3(1.72e-05)]_285 48707 4.19e-01 500 39374 2.20e-01 500 40462 2.41e-03 235_[+3(2.26e-07)]_105_\ [+3(3.14e-05)]_136 49771 6.29e-10 238_[+2(1.01e-09)]_41_\ [+1(1.31e-08)]_188 33206 4.19e-06 294_[+2(2.09e-10)]_188 51597 1.19e-02 457_[+1(1.60e-06)]_28 44773 8.34e-03 468_[+1(4.83e-07)]_17 45331 2.67e-03 474_[+3(1.34e-06)]_14 54417 1.28e-08 199_[+1(3.17e-06)]_51_\ [+2(3.12e-05)]_72_[+2(4.58e-10)]_127 45679 2.19e-04 331_[+2(1.26e-08)]_151 36175 6.61e-01 500 31620 5.84e-01 500 42612 9.67e-08 237_[+2(2.75e-08)]_77_\ [+1(1.16e-07)]_153 48368 2.71e-03 54_[+3(7.77e-06)]_434 44667 5.82e-01 500 43775 5.18e-02 156_[+3(4.39e-06)]_332 49942 7.75e-01 500 45659 1.06e-03 21_[+3(5.82e-05)]_142_\ [+1(1.19e-06)]_310 48420 6.04e-03 18_[+3(1.42e-05)]_470 45062 3.08e-03 397_[+3(3.54e-06)]_91 44662 1.58e-01 187_[+3(2.64e-05)]_301 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************