******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/51/51.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 19465 1.0000 500 2184 1.0000 500 21888 1.0000 500 22710 1.0000 500 23402 1.0000 500 25508 1.0000 500 25708 1.0000 500 262046 1.0000 500 270352 1.0000 500 32181 1.0000 500 33906 1.0000 500 35728 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/51/51.seqs.fa -oc motifs/51 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.257 C 0.249 G 0.234 T 0.259 Background letter frequencies (from dataset with add-one prior applied): A 0.257 C 0.250 G 0.234 T 0.259 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 12 llr = 143 E-value = 1.2e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1:1::1:11211:::: pos.-specific C :411:33:::22113: probability G 44:2a:29:28::383 matrix T 5288:75:97:897:7 bits 2.1 * 1.9 * 1.7 * * 1.5 * ** * Relative 1.3 * ** * * Entropy 1.0 * * ** * * ** (17.3 bits) 0.8 **** ** ****** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TCTTGTTGTTGTTTGT consensus GG CC GCG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 25508 335 7.34e-10 AGGTTGCCAT GGTTGTTGTTGTTTGT ATTCGTCGTG 262046 39 5.67e-08 TCTCAGCTCA GCTTGCCGTTGTTGGT TGTGCCGCCG 25708 240 7.36e-08 TCGTTGTTGT TGTTGTCGTTGATTGT GACGACGATG 32181 123 1.82e-07 CTTCATACAA TCTTGTTGTAGTTTCG CCAAGTTGGG 270352 51 5.71e-07 CTTCTGCGCT GCTGGTTGTTGTTGCG ACGATTCACC 33906 16 8.32e-07 AGTCTATATC TTTTGTTGTTATTTCT GTGATTCTTC 21888 282 1.96e-06 CTGCTTTGTC GCTTGTCGTGGCTCGT GGCCGATGAG 22710 220 2.49e-06 ACGAACTTGG TGTCGTCATTGTTTGG GGAGGATACG 2184 180 2.69e-06 CCGTCTTCTG TGTTGCGGTGCTTGGT TGTGCGTTGA 23402 167 3.91e-06 TCATCAGGGC GGTTGATGTTCTCTGT GCGAGAGATA 19465 271 9.97e-06 CTAGATCTTC TTCTGCGGTTGCTTGG CGATACGCCC 35728 147 3.44e-05 GTTCCACAAC ACAGGTTGAAGTTTGT AAAGTCCAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25508 7.3e-10 334_[+1]_150 262046 5.7e-08 38_[+1]_446 25708 7.4e-08 239_[+1]_245 32181 1.8e-07 122_[+1]_362 270352 5.7e-07 50_[+1]_434 33906 8.3e-07 15_[+1]_469 21888 2e-06 281_[+1]_203 22710 2.5e-06 219_[+1]_265 2184 2.7e-06 179_[+1]_305 23402 3.9e-06 166_[+1]_318 19465 1e-05 270_[+1]_214 35728 3.4e-05 146_[+1]_338 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=12 25508 ( 335) GGTTGTTGTTGTTTGT 1 262046 ( 39) GCTTGCCGTTGTTGGT 1 25708 ( 240) TGTTGTCGTTGATTGT 1 32181 ( 123) TCTTGTTGTAGTTTCG 1 270352 ( 51) GCTGGTTGTTGTTGCG 1 33906 ( 16) TTTTGTTGTTATTTCT 1 21888 ( 282) GCTTGTCGTGGCTCGT 1 22710 ( 220) TGTCGTCATTGTTTGG 1 2184 ( 180) TGTTGCGGTGCTTGGT 1 23402 ( 167) GGTTGATGTTCTCTGT 1 19465 ( 271) TTCTGCGGTTGCTTGG 1 35728 ( 147) ACAGGTTGAAGTTTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 8.91886 E= 1.2e-002 -162 -1023 83 95 -1023 74 83 -64 -162 -158 -1023 168 -1023 -158 -49 153 -1023 -1023 209 -1023 -162 0 -1023 136 -1023 42 -49 95 -162 -1023 197 -1023 -162 -1023 -1023 182 -62 -1023 -49 136 -162 -58 168 -1023 -162 -58 -1023 153 -1023 -158 -1023 182 -1023 -158 9 136 -1023 0 168 -1023 -1023 -1023 51 136 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 12 E= 1.2e-002 0.083333 0.000000 0.416667 0.500000 0.000000 0.416667 0.416667 0.166667 0.083333 0.083333 0.000000 0.833333 0.000000 0.083333 0.166667 0.750000 0.000000 0.000000 1.000000 0.000000 0.083333 0.250000 0.000000 0.666667 0.000000 0.333333 0.166667 0.500000 0.083333 0.000000 0.916667 0.000000 0.083333 0.000000 0.000000 0.916667 0.166667 0.000000 0.166667 0.666667 0.083333 0.166667 0.750000 0.000000 0.083333 0.166667 0.000000 0.750000 0.000000 0.083333 0.000000 0.916667 0.000000 0.083333 0.250000 0.666667 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.333333 0.666667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TG][CG]TTG[TC][TC]GTTGTT[TG][GC][TG] -------------------------------------------------------------------------------- Time 1.24 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 10 llr = 121 E-value = 3.3e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::1:91::::88:2 pos.-specific C ::99:63:::2176 probability G 21::136:93::3: matrix T 89:1::1a17:1:2 bits 2.1 1.9 * 1.7 ** 1.5 **** ** Relative 1.3 ***** ** * Entropy 1.0 ***** ****** (17.5 bits) 0.8 ***** ******* 0.6 ************** 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel TTCCACGTGTAACC consensus G GC GC GA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 32181 360 1.48e-07 ACGCCGTACA TTCCACGTGTCAGC TCCGAGCTCG 23402 426 3.55e-07 GAACCCGAAG TTACAGGTGTAACC CCCCTCGCGC 21888 357 4.85e-07 TCGCCCCAAC TTCCACGTTGAACC ACTCACTGAT 19465 423 5.51e-07 GAGAGTTGAC TTCCACGTGTATCA TCATTATGAC 262046 104 7.03e-07 CTTTTTGTTG TTCCACTTGGAAGC CGCGCTTGAC 35728 166 9.10e-07 AGTTTGTAAA GTCCAGCTGTAACT CGCTGTTGAA 270352 406 9.10e-07 AGTTTGTAAA GTCCAGCTGTAACT CGCTGTTGAA 33906 105 1.54e-06 AAACGGACGC TTCCGCGTGTACCC TATTATCCAT 25508 397 4.49e-06 AAGTGACTTT TTCTACCTGTCACA ATAGAGCTAC 22710 389 4.76e-06 CACGTGGAAG TGCCAAGTGGAAGC CAGCAAAACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32181 1.5e-07 359_[+2]_127 23402 3.5e-07 425_[+2]_61 21888 4.8e-07 356_[+2]_130 19465 5.5e-07 422_[+2]_64 262046 7e-07 103_[+2]_383 35728 9.1e-07 165_[+2]_321 270352 9.1e-07 405_[+2]_81 33906 1.5e-06 104_[+2]_382 25508 4.5e-06 396_[+2]_90 22710 4.8e-06 388_[+2]_98 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=10 32181 ( 360) TTCCACGTGTCAGC 1 23402 ( 426) TTACAGGTGTAACC 1 21888 ( 357) TTCCACGTTGAACC 1 19465 ( 423) TTCCACGTGTATCA 1 262046 ( 104) TTCCACTTGGAAGC 1 35728 ( 166) GTCCAGCTGTAACT 1 270352 ( 406) GTCCAGCTGTAACT 1 33906 ( 105) TTCCGCGTGTACCC 1 25508 ( 397) TTCTACCTGTCACA 1 22710 ( 389) TGCCAAGTGGAAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 5844 bayes= 9.44028 E= 3.3e-002 -997 -997 -23 163 -997 -997 -123 180 -136 185 -997 -997 -997 185 -997 -137 181 -997 -123 -997 -136 127 36 -997 -997 27 136 -137 -997 -997 -997 195 -997 -997 194 -137 -997 -997 36 143 164 -32 -997 -997 164 -132 -997 -137 -997 149 36 -997 -36 127 -997 -37 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 10 E= 3.3e-002 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.100000 0.900000 0.100000 0.900000 0.000000 0.000000 0.000000 0.900000 0.000000 0.100000 0.900000 0.000000 0.100000 0.000000 0.100000 0.600000 0.300000 0.000000 0.000000 0.300000 0.600000 0.100000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.900000 0.100000 0.000000 0.000000 0.300000 0.700000 0.800000 0.200000 0.000000 0.000000 0.800000 0.100000 0.000000 0.100000 0.000000 0.700000 0.300000 0.000000 0.200000 0.600000 0.000000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TG]TCCA[CG][GC]TG[TG][AC]A[CG][CAT] -------------------------------------------------------------------------------- Time 2.43 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 9 llr = 106 E-value = 4.1e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 263a8:a:28a1 pos.-specific C 8:7:27:98::9 probability G :4:::::1:2:: matrix T :::::3:::::: bits 2.1 1.9 * * * 1.7 * * * 1.5 * ** ** Relative 1.3 * ** ****** Entropy 1.0 ************ (16.9 bits) 0.8 ************ 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CACAACACCAAC consensus AGA CT AG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 35728 249 1.34e-07 CACGCCAAAA CGCAACACCAAC GTTCACAACG 270352 489 1.34e-07 CACGCCAAAA CGCAACACCAAC 19465 447 1.44e-06 TCATTATGAC AAAAACACCAAC ACCCTTCTTT 2184 451 2.32e-06 CAGCTTTACA CACAACACCAAA CTCATCTCCA 25508 191 2.64e-06 AGCTCTCCTT CACACCACCGAC ATTGAGCTTC 262046 419 3.09e-06 TATCATCTCT CGCACCACCGAC GTCTCAACGT 23402 455 3.52e-06 CGCGCACCCA CAAAATACAAAC TTGCCATCCA 33906 87 4.00e-06 TGTACACTAG CGAAATACAAAC GGACGCTTCC 32181 423 9.56e-06 ATCACAACAA AACAATAGCAAC AACTCAACCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35728 1.3e-07 248_[+3]_240 270352 1.3e-07 488_[+3] 19465 1.4e-06 446_[+3]_42 2184 2.3e-06 450_[+3]_38 25508 2.6e-06 190_[+3]_298 262046 3.1e-06 418_[+3]_70 23402 3.5e-06 454_[+3]_34 33906 4e-06 86_[+3]_402 32181 9.6e-06 422_[+3]_66 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=9 35728 ( 249) CGCAACACCAAC 1 270352 ( 489) CGCAACACCAAC 1 19465 ( 447) AAAAACACCAAC 1 2184 ( 451) CACAACACCAAA 1 25508 ( 191) CACACCACCGAC 1 262046 ( 419) CGCACCACCGAC 1 23402 ( 455) CAAAATACAAAC 1 33906 ( 87) CGAAATACAAAC 1 32181 ( 423) AACAATAGCAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 9.48101 E= 4.1e-001 -21 164 -982 -982 111 -982 92 -982 37 142 -982 -982 196 -982 -982 -982 160 -17 -982 -982 -982 142 -982 36 196 -982 -982 -982 -982 183 -107 -982 -21 164 -982 -982 160 -982 -8 -982 196 -982 -982 -982 -121 183 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 4.1e-001 0.222222 0.777778 0.000000 0.000000 0.555556 0.000000 0.444444 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.777778 0.222222 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 1.000000 0.000000 0.000000 0.000000 0.000000 0.888889 0.111111 0.000000 0.222222 0.777778 0.000000 0.000000 0.777778 0.000000 0.222222 0.000000 1.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA][AG][CA]A[AC][CT]AC[CA][AG]AC -------------------------------------------------------------------------------- Time 3.56 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 19465 2.17e-07 270_[+1(9.97e-06)]_136_\ [+2(5.51e-07)]_10_[+3(1.44e-06)]_42 2184 1.50e-04 179_[+1(2.69e-06)]_255_\ [+3(2.32e-06)]_38 21888 2.92e-05 281_[+1(1.96e-06)]_59_\ [+2(4.85e-07)]_130 22710 1.20e-04 219_[+1(2.49e-06)]_153_\ [+2(4.76e-06)]_98 23402 1.40e-07 166_[+1(3.91e-06)]_243_\ [+2(3.55e-07)]_15_[+3(3.52e-06)]_34 25508 4.10e-10 190_[+3(2.64e-06)]_132_\ [+1(7.34e-10)]_46_[+2(4.49e-06)]_90 25708 3.75e-04 239_[+1(7.36e-08)]_245 262046 4.81e-09 38_[+1(5.67e-08)]_49_[+2(7.03e-07)]_\ 178_[+3(4.81e-05)]_111_[+3(3.09e-06)]_70 270352 2.85e-09 50_[+1(5.71e-07)]_320_\ [+1(3.44e-05)]_3_[+2(9.10e-07)]_69_[+3(1.34e-07)] 32181 9.48e-09 122_[+1(1.82e-07)]_221_\ [+2(1.48e-07)]_49_[+3(9.56e-06)]_66 33906 1.47e-07 15_[+1(8.32e-07)]_55_[+3(4.00e-06)]_\ 6_[+2(1.54e-06)]_382 35728 1.22e-07 146_[+1(3.44e-05)]_3_[+2(9.10e-07)]_\ 69_[+3(1.34e-07)]_240 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************