******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/258/258.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1598 1.0000 500 47985 1.0000 500 4604 1.0000 500 48260 1.0000 500 38794 1.0000 500 38958 1.0000 500 43586 1.0000 500 2542 1.0000 500 49743 1.0000 500 49932 1.0000 500 30823 1.0000 500 50257 1.0000 500 50343 1.0000 500 41518 1.0000 500 42611 1.0000 500 42884 1.0000 500 43098 1.0000 500 34514 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/258/258.seqs.fa -oc motifs/258 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.260 C 0.241 G 0.238 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.260 C 0.241 G 0.238 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 10 llr = 114 E-value = 5.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :5:29a:4:76: pos.-specific C 9:a8::91:::6 probability G 15::1:15a1:: matrix T :::::::::244 bits 2.1 * * 1.9 * * * 1.7 * * ** * 1.4 * * *** * Relative 1.2 * ***** * Entropy 1.0 ******* * ** (16.5 bits) 0.8 ******* **** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CACCAACGGAAC consensus G A A TTT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 42611 198 5.01e-08 AACGGTCCGG CGCCAACGGAAC GCTTTTGATT 50343 87 7.88e-07 GGTACACCCC CACCAACGGATT TTTTCGATAG 49932 289 1.34e-06 GCCGTAGAAC CGCCAACCGAAC CCCGTGCCGT 38794 277 1.34e-06 GACGAAGAGA CACCAACAGTAC TTGAATACGC 41518 3 1.91e-06 TG CACAAACGGATC ATGGTGTCGA 30823 257 3.03e-06 CCAAAGAATT CGCAAACAGAAT ACTACTCAAA 2542 288 3.03e-06 CCGGAGGATC CGCCAAGGGAAC AGTTACGGAA 4604 29 4.02e-06 CACTCGGGTT CACCAACAGTTT CAACTTTCGA 38958 70 8.04e-06 ATTTCTAGCG CACCGACAGATT CGTATTCGGC 48260 359 1.26e-05 ACGGTACGTA GGCCAACGGGAC CGACGACATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42611 5e-08 197_[+1]_291 50343 7.9e-07 86_[+1]_402 49932 1.3e-06 288_[+1]_200 38794 1.3e-06 276_[+1]_212 41518 1.9e-06 2_[+1]_486 30823 3e-06 256_[+1]_232 2542 3e-06 287_[+1]_201 4604 4e-06 28_[+1]_460 38958 8e-06 69_[+1]_419 48260 1.3e-05 358_[+1]_130 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=10 42611 ( 198) CGCCAACGGAAC 1 50343 ( 87) CACCAACGGATT 1 49932 ( 289) CGCCAACCGAAC 1 38794 ( 277) CACCAACAGTAC 1 41518 ( 3) CACAAACGGATC 1 30823 ( 257) CGCAAACAGAAT 1 2542 ( 288) CGCCAAGGGAAC 1 4604 ( 29) CACCAACAGTTT 1 38958 ( 70) CACCGACAGATT 1 48260 ( 359) GGCCAACGGGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 10.7243 E= 5.3e+001 -997 190 -125 -997 94 -997 107 -997 -997 205 -997 -997 -38 173 -997 -997 179 -997 -125 -997 194 -997 -997 -997 -997 190 -125 -997 62 -127 107 -997 -997 -997 207 -997 143 -997 -125 -38 120 -997 -997 62 -997 132 -997 62 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 5.3e+001 0.000000 0.900000 0.100000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.900000 0.000000 0.100000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.900000 0.100000 0.000000 0.400000 0.100000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.700000 0.000000 0.100000 0.200000 0.600000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.400000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[AG]C[CA]AAC[GA]G[AT][AT][CT] -------------------------------------------------------------------------------- Time 3.23 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 5 llr = 96 E-value = 5.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 26a82::2:::22:::844:: pos.-specific C 82:::2:4a::2:::::::a2 probability G :2::2:8::aa4::8a:46:8 matrix T :::26824:::28a2:22::: bits 2.1 *** * * 1.9 * *** * * * 1.7 * *** * * * 1.4 * *** * * * Relative 1.2 * ** ** *** ***** ** Entropy 1.0 * ** ** *** ***** *** (27.7 bits) 0.8 * ** ** *** ***** *** 0.6 ******* *** ***** *** 0.4 *********** ********* 0.2 *********** ********* 0.0 --------------------- Multilevel CAAATTGCCGGGTTGGAAGCG consensus AC TACTT AA T TGA C sequence G G A C T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 38794 18 3.87e-11 CCAACCCCGT CCAATTGTCGGTTTGGAAGCG GAGATAGTGA 50343 182 9.05e-10 GAATTGCTTT CAAATTTCCGGGTTGGATACC ATACCGAAAA 4604 291 1.25e-09 AGCCAACATG AGAAATGCCGGATTGGAAGCG CCCATTTGGA 30823 79 1.87e-09 CTTGTGTAGC CAATTTGACGGCTTTGAGACG AAGAGGTTGT 41518 35 2.66e-09 TTCGTAACGC CAAAGCGTCGGGATGGTGGCG CACTAGCGGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38794 3.9e-11 17_[+2]_462 50343 9e-10 181_[+2]_298 4604 1.2e-09 290_[+2]_189 30823 1.9e-09 78_[+2]_401 41518 2.7e-09 34_[+2]_445 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=5 38794 ( 18) CCAATTGTCGGTTTGGAAGCG 1 50343 ( 182) CAAATTTCCGGGTTGGATACC 1 4604 ( 291) AGAAATGCCGGATTGGAAGCG 1 30823 ( 79) CAATTTGACGGCTTTGAGACG 1 41518 ( 35) CAAAGCGTCGGGATGGTGGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 11.0057 E= 5.2e+002 -38 173 -897 -897 120 -27 -25 -897 194 -897 -897 -897 162 -897 -897 -38 -38 -897 -25 120 -897 -27 -897 162 -897 -897 174 -38 -38 73 -897 62 -897 205 -897 -897 -897 -897 207 -897 -897 -897 207 -897 -38 -27 75 -38 -38 -897 -897 162 -897 -897 -897 194 -897 -897 174 -38 -897 -897 207 -897 162 -897 -897 -38 62 -897 75 -38 62 -897 133 -897 -897 205 -897 -897 -897 -27 174 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 5.2e+002 0.200000 0.800000 0.000000 0.000000 0.600000 0.200000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.200000 0.000000 0.200000 0.600000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.800000 0.200000 0.200000 0.400000 0.000000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.200000 0.400000 0.200000 0.200000 0.000000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.400000 0.000000 0.400000 0.200000 0.400000 0.000000 0.600000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CA][ACG]A[AT][TAG][TC][GT][CTA]CGG[GACT][TA]T[GT]G[AT][AGT][GA]C[GC] -------------------------------------------------------------------------------- Time 6.23 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 11 llr = 118 E-value = 5.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a93:8:2:552: pos.-specific C :17::9115581 probability G :::921761::9 matrix T :::1:::3:::: bits 2.1 1.9 * 1.7 * * * * 1.4 ** * * ** Relative 1.2 ****** ** Entropy 1.0 ******* *** (15.5 bits) 0.8 ******** *** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AACGACGGAACG consensus A TCC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43586 210 9.27e-07 GTCATTTGCC AACGACAGCCCG AAGGAAAATT 49932 137 1.24e-06 TGGCAGTTCC AACGACGGAAAG GACGAAAAAG 41518 354 1.38e-06 TTTGGCTGCC AACGACGCCACG GACCTTTCCA 38958 210 1.86e-06 CATACTCACA AAAGACGTAACG TACACTTAGA 43098 27 2.61e-06 GGTTTGCGGA AACGAGGGAACG ACGGATGCTG 38794 135 2.61e-06 GTGAGGGAGG ACCGACGGACCG AGTCTGATTG 42611 93 4.19e-06 GAAGGACACG AACGACGTGCCG TGTGCCGTAG 2542 111 5.81e-06 CTTGACTGTG AAAGACCGCCCG ACTTCGACTT 1598 245 6.18e-06 CAACTGTAGC AACGACGTCACC CATTGCCTAC 30823 46 1.72e-05 GCTATCGCAA AACGGCAGCCAG CTTCACGTCA 4604 86 2.60e-05 TCAATTCGAA AAATGCGGAACG ACAAACGAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43586 9.3e-07 209_[+3]_279 49932 1.2e-06 136_[+3]_352 41518 1.4e-06 353_[+3]_135 38958 1.9e-06 209_[+3]_279 43098 2.6e-06 26_[+3]_462 38794 2.6e-06 134_[+3]_354 42611 4.2e-06 92_[+3]_396 2542 5.8e-06 110_[+3]_378 1598 6.2e-06 244_[+3]_244 30823 1.7e-05 45_[+3]_443 4604 2.6e-05 85_[+3]_403 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=11 43586 ( 210) AACGACAGCCCG 1 49932 ( 137) AACGACGGAAAG 1 41518 ( 354) AACGACGCCACG 1 38958 ( 210) AAAGACGTAACG 1 43098 ( 27) AACGAGGGAACG 1 38794 ( 135) ACCGACGGACCG 1 42611 ( 93) AACGACGTGCCG 1 2542 ( 111) AAAGACCGCCCG 1 1598 ( 245) AACGACGTCACC 1 30823 ( 46) AACGGCAGCCAG 1 4604 ( 86) AAATGCGGAACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 9.99787 E= 5.4e+002 194 -1010 -1010 -1010 180 -140 -1010 -1010 7 159 -1010 -1010 -1010 -1010 193 -152 165 -1010 -39 -1010 -1010 192 -139 -1010 -52 -140 161 -1010 -1010 -140 142 7 80 92 -139 -1010 107 92 -1010 -1010 -52 176 -1010 -1010 -1010 -140 193 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 5.4e+002 1.000000 0.000000 0.000000 0.000000 0.909091 0.090909 0.000000 0.000000 0.272727 0.727273 0.000000 0.000000 0.000000 0.000000 0.909091 0.090909 0.818182 0.000000 0.181818 0.000000 0.000000 0.909091 0.090909 0.000000 0.181818 0.090909 0.727273 0.000000 0.000000 0.090909 0.636364 0.272727 0.454545 0.454545 0.090909 0.000000 0.545455 0.454545 0.000000 0.000000 0.181818 0.818182 0.000000 0.000000 0.000000 0.090909 0.909091 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AA[CA]GACG[GT][AC][AC]CG -------------------------------------------------------------------------------- Time 9.32 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1598 1.59e-02 244_[+3(6.18e-06)]_244 47985 3.87e-01 500 4604 5.01e-09 28_[+1(4.02e-06)]_45_[+3(2.60e-05)]_\ 193_[+2(1.25e-09)]_189 48260 6.23e-02 358_[+1(1.26e-05)]_130 38794 8.39e-12 17_[+2(3.87e-11)]_96_[+3(2.61e-06)]_\ 130_[+1(1.34e-06)]_212 38958 1.28e-04 69_[+1(8.04e-06)]_128_\ [+3(1.86e-06)]_279 43586 2.94e-03 209_[+3(9.27e-07)]_279 2542 8.76e-05 110_[+3(5.81e-06)]_165_\ [+1(3.03e-06)]_201 49743 4.64e-01 500 49932 3.10e-05 136_[+3(1.24e-06)]_140_\ [+1(1.34e-06)]_200 30823 3.83e-09 45_[+3(1.72e-05)]_21_[+2(1.87e-09)]_\ 157_[+1(3.03e-06)]_232 50257 2.82e-01 500 50343 3.01e-08 86_[+1(7.88e-07)]_83_[+2(9.05e-10)]_\ 298 41518 3.36e-10 2_[+1(1.91e-06)]_20_[+2(2.66e-09)]_\ 298_[+3(1.38e-06)]_135 42611 2.61e-06 92_[+3(4.19e-06)]_79_[+1(7.04e-05)]_\ 2_[+1(5.01e-08)]_291 42884 7.87e-01 500 43098 2.10e-02 26_[+3(2.61e-06)]_462 34514 6.10e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************