******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/70/70.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31973 1.0000 500 47505 1.0000 500 14478 1.0000 500 48078 1.0000 500 22659 1.0000 500 43695 1.0000 500 43802 1.0000 500 10398 1.0000 500 44329 1.0000 500 45994 1.0000 500 49896 1.0000 500 50158 1.0000 500 50295 1.0000 500 34138 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/70/70.seqs.fa -oc motifs/70 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.243 C 0.255 G 0.235 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.243 C 0.255 G 0.235 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 13 llr = 143 E-value = 5.3e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::a:4:1:85: pos.-specific C 2:9:81::8::3 probability G 721:::a:1:5: matrix T 28::25:922:7 bits 2.1 * * 1.9 * * 1.7 ** * 1.5 ** ** * Relative 1.3 **** ** * Entropy 1.0 **** ****** (15.9 bits) 0.8 ***** ****** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GTCACTGTCAGT consensus A AC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 31973 280 6.46e-08 CTTTCTCACA GTCACTGTCAGT CAAGTTTTTC 43695 408 2.51e-07 AATTTCCTTC GTCACAGTCAAT CATCAACGAA 22659 426 2.51e-07 AATCTCCCAA GTCACAGTCAAT CAAACGTATT 10398 346 3.13e-07 AAAATTCGTT GTCACTGTCAGC CACTAACACC 45994 317 4.32e-07 TCAGCGGGAT GTCACTGTCAAC AAAGCATGTT 50158 178 7.58e-07 TACGATTACT GTCACTGTTAGT TGACAGTGAA 44329 128 1.42e-06 GCAGCGAATA CTCACTGTCAAT ATCGATGCAG 14478 228 2.00e-06 AATCTAACCA TTCACAGTCAAT TCTCAGGTTG 43802 336 8.57e-06 CGTCGTCGTT GTCATCGTCAGT CTTCTATCGC 50295 162 1.71e-05 CTGTCACAGT GTCACAGTTTGC GGTTCTGCAA 47505 285 2.01e-05 ACACACGGAT GTGACAGTGAGT CTCGGACATA 48078 327 3.98e-05 TCTCGGCGCT CGCATTGTCAAC AATTGGTACG 34138 162 7.44e-05 GACTGTGACC TGCACTGACTGT TTTATAGAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31973 6.5e-08 279_[+1]_209 43695 2.5e-07 407_[+1]_81 22659 2.5e-07 425_[+1]_63 10398 3.1e-07 345_[+1]_143 45994 4.3e-07 316_[+1]_172 50158 7.6e-07 177_[+1]_311 44329 1.4e-06 127_[+1]_361 14478 2e-06 227_[+1]_261 43802 8.6e-06 335_[+1]_153 50295 1.7e-05 161_[+1]_327 47505 2e-05 284_[+1]_204 48078 4e-05 326_[+1]_162 34138 7.4e-05 161_[+1]_327 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=13 31973 ( 280) GTCACTGTCAGT 1 43695 ( 408) GTCACAGTCAAT 1 22659 ( 426) GTCACAGTCAAT 1 10398 ( 346) GTCACTGTCAGC 1 45994 ( 317) GTCACTGTCAAC 1 50158 ( 178) GTCACTGTTAGT 1 44329 ( 128) CTCACTGTCAAT 1 14478 ( 228) TTCACAGTCAAT 1 43802 ( 336) GTCATCGTCAGT 1 50295 ( 162) GTCACAGTTTGC 1 47505 ( 285) GTGACAGTGAGT 1 48078 ( 327) CGCATTGTCAAC 1 34138 ( 162) TGCACTGACTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.56922 E= 5.3e-006 -1035 -73 156 -79 -1035 -1035 -61 166 -1035 186 -161 -1035 204 -1035 -1035 -1035 -1035 173 -1035 -79 66 -172 -1035 101 -1035 -1035 209 -1035 -166 -1035 -1035 179 -1035 159 -161 -79 180 -1035 -1035 -79 92 -1035 119 -1035 -1035 27 -1035 138 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 5.3e-006 0.000000 0.153846 0.692308 0.153846 0.000000 0.000000 0.153846 0.846154 0.000000 0.923077 0.076923 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.846154 0.000000 0.153846 0.384615 0.076923 0.000000 0.538462 0.000000 0.000000 1.000000 0.000000 0.076923 0.000000 0.000000 0.923077 0.000000 0.769231 0.076923 0.153846 0.846154 0.000000 0.000000 0.153846 0.461538 0.000000 0.538462 0.000000 0.000000 0.307692 0.000000 0.692308 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GTCAC[TA]GTCA[GA][TC] -------------------------------------------------------------------------------- Time 1.85 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 9 llr = 103 E-value = 4.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::9:43::8643 pos.-specific C ::1a:::::::4 probability G 1a:::7:a21:2 matrix T 9:::6:a::36: bits 2.1 * * 1.9 * * ** 1.7 * * ** 1.5 **** ** Relative 1.3 **** **** Entropy 1.0 ********* * (16.5 bits) 0.8 ********* * 0.6 *********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGACTGTGAATC consensus AA GTAA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 50158 190 1.74e-07 CACTGTTAGT TGACAGTGAATC AATGTTGTTG 22659 146 3.90e-07 CTTCCTTCTT TGACTGTGAAAA GGAGTTTGGC 44329 373 5.63e-07 TGATGCATTT TGACTGTGATTC TCTTCTTCCA 10398 323 5.63e-07 TGACTGACAG TGACTGTGAATG AAAAATTCGT 31973 169 7.89e-07 ACTGACTGAC TGACTGTGATAC CTACCTAGTC 43695 66 5.50e-06 AAAATATTTT TGACAATGGATA TTTCGGTCTA 50295 247 7.45e-06 TAACGTGTTG TGACAATGGTAC GTCCATCACA 45994 60 1.23e-05 GATTCAAACA TGCCAATGAATG TTTGGAAAAA 34138 210 1.43e-05 AGATTCGCCT GGACTGTGAGAA CTCCGTGCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50158 1.7e-07 189_[+2]_299 22659 3.9e-07 145_[+2]_343 44329 5.6e-07 372_[+2]_116 10398 5.6e-07 322_[+2]_166 31973 7.9e-07 168_[+2]_320 43695 5.5e-06 65_[+2]_423 50295 7.5e-06 246_[+2]_242 45994 1.2e-05 59_[+2]_429 34138 1.4e-05 209_[+2]_279 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=9 50158 ( 190) TGACAGTGAATC 1 22659 ( 146) TGACTGTGAAAA 1 44329 ( 373) TGACTGTGATTC 1 10398 ( 323) TGACTGTGAATG 1 31973 ( 169) TGACTGTGATAC 1 43695 ( 66) TGACAATGGATA 1 50295 ( 247) TGACAATGGTAC 1 45994 ( 60) TGCCAATGAATG 1 34138 ( 210) GGACTGTGAGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.70369 E= 4.7e+001 -982 -982 -108 174 -982 -982 209 -982 187 -119 -982 -982 -982 197 -982 -982 87 -982 -982 106 45 -982 150 -982 -982 -982 -982 191 -982 -982 209 -982 168 -982 -8 -982 119 -982 -108 32 87 -982 -982 106 45 80 -8 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 4.7e+001 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 1.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.444444 0.000000 0.000000 0.555556 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.777778 0.000000 0.222222 0.000000 0.555556 0.000000 0.111111 0.333333 0.444444 0.000000 0.000000 0.555556 0.333333 0.444444 0.222222 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TGAC[TA][GA]TG[AG][AT][TA][CAG] -------------------------------------------------------------------------------- Time 3.44 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 7 llr = 85 E-value = 4.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :19:9aa:31:: pos.-specific C a:::1::::946 probability G :91a:::74::4 matrix T :::::::33:6: bits 2.1 * ** 1.9 * * ** 1.7 * * ** 1.5 ******* * Relative 1.3 ******** * Entropy 1.0 ******** *** (17.5 bits) 0.8 ******** *** 0.6 ******** *** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CGAGAAAGGCTC consensus TA CG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44329 66 3.70e-07 GACAGCAAGG CGAGAAAGACTG CGCCGGCTTG 47505 379 3.70e-07 TGTGGGAGAT CGAGAAAGACCC GTTTTGCAGT 45994 423 5.61e-07 GGTTTCGGTA CGAGAAAGTCCG TCCAATCGGC 34138 299 1.19e-06 ATTCATTATT CGAGAAAGGATC GGTCGGACCG 48078 309 2.31e-06 CAACAATACT CAAGAAAGTCTC GGCGCTCGCA 49896 71 3.85e-06 TCGGTGGGGG CGGGAAATGCCC TCAACGGTGT 14478 101 4.06e-06 AAGTTTGCCG CGAGCAATGCTG TTTCATCATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44329 3.7e-07 65_[+3]_423 47505 3.7e-07 378_[+3]_110 45994 5.6e-07 422_[+3]_66 34138 1.2e-06 298_[+3]_190 48078 2.3e-06 308_[+3]_180 49896 3.8e-06 70_[+3]_418 14478 4.1e-06 100_[+3]_388 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=7 44329 ( 66) CGAGAAAGACTG 1 47505 ( 379) CGAGAAAGACCC 1 45994 ( 423) CGAGAAAGTCCG 1 34138 ( 299) CGAGAAAGGATC 1 48078 ( 309) CAAGAAAGTCTC 1 49896 ( 71) CGGGAAATGCCC 1 14478 ( 101) CGAGCAATGCTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.77593 E= 4.5e+002 -945 197 -945 -945 -77 -945 186 -945 182 -945 -72 -945 -945 -945 209 -945 182 -83 -945 -945 204 -945 -945 -945 204 -945 -945 -945 -945 -945 160 10 23 -945 86 10 -77 175 -945 -945 -945 75 -945 110 -945 116 86 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 4.5e+002 0.000000 1.000000 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.857143 0.142857 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.714286 0.285714 0.285714 0.000000 0.428571 0.285714 0.142857 0.857143 0.000000 0.000000 0.000000 0.428571 0.000000 0.571429 0.000000 0.571429 0.428571 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CGAGAAA[GT][GAT]C[TC][CG] -------------------------------------------------------------------------------- Time 5.31 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31973 2.24e-06 168_[+2(7.89e-07)]_99_\ [+1(6.46e-08)]_209 47505 5.72e-07 285_[+2(3.06e-06)]_81_\ [+3(3.70e-07)]_110 14478 1.53e-04 100_[+3(4.06e-06)]_115_\ [+1(2.00e-06)]_261 48078 1.36e-03 308_[+3(2.31e-06)]_6_[+1(3.98e-05)]_\ 162 22659 1.89e-06 145_[+2(3.90e-07)]_268_\ [+1(2.51e-07)]_63 43695 3.62e-05 65_[+2(5.50e-06)]_254_\ [+1(7.61e-06)]_64_[+1(2.51e-07)]_81 43802 4.20e-02 335_[+1(8.57e-06)]_153 10398 7.87e-07 322_[+2(5.63e-07)]_11_\ [+1(3.13e-07)]_143 44329 1.09e-08 65_[+3(3.70e-07)]_50_[+1(1.42e-06)]_\ 233_[+2(5.63e-07)]_116 45994 9.04e-08 59_[+2(1.23e-05)]_245_\ [+1(4.32e-07)]_94_[+3(5.61e-07)]_66 49896 2.16e-02 70_[+3(3.85e-06)]_418 50158 2.54e-06 177_[+1(7.58e-07)]_[+2(1.74e-07)]_\ 26_[+1(4.71e-05)]_261 50295 1.05e-03 161_[+1(1.71e-05)]_73_\ [+2(7.45e-06)]_242 34138 2.03e-05 161_[+1(7.44e-05)]_36_\ [+2(1.43e-05)]_77_[+3(1.19e-06)]_190 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************