******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/386/386.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31556 1.0000 500 6915 1.0000 500 43641 1.0000 500 49745 1.0000 500 30898 1.0000 500 50255 1.0000 500 44818 1.0000 500 45029 1.0000 500 42686 1.0000 500 42912 1.0000 500 39719 1.0000 500 44293 1.0000 500 49190 1.0000 500 38174 1.0000 500 44666 1.0000 500 50105 1.0000 500 39471 1.0000 500 37958 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/386/386.seqs.fa -oc motifs/386 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.236 G 0.221 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.272 C 0.236 G 0.221 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 16 llr = 191 E-value = 5.7e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 686493678:41169821:6 pos.-specific C 1:121:23:93221:161a: probability G 1::1:2::::1743:1:8:: matrix T 2233152:31313:1:31:4 bits 2.2 * 2.0 * 1.7 * * 1.5 * * * Relative 1.3 * * * * Entropy 1.1 * * *** * * (17.2 bits) 0.9 * * *** * ** *** 0.7 ** * **** * ******* 0.4 *** ****** * ******* 0.2 *** **************** 0.0 -------------------- Multilevel AAAAATAAACAGGAAACGCA consensus TT A CT T TG T T sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 37958 391 2.36e-09 TGAACCACAC AAAAAAAAACTGGCAACGCT ATTCGTCAAG 42686 173 5.53e-09 TTTGGTAGCG AAAGAAAAACAGGGAATGCA CAAAATCGCG 30898 312 6.06e-08 CTGGAGCAAA AAACAATAACTGTAAATGCT TTGGAAGCAC 43641 15 4.49e-07 CAATTGGGTC AAAAATTAACTGTAAAATCA AACTTACTGT 38174 323 4.96e-07 CTCGGAGACA AAATCAAATCACGAAACGCA GAACTACCTA 45029 306 8.79e-07 ACTTTGGTAC AAACATACACACCAACAGCA CAAAGACTCC 44818 55 1.26e-06 CGTAGAACCC CATAATCCTCAGTGAACGCT AATATGCGAG 39719 115 1.50e-06 CCCGACCTGT AATTTTAAACGCGAAACGCA GTTTGTGTGT 50255 29 1.50e-06 GACTCACTAC TTAAATAAATCGGAAAAGCA TTTTTTTCAT 50105 399 1.64e-06 ACCACAGGCC ATTAATCATCAGGGAACACT AGGACCAACG 44293 404 1.78e-06 GGTAGTACAA TAATAGACACCACAAACGCT CCGCATCAAC 42912 318 2.29e-06 GACATCCCAA AAAGAGACACAGCCTACGCA ATTTTCGAAG 31556 356 3.16e-06 CTCATCTAAT GAACATAAACTGTAAGTCCA AGTCTAGATT 49745 321 3.41e-06 GGCATATCGA GATTAGTAACCGTGACCGCT ATTGCTTTTT 49190 167 8.19e-06 GTCAGGCGAC AACAAACAACTTGGAGTGCT GGTGAAGTAT 44666 255 2.12e-05 GTGCTGTCTT TTCAATACTCCGAAAACACA AAGAGCCCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37958 2.4e-09 390_[+1]_90 42686 5.5e-09 172_[+1]_308 30898 6.1e-08 311_[+1]_169 43641 4.5e-07 14_[+1]_466 38174 5e-07 322_[+1]_158 45029 8.8e-07 305_[+1]_175 44818 1.3e-06 54_[+1]_426 39719 1.5e-06 114_[+1]_366 50255 1.5e-06 28_[+1]_452 50105 1.6e-06 398_[+1]_82 44293 1.8e-06 403_[+1]_77 42912 2.3e-06 317_[+1]_163 31556 3.2e-06 355_[+1]_125 49745 3.4e-06 320_[+1]_160 49190 8.2e-06 166_[+1]_314 44666 2.1e-05 254_[+1]_226 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=16 37958 ( 391) AAAAAAAAACTGGCAACGCT 1 42686 ( 173) AAAGAAAAACAGGGAATGCA 1 30898 ( 312) AAACAATAACTGTAAATGCT 1 43641 ( 15) AAAAATTAACTGTAAAATCA 1 38174 ( 323) AAATCAAATCACGAAACGCA 1 45029 ( 306) AAACATACACACCAACAGCA 1 44818 ( 55) CATAATCCTCAGTGAACGCT 1 39719 ( 115) AATTTTAAACGCGAAACGCA 1 50255 ( 29) TTAAATAAATCGGAAAAGCA 1 50105 ( 399) ATTAATCATCAGGGAACACT 1 44293 ( 404) TAATAGACACCACAAACGCT 1 42912 ( 318) AAAGAGACACAGCCTACGCA 1 31556 ( 356) GAACATAAACTGTAAGTCCA 1 49745 ( 321) GATTAGTAACCGTGACCGCT 1 49190 ( 167) AACAAACAACTTGGAGTGCT 1 44666 ( 255) TTCAATACTCCGAAAACACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 8658 bayes= 9.07715 E= 5.7e-002 120 -192 -82 -53 158 -1064 -1064 -53 120 -92 -1064 -12 68 -33 -82 -12 168 -192 -1064 -211 20 -1064 -23 88 120 -33 -1064 -53 134 40 -1064 -1064 146 -1064 -1064 -12 -1064 199 -1064 -211 46 8 -182 21 -212 -33 164 -211 -212 -33 99 21 105 -92 50 -1064 178 -1064 -1064 -211 146 -92 -82 -1064 -54 125 -1064 -12 -112 -192 177 -211 -1064 208 -1064 -1064 105 -1064 -1064 69 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 16 E= 5.7e-002 0.625000 0.062500 0.125000 0.187500 0.812500 0.000000 0.000000 0.187500 0.625000 0.125000 0.000000 0.250000 0.437500 0.187500 0.125000 0.250000 0.875000 0.062500 0.000000 0.062500 0.312500 0.000000 0.187500 0.500000 0.625000 0.187500 0.000000 0.187500 0.687500 0.312500 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 0.937500 0.000000 0.062500 0.375000 0.250000 0.062500 0.312500 0.062500 0.187500 0.687500 0.062500 0.062500 0.187500 0.437500 0.312500 0.562500 0.125000 0.312500 0.000000 0.937500 0.000000 0.000000 0.062500 0.750000 0.125000 0.125000 0.000000 0.187500 0.562500 0.000000 0.250000 0.125000 0.062500 0.750000 0.062500 0.000000 1.000000 0.000000 0.000000 0.562500 0.000000 0.000000 0.437500 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- AA[AT][AT]A[TA]A[AC][AT]C[ATC]G[GT][AG]AA[CT]GC[AT] -------------------------------------------------------------------------------- Time 2.75 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 6 llr = 92 E-value = 5.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::2:233:::::::: pos.-specific C :::::2:27::a:7:: probability G :2:8a:23:a::2358 matrix T a8a::7523:a:8:52 bits 2.2 * * * 2.0 * * * *** 1.7 * * * *** 1.5 * *** *** * Relative 1.3 ***** **** * Entropy 1.1 ***** ******** (22.2 bits) 0.9 ***** ******** 0.7 ****** ******** 0.4 ******* ******** 0.2 ******* ******** 0.0 ---------------- Multilevel TTTGGTTACGTCTCGG consensus AGT GT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 44666 375 1.21e-08 GGCGCTTCAC TTTGGTGCCGTCTCGG ACACGAGCTC 44818 193 1.21e-08 CATTTGTTTC TTTGGTTTCGTCTGTG CCGCTTCTCG 30898 151 2.63e-08 AAGTGCGTAG TTTGGAAACGTCTCTG GTATCGTCGG 39471 44 7.11e-08 GGTTCCAAAA TGTGGTAATGTCTCTG TCGCAACACG 39719 171 1.36e-07 GACTTCTATG TTTAGCTGTGTCTCGG GAGAGAGGCT 37958 134 1.44e-07 TACATAGGGT TTTGGTTGCGTCGGGT CAATGAGTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44666 1.2e-08 374_[+2]_110 44818 1.2e-08 192_[+2]_292 30898 2.6e-08 150_[+2]_334 39471 7.1e-08 43_[+2]_441 39719 1.4e-07 170_[+2]_314 37958 1.4e-07 133_[+2]_351 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=6 44666 ( 375) TTTGGTGCCGTCTCGG 1 44818 ( 193) TTTGGTTTCGTCTGTG 1 30898 ( 151) TTTGGAAACGTCTCTG 1 39471 ( 44) TGTGGTAATGTCTCTG 1 39719 ( 171) TTTAGCTGTGTCTCGG 1 37958 ( 134) TTTGGTTGCGTCGGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8730 bayes= 10.1645 E= 5.9e+001 -923 -923 -923 188 -923 -923 -40 162 -923 -923 -923 188 -71 -923 192 -923 -923 -923 218 -923 -71 -50 -923 130 29 -923 -40 88 29 -50 59 -70 -923 149 -923 30 -923 -923 218 -923 -923 -923 -923 188 -923 208 -923 -923 -923 -923 -40 162 -923 149 59 -923 -923 -923 118 88 -923 -923 192 -70 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 5.9e+001 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 1.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.166667 0.000000 0.666667 0.333333 0.000000 0.166667 0.500000 0.333333 0.166667 0.333333 0.166667 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.833333 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TTTGGT[TA][AG][CT]GTCT[CG][GT]G -------------------------------------------------------------------------------- Time 5.63 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 10 llr = 110 E-value = 6.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 67:1428::::: pos.-specific C 42::3:1:::a: probability G :::41815:::a matrix T :1a52::5aa:: bits 2.2 ** 2.0 * **** 1.7 * **** 1.5 * **** Relative 1.3 * * **** Entropy 1.1 * * ******* (15.9 bits) 0.9 * * ******* 0.7 **** ******* 0.4 **** ******* 0.2 ************ 0.0 ------------ Multilevel AATTAGAGTTCG consensus CC GCA T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43641 65 2.55e-07 GAGAGCTGCA AATGCGAGTTCG ATTTCGAATC 39471 17 1.60e-06 GAAACTGGAA ACTGAGAGTTCG CGTTTGGTTC 49745 115 1.71e-06 TAAGAAATGG AATTGGAGTTCG TTGCTTGTAT 50255 62 1.96e-06 TTTTCATTAT CATTTGATTTCG CTGGTTAATT 39719 28 4.20e-06 AAATATTTCA AATGAAATTTCG ATCCGAATTA 31556 133 4.20e-06 GCTGACATCG AATAAGATTTCG TTAGTTCACA 44818 163 5.95e-06 TTTCTTACTC ATTTCGAGTTCG CCCTCATGCA 37958 296 6.49e-06 TTTTACATTA CATTAGCGTTCG CAAAATCGCG 42912 264 1.33e-05 GAGGATCGCG CATGTGGTTTCG TGGTCTTGCC 30898 468 1.57e-05 CTTTGTTTCC CCTTCAATTTCG CTCCCTGTCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43641 2.5e-07 64_[+3]_424 39471 1.6e-06 16_[+3]_472 49745 1.7e-06 114_[+3]_374 50255 2e-06 61_[+3]_427 39719 4.2e-06 27_[+3]_461 31556 4.2e-06 132_[+3]_356 44818 6e-06 162_[+3]_326 37958 6.5e-06 295_[+3]_193 42912 1.3e-05 263_[+3]_225 30898 1.6e-05 467_[+3]_21 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=10 43641 ( 65) AATGCGAGTTCG 1 39471 ( 17) ACTGAGAGTTCG 1 49745 ( 115) AATTGGAGTTCG 1 50255 ( 62) CATTTGATTTCG 1 39719 ( 28) AATGAAATTTCG 1 31556 ( 133) AATAAGATTTCG 1 44818 ( 163) ATTTCGAGTTCG 1 37958 ( 296) CATTAGCGTTCG 1 42912 ( 264) CATGTGGTTTCG 1 30898 ( 468) CCTTCAATTTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 10.0318 E= 6.5e+002 114 76 -997 -997 136 -24 -997 -143 -997 -997 -997 188 -144 -997 86 88 55 34 -114 -44 -44 -997 186 -997 155 -124 -114 -997 -997 -997 118 88 -997 -997 -997 188 -997 -997 -997 188 -997 208 -997 -997 -997 -997 218 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 6.5e+002 0.600000 0.400000 0.000000 0.000000 0.700000 0.200000 0.000000 0.100000 0.000000 0.000000 0.000000 1.000000 0.100000 0.000000 0.400000 0.500000 0.400000 0.300000 0.100000 0.200000 0.200000 0.000000 0.800000 0.000000 0.800000 0.100000 0.100000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AC][AC]T[TG][ACT][GA]A[GT]TTCG -------------------------------------------------------------------------------- Time 8.51 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31556 3.91e-05 132_[+3(4.20e-06)]_211_\ [+1(3.16e-06)]_125 6915 1.63e-01 213_[+2(2.77e-05)]_210_\ [+2(2.39e-05)]_45 43641 4.21e-06 14_[+1(4.49e-07)]_30_[+3(2.55e-07)]_\ 424 49745 7.37e-05 114_[+3(1.71e-06)]_194_\ [+1(3.41e-06)]_160 30898 1.08e-09 150_[+2(2.63e-08)]_145_\ [+1(6.06e-08)]_136_[+3(1.57e-05)]_21 50255 4.25e-05 28_[+1(1.50e-06)]_13_[+3(1.96e-06)]_\ 427 44818 3.58e-09 54_[+1(1.26e-06)]_88_[+3(5.95e-06)]_\ 18_[+2(1.21e-08)]_292 45029 5.96e-03 305_[+1(8.79e-07)]_175 42686 2.01e-04 172_[+1(5.53e-09)]_308 42912 2.85e-04 263_[+3(1.33e-05)]_42_\ [+1(2.29e-06)]_163 39719 2.84e-08 27_[+3(4.20e-06)]_75_[+1(1.50e-06)]_\ 36_[+2(1.36e-07)]_314 44293 1.96e-02 403_[+1(1.78e-06)]_77 49190 1.31e-02 166_[+1(8.19e-06)]_314 38174 6.92e-03 322_[+1(4.96e-07)]_158 44666 7.18e-06 254_[+1(2.12e-05)]_100_\ [+2(1.21e-08)]_110 50105 5.17e-03 398_[+1(1.64e-06)]_82 39471 4.24e-06 16_[+3(1.60e-06)]_15_[+2(7.11e-08)]_\ 441 37958 1.13e-10 133_[+2(1.44e-07)]_146_\ [+3(6.49e-06)]_83_[+1(2.36e-09)]_90 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************