******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/240/240.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42592 1.0000 500 54699 1.0000 500 37459 1.0000 500 47410 1.0000 500 48866 1.0000 500 43377 1.0000 500 9806 1.0000 500 7763 1.0000 500 25974 1.0000 500 779 1.0000 500 44782 1.0000 500 11477 1.0000 500 45758 1.0000 500 48598 1.0000 500 42931 1.0000 500 43155 1.0000 500 44497 1.0000 500 49424 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/240/240.seqs.fa -oc motifs/240 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.234 G 0.228 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.234 G 0.228 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 14 sites = 11 llr = 131 E-value = 1.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 61:2a3a:1:75:: pos.-specific C 12:5:7:::2:::7 probability G :5:2:::a:2:513 matrix T 32a2::::963:9: bits 2.1 * 1.9 * * ** 1.7 * * ** 1.5 * * *** * Relative 1.3 * ***** ** Entropy 1.1 * ***** **** (17.2 bits) 0.9 * ***** **** 0.6 * * ********** 0.4 *** ********** 0.2 ************** 0.0 -------------- Multilevel AGTCACAGTTAGTC consensus T A TA G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 9806 219 8.69e-08 ACGCTTTTAT AGTTACAGTTAATC TGTCGTCGGC 25974 473 2.05e-07 CTTCGGGTTG TCTCACAGTTAGTC AGCAAAGCAC 43377 168 2.05e-07 CCTTGATGCG AGTGACAGTGAGTC TGTTTTGCAG 47410 108 2.41e-07 TTCCACACGA AGTCACAGTCAGTG CTTTTTGTGC 49424 438 1.42e-06 TGTTCTCGTA ATTTACAGTTAGTG CGGGGATAGA 42592 199 1.80e-06 GTCAGGTCGC AGTCAAAGATAGTC AACGTCGATG 44782 246 2.55e-06 ACGCTCACCA TCTAACAGTTTGTC ATTCGATGTT 7763 435 3.81e-06 TTTTGGATTG TGTAAAAGTTTATC TCCTGCCTTG 43155 193 4.21e-06 GTTTCTGTAA AGTGAAAGTGAATG TTTAAATTTT 42931 402 5.81e-06 ATAAGCCAAA ATTCACAGTCAAGC TTATGGCCCC 45758 11 6.89e-06 GGAAAACAAG CATCACAGTTTATC AAACTCTGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9806 8.7e-08 218_[+1]_268 25974 2e-07 472_[+1]_14 43377 2e-07 167_[+1]_319 47410 2.4e-07 107_[+1]_379 49424 1.4e-06 437_[+1]_49 42592 1.8e-06 198_[+1]_288 44782 2.5e-06 245_[+1]_241 7763 3.8e-06 434_[+1]_52 43155 4.2e-06 192_[+1]_294 42931 5.8e-06 401_[+1]_85 45758 6.9e-06 10_[+1]_476 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=14 seqs=11 9806 ( 219) AGTTACAGTTAATC 1 25974 ( 473) TCTCACAGTTAGTC 1 43377 ( 168) AGTGACAGTGAGTC 1 47410 ( 108) AGTCACAGTCAGTG 1 49424 ( 438) ATTTACAGTTAGTG 1 42592 ( 199) AGTCAAAGATAGTC 1 44782 ( 246) TCTAACAGTTTGTC 1 7763 ( 435) TGTAAAAGTTTATC 1 43155 ( 193) AGTGAAAGTGAATG 1 42931 ( 402) ATTCACAGTCAAGC 1 45758 ( 11) CATCACAGTTTATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 8766 bayes= 9.99195 E= 1.1e+000 123 -136 -1010 3 -157 -36 126 -55 -1010 -1010 -1010 190 -58 96 -33 -55 188 -1010 -1010 -1010 1 163 -1010 -1010 188 -1010 -1010 -1010 -1010 -1010 213 -1010 -157 -1010 -1010 177 -1010 -36 -33 125 142 -1010 -1010 3 75 -1010 126 -1010 -1010 -1010 -132 177 -1010 163 26 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 11 E= 1.1e+000 0.636364 0.090909 0.000000 0.272727 0.090909 0.181818 0.545455 0.181818 0.000000 0.000000 0.000000 1.000000 0.181818 0.454545 0.181818 0.181818 1.000000 0.000000 0.000000 0.000000 0.272727 0.727273 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.090909 0.000000 0.000000 0.909091 0.000000 0.181818 0.181818 0.636364 0.727273 0.000000 0.000000 0.272727 0.454545 0.000000 0.545455 0.000000 0.000000 0.000000 0.090909 0.909091 0.000000 0.727273 0.272727 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AT]GTCA[CA]AGTT[AT][GA]T[CG] -------------------------------------------------------------------------------- Time 3.04 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 18 llr = 161 E-value = 1.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :a4411496489 pos.-specific C ::41:51::61: probability G 9:2:924:4111 matrix T 1::6:211:::: bits 2.1 1.9 * 1.7 ** * 1.5 ** * * Relative 1.3 ** * * * Entropy 1.1 ** * ** ** (12.9 bits) 0.9 ** * ***** 0.6 ** ** ***** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GAATGCGAACAA consensus CA GA GA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 779 374 6.45e-07 TTAGCGACGA GACTGCAAGCAA GCGATTGGAC 42931 54 4.26e-06 ACAGATTTTC GAATGGGAAAAA TGGGAAATTA 11477 178 6.78e-06 TACCAGCGAA GAATGGAAGAAA AGGATGCCGA 54699 232 6.78e-06 CTAGAGTAGT GAGTGGGAACAA CTCTTCGCTG 48866 44 7.75e-06 TCCAACCTAC GACAGTGAGAAA CACACCTTAC 48598 232 8.45e-06 GTGTGCGCCA GAATGCCAAAAA TTGGCCAGCA 43155 43 1.02e-05 TCAACTCTAG GAATGCGTGCAA TCATCTTTCC 37459 70 2.28e-05 CACCAAACGG GACTGTGAACCA AAATCCAGTC 47410 444 3.04e-05 GATCCACCCT GACAGAAAGCAA TTCAAACTTC 49424 244 3.46e-05 ATTATGTTAT TACTGTGAACAA AATCCTCTCG 42592 362 3.46e-05 CTCCTAACTG GAAAGCTAGAAA AGAGGGTACA 7763 458 4.01e-05 CTCCTGCCTT GAAAGCGAAAGA CACAGGTGGA 25974 86 4.48e-05 GTCGAGCATG TAATGTAAACAA TACAATGATG 9806 18 4.48e-05 GACTGGTCTC GACAAGGAACAA CGGCAAAGCT 44782 74 5.82e-05 AGGACTCAGT GAACGCCAACAA CTAAATAGTC 43377 16 5.82e-05 GGGATCAGCG GAGAGCAAAGAA TTGTGAGACA 44497 324 2.19e-04 GAGGTCGTAA GACTGCATGAAG CTTCCTGACG 45758 336 2.34e-04 TACATCACAG GAGAACAAGCCA ATCCACACTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 779 6.5e-07 373_[+2]_115 42931 4.3e-06 53_[+2]_435 11477 6.8e-06 177_[+2]_311 54699 6.8e-06 231_[+2]_257 48866 7.8e-06 43_[+2]_445 48598 8.4e-06 231_[+2]_257 43155 1e-05 42_[+2]_446 37459 2.3e-05 69_[+2]_419 47410 3e-05 443_[+2]_45 49424 3.5e-05 243_[+2]_245 42592 3.5e-05 361_[+2]_127 7763 4e-05 457_[+2]_31 25974 4.5e-05 85_[+2]_403 9806 4.5e-05 17_[+2]_471 44782 5.8e-05 73_[+2]_415 43377 5.8e-05 15_[+2]_473 44497 0.00022 323_[+2]_165 45758 0.00023 335_[+2]_153 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=18 779 ( 374) GACTGCAAGCAA 1 42931 ( 54) GAATGGGAAAAA 1 11477 ( 178) GAATGGAAGAAA 1 54699 ( 232) GAGTGGGAACAA 1 48866 ( 44) GACAGTGAGAAA 1 48598 ( 232) GAATGCCAAAAA 1 43155 ( 43) GAATGCGTGCAA 1 37459 ( 70) GACTGTGAACCA 1 47410 ( 444) GACAGAAAGCAA 1 49424 ( 244) TACTGTGAACAA 1 42592 ( 362) GAAAGCTAGAAA 1 7763 ( 458) GAAAGCGAAAGA 1 25974 ( 86) TAATGTAAACAA 1 9806 ( 18) GACAAGGAACAA 1 44782 ( 74) GAACGCCAACAA 1 43377 ( 16) GAGAGCAAAGAA 1 44497 ( 324) GACTGCATGAAG 1 45758 ( 336) GAGAACAAGCCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 8.93074 E= 1.0e+001 -1081 -1081 196 -126 188 -1081 -1081 -1081 71 73 -45 -1081 52 -207 -1081 106 -129 -1081 196 -1081 -228 109 -4 -26 52 -107 96 -226 171 -1081 -1081 -126 104 -1081 96 -1081 52 125 -203 -1081 162 -107 -203 -1081 180 -1081 -203 -1081 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 18 E= 1.0e+001 0.000000 0.000000 0.888889 0.111111 1.000000 0.000000 0.000000 0.000000 0.444444 0.388889 0.166667 0.000000 0.388889 0.055556 0.000000 0.555556 0.111111 0.000000 0.888889 0.000000 0.055556 0.500000 0.222222 0.222222 0.388889 0.111111 0.444444 0.055556 0.888889 0.000000 0.000000 0.111111 0.555556 0.000000 0.444444 0.000000 0.388889 0.555556 0.055556 0.000000 0.833333 0.111111 0.055556 0.000000 0.944444 0.000000 0.055556 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GA[AC][TA]G[CGT][GA]A[AG][CA]AA -------------------------------------------------------------------------------- Time 6.05 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 18 llr = 160 E-value = 1.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::::::421: pos.-specific C 321743:a:2:9 probability G :5:::54:3:11 matrix T 7393626:378: bits 2.1 * 1.9 * 1.7 * 1.5 * * * Relative 1.3 ** * * Entropy 1.1 * *** ** ** (12.8 bits) 0.9 * *** ** ** 0.6 ******** *** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGTCTGTCATTC consensus CT TCCG G sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47410 388 1.75e-06 AGGGAGCCAT TTTCTGTCTTTC TTTGGTCATG 49424 124 3.72e-06 TGTGAGACGT CGTCCGGCGTTC CAAAAAACGT 54699 478 3.72e-06 CACACTTAGT CGTCTCGCATTC ACACATACGA 48598 408 6.84e-06 TTCACACTCC TCTCTCTCATTC TCCTTTGGAA 42931 345 7.94e-06 GAAATGCTTC TGTCCGTCGCTC GATAGAATTG 42592 111 1.49e-05 GACGGTGATG TGTCTCTCTCTC TTGGTGAGTG 779 112 1.67e-05 ACGTCACCTT CGTCTGTCATGC GTTTAGAGTA 48866 415 1.67e-05 CCTGAACCTT TTTCCTTCTTTC CACGGTTGCC 45758 356 2.58e-05 CAATCCACAC TTTCCGGCGCTC CCAATCCGCA 11477 45 4.96e-05 GCTTCCATTC TTTCTCGCTATC ATCCCACAAG 25974 328 5.41e-05 TCGCATCCAG TCTCTGGCATTG TACCAAAGGG 37459 47 5.41e-05 GAATTGCCGT TTCCTTTCATTC CCACCAAACG 9806 393 5.86e-05 ACTGGATGCT TGTTTGTCTTTG AAGCGTATTG 7763 31 8.56e-05 CTAACGTTAC CCTTTTGCATTC CAGTCAAAAT 44497 26 9.17e-05 ACTCGACATC TTTTTTTCAATC GACGTCATAC 43377 362 9.17e-05 GATCTCTTTC CGTTCGTCGTGC ACGAAGATTC 43155 434 1.44e-04 ATCGTCCAAG CGCTCCGCATTC GGGCTACGAA 44782 184 1.53e-04 CGACTTTCAA TGTCCGTCGAAC TGCCACTGGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47410 1.8e-06 387_[+3]_101 49424 3.7e-06 123_[+3]_365 54699 3.7e-06 477_[+3]_11 48598 6.8e-06 407_[+3]_81 42931 7.9e-06 344_[+3]_144 42592 1.5e-05 110_[+3]_378 779 1.7e-05 111_[+3]_377 48866 1.7e-05 414_[+3]_74 45758 2.6e-05 355_[+3]_133 11477 5e-05 44_[+3]_444 25974 5.4e-05 327_[+3]_161 37459 5.4e-05 46_[+3]_442 9806 5.9e-05 392_[+3]_96 7763 8.6e-05 30_[+3]_458 44497 9.2e-05 25_[+3]_463 43377 9.2e-05 361_[+3]_127 43155 0.00014 433_[+3]_55 44782 0.00015 183_[+3]_305 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=18 47410 ( 388) TTTCTGTCTTTC 1 49424 ( 124) CGTCCGGCGTTC 1 54699 ( 478) CGTCTCGCATTC 1 48598 ( 408) TCTCTCTCATTC 1 42931 ( 345) TGTCCGTCGCTC 1 42592 ( 111) TGTCTCTCTCTC 1 779 ( 112) CGTCTGTCATGC 1 48866 ( 415) TTTCCTTCTTTC 1 45758 ( 356) TTTCCGGCGCTC 1 11477 ( 45) TTTCTCGCTATC 1 25974 ( 328) TCTCTGGCATTG 1 37459 ( 47) TTCCTTTCATTC 1 9806 ( 393) TGTTTGTCTTTG 1 7763 ( 31) CCTTTTGCATTC 1 44497 ( 26) TTTTTTTCAATC 1 43377 ( 362) CGTTCGTCGTGC 1 43155 ( 434) CGCTCCGCATTC 1 44782 ( 184) TGTCCGTCGAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 8.93074 E= 1.4e+002 -1081 51 -1081 132 -1081 -49 113 32 -1081 -107 -1081 173 -1081 162 -1081 6 -1081 73 -1081 119 -1081 25 113 -26 -1081 -1081 77 119 -1081 209 -1081 -1081 71 -1081 29 6 -70 -49 -1081 132 -228 -1081 -104 164 -1081 192 -104 -1081 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 18 E= 1.4e+002 0.000000 0.333333 0.000000 0.666667 0.000000 0.166667 0.500000 0.333333 0.000000 0.111111 0.000000 0.888889 0.000000 0.722222 0.000000 0.277778 0.000000 0.388889 0.000000 0.611111 0.000000 0.277778 0.500000 0.222222 0.000000 0.000000 0.388889 0.611111 0.000000 1.000000 0.000000 0.000000 0.444444 0.000000 0.277778 0.277778 0.166667 0.166667 0.000000 0.666667 0.055556 0.000000 0.111111 0.833333 0.000000 0.888889 0.111111 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TC][GT]T[CT][TC][GCT][TG]C[AGT]TTC -------------------------------------------------------------------------------- Time 9.04 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42592 1.57e-05 110_[+3(1.49e-05)]_76_\ [+1(1.80e-06)]_149_[+2(3.46e-05)]_127 54699 7.95e-05 231_[+2(6.78e-06)]_234_\ [+3(3.72e-06)]_11 37459 2.46e-03 46_[+3(5.41e-05)]_11_[+2(2.28e-05)]_\ 419 47410 3.39e-07 107_[+1(2.41e-07)]_266_\ [+3(1.75e-06)]_44_[+2(3.04e-05)]_45 48866 8.52e-04 43_[+2(7.75e-06)]_359_\ [+3(1.67e-05)]_74 43377 1.76e-05 15_[+2(5.82e-05)]_140_\ [+1(2.05e-07)]_180_[+3(9.17e-05)]_127 9806 4.44e-06 17_[+2(4.48e-05)]_189_\ [+1(8.69e-08)]_160_[+3(5.86e-05)]_96 7763 1.54e-04 30_[+3(8.56e-05)]_392_\ [+1(3.81e-06)]_9_[+2(4.01e-05)]_31 25974 8.86e-06 85_[+2(4.48e-05)]_230_\ [+3(5.41e-05)]_133_[+1(2.05e-07)]_14 779 1.20e-04 111_[+3(1.67e-05)]_250_\ [+2(6.45e-07)]_115 44782 2.43e-04 73_[+2(5.82e-05)]_160_\ [+1(2.55e-06)]_241 11477 3.45e-03 44_[+3(4.96e-05)]_121_\ [+2(6.78e-06)]_311 45758 4.03e-04 10_[+1(6.89e-06)]_331_\ [+3(2.58e-05)]_133 48598 4.38e-04 231_[+2(8.45e-06)]_164_\ [+3(6.84e-06)]_81 42931 3.95e-06 53_[+2(4.26e-06)]_279_\ [+3(7.94e-06)]_45_[+1(5.81e-06)]_85 43155 8.05e-05 42_[+2(1.02e-05)]_138_\ [+1(4.21e-06)]_294 44497 9.31e-02 25_[+3(9.17e-05)]_463 49424 3.68e-06 123_[+3(3.72e-06)]_108_\ [+2(3.46e-05)]_182_[+1(1.42e-06)]_49 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************