******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/72/72.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47425 1.0000 500 38192 1.0000 500 48008 1.0000 500 5904 1.0000 500 48863 1.0000 500 39529 1.0000 500 40111 1.0000 500 49434 1.0000 500 49464 1.0000 500 49627 1.0000 500 40988 1.0000 500 6974 1.0000 500 44178 1.0000 500 10566 1.0000 500 44232 1.0000 500 47325 1.0000 500 37443 1.0000 500 37578 1.0000 500 47713 1.0000 500 40832 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/72/72.seqs.fa -oc motifs/72 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10000 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.270 C 0.237 G 0.224 T 0.269 Background letter frequencies (from dataset with add-one prior applied): A 0.270 C 0.237 G 0.224 T 0.269 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 13 sites = 13 llr = 140 E-value = 3.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1:3a:5::4782: pos.-specific C 5:5:a1::31:3: probability G ::2:::a:12218 matrix T 4a1::4:a2::52 bits 2.2 * * 1.9 * ** ** 1.7 * ** ** 1.5 * ** ** * Relative 1.3 * ** ** * * Entropy 1.1 * ** ** * * (15.5 bits) 0.9 * ** ** ** * 0.6 ** ***** ** * 0.4 ** ***** ** * 0.2 ************* 0.0 ------------- Multilevel CTCACAGTAAATG consensus T A T CG C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 44232 371 1.80e-07 GAAGAACACC TTCACAGTCAATG TAAAGCTCAT 10566 373 3.21e-07 CGACATCGAG CTAACAGTAAACG TCTCATACGT 47325 173 6.53e-07 GAAGATTTTG CTAACTGTAAACG CTTCCATTGC 49434 223 8.64e-07 GGCAATCATT TTCACTGTTAATG TTAGGGTTGA 47425 162 2.88e-06 TTTGCGCGCC CTCACAGTCCATG ACAATCAGCA 5904 389 3.50e-06 AGAGGCGGCA CTAACCGTAAATG TAAACGGACA 40832 200 4.90e-06 GAGAAATACG TTAACAGTAAAGG GGATGTTCGA 48863 400 8.28e-06 AAGGACGTTG CTGACTGTGAACG AGCTCGCAGT 49627 115 1.19e-05 TCGCTTTCTT CTCACAGTCAGTT TATGTCAATG 37443 258 1.44e-05 TTGCGCAATG TTTACAGTTGATG TAAAATGTCC 49464 351 1.44e-05 GTAGAGTGCT CTCACAGTCAGCT GCTTGACTGA 47713 373 1.53e-05 GTCTTCGGCA TTGACTGTAGAAG ACCGGTGCGG 38192 318 3.21e-05 CGTTATCAAT ATCACTGTTGAAG ATGAGAATAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44232 1.8e-07 370_[+1]_117 10566 3.2e-07 372_[+1]_115 47325 6.5e-07 172_[+1]_315 49434 8.6e-07 222_[+1]_265 47425 2.9e-06 161_[+1]_326 5904 3.5e-06 388_[+1]_99 40832 4.9e-06 199_[+1]_288 48863 8.3e-06 399_[+1]_88 49627 1.2e-05 114_[+1]_373 37443 1.4e-05 257_[+1]_230 49464 1.4e-05 350_[+1]_137 47713 1.5e-05 372_[+1]_115 38192 3.2e-05 317_[+1]_170 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=13 seqs=13 44232 ( 371) TTCACAGTCAATG 1 10566 ( 373) CTAACAGTAAACG 1 47325 ( 173) CTAACTGTAAACG 1 49434 ( 223) TTCACTGTTAATG 1 47425 ( 162) CTCACAGTCCATG 1 5904 ( 389) CTAACCGTAAATG 1 40832 ( 200) TTAACAGTAAAGG 1 48863 ( 400) CTGACTGTGAACG 1 49627 ( 115) CTCACAGTCAGTT 1 37443 ( 258) TTTACAGTTGATG 1 49464 ( 351) CTCACAGTCAGCT 1 47713 ( 373) TTGACTGTAGAAG 1 38192 ( 318) ATCACTGTTGAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 9760 bayes= 10.0814 E= 3.6e+000 -181 118 -1035 51 -1035 -1035 -1035 189 19 96 -54 -180 189 -1035 -1035 -1035 -1035 207 -1035 -1035 99 -162 -1035 51 -1035 -1035 216 -1035 -1035 -1035 -1035 189 51 37 -154 -22 136 -162 5 -1035 165 -1035 -54 -1035 -81 37 -154 78 -1035 -1035 192 -81 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 13 E= 3.6e+000 0.076923 0.538462 0.000000 0.384615 0.000000 0.000000 0.000000 1.000000 0.307692 0.461538 0.153846 0.076923 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.538462 0.076923 0.000000 0.384615 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.384615 0.307692 0.076923 0.230769 0.692308 0.076923 0.230769 0.000000 0.846154 0.000000 0.153846 0.000000 0.153846 0.307692 0.076923 0.461538 0.000000 0.000000 0.846154 0.153846 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CT]T[CA]AC[AT]GT[ACT][AG]A[TC]G -------------------------------------------------------------------------------- Time 3.55 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 11 llr = 135 E-value = 1.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 5a::a94352a98582 pos.-specific C 3:19:131:1:11:2: probability G ::91::2456::13:2 matrix T 2:::::23:1:::3:6 bits 2.2 1.9 * * * 1.7 **** * 1.5 ***** ** Relative 1.3 ***** ** * Entropy 1.1 ***** * *** * (17.6 bits) 0.9 ***** * *** * 0.6 ***** ***** ** 0.4 ****** ******** 0.2 ****** ********* 0.0 ---------------- Multilevel AAGCAAAGGGAAAAAT consensus C CAA G sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 40832 374 9.69e-08 GGAAGAGCTA CAGCAAGGGGAAAAAG TCGTTCGCTA 37443 375 1.23e-07 CTTTTGGAGA AAGCAAAGAAAAATAT ACATCGTTTA 37578 456 3.19e-07 GATGAATACA AAGCAAAAAGACAAAT AGAGCTTCAA 44178 42 4.27e-07 CGACAGTGCT CAGCAATGGAAAATAT GATACCTAAC 47713 425 4.78e-07 CGGCGCAGTC CAGCAATGAGAAAGCT CGAAAGAGCC 49434 122 9.21e-07 TGACATGTGT AAGCAACAGGAAATCA GAGGACAGAA 49627 447 1.10e-06 GCACTCCAAG TAGCAAGTGGAAGAAT ACTTCGGCGA 47325 29 1.29e-06 GGGACCTGTG AACCAAATGGAAAGAG TGAAGGCAGT 48863 45 2.72e-06 GAGCTACCGG AAGCAACAGCAACGAT GATAAAGAAA 10566 261 3.41e-06 CAATAAGTAC AAGCACCTATAAAAAT GTGTATAAAA 5904 184 6.71e-06 GCGTGGTATA TAGGAAACAGAAAAAA ATTCCAACAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40832 9.7e-08 373_[+2]_111 37443 1.2e-07 374_[+2]_110 37578 3.2e-07 455_[+2]_29 44178 4.3e-07 41_[+2]_443 47713 4.8e-07 424_[+2]_60 49434 9.2e-07 121_[+2]_363 49627 1.1e-06 446_[+2]_38 47325 1.3e-06 28_[+2]_456 48863 2.7e-06 44_[+2]_440 10566 3.4e-06 260_[+2]_224 5904 6.7e-06 183_[+2]_301 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=11 40832 ( 374) CAGCAAGGGGAAAAAG 1 37443 ( 375) AAGCAAAGAAAAATAT 1 37578 ( 456) AAGCAAAAAGACAAAT 1 44178 ( 42) CAGCAATGGAAAATAT 1 47713 ( 425) CAGCAATGAGAAAGCT 1 49434 ( 122) AAGCAACAGGAAATCA 1 49627 ( 447) TAGCAAGTGGAAGAAT 1 47325 ( 29) AACCAAATGGAAAGAG 1 48863 ( 45) AAGCAACAGCAACGAT 1 10566 ( 261) AAGCACCTATAAAAAT 1 5904 ( 184) TAGGAAACAGAAAAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 10.1382 E= 1.3e+002 101 20 -1010 -56 189 -1010 -1010 -1010 -1010 -138 202 -1010 -1010 194 -130 -1010 189 -1010 -1010 -1010 175 -138 -1010 -1010 43 20 -30 -56 1 -138 70 2 75 -1010 129 -1010 -57 -138 151 -156 189 -1010 -1010 -1010 175 -138 -1010 -1010 160 -138 -130 -1010 75 -1010 29 2 160 -38 -1010 -1010 -57 -1010 -30 124 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 1.3e+002 0.545455 0.272727 0.000000 0.181818 1.000000 0.000000 0.000000 0.000000 0.000000 0.090909 0.909091 0.000000 0.000000 0.909091 0.090909 0.000000 1.000000 0.000000 0.000000 0.000000 0.909091 0.090909 0.000000 0.000000 0.363636 0.272727 0.181818 0.181818 0.272727 0.090909 0.363636 0.272727 0.454545 0.000000 0.545455 0.000000 0.181818 0.090909 0.636364 0.090909 1.000000 0.000000 0.000000 0.000000 0.909091 0.090909 0.000000 0.000000 0.818182 0.090909 0.090909 0.000000 0.454545 0.000000 0.272727 0.272727 0.818182 0.181818 0.000000 0.000000 0.181818 0.000000 0.181818 0.636364 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AC]AGCAA[AC][GAT][GA]GAAA[AGT]AT -------------------------------------------------------------------------------- Time 7.00 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 14 sites = 10 llr = 120 E-value = 3.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 694:9821:2:9:1 pos.-specific C 4:5a::7:::8::3 probability G :1::1:14a6219: matrix T ::1::2:5:2::16 bits 2.2 * * 1.9 * * 1.7 * * * 1.5 * ** * ** Relative 1.3 * ** * *** Entropy 1.1 ** *** * *** (17.3 bits) 0.9 ** **** * *** 0.6 ************** 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel AACCAACTGGCAGT consensus C A TAG AG C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 48008 439 6.75e-09 TGCACTGGAA AACCAACGGGCAGT CTCGTTGGGA 47713 182 2.64e-07 CGCTATGCGA AGCCAACTGGCAGT GCTATAGGGA 49434 454 9.64e-07 TTCCACGAAT AACCAAATGTCAGC GACTGGCAAG 37578 318 1.22e-06 TCTTGGCATG CACCAAAAGGCAGT ATTGCCCAAC 47425 93 1.32e-06 AAATAACCTG CATCAACTGGGAGT GAAACAACGA 37443 47 2.09e-06 TTACAGTTAT CAACAACTGTGAGC AAAGAACCGT 49464 448 2.28e-06 CTCCTGATCT AACCGACGGGCAGA CTGACTGTGA 6974 125 2.61e-06 CGCATATGTG AAACATGTGGCAGC GAATTTCCCA 10566 128 3.40e-06 TGCATAACGT CAACAACGGACATT GTGGACTGAT 44232 235 5.48e-06 ATAAAGAAGA AAACATCGGACGGT TATCAGATGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48008 6.7e-09 438_[+3]_48 47713 2.6e-07 181_[+3]_305 49434 9.6e-07 453_[+3]_33 37578 1.2e-06 317_[+3]_169 47425 1.3e-06 92_[+3]_394 37443 2.1e-06 46_[+3]_440 49464 2.3e-06 447_[+3]_39 6974 2.6e-06 124_[+3]_362 10566 3.4e-06 127_[+3]_359 44232 5.5e-06 234_[+3]_252 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=14 seqs=10 48008 ( 439) AACCAACGGGCAGT 1 47713 ( 182) AGCCAACTGGCAGT 1 49434 ( 454) AACCAAATGTCAGC 1 37578 ( 318) CACCAAAAGGCAGT 1 47425 ( 93) CATCAACTGGGAGT 1 37443 ( 47) CAACAACTGTGAGC 1 49464 ( 448) AACCGACGGGCAGA 1 6974 ( 125) AAACATGTGGCAGC 1 10566 ( 128) CAACAACGGACATT 1 44232 ( 235) AAACATCGGACGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 9740 bayes= 10.1781 E= 3.1e+002 115 75 -997 -997 174 -997 -116 -997 57 107 -997 -143 -997 207 -997 -997 174 -997 -116 -997 157 -997 -997 -43 -43 156 -116 -997 -143 -997 84 89 -997 -997 216 -997 -43 -997 142 -43 -997 175 -16 -997 174 -997 -116 -997 -997 -997 201 -143 -143 34 -997 116 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 10 E= 3.1e+002 0.600000 0.400000 0.000000 0.000000 0.900000 0.000000 0.100000 0.000000 0.400000 0.500000 0.000000 0.100000 0.000000 1.000000 0.000000 0.000000 0.900000 0.000000 0.100000 0.000000 0.800000 0.000000 0.000000 0.200000 0.200000 0.700000 0.100000 0.000000 0.100000 0.000000 0.400000 0.500000 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.600000 0.200000 0.000000 0.800000 0.200000 0.000000 0.900000 0.000000 0.100000 0.000000 0.000000 0.000000 0.900000 0.100000 0.100000 0.300000 0.000000 0.600000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AC]A[CA]CA[AT][CA][TG]G[GAT][CG]AG[TC] -------------------------------------------------------------------------------- Time 10.15 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47425 7.62e-05 92_[+3(1.32e-06)]_55_[+1(2.88e-06)]_\ 326 38192 1.87e-01 317_[+1(3.21e-05)]_170 48008 2.47e-05 438_[+3(6.75e-09)]_48 5904 9.58e-05 183_[+2(6.71e-06)]_189_\ [+1(3.50e-06)]_15_[+1(4.76e-05)]_71 48863 1.43e-04 26_[+1(6.96e-05)]_5_[+2(2.72e-06)]_\ 339_[+1(8.28e-06)]_88 39529 1.54e-01 500 40111 2.34e-01 500 49434 2.58e-08 121_[+2(9.21e-07)]_85_\ [+1(8.64e-07)]_218_[+3(9.64e-07)]_33 49464 3.37e-04 350_[+1(1.44e-05)]_84_\ [+3(2.28e-06)]_[+1(5.79e-05)]_26 49627 2.66e-04 44_[+1(3.95e-05)]_57_[+1(1.19e-05)]_\ 319_[+2(1.10e-06)]_38 40988 7.18e-01 500 6974 1.33e-02 124_[+3(2.61e-06)]_362 44178 5.07e-04 41_[+2(4.27e-07)]_443 10566 1.09e-07 127_[+3(3.40e-06)]_119_\ [+2(3.41e-06)]_96_[+1(3.21e-07)]_115 44232 4.66e-06 234_[+3(5.48e-06)]_122_\ [+1(1.80e-07)]_117 47325 2.40e-05 28_[+2(1.29e-06)]_128_\ [+1(6.53e-07)]_315 37443 1.08e-07 46_[+3(2.09e-06)]_99_[+1(3.95e-05)]_\ 85_[+1(1.44e-05)]_104_[+2(1.23e-07)]_110 37578 1.02e-05 317_[+3(1.22e-06)]_124_\ [+2(3.19e-07)]_29 47713 5.99e-08 181_[+3(2.64e-07)]_177_\ [+1(1.53e-05)]_39_[+2(4.78e-07)]_60 40832 9.46e-06 199_[+1(4.90e-06)]_161_\ [+2(9.69e-08)]_111 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************