******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/34/34.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42815 1.0000 500 8970 1.0000 500 43063 1.0000 500 43079 1.0000 500 43176 1.0000 500 47504 1.0000 500 48395 1.0000 500 48806 1.0000 500 43549 1.0000 500 49153 1.0000 500 49435 1.0000 500 49965 1.0000 500 18599 1.0000 500 50442 1.0000 500 44664 1.0000 500 44930 1.0000 500 35763 1.0000 500 46146 1.0000 500 46184 1.0000 500 32597 1.0000 500 47023 1.0000 500 45629 1.0000 500 12662 1.0000 500 44065 1.0000 500 47907 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/34/34.seqs.fa -oc motifs/34 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 25 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12500 N= 25 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.288 C 0.221 G 0.217 T 0.274 Background letter frequencies (from dataset with add-one prior applied): A 0.288 C 0.221 G 0.217 T 0.274 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 12 llr = 148 E-value = 2.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :9721424:4::1::5 pos.-specific C 9:::8::4::12128: probability G 113::38:861::32: matrix T :::823:22:8885:5 bits 2.2 2.0 1.8 * 1.5 * * * * Relative 1.3 ** * * * * Entropy 1.1 ***** * ***** * (17.8 bits) 0.9 ***** * ***** ** 0.7 ***** * ******** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CAATCAGAGGTTTTCA consensus G G C A G T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 42815 144 5.27e-08 TTTTGATTCA CAGTCTGAGATTTGCA ATTCGATGAA 44065 360 8.21e-08 ACGCCTTCCG CAATCAGCGGCTTGCT TTACATTGTT 48395 421 1.27e-07 ATCTCATTGT CAATCGGAGGGTTGCT GACTCTGAAA 44930 142 2.29e-07 AGATGATAGT CAAACTGCGATTTGCT TTGTGTTCGG 48806 462 5.21e-07 TAGAGTATAG CAATCAGAGGTCCTCA AAAATTTAGA 12662 157 8.77e-07 GAAAGTCGTT CAATTTGCTGTTTTCT ATAGATCTGT 49153 394 9.67e-07 CAACCATTCA CAGTCAACGATCTTCA ACGGCTTCTG 43079 120 1.65e-06 ACAGCTCGTT CGATTGGAGATTTTCA GCCGGTGTTT 46146 160 1.92e-06 TGCTCTTTTC CAATCGGTGATTACCA ACCAGAAGTG 45629 403 2.75e-06 TGGGTCCGGA CAAACAAAGGTTTTGA CAATGGTGCG 47023 60 3.15e-06 CAAGGGAAAG GAGTCGGCTGTTTCCT TCTTCATCTC 43063 29 3.60e-06 GAGCTACTAG CAGTAAGTGGTTTTGT GTGTCGTTAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42815 5.3e-08 143_[+1]_341 44065 8.2e-08 359_[+1]_125 48395 1.3e-07 420_[+1]_64 44930 2.3e-07 141_[+1]_343 48806 5.2e-07 461_[+1]_23 12662 8.8e-07 156_[+1]_328 49153 9.7e-07 393_[+1]_91 43079 1.6e-06 119_[+1]_365 46146 1.9e-06 159_[+1]_325 45629 2.7e-06 402_[+1]_82 47023 3.1e-06 59_[+1]_425 43063 3.6e-06 28_[+1]_456 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=12 42815 ( 144) CAGTCTGAGATTTGCA 1 44065 ( 360) CAATCAGCGGCTTGCT 1 48395 ( 421) CAATCGGAGGGTTGCT 1 44930 ( 142) CAAACTGCGATTTGCT 1 48806 ( 462) CAATCAGAGGTCCTCA 1 12662 ( 157) CAATTTGCTGTTTTCT 1 49153 ( 394) CAGTCAACGATCTTCA 1 43079 ( 120) CGATTGGAGATTTTCA 1 46146 ( 160) CAATCGGTGATTACCA 1 45629 ( 403) CAAACAAAGGTTTTGA 1 47023 ( 60) GAGTCGGCTGTTTCCT 1 43063 ( 29) CAGTAAGTGGTTTTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 12125 bayes= 9.63789 E= 2.1e+002 -1023 205 -138 -1023 167 -1023 -138 -1023 121 -1023 62 -1023 -79 -1023 -1023 160 -178 176 -1023 -72 53 -1023 62 -13 -79 -1023 194 -1023 53 92 -1023 -72 -1023 -1023 194 -72 53 -1023 142 -1023 -1023 -140 -138 160 -1023 -40 -1023 160 -178 -140 -1023 160 -1023 -40 62 86 -1023 192 -38 -1023 80 -1023 -1023 86 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 12 E= 2.1e+002 0.000000 0.916667 0.083333 0.000000 0.916667 0.000000 0.083333 0.000000 0.666667 0.000000 0.333333 0.000000 0.166667 0.000000 0.000000 0.833333 0.083333 0.750000 0.000000 0.166667 0.416667 0.000000 0.333333 0.250000 0.166667 0.000000 0.833333 0.000000 0.416667 0.416667 0.000000 0.166667 0.000000 0.000000 0.833333 0.166667 0.416667 0.000000 0.583333 0.000000 0.000000 0.083333 0.083333 0.833333 0.000000 0.166667 0.000000 0.833333 0.083333 0.083333 0.000000 0.833333 0.000000 0.166667 0.333333 0.500000 0.000000 0.833333 0.166667 0.000000 0.500000 0.000000 0.000000 0.500000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CA[AG]TC[AGT]G[AC]G[GA]TTT[TG]C[AT] -------------------------------------------------------------------------------- Time 5.02 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 16 llr = 162 E-value = 1.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::1:::::11 pos.-specific C :331:1:98::3 probability G 3:7::16:218: matrix T 78:998411917 bits 2.2 2.0 1.8 1.5 * * Relative 1.3 *** * ** Entropy 1.1 ***** ***** (14.6 bits) 0.9 ************ 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTGTTTGCCTGT consensus GCC T C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43079 140 1.04e-07 TTTTCAGCCG GTGTTTGCCTGT GATTGGGACC 45629 244 1.77e-07 AGGAAATTCT TTGTTTTCCTGT TAGTACAAAT 43176 83 1.12e-06 GCGTGATTTA GTGTTTGCGTGT CGATAATGCT 49435 113 2.46e-06 AAGGCGTTTG TTCTTCGCCTGT GTACAAAGTT 46146 123 2.80e-06 CCTAGAACAA TCGTTGGCCTGT TAGTGCCCTA 44065 396 5.94e-06 TCCTTGTACC TTCCTTTCCTGT GCAGGGCGCG 42815 437 5.94e-06 AGCGCGACCC TCGTTGTCCTGT GAGGATCGGA 47907 272 8.88e-06 AAAATCAACT GCCTTTGCGTGT ACATTGTCGT 43063 1 8.88e-06 . TTGTTTTCCTTC ACGCTCGAGC 44930 42 9.96e-06 CGAAAGAGGG TTGTTTTTCTGC TTGAAAGCTC 48395 465 1.55e-05 AATACCTTTT TTGTTTTCTTGC CGACACTTTC 12662 418 1.89e-05 AACGAAATAA GTGCTTGCGTGC GAGTTTTCAT 35763 201 2.07e-05 TGCCAAACTC TTGTTTGCCGAT TTCAAGTAGA 48806 391 2.25e-05 GTCAAGCTCC TTGTTTGCCTTA AAGTTTTCTG 49153 48 3.34e-05 CGGCCCCCCG TTCTACGCCTGT CGCGTCACTG 32597 370 7.15e-05 AACGAGAATA GCCTTTGTCGGT GAAATCTCAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43079 1e-07 139_[+2]_349 45629 1.8e-07 243_[+2]_245 43176 1.1e-06 82_[+2]_406 49435 2.5e-06 112_[+2]_376 46146 2.8e-06 122_[+2]_366 44065 5.9e-06 395_[+2]_93 42815 5.9e-06 436_[+2]_52 47907 8.9e-06 271_[+2]_217 43063 8.9e-06 [+2]_488 44930 1e-05 41_[+2]_447 48395 1.6e-05 464_[+2]_24 12662 1.9e-05 417_[+2]_71 35763 2.1e-05 200_[+2]_288 48806 2.3e-05 390_[+2]_98 49153 3.3e-05 47_[+2]_441 32597 7.1e-05 369_[+2]_119 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=16 43079 ( 140) GTGTTTGCCTGT 1 45629 ( 244) TTGTTTTCCTGT 1 43176 ( 83) GTGTTTGCGTGT 1 49435 ( 113) TTCTTCGCCTGT 1 46146 ( 123) TCGTTGGCCTGT 1 44065 ( 396) TTCCTTTCCTGT 1 42815 ( 437) TCGTTGTCCTGT 1 47907 ( 272) GCCTTTGCGTGT 1 43063 ( 1) TTGTTTTCCTTC 1 44930 ( 42) TTGTTTTTCTGC 1 48395 ( 465) TTGTTTTCTTGC 1 12662 ( 418) GTGCTTGCGTGC 1 35763 ( 201) TTGTTTGCCGAT 1 48806 ( 391) TTGTTTGCCTTA 1 49153 ( 48) TTCTACGCCTGT 1 32597 ( 370) GCCTTTGTCGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 12225 bayes= 10.3134 E= 1.7e+001 -1064 -1064 52 132 -1064 18 -1064 145 -1064 50 166 -1064 -1064 -82 -1064 167 -220 -1064 -1064 177 -1064 -82 -80 145 -1064 -1064 152 45 -1064 199 -1064 -113 -1064 176 -21 -213 -1064 -1064 -80 167 -220 -1064 190 -113 -220 18 -1064 132 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 16 E= 1.7e+001 0.000000 0.000000 0.312500 0.687500 0.000000 0.250000 0.000000 0.750000 0.000000 0.312500 0.687500 0.000000 0.000000 0.125000 0.000000 0.875000 0.062500 0.000000 0.000000 0.937500 0.000000 0.125000 0.125000 0.750000 0.000000 0.000000 0.625000 0.375000 0.000000 0.875000 0.000000 0.125000 0.000000 0.750000 0.187500 0.062500 0.000000 0.000000 0.125000 0.875000 0.062500 0.000000 0.812500 0.125000 0.062500 0.250000 0.000000 0.687500 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TG][TC][GC]TTT[GT]CCTG[TC] -------------------------------------------------------------------------------- Time 9.99 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 7 llr = 88 E-value = 1.4e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :6:6:1:::a:a pos.-specific C :::331::::a: probability G a3716:a:a::: matrix T :13:17:a:::: bits 2.2 * * * * 2.0 * * * * 1.8 * ****** 1.5 * ****** Relative 1.3 * * ****** Entropy 1.1 * * ****** (18.1 bits) 0.9 * * ****** 0.7 * ********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GAGAGTGTGACA consensus GTCC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48395 381 1.38e-07 TATGAAATCA GAGCGTGTGACA CTAACCCCTT 44065 279 3.37e-07 ATGTATGAAT GATAGTGTGACA CAACGCCCAC 32597 254 1.03e-06 TACCAGAAGG GAGCGCGTGACA ATTACAGAAG 18599 108 1.28e-06 GTCAGTGACT GTGACTGTGACA ACAGAAATGC 44664 59 1.37e-06 GAAAAAAGAA GGGAGAGTGACA TTTCGGCCTT 48806 142 1.58e-06 TTAAATTATA GGGGCTGTGACA TACTACCTAT 43176 43 2.24e-06 ACCATGGTGG GATATTGTGACA TTTTCGGGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48395 1.4e-07 380_[+3]_108 44065 3.4e-07 278_[+3]_210 32597 1e-06 253_[+3]_235 18599 1.3e-06 107_[+3]_381 44664 1.4e-06 58_[+3]_430 48806 1.6e-06 141_[+3]_347 43176 2.2e-06 42_[+3]_446 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=7 48395 ( 381) GAGCGTGTGACA 1 44065 ( 279) GATAGTGTGACA 1 32597 ( 254) GAGCGCGTGACA 1 18599 ( 108) GTGACTGTGACA 1 44664 ( 59) GGGAGAGTGACA 1 48806 ( 142) GGGGCTGTGACA 1 43176 ( 43) GATATTGTGACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 12225 bayes= 10.6132 E= 1.4e+003 -945 -945 220 -945 99 -945 39 -94 -945 -945 172 6 99 37 -60 -945 -945 37 139 -94 -101 -63 -945 138 -945 -945 220 -945 -945 -945 -945 186 -945 -945 220 -945 180 -945 -945 -945 -945 218 -945 -945 180 -945 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 1.4e+003 0.000000 0.000000 1.000000 0.000000 0.571429 0.000000 0.285714 0.142857 0.000000 0.000000 0.714286 0.285714 0.571429 0.285714 0.142857 0.000000 0.000000 0.285714 0.571429 0.142857 0.142857 0.142857 0.000000 0.714286 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[AG][GT][AC][GC]TGTGACA -------------------------------------------------------------------------------- Time 15.10 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42815 1.05e-05 143_[+1(5.27e-08)]_29_\ [+2(3.34e-05)]_236_[+2(5.94e-06)]_52 8970 6.37e-01 500 43063 4.30e-04 [+2(8.88e-06)]_16_[+1(3.60e-06)]_\ 122_[+2(8.76e-05)]_322 43079 2.62e-06 119_[+1(1.65e-06)]_4_[+2(1.04e-07)]_\ 349 43176 3.69e-05 42_[+3(2.24e-06)]_28_[+2(1.12e-06)]_\ 406 47504 6.63e-01 500 48395 1.00e-08 380_[+3(1.38e-07)]_28_\ [+1(1.27e-07)]_28_[+2(1.55e-05)]_24 48806 4.72e-07 141_[+3(1.58e-06)]_237_\ [+2(2.25e-05)]_59_[+1(5.21e-07)]_23 43549 8.18e-01 500 49153 2.19e-04 47_[+2(3.34e-05)]_334_\ [+1(9.67e-07)]_91 49435 8.23e-03 112_[+2(2.46e-06)]_376 49965 6.38e-01 500 18599 1.70e-03 107_[+3(1.28e-06)]_381 50442 3.92e-01 500 44664 1.09e-02 58_[+3(1.37e-06)]_430 44930 3.44e-05 41_[+2(9.96e-06)]_88_[+1(2.29e-07)]_\ 343 35763 3.98e-02 200_[+2(2.07e-05)]_288 46146 5.79e-05 122_[+2(2.80e-06)]_25_\ [+1(1.92e-06)]_325 46184 8.23e-01 500 32597 8.31e-04 253_[+3(1.03e-06)]_104_\ [+2(7.15e-05)]_119 47023 1.54e-02 59_[+1(3.15e-06)]_425 45629 7.32e-06 243_[+2(1.77e-07)]_147_\ [+1(2.75e-06)]_82 12662 1.33e-04 156_[+1(8.77e-07)]_41_\ [+1(2.08e-05)]_188_[+2(1.89e-05)]_71 44065 6.29e-09 278_[+3(3.37e-07)]_69_\ [+1(8.21e-08)]_20_[+2(5.94e-06)]_93 47907 2.55e-03 271_[+2(8.88e-06)]_59_\ [+1(3.22e-05)]_142 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************