******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/312/312.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10045 1.0000 500 11868 1.0000 500 1753 1.0000 500 21457 1.0000 500 22796 1.0000 500 25037 1.0000 500 25353 1.0000 500 25358 1.0000 500 25815 1.0000 500 260862 1.0000 500 261832 1.0000 500 268300 1.0000 500 268410 1.0000 500 3980 1.0000 500 5989 1.0000 500 7334 1.0000 500 7335 1.0000 500 7336 1.0000 500 7337 1.0000 500 7999 1.0000 500 9099 1.0000 500 9247 1.0000 500 934 1.0000 500 9381 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/312/312.seqs.fa -oc motifs/312 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 24 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12000 N= 24 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.265 C 0.236 G 0.230 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.236 G 0.230 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 9 llr = 190 E-value = 1.6e-019 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :9::9:1811:4a::a:1:97 pos.-specific C 11:41::2:89::9::8:31: probability G 9:a6:a9:11:6:1a::77:: matrix T ::::::::8:1:::::22::3 bits 2.1 * * * 1.9 * * * ** 1.7 * * ** * ** 1.5 *** *** * **** * Relative 1.3 *** **** * ***** ** Entropy 1.1 ******** ******** *** (30.5 bits) 0.8 ********************* 0.6 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GAGGAGGATCCGACGACGGAA consensus C C A TTC T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 7337 297 7.38e-13 GAGGCTTTTT GAGCAGGATCCAACGACGGAA AAACATTTCC 7336 297 7.38e-13 GAGGCTTTTT GAGCAGGATCCAACGACGGAA AAACATTTCC 7335 297 7.38e-13 GAGGCTTTTT GAGCAGGATCCAACGACGGAA AAACATTTCC 7334 297 7.38e-13 GAGGCTTTTT GAGCAGGATCCAACGACGGAA AAACATTTCC 9247 457 1.65e-11 ACGATCCCAC GAGGAGGATCCGACGACTCAT CATCATCGTC 11868 457 1.65e-11 ACGATCCCAC GAGGAGGATCCGACGACTCAT CATCATCGTC 25358 6 1.48e-09 TTAAA GAGGAGGAAGTGACGATGGAA ATGGGAGACC 10045 97 1.16e-08 GATGTGACGA GAGGAGACGACGACGATACAA TGAACAGGAG 260862 69 1.55e-08 GCCGATGCCT CCGGCGGCTCCGAGGACGGCT ACTTTGGCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7337 7.4e-13 296_[+1]_183 7336 7.4e-13 296_[+1]_183 7335 7.4e-13 296_[+1]_183 7334 7.4e-13 296_[+1]_183 9247 1.6e-11 456_[+1]_23 11868 1.6e-11 456_[+1]_23 25358 1.5e-09 5_[+1]_474 10045 1.2e-08 96_[+1]_383 260862 1.6e-08 68_[+1]_411 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=9 7337 ( 297) GAGCAGGATCCAACGACGGAA 1 7336 ( 297) GAGCAGGATCCAACGACGGAA 1 7335 ( 297) GAGCAGGATCCAACGACGGAA 1 7334 ( 297) GAGCAGGATCCAACGACGGAA 1 9247 ( 457) GAGGAGGATCCGACGACTCAT 1 11868 ( 457) GAGGAGGATCCGACGACTCAT 1 25358 ( 6) GAGGAGGAAGTGACGATGGAA 1 10045 ( 97) GAGGAGACGACGACGATACAA 1 260862 ( 69) CCGGCGGCTCCGAGGACGGCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 11520 bayes= 11.1693 E= 1.6e-019 -982 -109 195 -982 174 -109 -982 -982 -982 -982 212 -982 -982 91 127 -982 174 -109 -982 -982 -982 -982 212 -982 -125 -982 195 -982 155 -9 -982 -982 -125 -982 -105 153 -125 172 -105 -982 -982 191 -982 -127 74 -982 127 -982 191 -982 -982 -982 -982 191 -105 -982 -982 -982 212 -982 191 -982 -982 -982 -982 172 -982 -27 -125 -982 153 -27 -982 50 153 -982 174 -109 -982 -982 133 -982 -982 31 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 1.6e-019 0.000000 0.111111 0.888889 0.000000 0.888889 0.111111 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.444444 0.555556 0.000000 0.888889 0.111111 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.111111 0.000000 0.888889 0.000000 0.777778 0.222222 0.000000 0.000000 0.111111 0.000000 0.111111 0.777778 0.111111 0.777778 0.111111 0.000000 0.000000 0.888889 0.000000 0.111111 0.444444 0.000000 0.555556 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.777778 0.000000 0.222222 0.111111 0.000000 0.666667 0.222222 0.000000 0.333333 0.666667 0.000000 0.888889 0.111111 0.000000 0.000000 0.666667 0.000000 0.000000 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GAG[GC]AGG[AC]TCC[GA]ACGA[CT][GT][GC]A[AT] -------------------------------------------------------------------------------- Time 4.93 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 18 sites = 13 llr = 203 E-value = 9.9e-016 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 57a:26:a81619:258: pos.-specific C 51:a22a:16291a132a probability G :1::51::112:::2::: matrix T :2:::1:::2::::62:: bits 2.1 * * * * 1.9 ** ** * * 1.7 ** ** * * * 1.5 ** ** *** * Relative 1.3 ** *** *** ** Entropy 1.1 * ** *** *** ** (22.6 bits) 0.8 * ** *** *** ** 0.6 ***** ******** ** 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel AAACGACAACACACTAAC consensus C AC TG C sequence C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 7337 429 3.44e-11 TAATTCTCTG AAACGACAACACACTAAC ACTACAGTAT 7336 429 3.44e-11 TAATTCTCTG AAACGACAACACACTAAC ACTACAGTAT 7335 429 3.44e-11 TAATTCTCTG AAACGACAACACACTAAC ACTACAGTAT 7334 429 3.44e-11 TAATTCTCTG AAACGACAACACACTAAC ACTACAGTAT 268410 438 2.19e-08 ACCCGACCCC CAACGCCAACGCACCCAC AAAGCACAAC 5989 475 4.13e-08 CCTCACAAAC CAACGCCAACACCCAAAC GTCACAAC 9247 407 7.88e-08 TATCCAAAGC AAACAACAATCCACTCCC CAAATCACAC 11868 407 7.88e-08 TATCCAAAGC AAACAACAATCCACTCCC CAAATCACAC 3980 401 1.70e-07 ACATCAATTA CTACCTCAATACACTTAC TAAAGGCAAA 25353 454 2.48e-07 ACAGAACACA CGACCACAGCACACACAC AATAGTTCCT 9381 174 3.30e-07 TATACCCCGA CAACAACACGACACGTAC CTCGTTCGTG 260862 325 3.30e-07 AGCGTCAAAG ATACCGCAAAGCACTAAC GAGGTCGCAA 25358 470 6.31e-07 AGCTCTGCGA CCACGCCAACGAACGTAC ACCCTCTTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7337 3.4e-11 428_[+2]_54 7336 3.4e-11 428_[+2]_54 7335 3.4e-11 428_[+2]_54 7334 3.4e-11 428_[+2]_54 268410 2.2e-08 437_[+2]_45 5989 4.1e-08 474_[+2]_8 9247 7.9e-08 406_[+2]_76 11868 7.9e-08 406_[+2]_76 3980 1.7e-07 400_[+2]_82 25353 2.5e-07 453_[+2]_29 9381 3.3e-07 173_[+2]_309 260862 3.3e-07 324_[+2]_158 25358 6.3e-07 469_[+2]_13 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=18 seqs=13 7337 ( 429) AAACGACAACACACTAAC 1 7336 ( 429) AAACGACAACACACTAAC 1 7335 ( 429) AAACGACAACACACTAAC 1 7334 ( 429) AAACGACAACACACTAAC 1 268410 ( 438) CAACGCCAACGCACCCAC 1 5989 ( 475) CAACGCCAACACCCAAAC 1 9247 ( 407) AAACAACAATCCACTCCC 1 11868 ( 407) AAACAACAATCCACTCCC 1 3980 ( 401) CTACCTCAATACACTTAC 1 25353 ( 454) CGACCACAGCACACACAC 1 9381 ( 174) CAACAACACGACACGTAC 1 260862 ( 325) ATACCGCAAAGCACTAAC 1 25358 ( 470) CCACGCCAACGAACGTAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 11592 bayes= 9.55407 E= 9.9e-016 102 97 -1035 -1035 138 -162 -158 -80 191 -1035 -1035 -1035 -1035 208 -1035 -1035 -20 -3 122 -1035 121 -3 -158 -180 -1035 208 -1035 -1035 191 -1035 -1035 -1035 167 -162 -158 -1035 -178 138 -158 -22 121 -62 0 -1035 -178 197 -1035 -1035 180 -162 -1035 -1035 -1035 208 -1035 -1035 -79 -162 -58 120 80 38 -1035 -22 167 -62 -1035 -1035 -1035 208 -1035 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 13 E= 9.9e-016 0.538462 0.461538 0.000000 0.000000 0.692308 0.076923 0.076923 0.153846 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.230769 0.230769 0.538462 0.000000 0.615385 0.230769 0.076923 0.076923 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.846154 0.076923 0.076923 0.000000 0.076923 0.615385 0.076923 0.230769 0.615385 0.153846 0.230769 0.000000 0.076923 0.923077 0.000000 0.000000 0.923077 0.076923 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.153846 0.076923 0.153846 0.615385 0.461538 0.307692 0.000000 0.230769 0.846154 0.153846 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AC]AAC[GAC][AC]CAA[CT][AG]CACT[ACT]AC -------------------------------------------------------------------------------- Time 9.82 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 8 llr = 163 E-value = 3.3e-011 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a:9::1:::3::1a1998::a pos.-specific C :::136:983:a8:911395: probability G :a:8:::13:4:1:::::15: matrix T ::1183a::56:::::::::: bits 2.1 * * 1.9 ** * * * * 1.7 ** * * * * 1.5 ** ** * **** * * Relative 1.3 *** *** * **** * * Entropy 1.1 ***** *** *********** (29.5 bits) 0.8 ***** *** *********** 0.6 ********* *********** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel AGAGTCTCCTTCCACAAACCA consensus CT GAG C G sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 7337 329 2.35e-13 AACATTTCCA AGAGTCTCCTTCCACAAACGA ACTGCCTGTC 7336 329 2.35e-13 AACATTTCCA AGAGTCTCCTTCCACAAACGA ACTGCCTGTC 7335 329 2.35e-13 AACATTTCCA AGAGTCTCCTTCCACAAACGA ACTGCCTGTC 7334 329 2.35e-13 AACATTTCCA AGAGTCTCCTTCCACAAACGA ACTGCCTGTC 10045 343 2.21e-10 TGACATAAAG AGAGCTTCCCGCCACAACCCA CCAAATAGTA 268410 227 9.93e-09 GTCACTTGGA AGACTCTCGCGCAACCCACCA CGACCGCTCA 9099 78 1.04e-08 CACGACTGTG AGAGCTTGGATCGAAAAACCA TTCATTCCGC 22796 477 1.04e-08 CCTACACCAA AGTTTATCCAGCCACAACGCA ACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7337 2.3e-13 328_[+3]_151 7336 2.3e-13 328_[+3]_151 7335 2.3e-13 328_[+3]_151 7334 2.3e-13 328_[+3]_151 10045 2.2e-10 342_[+3]_137 268410 9.9e-09 226_[+3]_253 9099 1e-08 77_[+3]_402 22796 1e-08 476_[+3]_3 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=8 7337 ( 329) AGAGTCTCCTTCCACAAACGA 1 7336 ( 329) AGAGTCTCCTTCCACAAACGA 1 7335 ( 329) AGAGTCTCCTTCCACAAACGA 1 7334 ( 329) AGAGTCTCCTTCCACAAACGA 1 10045 ( 343) AGAGCTTCCCGCCACAACCCA 1 268410 ( 227) AGACTCTCGCGCAACCCACCA 1 9099 ( 78) AGAGCTTGGATCGAAAAACCA 1 22796 ( 477) AGTTTATCCAGCCACAACGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 11520 bayes= 10.4909 E= 3.3e-011 191 -965 -965 -965 -965 -965 212 -965 172 -965 -965 -110 -965 -92 170 -110 -965 8 -965 148 -108 140 -965 -10 -965 -965 -965 190 -965 189 -88 -965 -965 167 12 -965 -9 8 -965 90 -965 -965 70 122 -965 208 -965 -965 -108 167 -88 -965 191 -965 -965 -965 -108 189 -965 -965 172 -92 -965 -965 172 -92 -965 -965 150 8 -965 -965 -965 189 -88 -965 -965 108 112 -965 191 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 3.3e-011 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.875000 0.000000 0.000000 0.125000 0.000000 0.125000 0.750000 0.125000 0.000000 0.250000 0.000000 0.750000 0.125000 0.625000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.750000 0.250000 0.000000 0.250000 0.250000 0.000000 0.500000 0.000000 0.000000 0.375000 0.625000 0.000000 1.000000 0.000000 0.000000 0.125000 0.750000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.500000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AGAG[TC][CT]TC[CG][TAC][TG]CCACAA[AC]C[CG]A -------------------------------------------------------------------------------- Time 14.62 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10045 2.12e-11 96_[+1(1.16e-08)]_225_\ [+3(2.21e-10)]_137 11868 9.12e-11 54_[+1(1.64e-05)]_331_\ [+2(7.88e-08)]_32_[+1(1.65e-11)]_23 1753 7.36e-02 500 21457 3.64e-02 33_[+2(6.21e-05)]_449 22796 1.27e-05 476_[+3(1.04e-08)]_3 25037 7.02e-01 500 25353 1.46e-04 130_[+1(5.58e-05)]_302_\ [+2(2.48e-07)]_29 25358 5.79e-08 5_[+1(1.48e-09)]_443_[+2(6.31e-07)]_\ 13 25815 9.92e-02 500 260862 9.44e-08 68_[+1(1.55e-08)]_235_\ [+2(3.30e-07)]_158 261832 3.17e-01 500 268300 8.59e-01 500 268410 5.34e-09 226_[+3(9.93e-09)]_190_\ [+2(2.19e-08)]_45 3980 1.28e-03 400_[+2(1.70e-07)]_18_\ [+2(6.18e-06)]_46 5989 6.70e-05 416_[+1(6.52e-05)]_37_\ [+2(4.13e-08)]_8 7334 1.34e-24 [+2(7.46e-05)]_278_[+1(7.38e-13)]_\ 11_[+3(2.35e-13)]_79_[+2(3.44e-11)]_54 7335 1.34e-24 [+2(7.46e-05)]_278_[+1(7.38e-13)]_\ 11_[+3(2.35e-13)]_79_[+2(3.44e-11)]_54 7336 1.34e-24 [+2(7.46e-05)]_278_[+1(7.38e-13)]_\ 11_[+3(2.35e-13)]_79_[+2(3.44e-11)]_54 7337 1.34e-24 296_[+1(7.38e-13)]_11_\ [+3(2.35e-13)]_79_[+2(3.44e-11)]_54 7999 8.11e-01 500 9099 7.07e-06 77_[+3(1.04e-08)]_44_[+1(4.97e-05)]_\ 156_[+1(6.25e-05)]_160 9247 5.98e-11 54_[+1(2.91e-05)]_331_\ [+2(7.88e-08)]_32_[+1(1.65e-11)]_23 934 2.38e-01 190_[+1(9.89e-05)]_289 9381 1.71e-03 173_[+2(3.30e-07)]_309 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************