******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/43/43.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31878 1.0000 500 42850 1.0000 500 48235 1.0000 500 9777 1.0000 500 48924 1.0000 500 49171 1.0000 500 49620 1.0000 500 49669 1.0000 500 49672 1.0000 500 50049 1.0000 500 16493 1.0000 500 41004 1.0000 500 43953 1.0000 500 43978 1.0000 500 10391 1.0000 500 44305 1.0000 500 45239 1.0000 500 45312 1.0000 500 45835 1.0000 500 46033 1.0000 500 36014 1.0000 500 46275 1.0000 500 43020 1.0000 500 46647 1.0000 500 46882 1.0000 500 44098 1.0000 500 45514 1.0000 500 49683 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/43/43.seqs.fa -oc motifs/43 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 28 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 14000 N= 28 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.248 G 0.221 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.248 G 0.221 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 9 llr = 130 E-value = 2.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 32:::a:::::9a:1: pos.-specific C :74:3:91::31::74 probability G 61:27:::a:6::4:6 matrix T 1:68::19:a1::62: bits 2.2 * 2.0 * ** * 1.7 * ** * 1.5 ***** * Relative 1.3 ******* ** Entropy 1.1 ******* *** * (20.8 bits) 0.9 ********* *** * 0.7 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GCTTGACTGTGAATCG consensus AACGC C GTC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 45239 322 3.96e-09 AGACGGCGCA ACCTGACTGTGAAGCG ATCTCAGAAA 49672 409 9.11e-09 GGATTGTCGT GCTGGACTGTGAATCC GGTGGATAGA 45312 371 1.29e-08 GTGACATTCT GACTGACTGTGAATCC TGAATCGAAA 16493 87 1.95e-08 CTACGTACAA ACTTGACTGTGAATTG AAAAGAAAAT 49669 173 9.48e-08 TGATAGAAGT GATTCACTGTCAAGCC TGTACCTCCT 36014 390 1.24e-07 GATTGCAGTT GCTGGATTGTGAAGCG CTGGTTATTG 10391 343 6.97e-07 ACGATTGAGT TCCTCACTGTCAATAG GGTATATTTC 31878 190 7.64e-07 TTGCCATTGC GGTTGACTGTTCAGCG TGATAAGCTT 45514 12 9.34e-07 AAAGACGCAC ACCTCACCGTCAATTC ATCAAATTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45239 4e-09 321_[+1]_163 49672 9.1e-09 408_[+1]_76 45312 1.3e-08 370_[+1]_114 16493 2e-08 86_[+1]_398 49669 9.5e-08 172_[+1]_312 36014 1.2e-07 389_[+1]_95 10391 7e-07 342_[+1]_142 31878 7.6e-07 189_[+1]_295 45514 9.3e-07 11_[+1]_473 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=9 45239 ( 322) ACCTGACTGTGAAGCG 1 49672 ( 409) GCTGGACTGTGAATCC 1 45312 ( 371) GACTGACTGTGAATCC 1 16493 ( 87) ACTTGACTGTGAATTG 1 49669 ( 173) GATTCACTGTCAAGCC 1 36014 ( 390) GCTGGATTGTGAAGCG 1 10391 ( 343) TCCTCACTGTCAATAG 1 31878 ( 190) GGTTGACTGTTCAGCG 1 45514 ( 12) ACCTCACCGTCAATTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 13580 bayes= 10.6927 E= 2.0e+000 32 -982 133 -124 -27 143 -99 -982 -982 84 -982 108 -982 -982 1 156 -982 43 159 -982 190 -982 -982 -982 -982 184 -982 -124 -982 -115 -982 175 -982 -982 218 -982 -982 -982 -982 192 -982 43 133 -124 173 -115 -982 -982 190 -982 -982 -982 -982 -982 101 108 -127 143 -982 -25 -982 84 133 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 2.0e+000 0.333333 0.000000 0.555556 0.111111 0.222222 0.666667 0.111111 0.000000 0.000000 0.444444 0.000000 0.555556 0.000000 0.000000 0.222222 0.777778 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.888889 0.000000 0.111111 0.000000 0.111111 0.000000 0.888889 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.555556 0.111111 0.888889 0.111111 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.444444 0.555556 0.111111 0.666667 0.000000 0.222222 0.000000 0.444444 0.555556 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA][CA][TC][TG][GC]ACTGT[GC]AA[TG][CT][GC] -------------------------------------------------------------------------------- Time 6.44 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 21 llr = 235 E-value = 7.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 41365593142572125a729 pos.-specific C 43:22::7116:36251:15: probability G 1672:5::8224:1633:121 matrix T ::::2:1::2:::1::1:11: bits 2.2 2.0 * 1.7 * 1.5 * Relative 1.3 * * * * * Entropy 1.1 * *** * * * (16.1 bits) 0.9 ** **** * * * * 0.7 *** **** *** * ** * 0.4 *** **** ****** ** * 0.2 ********* *********** 0.0 --------------------- Multilevel CGGAAAACGACAACGCAAACA consensus ACACCG A T GC CGG G sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 46275 207 3.55e-10 CCATTTCTAT AGGGAAACGACACCGCAAACA AGATTCGTCG 49171 316 1.67e-08 GAAGTTGACA CGGATGACGTCACAGCGAACA GTGTTCGCGT 49620 294 3.06e-08 CTCTACCGTC CCGACAAAGACGACGCAAACG CGTCTCTGTC 43020 285 5.12e-07 TTGAAGATCG TAGACAAAGACAACGCGAACA ACGAATCAGA 46033 94 8.80e-07 GATTTTTCCA AGAAAAACGTGACGGGAAAGA GCTCACCTTG 45514 333 9.77e-07 CTTATTGACA CGGAAGAACACGAAAGGAACA AGCAGATCTT 49672 480 1.33e-06 GTGCACTGCA ACGGCGACGTCAACGGCAGCA 42850 364 1.47e-06 TTCTTTGACA CGAACGAAGTCGATCCGAAAA TCCGAACCCA 44098 309 1.79e-06 CTATTGCAGT CGAGAGAAGAAAACACAAATA GGCGCGTCTG 46647 212 1.79e-06 AGATGAAAAT AGGAAAACGAGGATCGTAACG TTTCTGCACC 16493 430 3.13e-06 AGGTACGTCC AGAAAAAAGCGGACGAAATGA AGGCTGTCTC 49683 50 3.43e-06 TTCGTTCGTT GGGCCGACGCAGACGGAACGA ATTTCGCGAA 41004 442 3.43e-06 AATCCAATTG CGGCAATCGCAAACGATAAAA CGAATGCCCC 10391 444 5.30e-06 ACGGGAACGA GCGGAGTCGGCGAAGCAATCA GTCGCGAACG 50049 418 6.26e-06 GAAGCGAAGA AAGAGATCGACACCGAAAAGA CCACAGTTTC 43953 80 7.38e-06 CAAGATAGCG GGGCAAAAGGCAAAACAAAAG CGGTTCCCCC 46882 160 9.36e-06 CGTGAACCGA CCGAACACGACCACCACAACA ACGGGATCTG 9777 432 1.27e-05 CCCGCCGCCA CCACTGACAGCGATGGAAAGA GAGCGCCGCG 43978 252 1.82e-05 AGGATTCGAA AGGCTGACGTCACGCATAATA TATATAAAGA 44305 365 2.24e-05 CATGCTTCGT CGAATGACAAGTACGCGACCA CGAATGCTGT 31878 352 2.74e-05 TTGACACCTG ACGAAAACCGAACCCCGAGAA ATCCCAACCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46275 3.5e-10 206_[+2]_273 49171 1.7e-08 315_[+2]_164 49620 3.1e-08 293_[+2]_186 43020 5.1e-07 284_[+2]_195 46033 8.8e-07 93_[+2]_386 45514 9.8e-07 332_[+2]_147 49672 1.3e-06 479_[+2] 42850 1.5e-06 363_[+2]_116 44098 1.8e-06 308_[+2]_171 46647 1.8e-06 211_[+2]_268 16493 3.1e-06 429_[+2]_50 49683 3.4e-06 49_[+2]_430 41004 3.4e-06 441_[+2]_38 10391 5.3e-06 443_[+2]_36 50049 6.3e-06 417_[+2]_62 43953 7.4e-06 79_[+2]_400 46882 9.4e-06 159_[+2]_320 9777 1.3e-05 431_[+2]_48 43978 1.8e-05 251_[+2]_228 44305 2.2e-05 364_[+2]_115 31878 2.7e-05 351_[+2]_128 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=21 46275 ( 207) AGGGAAACGACACCGCAAACA 1 49171 ( 316) CGGATGACGTCACAGCGAACA 1 49620 ( 294) CCGACAAAGACGACGCAAACG 1 43020 ( 285) TAGACAAAGACAACGCGAACA 1 46033 ( 94) AGAAAAACGTGACGGGAAAGA 1 45514 ( 333) CGGAAGAACACGAAAGGAACA 1 49672 ( 480) ACGGCGACGTCAACGGCAGCA 1 42850 ( 364) CGAACGAAGTCGATCCGAAAA 1 44098 ( 309) CGAGAGAAGAAAACACAAATA 1 46647 ( 212) AGGAAAACGAGGATCGTAACG 1 16493 ( 430) AGAAAAAAGCGGACGAAATGA 1 49683 ( 50) GGGCCGACGCAGACGGAACGA 1 41004 ( 442) CGGCAATCGCAAACGATAAAA 1 10391 ( 444) GCGGAGTCGGCGAAGCAATCA 1 50049 ( 418) AAGAGATCGACACCGAAAAGA 1 43953 ( 80) GGGCAAAAGGCAAAACAAAAG 1 46882 ( 160) CCGAACACGACCACCACAACA 1 9777 ( 432) CCACTGACAGCGATGGAAAGA 1 43978 ( 252) AGGCTGACGTCACGCATAATA 1 44305 ( 365) CGAATGACAAGTACGCGACCA 1 31878 ( 352) ACGAAAACCGAACCCCGAGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 13440 bayes= 10.3071 E= 7.1e+000 51 79 -63 -246 -149 21 149 -1104 9 -1104 169 -1104 109 -6 -21 -1104 97 -6 -221 -47 83 -238 111 -1104 168 -1104 -1104 -88 32 143 -1104 -1104 -149 -138 187 -1104 68 -79 -21 -15 -49 132 -21 -1104 97 -238 78 -246 142 21 -1104 -1104 -49 121 -121 -88 -91 -6 149 -1104 -17 94 37 -1104 83 -138 37 -88 190 -1104 -1104 -1104 142 -138 -121 -147 -49 94 11 -147 168 -1104 -63 -1104 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 21 E= 7.1e+000 0.380952 0.428571 0.142857 0.047619 0.095238 0.285714 0.619048 0.000000 0.285714 0.000000 0.714286 0.000000 0.571429 0.238095 0.190476 0.000000 0.523810 0.238095 0.047619 0.190476 0.476190 0.047619 0.476190 0.000000 0.857143 0.000000 0.000000 0.142857 0.333333 0.666667 0.000000 0.000000 0.095238 0.095238 0.809524 0.000000 0.428571 0.142857 0.190476 0.238095 0.190476 0.619048 0.190476 0.000000 0.523810 0.047619 0.380952 0.047619 0.714286 0.285714 0.000000 0.000000 0.190476 0.571429 0.095238 0.142857 0.142857 0.238095 0.619048 0.000000 0.238095 0.476190 0.285714 0.000000 0.476190 0.095238 0.285714 0.142857 1.000000 0.000000 0.000000 0.000000 0.714286 0.095238 0.095238 0.095238 0.190476 0.476190 0.238095 0.095238 0.857143 0.000000 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CA][GC][GA][AC][AC][AG]A[CA]G[AT]C[AG][AC]C[GC][CGA][AG]AA[CG]A -------------------------------------------------------------------------------- Time 12.68 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 6 llr = 94 E-value = 2.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :7::7:8:2::85:32 pos.-specific C 3:a2:a:28a::3:3: probability G 73:83:::::a22a28 matrix T ::::::28::::::2: bits 2.2 * * 2.0 * * ** * 1.7 * * ** * 1.5 ** * ** * * Relative 1.3 * ** ******* * * Entropy 1.1 ************ * * (22.5 bits) 0.9 ************ * * 0.7 ************ * * 0.4 ************** * 0.2 ************** * 0.0 ---------------- Multilevel GACGACATCCGAAGAG consensus CG G C C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 42850 121 1.80e-10 GACGCGGCTC GACGACATCCGAAGCG GATCTGATAT 46647 405 2.94e-08 GCAACGTAAC CGCGGCATCCGAAGTG AAATTTACAG 36014 26 4.51e-08 CCCAATGTCT GACGGCACCCGACGGG CGTCGTCGTC 46033 392 4.91e-08 AATGAAAATT CACGACATCCGACGAA TCACGACCTA 50049 398 6.22e-08 AACTCGCTTC GACGACTTACGAAGCG AAGAAAGAGA 49683 116 1.93e-07 TTTTCTCGGC GGCCACATCCGGGGAG GTATCCGCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42850 1.8e-10 120_[+3]_364 46647 2.9e-08 404_[+3]_80 36014 4.5e-08 25_[+3]_459 46033 4.9e-08 391_[+3]_93 50049 6.2e-08 397_[+3]_87 49683 1.9e-07 115_[+3]_369 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=6 42850 ( 121) GACGACATCCGAAGCG 1 46647 ( 405) CGCGGCATCCGAAGTG 1 36014 ( 26) GACGGCACCCGACGGG 1 46033 ( 392) CACGACATCCGACGAA 1 50049 ( 398) GACGACTTACGAAGCG 1 49683 ( 116) GGCCACATCCGGGGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 13580 bayes= 11.5912 E= 2.1e+002 -923 43 159 -923 131 -923 59 -923 -923 201 -923 -923 -923 -57 191 -923 131 -923 59 -923 -923 201 -923 -923 164 -923 -923 -66 -923 -57 -923 166 -68 175 -923 -923 -923 201 -923 -923 -923 -923 218 -923 164 -923 -41 -923 90 43 -41 -923 -923 -923 218 -923 32 43 -41 -66 -68 -923 191 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 2.1e+002 0.000000 0.333333 0.666667 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.166667 0.000000 0.833333 0.166667 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.500000 0.333333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.333333 0.166667 0.166667 0.166667 0.000000 0.833333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC][AG]CG[AG]CATCCGA[AC]G[AC]G -------------------------------------------------------------------------------- Time 18.94 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31878 2.55e-04 189_[+1(7.64e-07)]_146_\ [+2(2.74e-05)]_128 42850 1.73e-08 120_[+3(1.80e-10)]_227_\ [+2(1.47e-06)]_116 48235 8.71e-01 500 9777 4.70e-02 431_[+2(1.27e-05)]_48 48924 6.12e-01 500 49171 4.85e-05 315_[+2(1.67e-08)]_164 49620 6.48e-04 293_[+2(3.06e-08)]_186 49669 6.81e-04 172_[+1(9.48e-08)]_312 49672 5.46e-09 359_[+3(1.20e-05)]_33_\ [+1(9.11e-09)]_55_[+2(1.33e-06)] 50049 4.08e-06 397_[+3(6.22e-08)]_4_[+2(6.26e-06)]_\ 62 16493 2.38e-06 86_[+1(1.95e-08)]_327_\ [+2(3.13e-06)]_50 41004 4.94e-03 441_[+2(3.43e-06)]_38 43953 5.57e-02 79_[+2(7.38e-06)]_400 43978 8.00e-02 251_[+2(1.82e-05)]_228 10391 9.47e-05 342_[+1(6.97e-07)]_85_\ [+2(5.30e-06)]_36 44305 7.51e-02 364_[+2(2.24e-05)]_115 45239 6.52e-05 321_[+1(3.96e-09)]_163 45312 3.22e-04 370_[+1(1.29e-08)]_114 45835 4.94e-01 500 46033 2.34e-07 93_[+2(8.80e-07)]_277_\ [+3(4.91e-08)]_93 36014 2.12e-07 25_[+3(4.51e-08)]_348_\ [+1(1.24e-07)]_95 46275 2.09e-05 206_[+2(3.55e-10)]_273 43020 6.41e-03 284_[+2(5.12e-07)]_195 46647 9.78e-07 211_[+2(1.79e-06)]_172_\ [+3(2.94e-08)]_80 46882 1.97e-02 78_[+2(5.68e-05)]_60_[+2(9.36e-06)]_\ 320 44098 6.72e-03 308_[+2(1.79e-06)]_171 45514 2.16e-05 11_[+1(9.34e-07)]_305_\ [+2(9.77e-07)]_147 49683 8.17e-06 49_[+2(3.43e-06)]_45_[+3(1.93e-07)]_\ 369 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************