******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/322/322.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 43232 1.0000 500 49618 1.0000 500 50353 1.0000 500 34003 1.0000 500 45656 1.0000 500 54477 1.0000 500 48195 1.0000 500 48527 1.0000 500 43123 1.0000 500 43237 1.0000 500 49286 1.0000 500 47994 1.0000 500 48027 1.0000 500 48192 1.0000 500 45693 1.0000 500 47528 1.0000 500 45591 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/322/322.seqs.fa -oc motifs/322 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.220 G 0.216 T 0.292 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.220 G 0.216 T 0.292 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 17 llr = 159 E-value = 1.1e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 7:1:7:422:97 pos.-specific C 14:8382::a1: probability G :::2:215:::2 matrix T 269:::448::1 bits 2.2 * 2.0 * 1.8 * 1.5 * * ** Relative 1.3 ** * ** Entropy 1.1 **** *** (13.5 bits) 0.9 ****** *** 0.7 ****** **** 0.4 ****** ***** 0.2 ************ 0.0 ------------ Multilevel ATTCACTGTCAA consensus TC C AT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 54477 362 1.74e-07 AATTCGTCTC ATTCACAGTCAA GATTTGTTCT 47994 413 2.42e-07 CGGATCTTGG ACTCACTGTCAA CTGTTAGAGA 49618 322 2.42e-07 CGTGCCAACG ACTCACTGTCAA AGAGCATTCG 43123 98 1.67e-06 TTTATTGCAG ATTCACAGTCAG GAAATGGCTT 48027 207 7.11e-06 TTACTTAAGG ATTCCGTGTCAA TCCCAGTGTC 45693 43 7.98e-06 GATAGATCTA TTTCACTGTCAG TATCGAGGAA 50353 156 7.98e-06 TGAATTATGC ACTCACAGTCAT GACTCTGACG 45656 441 1.51e-05 CTTCTTCCTT TCTCACAATCAA AAATTCAGAT 34003 346 1.51e-05 TACGTTTGGT ATTGCCTTTCAA CATACGAGGT 49286 152 3.09e-05 TGGTAACGTA ACTCACCGACAG TTGGGGCTTT 48527 6 3.09e-05 GTAAT TCTCCCAATCAA CAACGATGAG 43232 348 3.66e-05 ACTGATGTGT ATACCCCTTCAA CCTGTTCTAT 47528 439 6.16e-05 CCGCCTTCTC ATTCACCATCCA GTGTTCTGAC 48195 486 6.58e-05 AGTCCAATTA CTTGACTTTCAA ATC 48192 465 1.15e-04 TGGATGCGAG ATTCAGGTACAA AAACGACTTT 45591 121 1.42e-04 ACTCTTTGTT ATACAGATACAA ATCAACTTGC 43237 163 1.65e-04 ATAAGTAAAA TTTGCCTTTCAT AGAAGTCGGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54477 1.7e-07 361_[+1]_127 47994 2.4e-07 412_[+1]_76 49618 2.4e-07 321_[+1]_167 43123 1.7e-06 97_[+1]_391 48027 7.1e-06 206_[+1]_282 45693 8e-06 42_[+1]_446 50353 8e-06 155_[+1]_333 45656 1.5e-05 440_[+1]_48 34003 1.5e-05 345_[+1]_143 49286 3.1e-05 151_[+1]_337 48527 3.1e-05 5_[+1]_483 43232 3.7e-05 347_[+1]_141 47528 6.2e-05 438_[+1]_50 48195 6.6e-05 485_[+1]_3 48192 0.00012 464_[+1]_24 45591 0.00014 120_[+1]_368 43237 0.00016 162_[+1]_326 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=17 54477 ( 362) ATTCACAGTCAA 1 47994 ( 413) ACTCACTGTCAA 1 49618 ( 322) ACTCACTGTCAA 1 43123 ( 98) ATTCACAGTCAG 1 48027 ( 207) ATTCCGTGTCAA 1 45693 ( 43) TTTCACTGTCAG 1 50353 ( 156) ACTCACAGTCAT 1 45656 ( 441) TCTCACAATCAA 1 34003 ( 346) ATTGCCTTTCAA 1 49286 ( 152) ACTCACCGACAG 1 48527 ( 6) TCTCCCAATCAA 1 43232 ( 348) ATACCCCTTCAA 1 47528 ( 439) ATTCACCATCCA 1 48195 ( 486) CTTGACTTTCAA 1 48192 ( 465) ATTCAGGTACAA 1 45591 ( 121) ATACAGATACAA 1 43237 ( 163) TTTGCCTTTCAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.00042 E= 1.1e-001 138 -190 -1073 -31 -1073 68 -1073 115 -120 -1073 -1073 159 -1073 190 -29 -1073 138 42 -1073 -1073 -1073 190 -29 -1073 38 -32 -188 49 -62 -1073 112 27 -62 -1073 -1073 149 -1073 218 -1073 -1073 180 -190 -1073 -1073 138 -1073 -29 -131 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 17 E= 1.1e-001 0.705882 0.058824 0.000000 0.235294 0.000000 0.352941 0.000000 0.647059 0.117647 0.000000 0.000000 0.882353 0.000000 0.823529 0.176471 0.000000 0.705882 0.294118 0.000000 0.000000 0.000000 0.823529 0.176471 0.000000 0.352941 0.176471 0.058824 0.411765 0.176471 0.000000 0.470588 0.352941 0.176471 0.000000 0.000000 0.823529 0.000000 1.000000 0.000000 0.000000 0.941176 0.058824 0.000000 0.000000 0.705882 0.000000 0.176471 0.117647 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AT][TC]TC[AC]C[TA][GT]TCAA -------------------------------------------------------------------------------- Time 2.63 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 15 llr = 145 E-value = 2.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :875:653a977 pos.-specific C :::3143::1:2 probability G 92319:17::3: matrix T 1::1::1::::1 bits 2.2 2.0 * 1.8 * * * 1.5 * * * Relative 1.3 ** * *** Entropy 1.1 *** ** **** (13.9 bits) 0.9 *** ** ***** 0.7 *** ** ***** 0.4 *** ** ***** 0.2 ************ 0.0 ------------ Multilevel GAAAGAAGAAAA consensus GGC CCA GC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 34003 107 1.44e-07 CTGGTCTTGG GAAAGCAGAAAA GGACTGCATC 49286 268 2.73e-07 ATACTGATCC GAAAGACGAAAA AGTGACCAAA 54477 238 2.73e-07 GGCTCGATTG GAACGAAGAAAA AGCAAGAACG 50353 71 3.88e-06 AGATTGATTT GAAAGCCAAAAA TGTTGATGTT 45693 444 4.41e-06 ATTAGTATCA GAGTGAAGAAAA AAGCCTTCTA 48027 274 4.81e-06 AAATAGTTAG GAAAGAAGAAGC CAGCTTGACG 48192 81 1.26e-05 CAAACGTGGT GAAAGCTAAAAA ATTACAATGC 43232 235 1.81e-05 AGAGTGAAAC GAACGCTAAAAA TCTCTCCTAT 48527 272 3.07e-05 CCAATAAATA GAAAGAAAACGA ACAATCAAGT 48195 157 3.57e-05 AATCGAATGA GGGTGAAGAAGA TCTATTCAGT 45591 488 3.82e-05 AAGTTCTTTC GAACGCGGAAAT A 43123 149 5.34e-05 TCTCTAGAGA GGGCGACGAAGC CCGAACTGAA 45656 465 5.34e-05 ATTCAGATTC TAGGGAAGAAAA ATTCGTTCTG 47994 469 6.73e-05 TCCACAACCT GAAGCCGGAAAA ATCAGAAAGC 47528 139 7.45e-05 TGCCGAGCAT GGGAGACGACAC GACACACATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34003 1.4e-07 106_[+2]_382 49286 2.7e-07 267_[+2]_221 54477 2.7e-07 237_[+2]_251 50353 3.9e-06 70_[+2]_418 45693 4.4e-06 443_[+2]_45 48027 4.8e-06 273_[+2]_215 48192 1.3e-05 80_[+2]_408 43232 1.8e-05 234_[+2]_254 48527 3.1e-05 271_[+2]_217 48195 3.6e-05 156_[+2]_332 45591 3.8e-05 487_[+2]_1 43123 5.3e-05 148_[+2]_340 45656 5.3e-05 464_[+2]_24 47994 6.7e-05 468_[+2]_20 47528 7.5e-05 138_[+2]_350 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=15 34003 ( 107) GAAAGCAGAAAA 1 49286 ( 268) GAAAGACGAAAA 1 54477 ( 238) GAACGAAGAAAA 1 50353 ( 71) GAAAGCCAAAAA 1 45693 ( 444) GAGTGAAGAAAA 1 48027 ( 274) GAAAGAAGAAGC 1 48192 ( 81) GAAAGCTAAAAA 1 43232 ( 235) GAACGCTAAAAA 1 48527 ( 272) GAAAGAAAACGA 1 48195 ( 157) GGGTGAAGAAGA 1 45591 ( 488) GAACGCGGAAAT 1 43123 ( 149) GGGCGACGAAGC 1 45656 ( 465) TAGGGAAGAAAA 1 47994 ( 469) GAAGCCGGAAAA 1 47528 ( 139) GGGAGACGACAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.78686 E= 2.0e+000 -1055 -1055 211 -213 156 -1055 -11 -1055 130 -1055 62 -1055 78 27 -70 -113 -1055 -172 211 -1055 115 86 -1055 -1055 78 27 -70 -113 -2 -1055 176 -1055 188 -1055 -1055 -1055 168 -73 -1055 -1055 144 -1055 30 -1055 144 -14 -1055 -213 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 15 E= 2.0e+000 0.000000 0.000000 0.933333 0.066667 0.800000 0.000000 0.200000 0.000000 0.666667 0.000000 0.333333 0.000000 0.466667 0.266667 0.133333 0.133333 0.000000 0.066667 0.933333 0.000000 0.600000 0.400000 0.000000 0.000000 0.466667 0.266667 0.133333 0.133333 0.266667 0.000000 0.733333 0.000000 1.000000 0.000000 0.000000 0.000000 0.866667 0.133333 0.000000 0.000000 0.733333 0.000000 0.266667 0.000000 0.733333 0.200000 0.000000 0.066667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[AG][AG][AC]G[AC][AC][GA]AA[AG][AC] -------------------------------------------------------------------------------- Time 5.02 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 13 llr = 144 E-value = 1.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 8:5882:a3225:85 pos.-specific C 2::::21:37:::22 probability G :a51229:32715:3 matrix T 1::1:4::1:145:: bits 2.2 * 2.0 * * 1.8 * ** 1.5 * ** Relative 1.3 * * ** * Entropy 1.1 **** ** * (16.0 bits) 0.9 ***** ** ** ** 0.7 ***** ** ****** 0.4 ***** ** ****** 0.2 ***** ********* 0.0 --------------- Multilevel AGAAATGAACGATAA consensus G C C ATG G sequence G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 34003 445 1.17e-07 CTTCATCAGG AGAAAAGAGCGATAA CTTATCAACT 48195 214 3.31e-07 TACCTTATTC AGGAATGAGCGGTAA TTCATCCAGG 49286 20 6.26e-07 ACCGAAAACG AGAAAGGAACAAGAG GTAGGGCGAA 43232 218 7.96e-07 ACGATTTACA AGGAACGAGAGTGAA ACGAACGCTA 49618 352 1.28e-06 CGAAATTACC AGAAGCGACCGAGAC GGGTTCGTGT 54477 276 2.40e-06 CAAAGAAATA CGAAATGACGGTGAA TCCAAATCGA 50353 12 3.22e-06 CTGTATGAGT CGGAATGATCGTTAA AGGGCTAGTG 47994 484 6.66e-06 CGGAAAAATC AGAAAGCAACAAGAA TC 43123 171 7.25e-06 CCGAACTGAA AGGAAGGAGGTATAA AAAATGCAAT 45656 240 7.25e-06 GCCGAGTCCC AGGAATGACAGATCC ATTGACGAAA 45591 220 9.95e-06 CACGGACACG AGATGTGAACGTGAG GGTTCCTAAT 45693 424 1.07e-05 CAGCTTCTCC TGAAACGACCATTAG TATCAGAGTG 48027 178 1.30e-05 AGAGTCGAGG AGGGAAGAACGATCG TTCTTTACTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34003 1.2e-07 444_[+3]_41 48195 3.3e-07 213_[+3]_272 49286 6.3e-07 19_[+3]_466 43232 8e-07 217_[+3]_268 49618 1.3e-06 351_[+3]_134 54477 2.4e-06 275_[+3]_210 50353 3.2e-06 11_[+3]_474 47994 6.7e-06 483_[+3]_2 43123 7.2e-06 170_[+3]_315 45656 7.2e-06 239_[+3]_246 45591 9.9e-06 219_[+3]_266 45693 1.1e-05 423_[+3]_62 48027 1.3e-05 177_[+3]_308 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=13 34003 ( 445) AGAAAAGAGCGATAA 1 48195 ( 214) AGGAATGAGCGGTAA 1 49286 ( 20) AGAAAGGAACAAGAG 1 43232 ( 218) AGGAACGAGAGTGAA 1 49618 ( 352) AGAAGCGACCGAGAC 1 54477 ( 276) CGAAATGACGGTGAA 1 50353 ( 12) CGGAATGATCGTTAA 1 47994 ( 484) AGAAAGCAACAAGAA 1 43123 ( 171) AGGAAGGAGGTATAA 1 45656 ( 240) AGGAATGACAGATCC 1 45591 ( 220) AGATGTGAACGTGAG 1 45693 ( 424) TGAAACGACCATTAG 1 48027 ( 178) AGGGAAGAACGATCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8262 bayes= 9.84078 E= 1.9e+001 151 -52 -1035 -192 -1035 -1035 221 -1035 99 -1035 109 -1035 164 -1035 -149 -192 164 -1035 -49 -1035 -81 7 9 40 -1035 -152 209 -1035 188 -1035 -1035 -1035 18 48 51 -192 -81 165 -49 -1035 -23 -1035 168 -192 99 -1035 -149 40 -1035 -1035 109 88 164 -52 -1035 -1035 99 -52 51 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 13 E= 1.9e+001 0.769231 0.153846 0.000000 0.076923 0.000000 0.000000 1.000000 0.000000 0.538462 0.000000 0.461538 0.000000 0.846154 0.000000 0.076923 0.076923 0.846154 0.000000 0.153846 0.000000 0.153846 0.230769 0.230769 0.384615 0.000000 0.076923 0.923077 0.000000 1.000000 0.000000 0.000000 0.000000 0.307692 0.307692 0.307692 0.076923 0.153846 0.692308 0.153846 0.000000 0.230769 0.000000 0.692308 0.076923 0.538462 0.000000 0.076923 0.384615 0.000000 0.000000 0.461538 0.538462 0.846154 0.153846 0.000000 0.000000 0.538462 0.153846 0.307692 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AG[AG]AA[TCG]GA[ACG]C[GA][AT][TG]A[AG] -------------------------------------------------------------------------------- Time 7.70 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43232 9.42e-06 217_[+3(7.96e-07)]_2_[+2(1.81e-05)]_\ 101_[+1(3.66e-05)]_141 49618 2.92e-06 321_[+1(2.42e-07)]_18_\ [+3(1.28e-06)]_134 50353 2.15e-06 11_[+3(3.22e-06)]_44_[+2(3.88e-06)]_\ 73_[+1(7.98e-06)]_333 34003 9.34e-09 106_[+2(1.44e-07)]_227_\ [+1(1.51e-05)]_87_[+3(1.17e-07)]_41 45656 7.73e-05 239_[+3(7.25e-06)]_186_\ [+1(1.51e-05)]_12_[+2(5.34e-05)]_24 54477 4.50e-09 237_[+2(2.73e-07)]_26_\ [+3(2.40e-06)]_38_[+3(4.30e-05)]_18_[+1(1.74e-07)]_127 48195 1.32e-05 156_[+2(3.57e-05)]_45_\ [+3(3.31e-07)]_257_[+1(6.58e-05)]_3 48527 2.96e-03 5_[+1(3.09e-05)]_254_[+2(3.07e-05)]_\ 217 43123 1.13e-05 97_[+1(1.67e-06)]_39_[+2(5.34e-05)]_\ 10_[+3(7.25e-06)]_315 43237 4.79e-01 500 49286 1.51e-07 19_[+3(6.26e-07)]_29_[+3(3.04e-05)]_\ 73_[+1(3.09e-05)]_104_[+2(2.73e-07)]_221 47994 2.29e-06 412_[+1(2.42e-07)]_44_\ [+2(6.73e-05)]_3_[+3(6.66e-06)]_2 48027 8.17e-06 177_[+3(1.30e-05)]_14_\ [+1(7.11e-06)]_55_[+2(4.81e-06)]_215 48192 1.09e-02 80_[+2(1.26e-05)]_408 45693 7.03e-06 42_[+1(7.98e-06)]_369_\ [+3(1.07e-05)]_5_[+2(4.41e-06)]_45 47528 2.11e-02 138_[+2(7.45e-05)]_288_\ [+1(6.16e-05)]_50 45591 5.13e-04 219_[+3(9.95e-06)]_253_\ [+2(3.82e-05)]_1 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************