******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/212/212.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42963 1.0000 500 46896 1.0000 500 15625 1.0000 500 40433 1.0000 500 33584 1.0000 500 46671 1.0000 500 48974 1.0000 500 47215 1.0000 500 49797 1.0000 500 45142 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/212/212.seqs.fa -oc motifs/212 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.276 C 0.237 G 0.217 T 0.270 Background letter frequencies (from dataset with add-one prior applied): A 0.276 C 0.237 G 0.217 T 0.270 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 7 llr = 111 E-value = 4.1e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :131::3:1:::a1::::9 pos.-specific C 9:1:3311:::6:93::4: probability G 19:1:7119914:::a331 matrix T ::677:47:19:::7:73: bits 2.2 * 2.0 * 1.8 * * 1.5 ** ** ** * Relative 1.3 ** * *** ** * * Entropy 1.1 ** ** ********* * (22.8 bits) 0.9 ** *** ********** * 0.7 ** *** ********** * 0.4 ****** ************ 0.2 ****** ************ 0.0 ------------------- Multilevel CGTTTGTTGGTCACTGTCA consensus A CCA G C GG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 42963 291 3.45e-11 TAAGGTCGAC CGATTGTTGGTCACTGTCA AATCAAGTAC 48974 327 2.71e-09 ACAAGGTTTC CGATTCATGGTCACCGTCA GTGCAACGCA 47215 1 2.28e-08 . CGTACGGTGGTGACTGGTA TATCTGGACT 46671 405 3.47e-08 ACTCACTCTC CATTTGTTGGTCAACGTGA CGGATACACA 46896 159 4.75e-08 CGCCAGACGA CGTTTCCTGTTGACTGGTA ATACGACGGG 33584 199 6.41e-08 TTTTTGTCAG CGTGTGACAGTGACTGTGA ATGAAACGTG 40433 147 3.99e-07 CGTATATGTG GGCTCGTGGGGCACTGTCG ATTGAAGATA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42963 3.5e-11 290_[+1]_191 48974 2.7e-09 326_[+1]_155 47215 2.3e-08 [+1]_481 46671 3.5e-08 404_[+1]_77 46896 4.8e-08 158_[+1]_323 33584 6.4e-08 198_[+1]_283 40433 4e-07 146_[+1]_335 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=7 42963 ( 291) CGATTGTTGGTCACTGTCA 1 48974 ( 327) CGATTCATGGTCACCGTCA 1 47215 ( 1) CGTACGGTGGTGACTGGTA 1 46671 ( 405) CATTTGTTGGTCAACGTGA 1 46896 ( 159) CGTTTCCTGTTGACTGGTA 1 33584 ( 199) CGTGTGACAGTGACTGTGA 1 40433 ( 147) GGCTCGTGGGGCACTGTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 4820 bayes= 9.26901 E= 4.1e-001 -945 185 -60 -945 -95 -945 198 -945 5 -73 -945 108 -95 -945 -60 140 -945 27 -945 140 -945 27 172 -945 5 -73 -60 66 -945 -73 -60 140 -95 -945 198 -945 -945 -945 198 -92 -945 -945 -60 166 -945 127 98 -945 186 -945 -945 -945 -95 185 -945 -945 -945 27 -945 140 -945 -945 220 -945 -945 -945 40 140 -945 85 40 8 163 -945 -60 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 7 E= 4.1e-001 0.000000 0.857143 0.142857 0.000000 0.142857 0.000000 0.857143 0.000000 0.285714 0.142857 0.000000 0.571429 0.142857 0.000000 0.142857 0.714286 0.000000 0.285714 0.000000 0.714286 0.000000 0.285714 0.714286 0.000000 0.285714 0.142857 0.142857 0.428571 0.000000 0.142857 0.142857 0.714286 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.142857 0.857143 0.000000 0.571429 0.428571 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 0.285714 0.000000 0.714286 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.714286 0.000000 0.428571 0.285714 0.285714 0.857143 0.000000 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CG[TA]T[TC][GC][TA]TGGT[CG]AC[TC]G[TG][CGT]A -------------------------------------------------------------------------------- Time 1.00 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 10 llr = 107 E-value = 3.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :a22::6::265a9 pos.-specific C 4:2:33::771::: probability G 6:1:7::1::32:1 matrix T ::58:74931:3:: bits 2.2 2.0 1.8 * * 1.5 * * * Relative 1.3 * * * ** Entropy 1.1 ** *** ** ** (15.5 bits) 0.9 ** ******* ** 0.7 ** ******** ** 0.4 ** *********** 0.2 ************** 0.0 -------------- Multilevel GATTGTATCCAAAA consensus C AACCT TAGT sequence C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 15625 188 6.21e-09 GTTTGTCGAG GATTGTATCCAAAA GGGGACAGTC 46896 390 1.00e-07 CGGTACTCAC GAATGTATCCAAAA CATCACACGA 33584 34 1.72e-06 CGCAATTTTG GAGTGCATCCATAA CGTCGAGGAC 45142 13 4.74e-06 ACCCATCCTT GACTCTTTCCGGAA CGAATCTCGT 47215 26 5.20e-06 GGTATATCTG GACTCCATCCGTAA GTCTCTACCA 48974 100 6.76e-06 TAATTTTTTA CAATCTTTTCAAAA ACCTTTCGGA 46671 76 7.90e-06 AAAGAGCGAG CATTGCTTTAAAAA AGTATTCCAA 42963 62 1.24e-05 TAGCAAAAAT CATAGTATTCCAAA ACATTTCGAT 49797 375 3.00e-05 ATGTGCCAGG GATTGTTGCTGGAA TGATATCGTC 40433 348 3.15e-05 TATGTTCCTT CATAGTATCAATAG TGATGTCTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 15625 6.2e-09 187_[+2]_299 46896 1e-07 389_[+2]_97 33584 1.7e-06 33_[+2]_453 45142 4.7e-06 12_[+2]_474 47215 5.2e-06 25_[+2]_461 48974 6.8e-06 99_[+2]_387 46671 7.9e-06 75_[+2]_411 42963 1.2e-05 61_[+2]_425 49797 3e-05 374_[+2]_112 40433 3.2e-05 347_[+2]_139 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=10 15625 ( 188) GATTGTATCCAAAA 1 46896 ( 390) GAATGTATCCAAAA 1 33584 ( 34) GAGTGCATCCATAA 1 45142 ( 13) GACTCTTTCCGGAA 1 47215 ( 26) GACTCCATCCGTAA 1 48974 ( 100) CAATCTTTTCAAAA 1 46671 ( 76) CATTGCTTTAAAAA 1 42963 ( 62) CATAGTATTCCAAA 1 49797 ( 375) GATTGTTGCTGGAA 1 40433 ( 348) CATAGTATCAATAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 4870 bayes= 8.92481 E= 3.2e+002 -997 76 147 -997 186 -997 -997 -997 -46 -24 -112 89 -46 -997 -997 156 -997 34 169 -997 -997 34 -997 137 112 -997 -997 56 -997 -997 -112 173 -997 156 -997 15 -46 156 -997 -143 112 -124 47 -997 86 -997 -12 15 186 -997 -997 -997 171 -997 -112 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 10 E= 3.2e+002 0.000000 0.400000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.200000 0.100000 0.500000 0.200000 0.000000 0.000000 0.800000 0.000000 0.300000 0.700000 0.000000 0.000000 0.300000 0.000000 0.700000 0.600000 0.000000 0.000000 0.400000 0.000000 0.000000 0.100000 0.900000 0.000000 0.700000 0.000000 0.300000 0.200000 0.700000 0.000000 0.100000 0.600000 0.100000 0.300000 0.000000 0.500000 0.000000 0.200000 0.300000 1.000000 0.000000 0.000000 0.000000 0.900000 0.000000 0.100000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GC]A[TAC][TA][GC][TC][AT]T[CT][CA][AG][ATG]AA -------------------------------------------------------------------------------- Time 1.85 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 4 llr = 64 E-value = 6.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::5:53:::::::: pos.-specific C a:a3:5:53:::::8 probability G :a:3a:838::3::3 matrix T :::::::3:aa8aa: bits 2.2 * * 2.0 *** * ** ** 1.8 *** * ** ** 1.5 *** * ** ** Relative 1.3 *** * * *** *** Entropy 1.1 *** * * ******* (23.3 bits) 0.9 *** *** ******* 0.7 *** *********** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CGCAGAGCGTTTTTC consensus C CAGC G G sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 46671 141 8.42e-09 TTTGCTTCTC CGCAGAGTGTTTTTC GAGACGAGAG 15625 152 2.35e-08 GAGCCGCAAT CGCGGCGCCTTTTTC TGAAACCAAA 33584 375 3.46e-08 ATCTCGTCCA CGCCGCACGTTTTTC CAGTCCAAGG 40433 100 6.28e-08 AAGACCAGTA CGCAGAGGGTTGTTG ACATCCCTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46671 8.4e-09 140_[+3]_345 15625 2.3e-08 151_[+3]_334 33584 3.5e-08 374_[+3]_111 40433 6.3e-08 99_[+3]_386 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=4 46671 ( 141) CGCAGAGTGTTTTTC 1 15625 ( 152) CGCGGCGCCTTTTTC 1 33584 ( 375) CGCCGCACGTTTTTC 1 40433 ( 100) CGCAGAGGGTTGTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4860 bayes= 10.2456 E= 6.6e+002 -865 208 -865 -865 -865 -865 220 -865 -865 208 -865 -865 86 8 20 -865 -865 -865 220 -865 86 108 -865 -865 -14 -865 179 -865 -865 108 20 -11 -865 8 179 -865 -865 -865 -865 188 -865 -865 -865 188 -865 -865 20 147 -865 -865 -865 188 -865 -865 -865 188 -865 166 20 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 4 E= 6.6e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CGC[ACG]G[AC][GA][CGT][GC]TT[TG]TT[CG] -------------------------------------------------------------------------------- Time 2.69 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42963 2.19e-08 61_[+2(1.24e-05)]_143_\ [+1(4.18e-07)]_53_[+1(3.45e-11)]_191 46896 2.28e-07 158_[+1(4.75e-08)]_212_\ [+2(1.00e-07)]_97 15625 5.00e-09 151_[+3(2.35e-08)]_21_\ [+2(6.21e-09)]_299 40433 2.61e-08 99_[+3(6.28e-08)]_32_[+1(3.99e-07)]_\ 182_[+2(3.15e-05)]_139 33584 1.89e-10 33_[+2(1.72e-06)]_151_\ [+1(6.41e-08)]_157_[+3(3.46e-08)]_111 46671 1.18e-10 75_[+2(7.90e-06)]_51_[+3(8.42e-09)]_\ 249_[+1(3.47e-08)]_77 48974 4.52e-07 99_[+2(6.76e-06)]_213_\ [+1(2.71e-09)]_155 47215 3.56e-06 [+1(2.28e-08)]_6_[+2(5.20e-06)]_461 49797 2.55e-03 374_[+2(3.00e-05)]_112 45142 2.73e-02 12_[+2(4.74e-06)]_474 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************