******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/477/477.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 5465 1.0000 500 39480 1.0000 500 49576 1.0000 500 40325 1.0000 500 40510 1.0000 500 40744 1.0000 500 49918 1.0000 500 40749 1.0000 500 34199 1.0000 500 35702 1.0000 500 46171 1.0000 500 31657 1.0000 500 46717 1.0000 500 46508 1.0000 500 40539 1.0000 500 37989 1.0000 500 49625 1.0000 500 40760 1.0000 500 34549 1.0000 500 35098 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/477/477.seqs.fa -oc motifs/477 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10000 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.275 C 0.228 G 0.219 T 0.278 Background letter frequencies (from dataset with add-one prior applied): A 0.274 C 0.228 G 0.219 T 0.278 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 15 llr = 150 E-value = 9.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 3::71::::757 pos.-specific C 1:83:::4:13: probability G 5:2::::1a1:1 matrix T 2a:19aa5::23 bits 2.2 * 2.0 * 1.8 * ** * 1.5 * *** * Relative 1.3 ** *** * Entropy 1.1 ** *** * (14.4 bits) 0.9 ****** ** 0.7 ********* * 0.4 *********** 0.2 ************ 0.0 ------------ Multilevel GTCATTTTGAAA consensus A GC C CT sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46171 211 1.02e-07 TTGGACAAAA GTCATTTTGAAA ATTCCCGAGT 40744 306 3.41e-07 ATTGAACATG GTCATTTCGACA GAATGACTAT 37989 349 6.24e-07 CATTTTGGCG GTCCTTTCGAAA GACAACAATC 40749 334 3.45e-06 AATTTCCCGG GTCATTTTGCCA TTTACCGAGG 46508 102 4.50e-06 GGACAGATCA GTCATTTGGAAA TGACCCCTTC 40760 221 6.19e-06 TTTTCGGGTC ATGATTTTGAAA TGTATGCTAG 46717 96 1.19e-05 TTTGTCGTCT ATCCTTTTGACT AGGAGCAAAA 31657 90 1.76e-05 ATCTTATACC CTCATTTTGACT GTGAGTTGAT 35098 278 2.08e-05 CGATTGCCAC ATCTTTTTGACA GAGTTCGTTG 34199 89 2.28e-05 TAGAATCAAA TTGATTTCGATA CTCGCATCTG 49918 250 2.46e-05 TTTAGGTAAA GTCATTTCGGTT GCTGGCACAG 35702 71 2.87e-05 GACACATCTT TTCATTTCGGAT ATTCTATTTA 40539 137 3.45e-05 GCCGGTCGTG ATCCTTTTGCTA CCGACAAGGG 34549 356 3.64e-05 GATATTGGCT GTGAATTTGAAA GTGTGCAAAC 49625 30 3.64e-05 ACACGTTTCT TTCCTTTCGAAG GAAGAGTCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46171 1e-07 210_[+1]_278 40744 3.4e-07 305_[+1]_183 37989 6.2e-07 348_[+1]_140 40749 3.4e-06 333_[+1]_155 46508 4.5e-06 101_[+1]_387 40760 6.2e-06 220_[+1]_268 46717 1.2e-05 95_[+1]_393 31657 1.8e-05 89_[+1]_399 35098 2.1e-05 277_[+1]_211 34199 2.3e-05 88_[+1]_400 49918 2.5e-05 249_[+1]_239 35702 2.9e-05 70_[+1]_418 40539 3.5e-05 136_[+1]_352 34549 3.6e-05 355_[+1]_133 49625 3.6e-05 29_[+1]_459 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=15 46171 ( 211) GTCATTTTGAAA 1 40744 ( 306) GTCATTTCGACA 1 37989 ( 349) GTCCTTTCGAAA 1 40749 ( 334) GTCATTTTGCCA 1 46508 ( 102) GTCATTTGGAAA 1 40760 ( 221) ATGATTTTGAAA 1 46717 ( 96) ATCCTTTTGACT 1 31657 ( 90) CTCATTTTGACT 1 35098 ( 278) ATCTTTTTGACA 1 34199 ( 89) TTGATTTCGATA 1 49918 ( 250) GTCATTTCGGTT 1 35702 ( 71) TTCATTTCGGAT 1 40539 ( 137) ATCCTTTTGCTA 1 34549 ( 356) GTGAATTTGAAA 1 49625 ( 30) TTCCTTTCGAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9780 bayes= 10.0216 E= 9.8e+000 -4 -177 109 -48 -1055 -1055 -1055 184 -1055 181 -13 -1055 128 23 -1055 -206 -204 -1055 -1055 175 -1055 -1055 -1055 184 -1055 -1055 -1055 184 -1055 81 -172 94 -1055 -1055 219 -1055 142 -77 -72 -1055 77 55 -1055 -48 128 -1055 -172 -6 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 15 E= 9.8e+000 0.266667 0.066667 0.466667 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.200000 0.000000 0.666667 0.266667 0.000000 0.066667 0.066667 0.000000 0.000000 0.933333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.066667 0.533333 0.000000 0.000000 1.000000 0.000000 0.733333 0.133333 0.133333 0.000000 0.466667 0.333333 0.000000 0.200000 0.666667 0.000000 0.066667 0.266667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GAT]T[CG][AC]TTT[TC]GA[ACT][AT] -------------------------------------------------------------------------------- Time 4.62 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 10 llr = 129 E-value = 2.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3326142::a5a:::a pos.-specific C 6:7:6:1:a:5::39: probability G 12:3:421::::9:1: matrix T :5113259::::17:: bits 2.2 * 2.0 ** * * 1.8 ** ** ** 1.5 ** ** ** Relative 1.3 *** ** ** Entropy 1.1 ********* (18.7 bits) 0.9 * ********* 0.7 * *** ********* 0.4 ****** ********* 0.2 **************** 0.0 ---------------- Multilevel CTCACATTCAAAGTCA consensus AAAGTGA C C sequence G TG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 40744 421 1.12e-09 TTGTTGCATA CTCACATTCAAAGTCA ACCTGCTAAC 34549 131 1.51e-07 TATGTGCCGA CTCGTGCTCACAGTCA AGCTGCCTCG 49625 285 2.33e-07 ATCATTAGAT CTCGCGTTCAAATTCA GCCTTTGCAT 46508 204 3.14e-07 TTCAAAATAA ATCACTATCACAGCCA ATCCATACCT 31657 308 4.54e-07 CCTGATTTTG AGCTCATTCACAGTCA GTATTTGAAT 40510 131 4.98e-07 TTTCCTATGA CACATGTTCAAAGTGA ATTGAGTGTT 39480 35 6.99e-07 TAGACCACGA GGAACATTCACAGTCA CAAGCTTCGG 40539 179 1.28e-06 TCTTTTTCTC AACATTATCACAGCCA GCTGTTTCAA 46171 463 1.58e-06 CGCAAGCCCA CTTAAGGTCAAAGTCA TATCCTCATT 35098 219 3.25e-06 GGTCCATTGT CAAGCAGGCAAAGCCA TTGCCCTTGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40744 1.1e-09 420_[+2]_64 34549 1.5e-07 130_[+2]_354 49625 2.3e-07 284_[+2]_200 46508 3.1e-07 203_[+2]_281 31657 4.5e-07 307_[+2]_177 40510 5e-07 130_[+2]_354 39480 7e-07 34_[+2]_450 40539 1.3e-06 178_[+2]_306 46171 1.6e-06 462_[+2]_22 35098 3.2e-06 218_[+2]_266 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=10 40744 ( 421) CTCACATTCAAAGTCA 1 34549 ( 131) CTCGTGCTCACAGTCA 1 49625 ( 285) CTCGCGTTCAAATTCA 1 46508 ( 204) ATCACTATCACAGCCA 1 31657 ( 308) AGCTCATTCACAGTCA 1 40510 ( 131) CACATGTTCAAAGTGA 1 39480 ( 35) GGAACATTCACAGTCA 1 40539 ( 179) AACATTATCACAGCCA 1 46171 ( 463) CTTAAGGTCAAAGTCA 1 35098 ( 219) CAAGCAGGCAAAGCCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 9.35404 E= 2.1e+001 13 140 -113 -997 13 -997 -13 84 -46 162 -997 -147 113 -997 45 -147 -145 140 -997 11 54 -997 87 -48 -46 -119 -13 84 -997 -997 -113 169 -997 213 -997 -997 186 -997 -997 -997 86 113 -997 -997 186 -997 -997 -997 -997 -997 204 -147 -997 40 -997 133 -997 198 -113 -997 186 -997 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 2.1e+001 0.300000 0.600000 0.100000 0.000000 0.300000 0.000000 0.200000 0.500000 0.200000 0.700000 0.000000 0.100000 0.600000 0.000000 0.300000 0.100000 0.100000 0.600000 0.000000 0.300000 0.400000 0.000000 0.400000 0.200000 0.200000 0.100000 0.200000 0.500000 0.000000 0.000000 0.100000 0.900000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.900000 0.100000 0.000000 0.300000 0.000000 0.700000 0.000000 0.900000 0.100000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CA][TAG][CA][AG][CT][AGT][TAG]TCA[AC]AG[TC]CA -------------------------------------------------------------------------------- Time 8.32 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 4 llr = 83 E-value = 6.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a:::3a::3a::8:a:38::: pos.-specific C :3333:::3:3a::::5:::a probability G :3:83:5a5::::a:a33a3: matrix T :58:3:5:::8:3::::::8: bits 2.2 * * * * * * 2.0 * * * * * *** * * 1.8 * * * * * *** * * 1.5 * * * * * *** * * Relative 1.3 * * * * * * *** * * Entropy 1.1 * ** *** ******* **** (30.1 bits) 0.9 * ** *** ******* **** 0.7 * ** **************** 0.4 **** **************** 0.2 **** **************** 0.0 --------------------- Multilevel ATTGAAGGGATCAGAGCAGTC consensus CCCC T A C T AG G sequence G G C G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 35702 160 3.98e-11 TGTTCACGAA ACTGCATGGATCAGAGAAGTC GGCAAAACTT 31657 36 1.65e-10 TGACAACGAC AGTGAATGCATCAGAGCGGTC GTCGCGATCA 40539 88 4.07e-10 TGCTTGCGAT ATTCTAGGAACCAGAGCAGTC CAAGCAGTTC 40760 438 5.12e-10 TTCGCCTTCG ATCGGAGGGATCTGAGGAGGC TTCATCGGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35702 4e-11 159_[+3]_320 31657 1.7e-10 35_[+3]_444 40539 4.1e-10 87_[+3]_392 40760 5.1e-10 437_[+3]_42 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=4 35702 ( 160) ACTGCATGGATCAGAGAAGTC 1 31657 ( 36) AGTGAATGCATCAGAGCGGTC 1 40539 ( 88) ATTCTAGGAACCAGAGCAGTC 1 40760 ( 438) ATCGGAGGGATCTGAGGAGGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9600 bayes= 11.2282 E= 6.7e+002 186 -865 -865 -865 -865 13 19 84 -865 13 -865 143 -865 13 177 -865 -13 13 19 -15 186 -865 -865 -865 -865 -865 119 84 -865 -865 219 -865 -13 13 119 -865 186 -865 -865 -865 -865 13 -865 143 -865 213 -865 -865 145 -865 -865 -15 -865 -865 219 -865 186 -865 -865 -865 -865 -865 219 -865 -13 113 19 -865 145 -865 19 -865 -865 -865 219 -865 -865 -865 19 143 -865 213 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 6.7e+002 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.250000 0.500000 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.750000 0.000000 0.250000 0.250000 0.250000 0.250000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.250000 0.250000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[TCG][TC][GC][ACGT]A[GT]G[GAC]A[TC]C[AT]GAG[CAG][AG]G[TG]C -------------------------------------------------------------------------------- Time 11.89 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5465 7.74e-01 500 39480 8.77e-03 34_[+2(6.99e-07)]_450 49576 2.22e-01 500 40325 4.97e-01 500 40510 6.63e-03 130_[+2(4.98e-07)]_354 40744 3.93e-09 305_[+1(3.41e-07)]_103_\ [+2(1.12e-09)]_64 49918 2.58e-02 249_[+1(2.46e-05)]_239 40749 1.98e-02 333_[+1(3.45e-06)]_155 34199 5.80e-02 88_[+1(2.28e-05)]_400 35702 3.41e-08 70_[+1(2.87e-05)]_77_[+3(3.98e-11)]_\ 320 46171 8.94e-07 210_[+1(1.02e-07)]_240_\ [+2(1.58e-06)]_4_[+1(2.46e-05)]_6 31657 6.96e-11 35_[+3(1.65e-10)]_33_[+1(1.76e-05)]_\ 206_[+2(4.54e-07)]_177 46717 1.12e-01 95_[+1(1.19e-05)]_393 46508 2.17e-05 101_[+1(4.50e-06)]_90_\ [+2(3.14e-07)]_281 40539 7.94e-10 87_[+3(4.07e-10)]_28_[+1(3.45e-05)]_\ 30_[+2(1.28e-06)]_306 37989 3.74e-03 348_[+1(6.24e-07)]_140 49625 3.40e-05 29_[+1(3.64e-05)]_243_\ [+2(2.33e-07)]_147_[+2(3.97e-05)]_37 40760 1.34e-07 220_[+1(6.19e-06)]_190_\ [+1(3.84e-05)]_3_[+3(5.12e-10)]_42 34549 1.01e-04 130_[+2(1.51e-07)]_209_\ [+1(3.64e-05)]_133 35098 5.34e-04 218_[+2(3.25e-06)]_43_\ [+1(2.08e-05)]_211 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************