******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/4/4.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 17172 1.0000 500 5889 1.0000 500 32038 1.0000 500 43010 1.0000 500 43105 1.0000 500 28181 1.0000 500 28412 1.0000 500 47287 1.0000 500 10722 1.0000 500 19427 1.0000 500 11785 1.0000 500 45557 1.0000 500 46038 1.0000 500 44246 1.0000 500 47796 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/4/4.seqs.fa -oc motifs/4 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.276 C 0.235 G 0.226 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.276 C 0.235 G 0.226 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 7 llr = 118 E-value = 3.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :1::34:::::::413:6361 pos.-specific C :13:7:::a:1:316:a3:31 probability G a476::76:a::6434::7:7 matrix T :3:4:634::9a1::3:1:1: bits 2.1 * ** * 1.9 * ** * * 1.7 * ** * * 1.5 * ** * * Relative 1.3 * * * **** * * Entropy 1.1 * *** ****** * * (24.3 bits) 0.9 * ********** * * * 0.6 * ************* ***** 0.4 * ******************* 0.2 ********************* 0.0 --------------------- Multilevel GGGGCTGGCGTTGACGCAGAG consensus TCTAATT CGGA CAC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 11785 201 2.86e-10 AATCGATATG GGGGAAGTCGTTCACGCAGAG AACAATAGTC 43010 119 2.66e-09 TGCTTGTGTT GGCGCTGGCGTTGCCGCTGCG ATACGGATAT 45557 187 1.07e-08 AATGTTGAAA GCGTCTGGCGTTCACTCCGAC TACGAGTTTT 17172 452 1.16e-08 GGGGGCGTGA GTGTCTGTCGTTTGCACAGAA CAATTTCACC 43105 190 1.50e-08 TGTTAACATG GAGGCAGGCGCTGGAGCAGCG GAAAAGTGCC 19427 75 1.76e-08 TCTTGTTCCG GTGTAATGCGTTGGGTCAAAG AAGAGCGGCT 32038 270 5.23e-08 TTTTCATCCG GGCGCTTTCGTTGAGACCATG GTGTCTTTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11785 2.9e-10 200_[+1]_279 43010 2.7e-09 118_[+1]_361 45557 1.1e-08 186_[+1]_293 17172 1.2e-08 451_[+1]_28 43105 1.5e-08 189_[+1]_290 19427 1.8e-08 74_[+1]_405 32038 5.2e-08 269_[+1]_210 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=7 11785 ( 201) GGGGAAGTCGTTCACGCAGAG 1 43010 ( 119) GGCGCTGGCGTTGCCGCTGCG 1 45557 ( 187) GCGTCTGGCGTTCACTCCGAC 1 17172 ( 452) GTGTCTGTCGTTTGCACAGAA 1 43105 ( 190) GAGGCAGGCGCTGGAGCAGCG 1 19427 ( 75) GTGTAATGCGTTGGGTCAAAG 1 32038 ( 270) GGCGCTTTCGTTGAGACCATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 11.2282 E= 3.3e+001 -945 -945 214 -945 -95 -72 92 12 -945 28 166 -945 -945 -945 134 71 5 160 -945 -945 63 -945 -945 112 -945 -945 166 12 -945 -945 134 71 -945 208 -945 -945 -945 -945 214 -945 -945 -72 -945 171 -945 -945 -945 193 -945 28 134 -88 63 -72 92 -945 -95 128 34 -945 5 -945 92 12 -945 208 -945 -945 105 28 -945 -88 5 -945 166 -945 105 28 -945 -88 -95 -72 166 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 3.3e+001 0.000000 0.000000 1.000000 0.000000 0.142857 0.142857 0.428571 0.285714 0.000000 0.285714 0.714286 0.000000 0.000000 0.000000 0.571429 0.428571 0.285714 0.714286 0.000000 0.000000 0.428571 0.000000 0.000000 0.571429 0.000000 0.000000 0.714286 0.285714 0.000000 0.000000 0.571429 0.428571 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.285714 0.571429 0.142857 0.428571 0.142857 0.428571 0.000000 0.142857 0.571429 0.285714 0.000000 0.285714 0.000000 0.428571 0.285714 0.000000 1.000000 0.000000 0.000000 0.571429 0.285714 0.000000 0.142857 0.285714 0.000000 0.714286 0.000000 0.571429 0.285714 0.000000 0.142857 0.142857 0.142857 0.714286 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[GT][GC][GT][CA][TA][GT][GT]CGTT[GC][AG][CG][GAT]C[AC][GA][AC]G -------------------------------------------------------------------------------- Time 2.18 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 14 llr = 136 E-value = 3.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 1:31::::1::1 pos.-specific C :8::a2::5661 probability G :111:1:::12: matrix T 9168:7aa4418 bits 2.1 * 1.9 * ** 1.7 * ** 1.5 * ** Relative 1.3 * * ** Entropy 1.1 ** ** ** * (14.0 bits) 0.9 ** ***** *** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TCTTCTTTCCCT consensus A C TTG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 17172 196 1.84e-06 CAAGAGACAC TCTTCTTTTCCC TTTTCCCGTT 19427 171 3.02e-06 AAGGACCGCT TCGTCTTTCCGT ATGCTTTGCA 44246 90 4.39e-06 ACTTCTATAT TCTGCTTTTTCT CGGCTGAAAT 46038 243 5.47e-06 ACTGGTCTGG TCGTCTTTCCCC ACAAGACTGA 43010 479 6.96e-06 GGAGTCGTCA TTTTCCTTCCCT ATTGGCAAAA 28412 350 8.22e-06 ACAGCGGACT TCAACTTTCCCT TGCATTGACA 43105 456 1.01e-05 AGTATCGCCC ACTTCTTTTCGT CGCAAAACGG 10722 216 1.11e-05 ATCATCGGGC TCTTCCTTCTTT TTTCGTAGCG 47287 271 1.47e-05 TCGCTGGTTG TCTTCTTTATTT GATTTGAAGG 5889 403 1.47e-05 ATGTATCAAG TGATCTTTCTCT GTCTGTGCTT 45557 317 1.91e-05 CGTCGCGACA ACTTCTTTCGCT GCGCTCACAC 28181 424 2.43e-05 TCGCAGTCAA TCATCTTTTTCA CAAAATTCAT 32038 477 3.05e-05 ACAACTTGCT TTTTCCTTTCGT TCCCGGCATT 11785 170 8.95e-05 TGGAACGAGG TCAGCGTTACCT GGAGATTTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17172 1.8e-06 195_[+2]_293 19427 3e-06 170_[+2]_318 44246 4.4e-06 89_[+2]_399 46038 5.5e-06 242_[+2]_246 43010 7e-06 478_[+2]_10 28412 8.2e-06 349_[+2]_139 43105 1e-05 455_[+2]_33 10722 1.1e-05 215_[+2]_273 47287 1.5e-05 270_[+2]_218 5889 1.5e-05 402_[+2]_86 45557 1.9e-05 316_[+2]_172 28181 2.4e-05 423_[+2]_65 32038 3.1e-05 476_[+2]_12 11785 8.9e-05 169_[+2]_319 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=14 17172 ( 196) TCTTCTTTTCCC 1 19427 ( 171) TCGTCTTTCCGT 1 44246 ( 90) TCTGCTTTTTCT 1 46038 ( 243) TCGTCTTTCCCC 1 43010 ( 479) TTTTCCTTCCCT 1 28412 ( 350) TCAACTTTCCCT 1 43105 ( 456) ACTTCTTTTCGT 1 10722 ( 216) TCTTCCTTCTTT 1 47287 ( 271) TCTTCTTTATTT 1 5889 ( 403) TGATCTTTCTCT 1 45557 ( 317) ACTTCTTTCGCT 1 28181 ( 424) TCATCTTTTTCA 1 32038 ( 477) TTTTCCTTTCGT 1 11785 ( 170) TCAGCGTTACCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 8.95014 E= 3.7e+001 -95 -1045 -1045 171 -1045 174 -166 -88 5 -1045 -66 112 -195 -1045 -66 158 -1045 209 -1045 -1045 -1045 -14 -166 144 -1045 -1045 -1045 193 -1045 -1045 -1045 193 -95 109 -1045 44 -1045 128 -166 44 -1045 145 -8 -88 -195 -72 -1045 158 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 3.7e+001 0.142857 0.000000 0.000000 0.857143 0.000000 0.785714 0.071429 0.142857 0.285714 0.000000 0.142857 0.571429 0.071429 0.000000 0.142857 0.785714 0.000000 1.000000 0.000000 0.000000 0.000000 0.214286 0.071429 0.714286 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.142857 0.500000 0.000000 0.357143 0.000000 0.571429 0.071429 0.357143 0.000000 0.642857 0.214286 0.142857 0.071429 0.142857 0.000000 0.785714 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TC[TA]TC[TC]TT[CT][CT][CG]T -------------------------------------------------------------------------------- Time 4.08 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 8 llr = 110 E-value = 1.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :9:1:a9::13::4a3 pos.-specific C :::19:1111:186:1 probability G 51:81:::83:13::6 matrix T 5:a::::91588:::: bits 2.1 1.9 * * * 1.7 * * * 1.5 * ** * * Relative 1.3 ** **** * * Entropy 1.1 ********* * *** (19.8 bits) 0.9 ********* ****** 0.6 ********* ****** 0.4 ********* ****** 0.2 **************** 0.0 ---------------- Multilevel GATGCAATGTTTCCAG consensus T GA GA A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 43010 50 3.82e-09 AGTACAGTAG TATGCAATGTATCCAG CAGGACTTGA 28412 18 2.05e-08 GGGTGTCGGG GATGCAACGGTTCCAG TACTCAATGC 5889 387 1.21e-07 TCGTGTTCAC GATCCAATGTATCAAG TGATCTTTCT 10722 441 2.42e-07 CAAAGACTCG GATGCAATTTTTCAAC CCTACGAAAC 47287 349 4.09e-07 CGACCGACGA TATGCAATCCTGCCAG AAATCCTGAT 19427 147 6.89e-07 GACCAAAGAA TATGCAATGATCGCAA GGACCGCTTC 17172 23 6.89e-07 GACATAAACA GATACACTGGTTCCAA AGATGTTGGT 43105 311 7.84e-07 TTCTTAAGAA TGTGGAATGTTTGAAG AAGACAAGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43010 3.8e-09 49_[+3]_435 28412 2e-08 17_[+3]_467 5889 1.2e-07 386_[+3]_98 10722 2.4e-07 440_[+3]_44 47287 4.1e-07 348_[+3]_136 19427 6.9e-07 146_[+3]_338 17172 6.9e-07 22_[+3]_462 43105 7.8e-07 310_[+3]_174 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=8 43010 ( 50) TATGCAATGTATCCAG 1 28412 ( 18) GATGCAACGGTTCCAG 1 5889 ( 387) GATCCAATGTATCAAG 1 10722 ( 441) GATGCAATTTTTCAAC 1 47287 ( 349) TATGCAATCCTGCCAG 1 19427 ( 147) TATGCAATGATCGCAA 1 17172 ( 23) GATACACTGGTTCCAA 1 43105 ( 311) TGTGGAATGTTTGAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7275 bayes= 9.82714 E= 1.8e+002 -965 -965 114 93 166 -965 -85 -965 -965 -965 -965 193 -114 -91 173 -965 -965 189 -85 -965 186 -965 -965 -965 166 -91 -965 -965 -965 -91 -965 174 -965 -91 173 -107 -114 -91 14 93 -14 -965 -965 151 -965 -91 -85 151 -965 167 14 -965 44 141 -965 -965 186 -965 -965 -965 -14 -91 147 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 1.8e+002 0.000000 0.000000 0.500000 0.500000 0.875000 0.000000 0.125000 0.000000 0.000000 0.000000 0.000000 1.000000 0.125000 0.125000 0.750000 0.000000 0.000000 0.875000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 0.125000 0.000000 0.875000 0.000000 0.125000 0.750000 0.125000 0.125000 0.125000 0.250000 0.500000 0.250000 0.000000 0.000000 0.750000 0.000000 0.125000 0.125000 0.750000 0.000000 0.750000 0.250000 0.000000 0.375000 0.625000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.125000 0.625000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GT]ATGCAATG[TG][TA]T[CG][CA]A[GA] -------------------------------------------------------------------------------- Time 5.99 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17172 6.64e-10 22_[+3(6.89e-07)]_157_\ [+2(1.84e-06)]_244_[+1(1.16e-08)]_28 5889 4.28e-05 386_[+3(1.21e-07)]_[+2(1.47e-05)]_\ 86 32038 3.33e-05 269_[+1(5.23e-08)]_186_\ [+2(3.05e-05)]_12 43010 4.51e-12 49_[+3(3.82e-09)]_53_[+1(2.66e-09)]_\ 339_[+2(6.96e-06)]_10 43105 4.56e-09 189_[+1(1.50e-08)]_100_\ [+3(7.84e-07)]_129_[+2(1.01e-05)]_33 28181 7.80e-02 423_[+2(2.43e-05)]_65 28412 2.69e-06 17_[+3(2.05e-08)]_316_\ [+2(8.22e-06)]_139 47287 1.19e-04 270_[+2(1.47e-05)]_66_\ [+3(4.09e-07)]_136 10722 3.24e-05 215_[+2(1.11e-05)]_213_\ [+3(2.42e-07)]_44 19427 1.54e-09 74_[+1(1.76e-08)]_51_[+3(6.89e-07)]_\ 8_[+2(3.02e-06)]_318 11785 5.87e-07 169_[+2(8.95e-05)]_19_\ [+1(2.86e-10)]_279 45557 4.43e-06 186_[+1(1.07e-08)]_109_\ [+2(1.91e-05)]_172 46038 2.63e-02 242_[+2(5.47e-06)]_246 44246 1.34e-02 89_[+2(4.39e-06)]_399 47796 3.39e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************