******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/110/110.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 24547 1.0000 500 43022 1.0000 500 46545 1.0000 500 48643 1.0000 500 48977 1.0000 500 43962 1.0000 500 51727 1.0000 500 44829 1.0000 500 45307 1.0000 500 35150 1.0000 500 48463 1.0000 500 42904 1.0000 500 46661 1.0000 500 50037 1.0000 500 47076 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/110/110.seqs.fa -oc motifs/110 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.223 G 0.227 T 0.279 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.223 G 0.227 T 0.279 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 4 llr = 83 E-value = 4.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :a:5::::::::a:3aa:35: pos.-specific C ::83:8333::a:::::33:a probability G ::::a3555a:::a8::::3: matrix T a:33::333:a::::::853: bits 2.2 * * * * * 2.0 * * * *** ** * 1.7 ** * ***** ** * 1.5 ** * ***** ** * Relative 1.3 *** ** ******** * Entropy 1.1 *** ** ********* * (30.0 bits) 0.9 *** ** ********* * 0.7 *** ************** * 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TACAGCGGGGTCAGGAATTAC consensus TC GCCC A CAG sequence T TTT CT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 48977 332 9.47e-11 TACCAGAAAT TATAGCGGGGTCAGGAATAGC ACAACAGCCA 43962 450 1.49e-10 CTTTGGAAAC TACAGCTCCGTCAGGAATTTC TTTGGGCTAA 46661 426 2.02e-10 AGTTCAGATT TACCGCGTTGTCAGGAACTAC TCAAGAGAAA 24547 50 6.57e-10 GTCAAAATCA TACTGGCGGGTCAGAAATCAC CGACTCTTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48977 9.5e-11 331_[+1]_148 43962 1.5e-10 449_[+1]_30 46661 2e-10 425_[+1]_54 24547 6.6e-10 49_[+1]_430 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=4 48977 ( 332) TATAGCGGGGTCAGGAATAGC 1 43962 ( 450) TACAGCTCCGTCAGGAATTTC 1 46661 ( 426) TACCGCGTTGTCAGGAACTAC 1 24547 ( 50) TACTGGCGGGTCAGAAATCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 10.813 E= 4.9e+002 -865 -865 -865 184 188 -865 -865 -865 -865 175 -865 -16 88 17 -865 -16 -865 -865 214 -865 -865 175 14 -865 -865 17 114 -16 -865 17 114 -16 -865 17 114 -16 -865 -865 214 -865 -865 -865 -865 184 -865 217 -865 -865 188 -865 -865 -865 -865 -865 214 -865 -12 -865 172 -865 188 -865 -865 -865 188 -865 -865 -865 -865 17 -865 142 -12 17 -865 84 88 -865 14 -16 -865 217 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 4.9e+002 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.500000 0.250000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.250000 0.500000 0.250000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.250000 0.250000 0.000000 0.500000 0.500000 0.000000 0.250000 0.250000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TA[CT][ACT]G[CG][GCT][GCT][GCT]GTCAG[GA]AA[TC][TAC][AGT]C -------------------------------------------------------------------------------- Time 2.40 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 3 llr = 68 E-value = 5.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::3::33::::3:::3a:: pos.-specific C :::77:37:aa7:a:a::aa probability G :3a:3a::a:::7:3:7::: matrix T a7::::3::::3::7::::: bits 2.2 * * *** * * ** 2.0 * * *** * * *** 1.7 * * * *** * * *** 1.5 * * * *** * * *** Relative 1.3 * * ** *** * * *** Entropy 1.1 ****** ************* (32.5 bits) 0.9 ****** ************* 0.7 ****** ************* 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel TTGCCGACGCCCGCTCGACC consensus G AG CA TA G A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 42904 390 1.99e-11 TCCGTCCAAG TTGCCGTCGCCCACGCGACC TACGTAGACT 51727 193 5.38e-11 ACCCTTTGTT TGGACGCCGCCCGCTCAACC CGAAAAAACT 48463 243 9.06e-11 GAATAGCGCG TTGCGGAAGCCTGCTCGACC GATTACGAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42904 2e-11 389_[+2]_91 51727 5.4e-11 192_[+2]_288 48463 9.1e-11 242_[+2]_238 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=3 42904 ( 390) TTGCCGTCGCCCACGCGACC 1 51727 ( 193) TGGACGCCGCCCGCTCAACC 1 48463 ( 243) TTGCGGAAGCCTGCTCGACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7215 bayes= 11.6788 E= 5.0e+002 -823 -823 -823 184 -823 -823 55 125 -823 -823 213 -823 30 158 -823 -823 -823 158 55 -823 -823 -823 213 -823 30 58 -823 25 30 158 -823 -823 -823 -823 213 -823 -823 216 -823 -823 -823 216 -823 -823 -823 158 -823 25 30 -823 155 -823 -823 216 -823 -823 -823 -823 55 125 -823 216 -823 -823 30 -823 155 -823 188 -823 -823 -823 -823 216 -823 -823 -823 216 -823 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 3 E= 5.0e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.333333 0.000000 0.333333 0.333333 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.333333 0.000000 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[TG]G[CA][CG]G[ACT][CA]GCC[CT][GA]C[TG]C[GA]ACC -------------------------------------------------------------------------------- Time 4.60 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 4 llr = 78 E-value = 5.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :a3::::::::::::85a: pos.-specific C :::8:::a3a8:338:3:a probability G 8:5:::a:::3::3333:: matrix T 3:33aa::8::a85::::: bits 2.2 ** * * 2.0 * ** * ** 1.7 * **** * * ** 1.5 * **** * * ** Relative 1.3 ** ***** *** * ** Entropy 1.1 ** ********** ** ** (28.0 bits) 0.9 ** ********** ** ** 0.7 ** ********** ** ** 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel GAGCTTGCTCCTTTCAAAC consensus T AT C G CCGGC sequence T G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 43022 112 3.66e-12 CTTGAGATGA GAGCTTGCTCCTTTCAAAC CTCTTCATGC 51727 166 1.18e-09 GCCTGATGGT GAGCTTGCCCCTCCCGAAC CCTTTGTTTG 24547 124 2.01e-09 GTCCCTCGAG GAACTTGCTCGTTTGAGAC GAAGGCCTGG 48643 94 4.16e-09 GCGATTCCTA TATTTTGCTCCTTGCACAC TAATGTAGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43022 3.7e-12 111_[+3]_370 51727 1.2e-09 165_[+3]_316 24547 2e-09 123_[+3]_358 48643 4.2e-09 93_[+3]_388 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=4 43022 ( 112) GAGCTTGCTCCTTTCAAAC 1 51727 ( 166) GAGCTTGCCCCTCCCGAAC 1 24547 ( 124) GAACTTGCTCGTTTGAGAC 1 48643 ( 94) TATTTTGCTCCTTGCACAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 7230 bayes= 10.819 E= 5.7e+002 -865 -865 172 -16 188 -865 -865 -865 -12 -865 114 -16 -865 175 -865 -16 -865 -865 -865 184 -865 -865 -865 184 -865 -865 214 -865 -865 217 -865 -865 -865 17 -865 142 -865 217 -865 -865 -865 175 14 -865 -865 -865 -865 184 -865 17 -865 142 -865 17 14 84 -865 175 14 -865 147 -865 14 -865 88 17 14 -865 188 -865 -865 -865 -865 217 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 4 E= 5.7e+002 0.000000 0.000000 0.750000 0.250000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.500000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.250000 0.500000 0.000000 0.750000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.500000 0.250000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GT]A[GAT][CT]TTGC[TC]C[CG]T[TC][TCG][CG][AG][ACG]AC -------------------------------------------------------------------------------- Time 6.74 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24547 5.75e-11 49_[+1(6.57e-10)]_53_[+3(2.01e-09)]_\ 358 43022 6.41e-08 111_[+3(3.66e-12)]_370 46545 5.06e-01 500 48643 3.99e-05 93_[+3(4.16e-09)]_388 48977 7.88e-07 331_[+1(9.47e-11)]_148 43962 2.78e-06 449_[+1(1.49e-10)]_30 51727 7.11e-12 165_[+3(1.18e-09)]_8_[+2(5.38e-11)]_\ 288 44829 2.96e-01 500 45307 1.49e-01 500 35150 2.65e-02 470_[+1(8.66e-05)]_9 48463 1.90e-06 242_[+2(9.06e-11)]_238 42904 1.45e-06 389_[+2(1.99e-11)]_91 46661 2.08e-06 425_[+1(2.02e-10)]_54 50037 3.99e-02 500 47076 6.99e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************