******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/455/455.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42537 1.0000 500 50179 1.0000 500 41620 1.0000 500 44611 1.0000 500 37228 1.0000 500 35488 1.0000 500 34469 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/455/455.seqs.fa -oc motifs/455 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.217 G 0.238 T 0.273 Background letter frequencies (from dataset with add-one prior applied): A 0.272 C 0.217 G 0.238 T 0.273 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 7 llr = 76 E-value = 6.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :1a1::94::1: pos.-specific C 9::9:111::13 probability G 1:::79:1a:47 matrix T :9::3::3:a3: bits 2.2 2.0 * * 1.8 * ** 1.5 * ** * ** Relative 1.3 **** ** ** * Entropy 1.1 ******* ** * (15.8 bits) 0.9 ******* ** * 0.7 ******* ** * 0.4 ******* ** * 0.2 ******* **** 0.0 ------------ Multilevel CTACGGAAGTGG consensus T T TC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 35488 380 1.97e-06 TCCTGAGGCA CTAAGGAAGTGG ATACATGATG 42537 348 2.33e-06 AGGACACTAC CTACGGAGGTAG AAGTGAGTGT 41620 303 2.79e-06 ATTTCCTGAG CTACTGAAGTTC TTTGGTGTTA 50179 58 2.79e-06 TCGAGTAGAT CTACGGCAGTGC CGATTTGAAC 44611 60 3.92e-06 GCTGTGTAAA CAACGGATGTTG TACAAGACAA 34469 446 5.27e-06 ATAAGATGGA CTACTGACGTCG TCTCCTTACT 37228 81 1.34e-05 TTAGGTCCCC GTACGCATGTGG AATTGGACGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35488 2e-06 379_[+1]_109 42537 2.3e-06 347_[+1]_141 41620 2.8e-06 302_[+1]_186 50179 2.8e-06 57_[+1]_431 44611 3.9e-06 59_[+1]_429 34469 5.3e-06 445_[+1]_43 37228 1.3e-05 80_[+1]_408 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=7 35488 ( 380) CTAAGGAAGTGG 1 42537 ( 348) CTACGGAGGTAG 1 41620 ( 303) CTACTGAAGTTC 1 50179 ( 58) CTACGGCAGTGC 1 44611 ( 60) CAACGGATGTTG 1 34469 ( 446) CTACTGACGTCG 1 37228 ( 81) GTACGCATGTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3423 bayes= 8.93074 E= 6.7e+001 -945 198 -74 -945 -93 -945 -945 165 188 -945 -945 -945 -93 198 -945 -945 -945 -945 158 6 -945 -60 185 -945 165 -60 -945 -945 66 -60 -74 6 -945 -945 207 -945 -945 -945 -945 187 -93 -60 85 6 -945 40 158 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 6.7e+001 0.000000 0.857143 0.142857 0.000000 0.142857 0.000000 0.000000 0.857143 1.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 0.000000 0.714286 0.285714 0.000000 0.142857 0.857143 0.000000 0.857143 0.142857 0.000000 0.000000 0.428571 0.142857 0.142857 0.285714 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.142857 0.142857 0.428571 0.285714 0.000000 0.285714 0.714286 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CTAC[GT]GA[AT]GT[GT][GC] -------------------------------------------------------------------------------- Time 0.48 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 4 llr = 78 E-value = 4.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :88::::535:aa:3a883: pos.-specific C a3::::3::3a::88:33:a probability G ::38:a:333:::::::::: matrix T :::3a:835::::3::::8: bits 2.2 * * * 2.0 * * *** * * 1.8 * ** *** * * 1.5 * ** *** * * Relative 1.3 * ** ****** * Entropy 1.1 ******* ********** (28.2 bits) 0.9 ******* ********** 0.7 ******* ********** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel CAAGTGTATACAACCAAATC consensus CGT CGAC TA CCA sequence TGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 44611 196 1.67e-10 TTTGGAGTTA CAAGTGCATCCAACCACATC AATGATGGGT 34469 58 3.28e-10 ATCTTACATG CAAGTGTAAGCAACAAAATC AGAAAAGGTA 35488 308 1.16e-09 CAAGTGTGCA CAGGTGTTTACAACCAACAC GGCCTATTTG 50179 136 2.84e-09 TTTTTCTTTG CCATTGTGGACAATCAAATC TTTGAACTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44611 1.7e-10 195_[+2]_285 34469 3.3e-10 57_[+2]_423 35488 1.2e-09 307_[+2]_173 50179 2.8e-09 135_[+2]_345 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=4 44611 ( 196) CAAGTGCATCCAACCACATC 1 34469 ( 58) CAAGTGTAAGCAACAAAATC 1 35488 ( 308) CAGGTGTTTACAACCAACAC 1 50179 ( 136) CCATTGTGGACAATCAAATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 3367 bayes= 9.71553 E= 4.3e+002 -865 220 -865 -865 146 20 -865 -865 146 -865 7 -865 -865 -865 165 -13 -865 -865 -865 187 -865 -865 207 -865 -865 20 -865 146 88 -865 7 -13 -12 -865 7 87 88 20 7 -865 -865 220 -865 -865 188 -865 -865 -865 188 -865 -865 -865 -865 179 -865 -13 -12 179 -865 -865 188 -865 -865 -865 146 20 -865 -865 146 20 -865 -865 -12 -865 -865 146 -865 220 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 4.3e+002 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.500000 0.000000 0.250000 0.250000 0.250000 0.000000 0.250000 0.500000 0.500000 0.250000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[AC][AG][GT]TG[TC][AGT][TAG][ACG]CAA[CT][CA]A[AC][AC][TA]C -------------------------------------------------------------------------------- Time 0.92 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 3 llr = 70 E-value = 7.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::::3:::::7::a3::::: pos.-specific C a7::a:a:3a3:3a::::::3 probability G :3:a::::::33:::7::3:7 matrix T ::a::7:a7:3:7:::aa7a: bits 2.2 * * * * * 2.0 * ** * * ** 1.8 * *** ** * ** ** * 1.5 * *** ** * ** ** * Relative 1.3 ***** ** * ** ** * Entropy 1.1 ***** **** ********** (33.5 bits) 0.9 ********** ********** 0.7 ********** ********** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CCTGCTCTTCCATCAGTTTTG consensus G A C GGC A G C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 37228 22 1.48e-11 TTGCCATCGC CGTGCTCTTCGGTCAGTTTTG ATTTCCTCGT 41620 6 2.14e-11 TAGCT CCTGCTCTTCCACCAGTTGTC AACTCTCCAG 50179 472 4.90e-11 TGCGTTTTCA CCTGCACTCCTATCAATTTTG TCACAACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37228 1.5e-11 21_[+3]_458 41620 2.1e-11 5_[+3]_474 50179 4.9e-11 471_[+3]_8 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=3 37228 ( 22) CGTGCTCTTCGGTCAGTTTTG 1 41620 ( 6) CCTGCTCTTCCACCAGTTGTC 1 50179 ( 472) CCTGCACTCCTATCAATTTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3360 bayes= 9.78661 E= 7.1e+002 -823 220 -823 -823 -823 162 48 -823 -823 -823 -823 187 -823 -823 207 -823 -823 220 -823 -823 29 -823 -823 128 -823 220 -823 -823 -823 -823 -823 187 -823 62 -823 128 -823 220 -823 -823 -823 62 48 29 129 -823 48 -823 -823 62 -823 128 -823 220 -823 -823 187 -823 -823 -823 29 -823 148 -823 -823 -823 -823 187 -823 -823 -823 187 -823 -823 48 128 -823 -823 -823 187 -823 62 148 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 3 E= 7.1e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.333333 0.333333 0.666667 0.000000 0.333333 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.666667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[CG]TGC[TA]CT[TC]C[CGT][AG][TC]CA[GA]TT[TG]T[GC] -------------------------------------------------------------------------------- Time 1.46 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42537 2.08e-02 347_[+1(2.33e-06)]_141 50179 3.28e-14 57_[+1(2.79e-06)]_66_[+2(2.84e-09)]_\ 316_[+3(4.90e-11)]_8 41620 3.88e-09 5_[+3(2.14e-11)]_276_[+1(2.79e-06)]_\ 186 44611 2.41e-08 59_[+1(3.92e-06)]_124_\ [+2(1.67e-10)]_285 37228 7.22e-09 21_[+3(1.48e-11)]_38_[+1(1.34e-05)]_\ 408 35488 1.05e-07 307_[+2(1.16e-09)]_52_\ [+1(1.97e-06)]_109 34469 9.24e-08 57_[+2(3.28e-10)]_368_\ [+1(5.27e-06)]_43 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************