******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/429/429.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 7913 1.0000 500 7617 1.0000 500 3120 1.0000 500 12984 1.0000 500 14054 1.0000 500 7867 1.0000 500 7910 1.0000 500 16243 1.0000 500 44729 1.0000 500 45034 1.0000 500 7181 1.0000 500 49775 1.0000 500 38020 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/429/429.seqs.fa -oc motifs/429 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.252 C 0.263 G 0.250 T 0.234 Background letter frequencies (from dataset with add-one prior applied): A 0.252 C 0.263 G 0.250 T 0.234 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 11 llr = 115 E-value = 1.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::2::36:1aa5 pos.-specific C :::6:2496::3 probability G 1:::63::3::2 matrix T 9a8443:1:::: bits 2.1 * 1.9 * ** 1.7 ** ** 1.5 *** * ** Relative 1.3 *** * ** Entropy 1.0 ***** ** ** (15.0 bits) 0.8 ***** ** ** 0.6 ***** ****** 0.4 ***** ****** 0.2 ***** ****** 0.0 ------------ Multilevel TTTCGAACCAAA consensus TTGC G C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 38020 487 1.73e-07 GAGAGACGGT TTTCGAACCAAA CC 44729 171 4.40e-07 TACTTACAGG TTTCTTACCAAA CCTGTACCAG 7913 161 7.31e-07 ATTGCGCGGT TTTCGACCCAAA AGCGGCACGA 7910 258 2.77e-06 GAAGCGTGGC TTTCGTCCGAAA GATAACAAGT 12984 160 6.95e-06 ATTCCAGAAT TTTTTGACCAAG GCGAAGTTCC 7867 169 9.20e-06 CTCTTGGGAT TTACGTACCAAC ATATGCTACT 7181 400 1.00e-05 TCAGAATTGT TTTCTGACGAAG ACGATTTCGA 16243 106 1.42e-05 ATGCGCCGAC TTATGGACGAAA TTGAAGAAGA 14054 176 2.68e-05 GCCATGCCGC TTTCGCCCAAAC ATCGACGGGA 49775 10 2.82e-05 AGAGTGGAG TTTTTACTCAAA TGTCAGTATT 45034 482 3.33e-05 CGAACGCTTG GTTTGCACCAAC GGACATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38020 1.7e-07 486_[+1]_2 44729 4.4e-07 170_[+1]_318 7913 7.3e-07 160_[+1]_328 7910 2.8e-06 257_[+1]_231 12984 6.9e-06 159_[+1]_329 7867 9.2e-06 168_[+1]_320 7181 1e-05 399_[+1]_89 16243 1.4e-05 105_[+1]_383 14054 2.7e-05 175_[+1]_313 49775 2.8e-05 9_[+1]_479 45034 3.3e-05 481_[+1]_7 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=11 38020 ( 487) TTTCGAACCAAA 1 44729 ( 171) TTTCTTACCAAA 1 7913 ( 161) TTTCGACCCAAA 1 7910 ( 258) TTTCGTCCGAAA 1 12984 ( 160) TTTTTGACCAAG 1 7867 ( 169) TTACGTACCAAC 1 7181 ( 400) TTTCTGACGAAG 1 16243 ( 106) TTATGGACGAAA 1 14054 ( 176) TTTCGCCCAAAC 1 49775 ( 10) TTTTTACTCAAA 1 45034 ( 482) GTTTGCACCAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 9.52784 E= 1.3e+001 -1010 -1010 -146 195 -1010 -1010 -1010 209 -47 -1010 -1010 180 -1010 127 -1010 63 -1010 -1010 134 63 11 -53 12 22 133 47 -1010 -1010 -1010 179 -1010 -136 -147 127 12 -1010 199 -1010 -1010 -1010 199 -1010 -1010 -1010 111 5 -46 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 1.3e+001 0.000000 0.000000 0.090909 0.909091 0.000000 0.000000 0.000000 1.000000 0.181818 0.000000 0.000000 0.818182 0.000000 0.636364 0.000000 0.363636 0.000000 0.000000 0.636364 0.363636 0.272727 0.181818 0.272727 0.272727 0.636364 0.363636 0.000000 0.000000 0.000000 0.909091 0.000000 0.090909 0.090909 0.636364 0.272727 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.545455 0.272727 0.181818 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TTT[CT][GT][AGT][AC]C[CG]AA[AC] -------------------------------------------------------------------------------- Time 1.57 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 5 llr = 79 E-value = 1.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::::::::2:::: pos.-specific C :64::::2:::::::: probability G :4:8a:224:64:4:8 matrix T a:62:a866a44a6a2 bits 2.1 * ** * * * 1.9 * ** * * * 1.7 * ** * * * 1.5 * ** * * * Relative 1.3 * **** * * ** Entropy 1.0 ******* *** **** (22.9 bits) 0.8 ******* *** **** 0.6 *********** **** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TCTGGTTTTTGGTTTG consensus GCT GCG TT G T sequence G A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 3120 302 2.21e-09 ATTCTAACAC TCCGGTTTTTTTTTTG GCACAGCAAA 7913 389 2.96e-09 AAACCTGTTG TCTGGTTTGTTGTTTG CAGATTACAG 12984 179 3.76e-08 AAGGCGAAGT TCCGGTTCTTGATTTG GTGTTCAAGC 38020 343 6.09e-08 CTTGGTATCG TGTGGTGTGTGGTGTG TGATTGGTAC 45034 6 1.97e-07 GCTGC TGTTGTTGTTGTTGTT GTTGTTGTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3120 2.2e-09 301_[+2]_183 7913 3e-09 388_[+2]_96 12984 3.8e-08 178_[+2]_306 38020 6.1e-08 342_[+2]_142 45034 2e-07 5_[+2]_479 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=5 3120 ( 302) TCCGGTTTTTTTTTTG 1 7913 ( 389) TCTGGTTTGTTGTTTG 1 12984 ( 179) TCCGGTTCTTGATTTG 1 38020 ( 343) TGTGGTGTGTGGTGTG 1 45034 ( 6) TGTTGTTGTTGTTGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 11.2432 E= 1.5e+002 -897 -897 -897 209 -897 119 67 -897 -897 60 -897 135 -897 -897 167 -23 -897 -897 200 -897 -897 -897 -897 209 -897 -897 -32 177 -897 -39 -32 135 -897 -897 67 135 -897 -897 -897 209 -897 -897 126 77 -33 -897 67 77 -897 -897 -897 209 -897 -897 67 135 -897 -897 -897 209 -897 -897 167 -23 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 1.5e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.200000 0.200000 0.600000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.400000 0.200000 0.000000 0.400000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[CG][TC][GT]GT[TG][TCG][TG]T[GT][GTA]T[TG]T[GT] -------------------------------------------------------------------------------- Time 3.16 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 94 E-value = 5.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 8:4:2242:::4:::22:::: pos.-specific C ::6:::4::4::a4a:22a8a probability G 22:2:62::6a2:::6::::: matrix T :8:882:8a::4:6:268:2: bits 2.1 * * 1.9 * * * * * * 1.7 * * * * * * 1.5 * * * * * * Relative 1.3 ** ** ** * * * **** Entropy 1.0 ***** **** *** **** (27.0 bits) 0.8 ***** **** *** **** 0.6 ****** **** ********* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel ATCTTGATTGGACTCGTTCCC consensus GGAGAACA C T C AAC T sequence TG G TC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 44729 359 3.00e-11 TCGTAACACT ATATTGATTGGTCTCGCTCCC TCACGAACGG 38020 86 2.19e-10 AGAGCGTTGA ATCTTGCTTGGGCCCGTCCCC GACAGTAGGA 45034 404 5.34e-10 TCGTCGTCGT GGATTGCTTGGACTCGTTCCC AATTGGGCGA 12984 477 1.05e-08 ACCGGAAGCT ATCTTAGATCGACCCAATCCC CCG 7617 327 1.22e-08 TCGCTTCGCC ATCGATATTCGTCTCTTTCTC GCAGCTGTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44729 3e-11 358_[+3]_121 38020 2.2e-10 85_[+3]_394 45034 5.3e-10 403_[+3]_76 12984 1e-08 476_[+3]_3 7617 1.2e-08 326_[+3]_153 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 44729 ( 359) ATATTGATTGGTCTCGCTCCC 1 38020 ( 86) ATCTTGCTTGGGCCCGTCCCC 1 45034 ( 404) GGATTGCTTGGACTCGTTCCC 1 12984 ( 477) ATCTTAGATCGACCCAATCCC 1 7617 ( 327) ATCGATATTCGTCTCTTTCTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6240 bayes= 10.536 E= 5.4e+002 166 -897 -32 -897 -897 -897 -32 177 66 119 -897 -897 -897 -897 -32 177 -33 -897 -897 177 -33 -897 126 -23 66 60 -32 -897 -33 -897 -897 177 -897 -897 -897 209 -897 60 126 -897 -897 -897 200 -897 66 -897 -32 77 -897 192 -897 -897 -897 60 -897 135 -897 192 -897 -897 -33 -897 126 -23 -33 -39 -897 135 -897 -39 -897 177 -897 192 -897 -897 -897 160 -897 -23 -897 192 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 5.4e+002 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 0.200000 0.800000 0.400000 0.600000 0.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.600000 0.200000 0.400000 0.400000 0.200000 0.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.200000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.600000 0.200000 0.200000 0.200000 0.000000 0.600000 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AG][TG][CA][TG][TA][GAT][ACG][TA]T[GC]G[ATG]C[TC]C[GAT][TAC][TC]C[CT]C -------------------------------------------------------------------------------- Time 4.53 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7913 7.72e-09 160_[+1(7.31e-07)]_216_\ [+2(2.96e-09)]_43_[+3(9.88e-05)]_32 7617 1.15e-04 326_[+3(1.22e-08)]_153 3120 7.87e-05 301_[+2(2.21e-09)]_183 12984 1.38e-10 159_[+1(6.95e-06)]_7_[+2(3.76e-08)]_\ 282_[+3(1.05e-08)]_3 14054 1.44e-01 175_[+1(2.68e-05)]_313 7867 1.76e-02 168_[+1(9.20e-06)]_320 7910 3.20e-03 197_[+3(9.62e-05)]_39_\ [+1(2.77e-06)]_231 16243 7.69e-02 105_[+1(1.42e-05)]_383 44729 3.50e-10 170_[+1(4.40e-07)]_176_\ [+3(3.00e-11)]_121 45034 1.73e-10 5_[+2(1.97e-07)]_382_[+3(5.34e-10)]_\ 57_[+1(3.33e-05)]_7 7181 3.30e-02 399_[+1(1.00e-05)]_89 49775 9.61e-02 9_[+1(2.82e-05)]_479 38020 1.78e-13 85_[+3(2.19e-10)]_236_\ [+2(6.09e-08)]_128_[+1(1.73e-07)]_2 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************