******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/19/19.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 43132 1.0000 500 48471 1.0000 500 36162 1.0000 500 36184 1.0000 500 39253 1.0000 500 35520 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/19/19.seqs.fa -oc motifs/19 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 6 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3000 N= 6 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.279 C 0.220 G 0.209 T 0.292 Background letter frequencies (from dataset with add-one prior applied): A 0.279 C 0.220 G 0.209 T 0.292 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 6 llr = 72 E-value = 7.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :8838a8a82:: pos.-specific C a::22:2:2::: probability G :2:2:::::::a matrix T ::23:::::8a: bits 2.3 * * 2.0 * * 1.8 * * * ** 1.6 * * * ** Relative 1.4 ** ***** ** Entropy 1.1 *** ******** (17.3 bits) 0.9 *** ******** 0.7 *** ******** 0.5 *** ******** 0.2 *** ******** 0.0 ------------ Multilevel CAAAAAAAATTG consensus T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 35520 89 1.44e-07 AGAGTCCTGC CAAAAAAAATTG CTTCCCGCTA 36162 3 2.95e-07 TG CAATAAAAATTG AGACTTTCAT 43132 341 2.44e-06 GTACTTCCAA CAAGCAAAATTG GAAGATAGAA 39253 366 2.71e-06 GACTTTTTCG CAACAAAACTTG CCCTTGCCAG 36184 370 5.22e-06 TGCCGACATC CGTAAAAAATTG GTCGCCACAT 48471 91 5.22e-06 AGTTACCAGG CAATAACAAATG CAGTCCAGGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35520 1.4e-07 88_[+1]_400 36162 2.9e-07 2_[+1]_486 43132 2.4e-06 340_[+1]_148 39253 2.7e-06 365_[+1]_123 36184 5.2e-06 369_[+1]_119 48471 5.2e-06 90_[+1]_398 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=6 35520 ( 89) CAAAAAAAATTG 1 36162 ( 3) CAATAAAAATTG 1 43132 ( 341) CAAGCAAAATTG 1 39253 ( 366) CAACAAAACTTG 1 36184 ( 370) CGTAAAAAATTG 1 48471 ( 91) CAATAACAAATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 2934 bayes= 9.37898 E= 7.1e+000 -923 218 -923 -923 158 -923 -32 -923 158 -923 -923 -81 26 -40 -32 19 158 -40 -923 -923 184 -923 -923 -923 158 -40 -923 -923 184 -923 -923 -923 158 -40 -923 -923 -74 -923 -923 151 -923 -923 -923 177 -923 -923 226 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 7.1e+000 0.000000 1.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.166667 0.333333 0.166667 0.166667 0.333333 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CAA[AT]AAAAATTG -------------------------------------------------------------------------------- Time 0.34 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 6 llr = 97 E-value = 2.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3:::2::27::22::52a:: pos.-specific C :2:::7:8:8:22::22::3 probability G 75aa32::::8::7235:7: matrix T :3::52a:3227738:2:37 bits 2.3 ** 2.0 ** 1.8 ** * * 1.6 ** ** * * Relative 1.4 ** ** ** * Entropy 1.1 * ** ** ** ** ** (23.4 bits) 0.9 * ** ****** ** *** 0.7 **** *********** *** 0.5 **************** *** 0.2 ******************** 0.0 -------------------- Multilevel GGGGTCTCACGTTGTAGAGT consensus AT G T T G TC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 39253 25 8.43e-11 AAATGGGAAC GTGGACTCACGTTGTGGAGT TAGACACATT 43132 308 8.43e-11 TCCAGAGAGA GGGGTCTCATGTTGTAGAGT GGTGTACTTC 36184 457 4.97e-08 GCTGTTGCGA ATGGGCTCACTATGTAAAGT CCAAATTGTA 36162 405 5.34e-08 AAGCCCAATT GGGGTGTCTCGTCGTCCATC AGGCCAAGAC 35520 425 7.99e-08 GCTTTTACGA GCGGGCTCTCGCATTGGATC TCCATTTATC 48471 418 1.81e-07 TTTCCGGTGG AGGGTTTAACGTTTGATAGT ATTGAAAAGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39253 8.4e-11 24_[+2]_456 43132 8.4e-11 307_[+2]_173 36184 5e-08 456_[+2]_24 36162 5.3e-08 404_[+2]_76 35520 8e-08 424_[+2]_56 48471 1.8e-07 417_[+2]_63 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=6 39253 ( 25) GTGGACTCACGTTGTGGAGT 1 43132 ( 308) GGGGTCTCATGTTGTAGAGT 1 36184 ( 457) ATGGGCTCACTATGTAAAGT 1 36162 ( 405) GGGGTGTCTCGTCGTCCATC 1 35520 ( 425) GCGGGCTCTCGCATTGGATC 1 48471 ( 418) AGGGTTTAACGTTTGATAGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 2886 bayes= 10.008 E= 2.1e+001 26 -923 167 -923 -923 -40 126 19 -923 -923 226 -923 -923 -923 226 -923 -74 -923 67 77 -923 160 -32 -81 -923 -923 -923 177 -74 192 -923 -923 126 -923 -923 19 -923 192 -923 -81 -923 -923 200 -81 -74 -40 -923 119 -74 -40 -923 119 -923 -923 167 19 -923 -923 -32 151 84 -40 67 -923 -74 -40 126 -81 184 -923 -923 -923 -923 -923 167 19 -923 60 -923 119 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 2.1e+001 0.333333 0.000000 0.666667 0.000000 0.000000 0.166667 0.500000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.333333 0.500000 0.000000 0.666667 0.166667 0.166667 0.000000 0.000000 0.000000 1.000000 0.166667 0.833333 0.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.833333 0.166667 0.166667 0.166667 0.000000 0.666667 0.166667 0.166667 0.000000 0.666667 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.166667 0.833333 0.500000 0.166667 0.333333 0.000000 0.166667 0.166667 0.500000 0.166667 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.333333 0.000000 0.666667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GA][GT]GG[TG]CTC[AT]CGTT[GT]T[AG]GA[GT][TC] -------------------------------------------------------------------------------- Time 0.71 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 6 llr = 68 E-value = 5.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::3:2:2:8:a3 pos.-specific C :::8:3:a:3:: probability G :a2:8:2::7:: matrix T a:52:77:2::7 bits 2.3 * * 2.0 * * 1.8 ** * * 1.6 ** * * * Relative 1.4 ** ** * ** Entropy 1.1 ** ** **** (16.5 bits) 0.9 ** *** ***** 0.7 ** ********* 0.5 ************ 0.2 ************ 0.0 ------------ Multilevel TGTCGTTCAGAT consensus A C C A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 36184 328 7.91e-07 AGGAAATATT TGTCGCTCACAT AAGCGAAAAC 35520 380 1.50e-06 CCCCTGCTCT TGTCGTTCTGAT AGACGAGGCA 43132 22 1.50e-06 ACTTTTTCAA TGACGTTCACAA AGCAATTCAC 36162 117 1.77e-06 AAAAGATGGC TGGCGTGCAGAT TTTTTATGAT 48471 170 3.16e-06 TATATTTTCA TGTTGCTCAGAT AGATTAGTTC 39253 104 1.40e-05 TAGGATCGTG TGACATACAGAA GGATATTTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36184 7.9e-07 327_[+3]_161 35520 1.5e-06 379_[+3]_109 43132 1.5e-06 21_[+3]_467 36162 1.8e-06 116_[+3]_372 48471 3.2e-06 169_[+3]_319 39253 1.4e-05 103_[+3]_385 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=6 36184 ( 328) TGTCGCTCACAT 1 35520 ( 380) TGTCGTTCTGAT 1 43132 ( 22) TGACGTTCACAA 1 36162 ( 117) TGGCGTGCAGAT 1 48471 ( 170) TGTTGCTCAGAT 1 39253 ( 104) TGACATACAGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 2934 bayes= 8.93074 E= 5.9e+002 -923 -923 -923 177 -923 -923 226 -923 26 -923 -32 77 -923 192 -923 -81 -74 -923 200 -923 -923 60 -923 119 -74 -923 -32 119 -923 218 -923 -923 158 -923 -923 -81 -923 60 167 -923 184 -923 -923 -923 26 -923 -923 119 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 5.9e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.166667 0.500000 0.000000 0.833333 0.000000 0.166667 0.166667 0.000000 0.833333 0.000000 0.000000 0.333333 0.000000 0.666667 0.166667 0.000000 0.166667 0.666667 0.000000 1.000000 0.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.000000 0.666667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TG[TA]CG[TC]TCA[GC]A[TA] -------------------------------------------------------------------------------- Time 1.14 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43132 1.81e-11 21_[+3(1.50e-06)]_22_[+1(5.49e-05)]_\ 240_[+2(8.43e-11)]_13_[+1(2.44e-06)]_148 48471 8.91e-08 90_[+1(5.22e-06)]_67_[+3(3.16e-06)]_\ 236_[+2(1.81e-07)]_63 36162 1.21e-09 2_[+1(2.95e-07)]_102_[+3(1.77e-06)]_\ 99_[+3(4.57e-05)]_114_[+3(2.14e-05)]_39_[+2(5.34e-08)]_76 36184 7.66e-09 327_[+3(7.91e-07)]_30_\ [+1(5.22e-06)]_75_[+2(4.97e-08)]_24 39253 1.61e-10 24_[+2(8.43e-11)]_59_[+3(1.40e-05)]_\ 250_[+1(2.71e-06)]_123 35520 7.75e-10 88_[+1(1.44e-07)]_279_\ [+3(1.50e-06)]_33_[+2(7.99e-08)]_56 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************