******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/415/415.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 37497 1.0000 500 5811 1.0000 500 38075 1.0000 339 38100 1.0000 500 14771 1.0000 500 39832 1.0000 500 49772 1.0000 500 40896 1.0000 500 41294 1.0000 500 33106 1.0000 500 39167 1.0000 500 48715 1.0000 500 38463 1.0000 500 40592 1.0000 500 41403 1.0000 500 36987 1.0000 500 37060 1.0000 500 38389 1.0000 500 38995 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/415/415.seqs.fa -oc motifs/415 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9339 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.282 C 0.234 G 0.205 T 0.280 Background letter frequencies (from dataset with add-one prior applied): A 0.282 C 0.234 G 0.205 T 0.280 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 123 E-value = 7.5e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::8aa732::::3:::27aa2 pos.-specific C a:2::33:3a:::23::3::: probability G :::::::52:3:787a2:::8 matrix T :a::::335:7a::::7:::: bits 2.3 * 2.1 * * * 1.8 ** ** * * * ** 1.6 ** ** * * * * *** Relative 1.4 ** ** * * *** *** Entropy 1.1 ***** ******* *** (29.6 bits) 0.9 ****** ******* **** 0.7 ****** * ************ 0.5 ****** ************** 0.2 ********************* 0.0 --------------------- Multilevel CTAAAAAGTCTTGGGGTAAAG consensus CCTC G A C C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 36987 304 2.32e-13 TCAGCATGCT CTAAAACGTCTTGGGGTAAAG TAGCAAGCTT 38463 306 2.32e-13 TCAGCATGCT CTAAAACGTCTTGGGGTAAAG TAGCAAGCTT 37497 160 1.73e-10 TTGCCGTGCT CTCAAATGTCTTGGGGGAAAG TGGCATCCTT 38389 459 8.92e-10 ACATAGCACT CTAAACATCCGTAGCGTCAAG GCATTTGTAC 40592 459 8.92e-10 ACATAGCACT CTAAACATCCGTAGCGTCAAG GCATTTGTAC 33106 91 4.06e-09 ATTTTCGAGG CTAAAATAGCTTGCGGAAAAA GTTTTTAATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36987 2.3e-13 303_[+1]_176 38463 2.3e-13 305_[+1]_174 37497 1.7e-10 159_[+1]_320 38389 8.9e-10 458_[+1]_21 40592 8.9e-10 458_[+1]_21 33106 4.1e-09 90_[+1]_389 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 36987 ( 304) CTAAAACGTCTTGGGGTAAAG 1 38463 ( 306) CTAAAACGTCTTGGGGTAAAG 1 37497 ( 160) CTCAAATGTCTTGGGGGAAAG 1 38389 ( 459) CTAAACATCCGTAGCGTCAAG 1 40592 ( 459) CTAAACATCCGTAGCGTCAAG 1 33106 ( 91) CTAAAATAGCTTGCGGAAAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8959 bayes= 10.9909 E= 7.5e-004 -923 210 -923 -923 -923 -923 -923 184 156 -49 -923 -923 182 -923 -923 -923 182 -923 -923 -923 124 51 -923 -923 24 51 -923 25 -76 -923 129 25 -923 51 -29 84 -923 210 -923 -923 -923 -923 70 125 -923 -923 -923 184 24 -923 170 -923 -923 -49 202 -923 -923 51 170 -923 -923 -923 229 -923 -76 -923 -29 125 124 51 -923 -923 182 -923 -923 -923 182 -923 -923 -923 -76 -923 202 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 7.5e-004 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.333333 0.333333 0.000000 0.333333 0.166667 0.000000 0.500000 0.333333 0.000000 0.333333 0.166667 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.166667 0.666667 0.666667 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CTAAA[AC][ACT][GT][TC]C[TG]T[GA]G[GC]GT[AC]AAG -------------------------------------------------------------------------------- Time 3.07 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 7 llr = 125 E-value = 1.1e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::3:3:::9::1:a41 pos.-specific C ::::41134a:::::::6: probability G 31::169:::1::79:::: matrix T 79aa4::46:91a3:a::9 bits 2.3 2.1 * 1.8 ** * * ** 1.6 ** * * * *** Relative 1.4 *** * ** ***** Entropy 1.1 **** * ******** * (25.7 bits) 0.9 **** * *********** 0.7 **** ** *********** 0.5 ******************* 0.2 ******************* 0.0 ------------------- Multilevel TTTTCGGTTCTATGGTACT consensus G TA AC T A sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 38389 289 1.62e-11 AGACACAATT TTTTTGGTTCTATGGTACT CGGGAAAACG 40592 289 1.62e-11 AGACACAATT TTTTTGGTTCTATGGTACT CGGGAAAACG 36987 230 9.22e-10 GTGAGGCCAC TTTTCAGACCTATGGTAAT AAACAAAGAA 38463 232 9.22e-10 GTGAGGCCAC TTTTCAGACCTATGGTAAT AAACAAAGAA 39832 460 7.38e-09 AAACGGTCCG GTTTCGGCTCTATTGTACA GATAGGTAGG 48715 168 4.96e-08 ATGTGGCAGT GTTTTCCCCCTATGATACT TTCTTGTTTT 33106 174 6.75e-08 ACTAAATCTT TGTTGGGTTCGTTTGTAAT CTGTGAATGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38389 1.6e-11 288_[+2]_193 40592 1.6e-11 288_[+2]_193 36987 9.2e-10 229_[+2]_252 38463 9.2e-10 231_[+2]_250 39832 7.4e-09 459_[+2]_22 48715 5e-08 167_[+2]_314 33106 6.8e-08 173_[+2]_308 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=7 38389 ( 289) TTTTTGGTTCTATGGTACT 1 40592 ( 289) TTTTTGGTTCTATGGTACT 1 36987 ( 230) TTTTCAGACCTATGGTAAT 1 38463 ( 232) TTTTCAGACCTATGGTAAT 1 39832 ( 460) GTTTCGGCTCTATTGTACA 1 48715 ( 168) GTTTTCCCCCTATGATACT 1 33106 ( 174) TGTTGGGTTCGTTTGTAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 8997 bayes= 10.1705 E= 1.1e-003 -945 -945 48 135 -945 -945 -52 161 -945 -945 -945 184 -945 -945 -945 184 -945 87 -52 62 2 -71 148 -945 -945 -71 207 -945 2 29 -945 62 -945 87 -945 103 -945 210 -945 -945 -945 -945 -52 161 160 -945 -945 -97 -945 -945 -945 184 -945 -945 180 3 -98 -945 207 -945 -945 -945 -945 184 182 -945 -945 -945 60 129 -945 -945 -98 -945 -945 161 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 7 E= 1.1e-003 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.428571 0.142857 0.428571 0.285714 0.142857 0.571429 0.000000 0.000000 0.142857 0.857143 0.000000 0.285714 0.285714 0.000000 0.428571 0.000000 0.428571 0.000000 0.571429 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.857143 0.000000 0.000000 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.714286 0.285714 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.428571 0.571429 0.000000 0.000000 0.142857 0.000000 0.000000 0.857143 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TG]TTT[CT][GA]G[TAC][TC]CTAT[GT]GTA[CA]T -------------------------------------------------------------------------------- Time 6.15 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 109 E-value = 7.3e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 228:::a::::a842:::::8 pos.-specific C 8::4:::a:2a:::828:6:: probability G :8264a:::8::26:8242a: matrix T ::::6:::a::::::::62:2 bits 2.3 * * 2.1 * * * * 1.8 **** ** * 1.6 ******* * * Relative 1.4 ** ******* *** * Entropy 1.1 ****************** ** (31.5 bits) 0.9 ****************** ** 0.7 ********************* 0.5 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CGAGTGACTGCAAGCGCTCGA consensus AAGCG C GAACGGG T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 38389 91 5.68e-13 GTAGCACGAT CGACGGACTGCAAGCGCTCGA CATCATACCG 40592 91 5.68e-13 GTAGCACGAT CGACGGACTGCAAGCGCTCGA CATCATACCG 14771 35 1.39e-10 AGACTGTTGA CGAGTGACTCCAAGCCGGCGA GCGCCTCTCC 48715 453 1.85e-10 TGACGGTAAG CAAGTGACTGCAGACGCGGGA TCTCCGGTTT 37497 384 1.20e-09 AATTGGGGGA AGGGTGACTGCAAAAGCTTGT CACGTTGTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38389 5.7e-13 90_[+3]_389 40592 5.7e-13 90_[+3]_389 14771 1.4e-10 34_[+3]_445 48715 1.8e-10 452_[+3]_27 37497 1.2e-09 383_[+3]_96 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 38389 ( 91) CGACGGACTGCAAGCGCTCGA 1 40592 ( 91) CGACGGACTGCAAGCGCTCGA 1 14771 ( 35) CGAGTGACTCCAAGCCGGCGA 1 48715 ( 453) CAAGTGACTGCAGACGCGGGA 1 37497 ( 384) AGGGTGACTGCAAAAGCTTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8959 bayes= 11.0581 E= 7.3e-003 -50 177 -897 -897 -50 -897 197 -897 150 -897 -3 -897 -897 77 155 -897 -897 -897 97 110 -897 -897 229 -897 182 -897 -897 -897 -897 210 -897 -897 -897 -897 -897 184 -897 -22 197 -897 -897 210 -897 -897 182 -897 -897 -897 150 -897 -3 -897 50 -897 155 -897 -50 177 -897 -897 -897 -22 197 -897 -897 177 -3 -897 -897 -897 97 110 -897 136 -3 -48 -897 -897 229 -897 150 -897 -897 -48 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 7.3e-003 0.200000 0.800000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.400000 0.000000 0.600000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.600000 0.200000 0.200000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.000000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA][GA][AG][GC][TG]GACT[GC]CA[AG][GA][CA][GC][CG][TG][CGT]G[AT] -------------------------------------------------------------------------------- Time 8.94 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37497 1.72e-11 159_[+1(1.73e-10)]_203_\ [+3(1.20e-09)]_96 5811 7.48e-01 500 38075 9.93e-01 339 38100 3.36e-01 500 14771 7.64e-06 34_[+3(1.39e-10)]_445 39832 1.12e-04 459_[+2(7.38e-09)]_22 49772 4.24e-01 500 40896 3.26e-01 500 41294 7.03e-01 500 33106 4.74e-09 90_[+1(4.06e-09)]_62_[+2(6.75e-08)]_\ 308 39167 3.20e-01 500 48715 6.90e-10 167_[+2(4.96e-08)]_266_\ [+3(1.85e-10)]_27 38463 1.76e-14 231_[+2(9.22e-10)]_55_\ [+1(2.32e-13)]_174 40592 1.45e-21 90_[+3(5.68e-13)]_177_\ [+2(1.62e-11)]_151_[+1(8.92e-10)]_21 41403 3.36e-01 500 36987 7.77e-15 229_[+2(9.22e-10)]_55_\ [+1(2.32e-13)]_176 37060 4.76e-01 500 38389 1.45e-21 90_[+3(5.68e-13)]_177_\ [+2(1.62e-11)]_151_[+1(8.92e-10)]_21 38995 3.94e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************