******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/11/11.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9890 1.0000 500 43466 1.0000 500 49221 1.0000 500 50098 1.0000 500 33873 1.0000 500 36006 1.0000 500 49454 1.0000 500 46822 1.0000 500 46898 1.0000 500 34611 1.0000 500 47298 1.0000 500 33383 1.0000 500 50062 1.0000 500 39484 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/11/11.seqs.fa -oc motifs/11 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.288 C 0.222 G 0.209 T 0.281 Background letter frequencies (from dataset with add-one prior applied): A 0.288 C 0.222 G 0.209 T 0.281 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 7 llr = 84 E-value = 4.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :aaa4:71:::: pos.-specific C a:::11:::::: probability G ::::161::a6: matrix T ::::3319a:4a bits 2.3 * * 2.0 * * 1.8 **** ** * 1.6 **** ** * Relative 1.4 **** ** * Entropy 1.1 **** ***** (17.4 bits) 0.9 **** ***** 0.7 **** ******* 0.5 **** ******* 0.2 **** ******* 0.0 ------------ Multilevel CAAAAGATTGGT consensus TT T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 50098 355 2.96e-07 TGCAAGCGTT CAAAAGATTGTT TAGCTGTCCT 33873 227 9.34e-07 TCGTTCGACG CAAATTATTGGT TTCTCTGAAA 49454 282 1.12e-06 GACACATCTC CAAAACATTGGT ATCCATTCCT 39484 260 1.88e-06 TTTATTTAGC CAAACTATTGGT GAGTTACATT 49221 448 1.88e-06 CCCCTGTCTG CAAAAGGTTGTT ATACCAAGCG 47298 409 3.78e-06 AAGCGAGGTT CAAATGTTTGTT TCCAAAAAAT 9890 22 3.78e-06 AACCTGGCGT CAAAGGAATGGT AAGTTTCGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50098 3e-07 354_[+1]_134 33873 9.3e-07 226_[+1]_262 49454 1.1e-06 281_[+1]_207 39484 1.9e-06 259_[+1]_229 49221 1.9e-06 447_[+1]_41 47298 3.8e-06 408_[+1]_80 9890 3.8e-06 21_[+1]_467 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=7 50098 ( 355) CAAAAGATTGTT 1 33873 ( 227) CAAATTATTGGT 1 49454 ( 282) CAAAACATTGGT 1 39484 ( 260) CAAACTATTGGT 1 49221 ( 448) CAAAAGGTTGTT 1 47298 ( 409) CAAATGTTTGTT 1 9890 ( 22) CAAAGGAATGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.77593 E= 4.6e+002 -945 217 -945 -945 180 -945 -945 -945 180 -945 -945 -945 180 -945 -945 -945 58 -64 -55 2 -945 -64 145 2 131 -945 -55 -97 -101 -945 -945 161 -945 -945 -945 183 -945 -945 226 -945 -945 -945 145 61 -945 -945 -945 183 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 4.6e+002 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.428571 0.142857 0.142857 0.285714 0.000000 0.142857 0.571429 0.285714 0.714286 0.000000 0.142857 0.142857 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.571429 0.428571 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CAAA[AT][GT]ATTG[GT]T -------------------------------------------------------------------------------- Time 1.97 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 4 llr = 80 E-value = 5.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::::5:3::::3a::::3 pos.-specific C ::338:3::::3a3::5:a: probability G a35:3a:5:::5:5::5a:8 matrix T :838::358aa3:::a:::: bits 2.3 * * * ** 2.0 * * * ** 1.8 * * ** * ** ** 1.6 * * ** * ** ** Relative 1.4 * ** ** * ** *** Entropy 1.1 ** *** * ** * ****** (28.9 bits) 0.9 ** *** **** * ****** 0.7 ****** ************* 0.5 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GTGTCGAGTTTGCGATCGCG consensus GCCG CTA C A G A sequence T T T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 36006 2 1.74e-10 T GTTTCGATTTTGCAATCGCG TGTTTGTGCA 33873 431 4.01e-10 CTCCCGAAAT GGGTCGAGATTTCGATCGCG ACCAGAAAAC 39484 412 6.87e-10 CCTGCGGTCC GTGCGGCTTTTCCGATGGCG TTTTACTGCT 46822 321 8.89e-10 AAGTAGTCCG GTCTCGTGTTTGCCATGGCA TAGGTCACCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36006 1.7e-10 1_[+2]_479 33873 4e-10 430_[+2]_50 39484 6.9e-10 411_[+2]_69 46822 8.9e-10 320_[+2]_160 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=4 36006 ( 2) GTTTCGATTTTGCAATCGCG 1 33873 ( 431) GGGTCGAGATTTCGATCGCG 1 39484 ( 412) GTGCGGCTTTTCCGATGGCG 1 46822 ( 321) GTCTCGTGTTTGCCATGGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6734 bayes= 10.7164 E= 5.2e+002 -865 -865 225 -865 -865 -865 26 141 -865 17 126 -17 -865 17 -865 141 -865 175 26 -865 -865 -865 225 -865 80 17 -865 -17 -865 -865 126 83 -20 -865 -865 141 -865 -865 -865 183 -865 -865 -865 183 -865 17 126 -17 -865 217 -865 -865 -20 17 126 -865 180 -865 -865 -865 -865 -865 -865 183 -865 117 126 -865 -865 -865 225 -865 -865 217 -865 -865 -20 -865 184 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 5.2e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.250000 0.500000 0.250000 0.000000 0.250000 0.000000 0.750000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.250000 0.000000 0.250000 0.000000 0.000000 0.500000 0.500000 0.250000 0.000000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.500000 0.250000 0.000000 1.000000 0.000000 0.000000 0.250000 0.250000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[TG][GCT][TC][CG]G[ACT][GT][TA]TT[GCT]C[GAC]AT[CG]GC[GA] -------------------------------------------------------------------------------- Time 3.99 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 9 llr = 98 E-value = 1.1e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 3:4:3::::::2 pos.-specific C 7467:::3::a1 probability G :6:1::::a1:3 matrix T :::27aa7:9:3 bits 2.3 * * 2.0 * * 1.8 ** * * 1.6 ** * * Relative 1.4 ** *** Entropy 1.1 ** ****** (15.8 bits) 0.9 *********** 0.7 *********** 0.5 *********** 0.2 *********** 0.0 ------------ Multilevel CGCCTTTTGTCG consensus ACATA C T sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 36006 395 2.13e-07 CCACTCGACC CGCCTTTCGTCG AATATGCAAC 33873 401 2.69e-07 AGAATTTACT CCCCTTTTGTCT TCCCCAGCCT 9890 283 5.57e-07 TACTCATGAG CGCCTTTCGTCT CATATTTTAA 50098 483 5.25e-06 TTGATTCATT CCACATTTGTCA GTTAGA 50062 59 5.62e-06 CAAATCCTGA CGATTTTCGTCG AAAAACAATA 43466 148 6.25e-06 GTATTTCCGT ACCCTTTTGTCC AGATCTTCTG 39484 119 7.66e-06 CTGGTAATTT CGATATTTGTCG TTGGGTATGA 46822 159 1.67e-05 ATGAGTATAC AGCCTTTTGGCA TCGCCTGTTC 33383 332 2.67e-05 GCGATCTTAT ACAGATTTGTCT GTCTAAGAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36006 2.1e-07 394_[+3]_94 33873 2.7e-07 400_[+3]_88 9890 5.6e-07 282_[+3]_206 50098 5.3e-06 482_[+3]_6 50062 5.6e-06 58_[+3]_430 43466 6.3e-06 147_[+3]_341 39484 7.7e-06 118_[+3]_370 46822 1.7e-05 158_[+3]_330 33383 2.7e-05 331_[+3]_157 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=9 36006 ( 395) CGCCTTTCGTCG 1 33873 ( 401) CCCCTTTTGTCT 1 9890 ( 283) CGCCTTTCGTCT 1 50098 ( 483) CCACATTTGTCA 1 50062 ( 59) CGATTTTCGTCG 1 43466 ( 148) ACCCTTTTGTCC 1 39484 ( 119) CGATATTTGTCG 1 46822 ( 159) AGCCTTTTGGCA 1 33383 ( 332) ACAGATTTGTCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 10.4181 E= 1.1e+003 21 158 -982 -982 -982 100 141 -982 63 132 -982 -982 -982 158 -91 -34 21 -982 -982 124 -982 -982 -982 183 -982 -982 -982 183 -982 58 -982 124 -982 -982 226 -982 -982 -982 -91 166 -982 217 -982 -982 -37 -100 67 25 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 1.1e+003 0.333333 0.666667 0.000000 0.000000 0.000000 0.444444 0.555556 0.000000 0.444444 0.555556 0.000000 0.000000 0.000000 0.666667 0.111111 0.222222 0.333333 0.000000 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.000000 1.000000 0.000000 0.000000 0.222222 0.111111 0.333333 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA][GC][CA][CT][TA]TT[TC]GTC[GTA] -------------------------------------------------------------------------------- Time 5.93 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9890 4.58e-05 21_[+1(3.78e-06)]_249_\ [+3(5.57e-07)]_206 43466 3.61e-02 147_[+3(6.25e-06)]_341 49221 1.02e-02 447_[+1(1.88e-06)]_41 50098 2.70e-05 354_[+1(2.96e-07)]_116_\ [+3(5.25e-06)]_6 33873 6.35e-12 226_[+1(9.34e-07)]_162_\ [+3(2.69e-07)]_18_[+2(4.01e-10)]_50 36006 2.38e-09 1_[+2(1.74e-10)]_373_[+3(2.13e-07)]_\ 94 49454 1.68e-03 281_[+1(1.12e-06)]_207 46822 2.82e-07 158_[+3(1.67e-05)]_150_\ [+2(8.89e-10)]_160 46898 5.92e-01 500 34611 6.85e-01 500 47298 2.43e-02 25_[+1(9.95e-05)]_371_\ [+1(3.78e-06)]_80 33383 9.92e-02 331_[+3(2.67e-05)]_157 50062 2.60e-02 58_[+3(5.62e-06)]_430 39484 4.63e-10 118_[+3(7.66e-06)]_129_\ [+1(1.88e-06)]_140_[+2(6.87e-10)]_69 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************