******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/198/198.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 32120 1.0000 500 37723 1.0000 500 48930 1.0000 500 54992 1.0000 500 50274 1.0000 500 16922 1.0000 500 26190 1.0000 500 4216 1.0000 500 45928 1.0000 500 35151 1.0000 500 43742 1.0000 500 45014 1.0000 500 38545 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/198/198.seqs.fa -oc motifs/198 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.245 C 0.255 G 0.235 T 0.265 Background letter frequencies (from dataset with add-one prior applied): A 0.245 C 0.255 G 0.235 T 0.265 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 13 llr = 150 E-value = 3.2e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 4:92::8:52::6312 pos.-specific C :9::221a1121:133 probability G 61:::82::329:2:5 matrix T ::1881::546:456: bits 2.1 1.9 * 1.7 ** * * 1.5 ** * * Relative 1.3 **** * * Entropy 1.0 ******** ** (16.6 bits) 0.8 ******** ** 0.6 ********* *** ** 0.4 ********* *** ** 0.2 **************** 0.0 ---------------- Multilevel GCATTGACATTGATTG consensus A A TGG TACC sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 54992 383 5.15e-10 CCTTCGTGGC GCATTGACATTGATTG CGATTCGATT 35151 447 1.45e-09 TCGGAGAGGG GCATTGACTTTGAATG AGAAGAAAGA 16922 175 3.01e-07 CAGAGTTGTG ACATTGACTTTGAAAG GGAACTGTAA 37723 141 8.03e-07 CAGGCGCATC GCATCGGCTATGATTG GCTCTACTGG 43742 357 1.07e-06 TTTTCCTCAG ACAATGACAAGGTATG CTCGATTGTG 48930 230 1.56e-06 GCCCCTGTTG GCATTGCCTTTGTTCC AGTGACACTA 45928 403 3.32e-06 CCAATCAACT GCATTCACTGTCATTC ATGTGACCGT 38545 453 4.17e-06 CTCTTCCTCG GCATTGACACTGTCCC CGTTGCACGT 26190 137 6.04e-06 GTCGTTTGTT GCTTCGACAGCGATTG TACAACTACG 45014 306 7.44e-06 TCCTAGTAAA AGATTGACAGCGTGTG TGTGTGCGCG 4216 250 7.44e-06 TAAAGCACAG ACAATGACAAGGAGCA AATCAAACAC 32120 312 8.51e-06 ATATTGATAT GCATTTGCTTTGTATA TAACTCCGTC 50274 476 2.21e-05 GCACGGACAC ACAATCACCGGGATCC TCGACGACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54992 5.2e-10 382_[+1]_102 35151 1.4e-09 446_[+1]_38 16922 3e-07 174_[+1]_310 37723 8e-07 140_[+1]_344 43742 1.1e-06 356_[+1]_128 48930 1.6e-06 229_[+1]_255 45928 3.3e-06 402_[+1]_82 38545 4.2e-06 452_[+1]_32 26190 6e-06 136_[+1]_348 45014 7.4e-06 305_[+1]_179 4216 7.4e-06 249_[+1]_235 32120 8.5e-06 311_[+1]_173 50274 2.2e-05 475_[+1]_9 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=13 54992 ( 383) GCATTGACATTGATTG 1 35151 ( 447) GCATTGACTTTGAATG 1 16922 ( 175) ACATTGACTTTGAAAG 1 37723 ( 141) GCATCGGCTATGATTG 1 43742 ( 357) ACAATGACAAGGTATG 1 48930 ( 230) GCATTGCCTTTGTTCC 1 45928 ( 403) GCATTCACTGTCATTC 1 38545 ( 453) GCATTGACACTGTCCC 1 26190 ( 137) GCTTCGACAGCGATTG 1 45014 ( 306) AGATTGACAGCGTGTG 1 4216 ( 250) ACAATGACAAGGAGCA 1 32120 ( 312) GCATTTGCTTTGTATA 1 50274 ( 476) ACAATCACCGGGATCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 9.45029 E= 3.2e-003 65 -1035 139 -1035 -1035 186 -161 -1035 191 -1035 -1035 -178 -9 -1035 -1035 154 -1035 -73 -1035 167 -1035 -73 171 -178 165 -172 -61 -1035 -1035 197 -1035 -1035 91 -172 -1035 80 -9 -172 39 54 -1035 -73 -3 121 -1035 -172 197 -1035 133 -1035 -1035 54 33 -172 -61 80 -167 27 -1035 121 -67 27 120 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 13 E= 3.2e-003 0.384615 0.000000 0.615385 0.000000 0.000000 0.923077 0.076923 0.000000 0.923077 0.000000 0.000000 0.076923 0.230769 0.000000 0.000000 0.769231 0.000000 0.153846 0.000000 0.846154 0.000000 0.153846 0.769231 0.076923 0.769231 0.076923 0.153846 0.000000 0.000000 1.000000 0.000000 0.000000 0.461538 0.076923 0.000000 0.461538 0.230769 0.076923 0.307692 0.384615 0.000000 0.153846 0.230769 0.615385 0.000000 0.076923 0.923077 0.000000 0.615385 0.000000 0.000000 0.384615 0.307692 0.076923 0.153846 0.461538 0.076923 0.307692 0.000000 0.615385 0.153846 0.307692 0.538462 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA]CA[TA]TGAC[AT][TGA][TG]G[AT][TA][TC][GC] -------------------------------------------------------------------------------- Time 1.44 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 8 llr = 122 E-value = 7.2e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A a355::5:6:13:146::: pos.-specific C :64:9558:a::8:41a:: probability G ::1515:11:953911:a5 matrix T :1:::::13::3::11::5 bits 2.1 * * 1.9 * * ** 1.7 * * ** 1.5 * * ** * ** Relative 1.3 * * ** ** ** Entropy 1.0 * **** ** ** *** (22.0 bits) 0.8 * ***** ** ** *** 0.6 ************** *** 0.4 ************** **** 0.2 ******************* 0.0 ------------------- Multilevel ACAACCACACGGCGAACGG consensus ACG GC T AG C T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 48930 111 1.07e-09 CCAGCCGGCG AACGCCCCACGGCGCACGG CACCCGAAGC 45928 69 8.30e-09 CTGCCTTTTA ATCGCGACACGACGCACGG CAACGAAGTA 37723 333 1.92e-08 CGGTGCACCC ACCGCGACACGGGAAACGG AACGCACCCC 4216 72 2.70e-08 GTATATTAGG AAAACCCCACGTCGAGCGT CTGTCATATG 26190 400 2.70e-08 GATGCCTTTT ACAACGACGCGTCGGACGT CTCGAAGTCC 35151 422 7.90e-08 ACCAGCCTGA ACAACGATTCGGCGCTCGG AGAGGGGCAT 16922 11 3.11e-07 TGACCTGTTT ACAAGCCCACAGGGTACGT AGCTGGCCGA 38545 35 4.56e-07 GAACGCGGAT ACGGCCCGTCGACGACCGT TACAGTTACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48930 1.1e-09 110_[+2]_371 45928 8.3e-09 68_[+2]_413 37723 1.9e-08 332_[+2]_149 4216 2.7e-08 71_[+2]_410 26190 2.7e-08 399_[+2]_82 35151 7.9e-08 421_[+2]_60 16922 3.1e-07 10_[+2]_471 38545 4.6e-07 34_[+2]_447 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=8 48930 ( 111) AACGCCCCACGGCGCACGG 1 45928 ( 69) ATCGCGACACGACGCACGG 1 37723 ( 333) ACCGCGACACGGGAAACGG 1 4216 ( 72) AAAACCCCACGTCGAGCGT 1 26190 ( 400) ACAACGACGCGTCGGACGT 1 35151 ( 422) ACAACGATTCGGCGCTCGG 1 16922 ( 11) ACAAGCCCACAGGGTACGT 1 38545 ( 35) ACGGCCCGTCGACGACCGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 6266 bayes= 10.3492 E= 7.2e-001 203 -965 -965 -965 3 129 -965 -108 103 56 -91 -965 103 -965 109 -965 -965 178 -91 -965 -965 97 109 -965 103 97 -965 -965 -965 156 -91 -108 135 -965 -91 -9 -965 197 -965 -965 -97 -965 190 -965 3 -965 109 -9 -965 156 9 -965 -97 -965 190 -965 61 56 -91 -108 135 -102 -91 -108 -965 197 -965 -965 -965 -965 209 -965 -965 -965 109 91 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 8 E= 7.2e-001 1.000000 0.000000 0.000000 0.000000 0.250000 0.625000 0.000000 0.125000 0.500000 0.375000 0.125000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.500000 0.500000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.750000 0.125000 0.125000 0.625000 0.000000 0.125000 0.250000 0.000000 1.000000 0.000000 0.000000 0.125000 0.000000 0.875000 0.000000 0.250000 0.000000 0.500000 0.250000 0.000000 0.750000 0.250000 0.000000 0.125000 0.000000 0.875000 0.000000 0.375000 0.375000 0.125000 0.125000 0.625000 0.125000 0.125000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[CA][AC][AG]C[CG][AC]C[AT]CG[GAT][CG]G[AC]ACG[GT] -------------------------------------------------------------------------------- Time 2.84 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 105 E-value = 9.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 3:87::a7287322:2:88:: pos.-specific C 7a2::2::8:2:2::88:::: probability G :::2:3:3:22255::22:23 matrix T :::2a5:::::523a:::287 bits 2.1 * 1.9 * * * * 1.7 * * * * 1.5 ** * * * * ** Relative 1.3 ** * * ** ****** Entropy 1.0 *** * **** ******* (25.3 bits) 0.8 ***** ***** ******* 0.6 ************ ******** 0.4 ************ ******** 0.2 ********************* 0.0 --------------------- Multilevel CCAATTAACAATGGTCCAATT consensus A G G A T G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 32120 125 1.88e-11 TGGGGTTTTT CCAATGAGCAATGGTCCAATG CCTTGTCCTC 38545 272 7.31e-11 TACGACTTTG CCAATTAACAAAGTTCGAATT TCCGACATGA 35151 211 7.18e-09 GACGGATTTG ACAGTTAGAAATGGTCCAAGT TTTCCTCGTC 45928 310 1.59e-08 ACAGGTGGAA CCAATGAACGGTTTTACAATT CCCGGTTCTA 48930 432 1.93e-08 CGCGGAAGAC CCAATTAACACGAGTCCGTTT CTCACTACAC 26190 2 4.82e-08 G ACCTTCAACAAACATCCAATG CACTATCGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32120 1.9e-11 124_[+3]_355 38545 7.3e-11 271_[+3]_208 35151 7.2e-09 210_[+3]_269 45928 1.6e-08 309_[+3]_170 48930 1.9e-08 431_[+3]_48 26190 4.8e-08 1_[+3]_478 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 32120 ( 125) CCAATGAGCAATGGTCCAATG 1 38545 ( 272) CCAATTAACAAAGTTCGAATT 1 35151 ( 211) ACAGTTAGAAATGGTCCAAGT 1 45928 ( 310) CCAATGAACGGTTTTACAATT 1 48930 ( 432) CCAATTAACACGAGTCCGTTT 1 26190 ( 2) ACCTTCAACAAACATCCAATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6240 bayes= 9.67957 E= 9.2e+002 44 139 -923 -923 -923 197 -923 -923 176 -61 -923 -923 144 -923 -49 -67 -923 -923 -923 191 -923 -61 50 91 203 -923 -923 -923 144 -923 50 -923 -56 171 -923 -923 176 -923 -49 -923 144 -61 -49 -923 44 -923 -49 91 -56 -61 109 -67 -56 -923 109 33 -923 -923 -923 191 -56 171 -923 -923 -923 171 -49 -923 176 -923 -49 -923 176 -923 -923 -67 -923 -923 -49 165 -923 -923 50 133 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 9.2e+002 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.666667 0.000000 0.166667 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.333333 0.500000 1.000000 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.166667 0.833333 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.666667 0.166667 0.166667 0.000000 0.333333 0.000000 0.166667 0.500000 0.166667 0.166667 0.500000 0.166667 0.166667 0.000000 0.500000 0.333333 0.000000 0.000000 0.000000 1.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.333333 0.666667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA]CAAT[TG]A[AG]CAA[TA]G[GT]TCCAAT[TG] -------------------------------------------------------------------------------- Time 4.16 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32120 1.02e-08 124_[+3(1.88e-11)]_166_\ [+1(8.51e-06)]_173 37723 2.08e-07 140_[+1(8.03e-07)]_176_\ [+2(1.92e-08)]_149 48930 2.12e-12 65_[+3(4.53e-05)]_24_[+2(1.07e-09)]_\ 100_[+1(1.56e-06)]_186_[+3(1.93e-08)]_48 54992 1.48e-06 153_[+3(8.56e-05)]_208_\ [+1(5.15e-10)]_102 50274 2.75e-02 475_[+1(2.21e-05)]_9 16922 2.62e-06 10_[+2(3.11e-07)]_145_\ [+1(3.01e-07)]_310 26190 3.65e-10 1_[+3(4.82e-08)]_114_[+1(6.04e-06)]_\ 247_[+2(2.70e-08)]_82 4216 6.39e-06 71_[+2(2.70e-08)]_159_\ [+1(7.44e-06)]_235 45928 2.46e-11 68_[+2(8.30e-09)]_222_\ [+3(1.59e-08)]_72_[+1(3.32e-06)]_82 35151 6.63e-14 210_[+3(7.18e-09)]_94_\ [+1(5.63e-06)]_80_[+2(7.90e-08)]_6_[+1(1.45e-09)]_38 43742 1.36e-02 356_[+1(1.07e-06)]_128 45014 4.62e-02 305_[+1(7.44e-06)]_179 38545 8.38e-12 34_[+2(4.56e-07)]_218_\ [+3(7.31e-11)]_160_[+1(4.17e-06)]_32 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************