******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/185/185.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1044 1.0000 500 11490 1.0000 500 23259 1.0000 500 24686 1.0000 500 3107 1.0000 500 32093 1.0000 500 33806 1.0000 500 3454 1.0000 500 34996 1.0000 500 36775 1.0000 500 4302 1.0000 500 6299 1.0000 500 7523 1.0000 500 7810 1.0000 500 8866 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/185/185.seqs.fa -oc motifs/185 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.254 C 0.234 G 0.253 T 0.259 Background letter frequencies (from dataset with add-one prior applied): A 0.254 C 0.234 G 0.253 T 0.259 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 10 llr = 158 E-value = 3.9e-005 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :1:73:8:2:51::93:313: pos.-specific C 6891:82854182912a47:9 probability G :::1:2:::6::4::1::::: matrix T 41117::23:4141:4:3271 bits 2.1 * 1.9 * 1.7 * * * * 1.5 * ** * * Relative 1.3 * *** ** * * Entropy 1.0 *** **** * * ** * ** (22.8 bits) 0.8 *** **** * * ** * *** 0.6 ************ ** * *** 0.4 *************** ***** 0.2 ********************* 0.0 --------------------- Multilevel CCCATCACCGACGCATCCCTC consensus T AGCTTCT T A ATA sequence A C C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 33806 121 1.90e-09 CGCCTCCACA CCCATCACCGTCGCAGCCATC ACCATGACCC 7523 441 2.67e-09 AAGCTACACT TCCATCACCGACTCATCTCAT CACAAGCCCA 36775 24 3.32e-09 CCGATACAAA TCCAACACCGCCTCATCCTTC ACCTTGGACG 34996 313 5.04e-09 CGAACCCCTT CCCAACACTGAAGCACCACTC CACTGACCAT 3107 243 8.29e-09 AGAGCAGCCG CCTATCACTCACTCAACCTTC GCTAGTACCA 7810 464 1.21e-08 ACCGCTTCAA CCCGTCACACACCCAACACAC ATCTACCACC 1044 412 4.06e-08 TGCGACGCTC CACAAGATCGTCGCATCTCTC ACGTGCGAAC 3454 387 9.99e-08 ACTCTTCCGT TCCATGCCCCACCTACCTCTC TCTCTGAGAC 32093 289 2.66e-07 AATCGTCCAT TTCTTCACAGTTGCAACACTC ACCAGTACCC 4302 425 2.83e-07 CTGCCGACCT CCCCTCCTTCTCTCCTCCCAC GAGACAGACG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33806 1.9e-09 120_[+1]_359 7523 2.7e-09 440_[+1]_39 36775 3.3e-09 23_[+1]_456 34996 5e-09 312_[+1]_167 3107 8.3e-09 242_[+1]_237 7810 1.2e-08 463_[+1]_16 1044 4.1e-08 411_[+1]_68 3454 1e-07 386_[+1]_93 32093 2.7e-07 288_[+1]_191 4302 2.8e-07 424_[+1]_55 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=10 33806 ( 121) CCCATCACCGTCGCAGCCATC 1 7523 ( 441) TCCATCACCGACTCATCTCAT 1 36775 ( 24) TCCAACACCGCCTCATCCTTC 1 34996 ( 313) CCCAACACTGAAGCACCACTC 1 3107 ( 243) CCTATCACTCACTCAACCTTC 1 7810 ( 464) CCCGTCACACACCCAACACAC 1 1044 ( 412) CACAAGATCGTCGCATCTCTC 1 3454 ( 387) TCCATGCCCCACCTACCTCTC 1 32093 ( 289) TTCTTCACAGTTGCAACACTC 1 4302 ( 425) CCCCTCCTTCTCTCCTCCCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 9.74171 E= 3.9e-005 -997 136 -997 63 -134 177 -997 -137 -997 194 -997 -137 146 -122 -134 -137 24 -997 -997 143 -997 177 -34 -997 165 -22 -997 -997 -997 177 -997 -37 -34 110 -997 21 -997 77 124 -997 98 -122 -997 63 -134 177 -997 -137 -997 -22 66 63 -997 194 -997 -137 182 -122 -997 -997 24 -22 -134 63 -997 210 -997 -997 24 77 -997 21 -134 158 -997 -37 24 -997 -997 143 -997 194 -997 -137 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 3.9e-005 0.000000 0.600000 0.000000 0.400000 0.100000 0.800000 0.000000 0.100000 0.000000 0.900000 0.000000 0.100000 0.700000 0.100000 0.100000 0.100000 0.300000 0.000000 0.000000 0.700000 0.000000 0.800000 0.200000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.200000 0.500000 0.000000 0.300000 0.000000 0.400000 0.600000 0.000000 0.500000 0.100000 0.000000 0.400000 0.100000 0.800000 0.000000 0.100000 0.000000 0.200000 0.400000 0.400000 0.000000 0.900000 0.000000 0.100000 0.900000 0.100000 0.000000 0.000000 0.300000 0.200000 0.100000 0.400000 0.000000 1.000000 0.000000 0.000000 0.300000 0.400000 0.000000 0.300000 0.100000 0.700000 0.000000 0.200000 0.300000 0.000000 0.000000 0.700000 0.000000 0.900000 0.000000 0.100000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CT]CCA[TA][CG][AC][CT][CTA][GC][AT]C[GTC]CA[TAC]C[CAT][CT][TA]C -------------------------------------------------------------------------------- Time 2.19 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 11 llr = 133 E-value = 1.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 2:12:::1:5:12:: pos.-specific C ::::1::::22:4:1 probability G 187:97:61189:59 matrix T 7228:3a392::55: bits 2.1 1.9 * 1.7 * 1.5 * * * * * Relative 1.3 * ** * * ** * Entropy 1.0 * **** * ** ** (17.4 bits) 0.8 ******* * ** ** 0.6 ********* ** ** 0.4 ********* ***** 0.2 *************** 0.0 --------------- Multilevel TGGTGGTGTAGGTGG consensus T T CT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 6299 171 1.95e-08 CAGCCGTTGC TGGTGGTGTCGGCGG CATTGTCGTT 23259 197 6.16e-08 GGCTCGCTCA TGGTGTTTTAGGTGG TGTGAGCGAG 36775 391 1.24e-07 GGGGAGGACA TGGTGGTGTACGAGG TGTGGTGGAG 4302 103 3.69e-07 GAGCCTCCGA TGTAGGTGTAGGCTG TGTTGCTGCT 34996 25 5.34e-07 GGTGGGGAAT TGGTGGTGGCGGCGG GGGTGGAGAG 8866 197 9.33e-07 TCGTTCAAGA ATGTGGTGTTGGTGG CTTTGGATTG 24686 153 2.00e-06 TCGGTCGTCG TGAAGTTGTAGGTTG TGTAGCGTGG 7810 100 2.74e-06 TTCGCTTGTC GTTTGGTGTAGGTTG GAAGTTCTAG 7523 163 3.13e-06 TCCTATAGAA TGGTCTTTTTGGTGG TTGTAGAAGC 32093 141 8.94e-06 TGATGGGAAG AGGTGGTTTGGGCTC AGGGTTGCGG 11490 210 1.00e-05 CTTTGAAGGA TGGTGGTATACAATG TAGAGAGGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 6299 2e-08 170_[+2]_315 23259 6.2e-08 196_[+2]_289 36775 1.2e-07 390_[+2]_95 4302 3.7e-07 102_[+2]_383 34996 5.3e-07 24_[+2]_461 8866 9.3e-07 196_[+2]_289 24686 2e-06 152_[+2]_333 7810 2.7e-06 99_[+2]_386 7523 3.1e-06 162_[+2]_323 32093 8.9e-06 140_[+2]_345 11490 1e-05 209_[+2]_276 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=11 6299 ( 171) TGGTGGTGTCGGCGG 1 23259 ( 197) TGGTGTTTTAGGTGG 1 36775 ( 391) TGGTGGTGTACGAGG 1 4302 ( 103) TGTAGGTGTAGGCTG 1 34996 ( 25) TGGTGGTGGCGGCGG 1 8866 ( 197) ATGTGGTGTTGGTGG 1 24686 ( 153) TGAAGTTGTAGGTTG 1 7810 ( 100) GTTTGGTGTAGGTTG 1 7523 ( 163) TGGTCTTTTTGGTGG 1 32093 ( 141) AGGTGGTTTGGGCTC 1 11490 ( 210) TGGTGGTATACAATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 7290 bayes= 9.72566 E= 1.8e+000 -48 -1010 -148 149 -1010 -1010 169 -51 -148 -1010 152 -51 -48 -1010 -1010 166 -1010 -136 184 -1010 -1010 -1010 152 8 -1010 -1010 -1010 195 -148 -1010 133 8 -1010 -1010 -148 181 110 -36 -148 -51 -1010 -36 169 -1010 -148 -1010 184 -1010 -48 64 -1010 81 -1010 -1010 110 81 -1010 -136 184 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 11 E= 1.8e+000 0.181818 0.000000 0.090909 0.727273 0.000000 0.000000 0.818182 0.181818 0.090909 0.000000 0.727273 0.181818 0.181818 0.000000 0.000000 0.818182 0.000000 0.090909 0.909091 0.000000 0.000000 0.000000 0.727273 0.272727 0.000000 0.000000 0.000000 1.000000 0.090909 0.000000 0.636364 0.272727 0.000000 0.000000 0.090909 0.909091 0.545455 0.181818 0.090909 0.181818 0.000000 0.181818 0.818182 0.000000 0.090909 0.000000 0.909091 0.000000 0.181818 0.363636 0.000000 0.454545 0.000000 0.000000 0.545455 0.454545 0.000000 0.090909 0.909091 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TGGTG[GT]T[GT]TAGG[TC][GT]G -------------------------------------------------------------------------------- Time 4.14 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 8 llr = 113 E-value = 5.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 4:19:945a635::3: pos.-specific C :59:a155:38:a:3a probability G :::::::::::1:4:: matrix T 65:1::1::1:4:65: bits 2.1 * * * 1.9 * * * * 1.7 * * * * 1.5 **** * * * Relative 1.3 **** * * * * Entropy 1.0 ****** ** * ** * (20.4 bits) 0.8 ****** ** * ** * 0.6 ************** * 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TCCACACAAACACTTC consensus AT AC CAT GA sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 34996 383 2.04e-09 TCCCTCTGAG TTCACACAAACTCTTC CAGCAACACA 33806 340 2.21e-08 CCCCCCTCTC ACCACACCACCACTTC AATCGCTCAA 7523 50 3.81e-08 TCCAATCGTC ATCACAACAACTCGTC GGCAGACTCA 23259 485 1.15e-07 TCTTCACTTC TTCACACAAAAACGAC 1044 361 2.20e-07 AGACGGAGCA ACCACAACATCACTCC TGGCGTTTGT 3454 248 3.24e-07 GAAGGAGTGT TCAACAACAACACGCC ACTGATTGAG 36775 457 6.58e-07 CGATGCGGGT TTCTCACAACCGCTTC CACTTCACCC 32093 478 1.29e-06 GATACGCCCC TCCACCTAAAATCTAC AATCCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34996 2e-09 382_[+3]_102 33806 2.2e-08 339_[+3]_145 7523 3.8e-08 49_[+3]_435 23259 1.2e-07 484_[+3] 1044 2.2e-07 360_[+3]_124 3454 3.2e-07 247_[+3]_237 36775 6.6e-07 456_[+3]_28 32093 1.3e-06 477_[+3]_7 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=8 34996 ( 383) TTCACACAAACTCTTC 1 33806 ( 340) ACCACACCACCACTTC 1 7523 ( 50) ATCACAACAACTCGTC 1 23259 ( 485) TTCACACAAAAACGAC 1 1044 ( 361) ACCACAACATCACTCC 1 3454 ( 248) TCAACAACAACACGCC 1 36775 ( 457) TTCTCACAACCGCTTC 1 32093 ( 478) TCCACCTAAAATCTAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7275 bayes= 9.82714 E= 5.8e+000 56 -965 -965 127 -965 110 -965 95 -102 190 -965 -965 178 -965 -965 -105 -965 210 -965 -965 178 -90 -965 -965 56 110 -965 -105 98 110 -965 -965 198 -965 -965 -965 130 10 -965 -105 -2 168 -965 -965 98 -965 -102 53 -965 210 -965 -965 -965 -965 56 127 -2 10 -965 95 -965 210 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 5.8e+000 0.375000 0.000000 0.000000 0.625000 0.000000 0.500000 0.000000 0.500000 0.125000 0.875000 0.000000 0.000000 0.875000 0.000000 0.000000 0.125000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.375000 0.500000 0.000000 0.125000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.625000 0.250000 0.000000 0.125000 0.250000 0.750000 0.000000 0.000000 0.500000 0.000000 0.125000 0.375000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.375000 0.625000 0.250000 0.250000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TA][CT]CACA[CA][AC]A[AC][CA][AT]C[TG][TAC]C -------------------------------------------------------------------------------- Time 6.00 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1044 4.33e-07 360_[+3(2.20e-07)]_35_\ [+1(4.06e-08)]_68 11490 1.23e-02 209_[+2(1.00e-05)]_276 23259 1.78e-07 196_[+2(6.16e-08)]_273_\ [+3(1.15e-07)] 24686 3.08e-03 152_[+2(2.00e-06)]_333 3107 9.77e-05 242_[+1(8.29e-09)]_237 32093 9.02e-08 140_[+2(8.94e-06)]_133_\ [+1(2.66e-07)]_168_[+3(1.29e-06)]_7 33806 3.25e-09 120_[+1(1.90e-09)]_198_\ [+3(2.21e-08)]_145 3454 9.59e-07 247_[+3(3.24e-07)]_123_\ [+1(9.99e-08)]_93 34996 4.03e-13 24_[+2(5.34e-07)]_273_\ [+1(5.04e-09)]_49_[+3(2.04e-09)]_102 36775 1.59e-11 23_[+1(3.32e-09)]_346_\ [+2(1.24e-07)]_51_[+3(6.58e-07)]_28 4302 2.16e-06 102_[+2(3.69e-07)]_307_\ [+1(2.83e-07)]_55 6299 2.73e-04 170_[+2(1.95e-08)]_30_\ [+2(5.70e-05)]_270 7523 1.84e-11 49_[+3(3.81e-08)]_97_[+2(3.13e-06)]_\ 263_[+1(2.67e-09)]_39 7810 2.96e-07 99_[+2(2.74e-06)]_95_[+2(8.07e-05)]_\ 239_[+1(1.21e-08)]_16 8866 6.09e-03 196_[+2(9.33e-07)]_289 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************