******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/492/492.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42836 1.0000 500 51406 1.0000 500 3832 1.0000 500 36575 1.0000 500 5405 1.0000 500 43655 1.0000 500 49400 1.0000 500 50318 1.0000 500 33610 1.0000 500 45427 1.0000 500 19979 1.0000 500 46169 1.0000 500 48305 1.0000 500 46951 1.0000 500 43741 1.0000 500 43651 1.0000 500 36598 1.0000 500 30909 1.0000 500 49267 1.0000 500 44781 1.0000 500 41008 1.0000 500 45426 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/492/492.seqs.fa -oc motifs/492 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 22 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 11000 N= 22 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.264 C 0.249 G 0.230 T 0.257 Background letter frequencies (from dataset with add-one prior applied): A 0.264 C 0.249 G 0.230 T 0.257 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 13 llr = 155 E-value = 7.1e-007 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :18:6:2:92:: pos.-specific C :9:a:::a::55 probability G ::2:1a::1824 matrix T a:::3:8:::42 bits 2.1 * 1.9 * * * * 1.7 ** * * * 1.5 ** * * *** Relative 1.3 **** ***** Entropy 1.1 **** ***** (17.2 bits) 0.8 **** ***** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TCACAGTCAGCC consensus T TG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 49267 318 1.82e-07 TACTCTCACG TCACAGTCAGTC AGCTAGAGTC 30909 293 1.82e-07 GATAGTGACG TCACAGTCAGTC ACGAACATCC 43741 288 1.82e-07 CCCTATTCCC TCACAGTCAGTC AACGATCGCG 46951 92 1.82e-07 TGTTCGTTTT TCACAGTCAGTC CGACCCATCG 19979 241 9.90e-07 CTCCAAAGTG TCGCAGTCAGTC GGATTTGGCA 43651 187 1.28e-06 AGAGACTCAC TCACTGTCAGGG TGAGTTTACA 33610 247 1.28e-06 CCATTCACAC TCACAGTCAACC CAGCAACTTA 50318 335 1.28e-06 TATAACTGTA TCACTGTCAGCT TCCTACATTA 49400 270 1.41e-06 TCACATTACA TCACAGTCAACG AGAACTCAAA 41008 49 4.11e-06 GTTGGTTGCT TCACGGTCAGCT CTCGGACGGG 51406 291 5.73e-06 GGAGTCGGTG TCACAGTCGGGG AGCGATGACG 44781 191 7.60e-06 CGAGGGTTTT TCGCTGACAGCG AATCAGCTTT 42836 356 1.30e-05 GCTCTCATCG TAACTGACAGCG AGGACAGCTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49267 1.8e-07 317_[+1]_171 30909 1.8e-07 292_[+1]_196 43741 1.8e-07 287_[+1]_201 46951 1.8e-07 91_[+1]_397 19979 9.9e-07 240_[+1]_248 43651 1.3e-06 186_[+1]_302 33610 1.3e-06 246_[+1]_242 50318 1.3e-06 334_[+1]_154 49400 1.4e-06 269_[+1]_219 41008 4.1e-06 48_[+1]_440 51406 5.7e-06 290_[+1]_198 44781 7.6e-06 190_[+1]_298 42836 1.3e-05 355_[+1]_133 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=13 49267 ( 318) TCACAGTCAGTC 1 30909 ( 293) TCACAGTCAGTC 1 43741 ( 288) TCACAGTCAGTC 1 46951 ( 92) TCACAGTCAGTC 1 19979 ( 241) TCGCAGTCAGTC 1 43651 ( 187) TCACTGTCAGGG 1 33610 ( 247) TCACAGTCAACC 1 50318 ( 335) TCACTGTCAGCT 1 49400 ( 270) TCACAGTCAACG 1 41008 ( 49) TCACGGTCAGCT 1 51406 ( 291) TCACAGTCGGGG 1 44781 ( 191) TCGCTGACAGCG 1 42836 ( 356) TAACTGACAGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 10758 bayes= 10.222 E= 7.1e-007 -1035 -1035 -1035 196 -178 189 -1035 -1035 168 -1035 -58 -1035 -1035 201 -1035 -1035 122 -1035 -158 26 -1035 -1035 212 -1035 -78 -1035 -1035 172 -1035 201 -1035 -1035 181 -1035 -158 -1035 -78 -1035 188 -1035 -1035 89 -58 58 -1035 89 74 -74 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 7.1e-007 0.000000 0.000000 0.000000 1.000000 0.076923 0.923077 0.000000 0.000000 0.846154 0.000000 0.153846 0.000000 0.000000 1.000000 0.000000 0.000000 0.615385 0.000000 0.076923 0.307692 0.000000 0.000000 1.000000 0.000000 0.153846 0.000000 0.000000 0.846154 0.000000 1.000000 0.000000 0.000000 0.923077 0.000000 0.076923 0.000000 0.153846 0.000000 0.846154 0.000000 0.000000 0.461538 0.153846 0.384615 0.000000 0.461538 0.384615 0.153846 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TCAC[AT]GTCAG[CT][CG] -------------------------------------------------------------------------------- Time 4.05 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 12 llr = 134 E-value = 6.3e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::a:4:2:a32 pos.-specific C :3::a1:12:44 probability G ::9:::a28:3: matrix T a81::5:6:::4 bits 2.1 * 1.9 * ** * * 1.7 * *** * * 1.5 * *** * ** Relative 1.3 * *** * ** Entropy 1.1 ***** * ** (16.2 bits) 0.8 ***** * ** 0.6 ******* ** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTGACTGTGACC consensus C A AT sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43651 362 3.58e-07 CATGAAACGT TTGACTGTGAAC CAACGTTTTG 46169 32 4.66e-07 TAGGAACGAT TTGACTGTGAGC GATACATTTC 19979 68 4.66e-07 GTCGATTGGC TTGACTGTGAGT GGCTCAGTGG 49267 163 7.02e-07 ACATCAAATA TTGACAGTGAGC TTCCTGTCAA 36598 181 9.79e-07 GCCTGATGCA TTGACTGGGACT GCTTTTGCAA 30909 324 1.46e-06 CGTCAGAGAA TTGACTGTGAAA GAACTGTAAA 45427 413 6.64e-06 ATTGTATCTC TCGACAGGGAAT CTCTTGCTGT 41008 104 7.94e-06 TTTGCCCGAG TTGACCGTGACA ATTGGAGTTT 33610 149 1.05e-05 CGCAATCCCA TCGACAGCGACC AAGACGACAT 45426 284 1.12e-05 CAACTAGCAG TTGACAGACACC AGTATAAAGG 48305 46 1.12e-05 CGAGTGCGGG TTGACTGACAAT ATGCTCATAT 49400 79 1.47e-05 GATTTACTGT TCTACAGTGACT GACGTGAGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43651 3.6e-07 361_[+2]_127 46169 4.7e-07 31_[+2]_457 19979 4.7e-07 67_[+2]_421 49267 7e-07 162_[+2]_326 36598 9.8e-07 180_[+2]_308 30909 1.5e-06 323_[+2]_165 45427 6.6e-06 412_[+2]_76 41008 7.9e-06 103_[+2]_385 33610 1e-05 148_[+2]_340 45426 1.1e-05 283_[+2]_205 48305 1.1e-05 45_[+2]_443 49400 1.5e-05 78_[+2]_410 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=12 43651 ( 362) TTGACTGTGAAC 1 46169 ( 32) TTGACTGTGAGC 1 19979 ( 68) TTGACTGTGAGT 1 49267 ( 163) TTGACAGTGAGC 1 36598 ( 181) TTGACTGGGACT 1 30909 ( 324) TTGACTGTGAAA 1 45427 ( 413) TCGACAGGGAAT 1 41008 ( 104) TTGACCGTGACA 1 33610 ( 149) TCGACAGCGACC 1 45426 ( 284) TTGACAGACACC 1 48305 ( 46) TTGACTGACAAT 1 49400 ( 79) TCTACAGTGACT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 10758 bayes= 10.9069 E= 6.3e-001 -1023 -1023 -1023 196 -1023 1 -1023 154 -1023 -1023 199 -162 192 -1023 -1023 -1023 -1023 201 -1023 -1023 66 -158 -1023 96 -1023 -1023 212 -1023 -66 -158 -47 118 -1023 -58 186 -1023 192 -1023 -1023 -1023 34 74 12 -1023 -66 74 -1023 70 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 6.3e-001 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.916667 0.083333 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.416667 0.083333 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.166667 0.083333 0.166667 0.583333 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.416667 0.250000 0.000000 0.166667 0.416667 0.000000 0.416667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[TC]GAC[TA]GTGA[CAG][CT] -------------------------------------------------------------------------------- Time 8.10 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 20 llr = 179 E-value = 2.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::252:1:a4: pos.-specific C 217339:19:55 probability G :::6::821:1: matrix T 8a3:2:271:26 bits 2.1 1.9 * 1.7 * * 1.5 * * * Relative 1.3 ** ** ** Entropy 1.1 *** ** ** * (12.9 bits) 0.8 *** ** ** * 0.6 **** ** ** * 0.4 ********** * 0.2 ************ 0.0 ------------ Multilevel TTCGACGTCACT consensus C TCC T AC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 3832 432 6.18e-08 GCCTTGTGAG TTCGACGTCACT CCGTATTTGA 42836 298 1.31e-06 CGTCTGTCTA TTCCACGTCAAT CAAACTGTCA 43741 335 2.06e-06 GGGTTCAGAA TTCGACGGCACC TCACCTGAGA 30909 46 3.70e-06 AACAGTTTAT TTCACCGTCACT CGTTGCAAAA 45426 30 6.23e-06 TTTGTAGCAA TTCGCCGCCACT TACACGGTAC 49400 410 7.06e-06 TCTGGGTAAT TTCATCGTCACT CTCTGCAATC 36598 212 1.36e-05 AGGGCACTTC TTTGTCGTCATT GTCCACGTTT 46169 224 1.36e-05 GCGACACATC CTCCCCGTCACC CGTGAGGCAA 43655 106 1.36e-05 GACGGAAAAC TTCGACGTGACC ATAAGAAGGA 45427 274 2.68e-05 GCGAGCCGTT CTTCACGTCAAT CCCCTTTGTA 48305 424 2.96e-05 TTTTTGGACC CTCGACGACAAT CGCAATTTAC 46951 212 3.85e-05 TTGGTAGGGA TTCGACTCCAAC GTGAGTTGAC 49267 380 5.81e-05 ATGAAAAAAT TTCGTCTGCAAC CAAATCATCA 19979 425 7.31e-05 TTTTGCATCC TTTCCCGTTACC TCTGTCTCCC 50318 176 8.44e-05 ACACGCTCGA TTCCAATTCAAC AGTGTTACCG 5405 387 8.44e-05 ACCAGCAGAA TTTGAAGTTACT CATCTGACGC 44781 39 9.04e-05 GAGTAGCGAG TTTACCTTCAAC AACCCCAACA 33610 58 1.19e-04 TCACTTACCC TTTGCAGTCAGT ACATACCTGG 36575 137 1.27e-04 CGTTGTCGTT CTCGTCGACATT GTCTCGCGTG 51406 428 1.63e-04 ACGACTCCGC TCCGACGGCATC GACGAATGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3832 6.2e-08 431_[+3]_57 42836 1.3e-06 297_[+3]_191 43741 2.1e-06 334_[+3]_154 30909 3.7e-06 45_[+3]_443 45426 6.2e-06 29_[+3]_459 49400 7.1e-06 409_[+3]_79 36598 1.4e-05 211_[+3]_277 46169 1.4e-05 223_[+3]_265 43655 1.4e-05 105_[+3]_383 45427 2.7e-05 273_[+3]_215 48305 3e-05 423_[+3]_65 46951 3.9e-05 211_[+3]_277 49267 5.8e-05 379_[+3]_109 19979 7.3e-05 424_[+3]_64 50318 8.4e-05 175_[+3]_313 5405 8.4e-05 386_[+3]_102 44781 9e-05 38_[+3]_450 33610 0.00012 57_[+3]_431 36575 0.00013 136_[+3]_352 51406 0.00016 427_[+3]_61 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=20 3832 ( 432) TTCGACGTCACT 1 42836 ( 298) TTCCACGTCAAT 1 43741 ( 335) TTCGACGGCACC 1 30909 ( 46) TTCACCGTCACT 1 45426 ( 30) TTCGCCGCCACT 1 49400 ( 410) TTCATCGTCACT 1 36598 ( 212) TTTGTCGTCATT 1 46169 ( 224) CTCCCCGTCACC 1 43655 ( 106) TTCGACGTGACC 1 45427 ( 274) CTTCACGTCAAT 1 48305 ( 424) CTCGACGACAAT 1 46951 ( 212) TTCGACTCCAAC 1 49267 ( 380) TTCGTCTGCAAC 1 19979 ( 425) TTTCCCGTTACC 1 50318 ( 176) TTCCAATTCAAC 1 5405 ( 387) TTTGAAGTTACT 1 44781 ( 39) TTTACCTTCAAC 1 33610 ( 58) TTTGCAGTCAGT 1 36575 ( 137) CTCGTCGACATT 1 51406 ( 428) TCCGACGGCATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 10758 bayes= 9.32048 E= 2.4e+001 -1097 -31 -1097 164 -1097 -231 -1097 189 -1097 149 -1097 22 -81 1 138 -1097 92 27 -1097 -36 -81 177 -1097 -1097 -1097 -1097 180 -36 -140 -131 -62 134 -1097 177 -220 -136 192 -1097 -1097 -1097 41 85 -220 -78 -1097 85 -1097 110 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 20 E= 2.4e+001 0.000000 0.200000 0.000000 0.800000 0.000000 0.050000 0.000000 0.950000 0.000000 0.700000 0.000000 0.300000 0.150000 0.250000 0.600000 0.000000 0.500000 0.300000 0.000000 0.200000 0.150000 0.850000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.100000 0.100000 0.150000 0.650000 0.000000 0.850000 0.050000 0.100000 1.000000 0.000000 0.000000 0.000000 0.350000 0.450000 0.050000 0.150000 0.000000 0.450000 0.000000 0.550000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TC]T[CT][GC][ACT]C[GT]TCA[CA][TC] -------------------------------------------------------------------------------- Time 12.24 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42836 1.97e-04 297_[+3(1.31e-06)]_46_\ [+1(1.30e-05)]_133 51406 2.73e-03 290_[+1(5.73e-06)]_198 3832 7.07e-04 431_[+3(6.18e-08)]_57 36575 2.88e-01 500 5405 2.60e-01 386_[+3(8.44e-05)]_102 43655 6.60e-02 105_[+3(1.36e-05)]_383 49400 3.05e-06 78_[+2(1.47e-05)]_179_\ [+1(1.41e-06)]_128_[+3(7.06e-06)]_79 50318 1.20e-03 175_[+3(8.44e-05)]_147_\ [+1(1.28e-06)]_154 33610 2.49e-05 148_[+2(1.05e-05)]_86_\ [+1(1.28e-06)]_242 45427 1.10e-03 273_[+3(2.68e-05)]_127_\ [+2(6.64e-06)]_76 19979 8.05e-07 67_[+2(4.66e-07)]_161_\ [+1(9.90e-07)]_44_[+2(9.58e-05)]_116_[+3(7.31e-05)]_64 46169 3.84e-05 31_[+2(4.66e-07)]_180_\ [+3(1.36e-05)]_265 48305 7.82e-04 45_[+2(1.12e-05)]_366_\ [+3(2.96e-05)]_65 46951 2.00e-05 91_[+1(1.82e-07)]_108_\ [+3(3.85e-05)]_277 43741 9.59e-06 287_[+1(1.82e-07)]_35_\ [+3(2.06e-06)]_154 43651 1.55e-05 186_[+1(1.28e-06)]_163_\ [+2(3.58e-07)]_127 36598 1.01e-04 180_[+2(9.79e-07)]_19_\ [+3(1.36e-05)]_277 30909 3.28e-08 45_[+3(3.70e-06)]_235_\ [+1(1.82e-07)]_19_[+2(1.46e-06)]_165 49267 2.05e-07 162_[+2(7.02e-07)]_81_\ [+2(7.69e-05)]_50_[+1(1.82e-07)]_50_[+3(5.81e-05)]_109 44781 1.03e-03 38_[+3(9.04e-05)]_140_\ [+1(7.60e-06)]_298 41008 4.71e-04 48_[+1(4.11e-06)]_43_[+2(7.94e-06)]_\ 385 45426 4.54e-04 29_[+3(6.23e-06)]_242_\ [+2(1.12e-05)]_205 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************