******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/30/30.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9554 1.0000 500 32285 1.0000 500 43214 1.0000 500 12981 1.0000 500 47247 1.0000 500 28620 1.0000 500 43493 1.0000 500 51591 1.0000 500 19103 1.0000 500 45240 1.0000 500 45548 1.0000 500 12043 1.0000 500 35556 1.0000 500 35637 1.0000 500 39080 1.0000 500 32311 1.0000 500 43474 1.0000 500 35061 1.0000 500 43537 1.0000 500 45776 1.0000 500 37402 1.0000 500 37401 1.0000 500 47363 1.0000 500 40378 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/30/30.seqs.fa -oc motifs/30 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 24 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12000 N= 24 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.274 C 0.237 G 0.226 T 0.264 Background letter frequencies (from dataset with add-one prior applied): A 0.274 C 0.237 G 0.226 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 24 llr = 209 E-value = 4.7e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 15:2:614::18 pos.-specific C 7:8:2:8:1:7: probability G 222:4:::9:1: matrix T :3:84416:a12 bits 2.1 1.9 * 1.7 * 1.5 ** Relative 1.3 ** ** * Entropy 1.1 ** * ** * (12.5 bits) 0.9 * ** ***** * 0.6 * ** ******* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CACTGACTGTCA consensus T TT A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 32311 339 1.88e-07 AATCAATCGA CTCTGACTGTCA GTAATAGCAC 43493 190 5.72e-07 ATCAAGGTAT CTCTGTCTGTCA TGTACGGGTT 51591 176 3.42e-06 CATTCGCAGT CACTCTCAGTCA ACCTGACGAT 43474 240 3.88e-06 ACTCGGACCT CACTTACTGTTA CATTACCTAC 47363 8 6.23e-06 CTCGAAA CTCTGACTCTCA AGGTTCAGCA 47247 313 7.18e-06 TACTGTTAGT CGCTCACAGTCA ATCAGCAGCG 35556 264 8.16e-06 GATAGTCTTT CACTGTCAGTCT GCCTGTAAAC 12043 46 1.29e-05 TGGCTGGGCT GACTCACAGTCA TGTCACATCT 28620 237 1.58e-05 AACGCCAATT CACACTCTGTCA AGGATTCTGG 35061 262 1.73e-05 GGTGTCAAGA GACTTACTGTCT ACCCGTAAAT 45548 146 1.73e-05 GATGGTTTTA CTCTGACAGTAA TCTCGAGACT 12981 415 2.80e-05 TACAAATCGA CGGTTTCAGTCA CAACTCAGCG 40378 273 3.43e-05 CTATCGGAGA ATCTGACTGTGA ACCAACACAT 45240 31 3.43e-05 ATGGTGCCTT GTCTTACTGTTA TATTGTTGGA 43214 76 3.43e-05 CGTCGAGAAC ATCTGACTGTGA ATATGTTGAC 45776 320 3.79e-05 TCAGTCAGTA CACTTTCTGTGT TAGATAGGAT 35637 464 6.08e-05 TGCTGCATGG CAGTATCTGTCA ACCACCATTA 32285 206 8.55e-05 ACTATGATCA CACTGAAAGTAA AAGAACATTC 37401 172 1.15e-04 TTGGGGGGTG CGGTTAAAGTCA AAGGCGCCGT 9554 316 1.24e-04 GTCCGTGCAG CTTTGACTGTTA CCGCTACCGC 43537 442 1.53e-04 CCTCACTGCA AACAGACTCTCA ACTACCTGTG 39080 188 1.64e-04 CCAACAAACA CGCATTTTGTCA TAAGTAGCAA 19103 146 3.97e-04 GAAAGCACGA CACATTAACTCA GAACTCTTTC 37402 22 4.19e-04 TGATGCTCTT GAGTTATTGTCT CAATTTCTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32311 1.9e-07 338_[+1]_150 43493 5.7e-07 189_[+1]_299 51591 3.4e-06 175_[+1]_313 43474 3.9e-06 239_[+1]_249 47363 6.2e-06 7_[+1]_481 47247 7.2e-06 312_[+1]_176 35556 8.2e-06 263_[+1]_225 12043 1.3e-05 45_[+1]_443 28620 1.6e-05 236_[+1]_252 35061 1.7e-05 261_[+1]_227 45548 1.7e-05 145_[+1]_343 12981 2.8e-05 414_[+1]_74 40378 3.4e-05 272_[+1]_216 45240 3.4e-05 30_[+1]_458 43214 3.4e-05 75_[+1]_413 45776 3.8e-05 319_[+1]_169 35637 6.1e-05 463_[+1]_25 32285 8.5e-05 205_[+1]_283 37401 0.00012 171_[+1]_317 9554 0.00012 315_[+1]_173 43537 0.00015 441_[+1]_47 39080 0.00016 187_[+1]_301 19103 0.0004 145_[+1]_343 37402 0.00042 21_[+1]_467 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=24 32311 ( 339) CTCTGACTGTCA 1 43493 ( 190) CTCTGTCTGTCA 1 51591 ( 176) CACTCTCAGTCA 1 43474 ( 240) CACTTACTGTTA 1 47363 ( 8) CTCTGACTCTCA 1 47247 ( 313) CGCTCACAGTCA 1 35556 ( 264) CACTGTCAGTCT 1 12043 ( 46) GACTCACAGTCA 1 28620 ( 237) CACACTCTGTCA 1 35061 ( 262) GACTTACTGTCT 1 45548 ( 146) CTCTGACAGTAA 1 12981 ( 415) CGGTTTCAGTCA 1 40378 ( 273) ATCTGACTGTGA 1 45240 ( 31) GTCTTACTGTTA 1 43214 ( 76) ATCTGACTGTGA 1 45776 ( 320) CACTTTCTGTGT 1 35637 ( 464) CAGTATCTGTCA 1 32285 ( 206) CACTGAAAGTAA 1 37401 ( 172) CGGTTAAAGTCA 1 9554 ( 316) CTTTGACTGTTA 1 43537 ( 442) AACAGACTCTCA 1 39080 ( 188) CGCATTTTGTCA 1 19103 ( 146) CACATTAACTCA 1 37402 ( 22) GAGTTATTGTCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 11736 bayes= 9.37898 E= 4.7e-002 -113 158 -44 -1123 87 -1123 -44 34 -1123 174 -44 -266 -71 -1123 -1123 166 -271 -51 88 51 119 -1123 -1123 51 -113 174 -1123 -166 45 -1123 -1123 125 -1123 -92 195 -1123 -1123 -1123 -1123 192 -171 149 -85 -108 161 -1123 -1123 -66 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 24 E= 4.7e-002 0.125000 0.708333 0.166667 0.000000 0.500000 0.000000 0.166667 0.333333 0.000000 0.791667 0.166667 0.041667 0.166667 0.000000 0.000000 0.833333 0.041667 0.166667 0.416667 0.375000 0.625000 0.000000 0.000000 0.375000 0.125000 0.791667 0.000000 0.083333 0.375000 0.000000 0.000000 0.625000 0.000000 0.125000 0.875000 0.000000 0.000000 0.000000 0.000000 1.000000 0.083333 0.666667 0.125000 0.125000 0.833333 0.000000 0.000000 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[AT]CT[GT][AT]C[TA]GTCA -------------------------------------------------------------------------------- Time 5.93 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 10 llr = 145 E-value = 3.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::323:12:::3513:: pos.-specific C :::7:523211::332267: probability G ::23:21:1:::17135:31 matrix T aa8:a:54787a9:3:21:9 bits 2.1 1.9 ** * * 1.7 ** * * 1.5 ** * ** * Relative 1.3 ***** *** ** Entropy 1.1 ***** * *** ** (20.9 bits) 0.9 ***** ****** ** 0.6 ****** ****** *** 0.4 ****** ******* * *** 0.2 ************** ***** 0.0 -------------------- Multilevel TTTCTCTTTTTTTGAAGCCT consensus GG AAAC A CCGCAG sequence GCC TCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 47247 186 4.43e-10 GGCCCATTTG TTTCTCTATTTTTGCGTCCT CAGCTGAGGA 32285 118 1.12e-09 CGCCGCCCTT TTTCTCTTTTTTTCTATCCT TGCGGAAGTT 47363 242 4.82e-08 ATAATGTCAA TTTCTATTTTTTGGAAACCT ATCGGTGCCT 43474 423 4.82e-08 AATCCCCTAC TTTCTGTCTTTTTGCCCTCT AGACTTTACC 40378 371 1.42e-07 TATCGTACGT TTGCTCATCTATTGAAGACT GATTTTAACA 9554 247 1.67e-07 ATTCAGACTC TTTGTAGCTTCTTGCAGCGT ACATCCCTGT 35061 205 1.96e-07 CGCCACGACT TTTGTCACGTTTTGGGGACT AACTGTAACA 45776 183 2.28e-07 TCTGTTTTTT TTGGTACTTTTTTCTCGACT GCTCTAGTTT 43214 407 3.72e-07 TCTATCACTT TTTCTGCATAATTGTGGCGT CTTCATCCTA 12043 374 1.18e-06 GTACTCGTCT TTTCTCTACCTTTCAACCGG CTGATCCGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47247 4.4e-10 185_[+2]_295 32285 1.1e-09 117_[+2]_363 47363 4.8e-08 241_[+2]_239 43474 4.8e-08 422_[+2]_58 40378 1.4e-07 370_[+2]_110 9554 1.7e-07 246_[+2]_234 35061 2e-07 204_[+2]_276 45776 2.3e-07 182_[+2]_298 43214 3.7e-07 406_[+2]_74 12043 1.2e-06 373_[+2]_107 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=10 47247 ( 186) TTTCTCTATTTTTGCGTCCT 1 32285 ( 118) TTTCTCTTTTTTTCTATCCT 1 47363 ( 242) TTTCTATTTTTTGGAAACCT 1 43474 ( 423) TTTCTGTCTTTTTGCCCTCT 1 40378 ( 371) TTGCTCATCTATTGAAGACT 1 9554 ( 247) TTTGTAGCTTCTTGCAGCGT 1 35061 ( 205) TTTGTCACGTTTTGGGGACT 1 45776 ( 183) TTGGTACTTTTTTCTCGACT 1 43214 ( 407) TTTCTGCATAATTGTGGCGT 1 12043 ( 374) TTTCTCTACCTTTCAACCGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 11544 bayes= 10.4234 E= 3.9e+000 -997 -997 -997 192 -997 -997 -997 192 -997 -997 -17 160 -997 156 41 -997 -997 -997 -997 192 13 108 -17 -997 -45 -25 -117 92 13 34 -997 60 -997 -25 -117 141 -145 -124 -997 160 -45 -124 -997 141 -997 -997 -997 192 -997 -997 -117 177 -997 34 163 -997 13 34 -117 19 87 -25 41 -997 -145 -25 115 -40 13 134 -997 -140 -997 156 41 -997 -997 -997 -117 177 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 10 E= 3.9e+000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.700000 0.300000 0.000000 0.000000 0.000000 0.000000 1.000000 0.300000 0.500000 0.200000 0.000000 0.200000 0.200000 0.100000 0.500000 0.300000 0.300000 0.000000 0.400000 0.000000 0.200000 0.100000 0.700000 0.100000 0.100000 0.000000 0.800000 0.200000 0.100000 0.000000 0.700000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.100000 0.900000 0.000000 0.300000 0.700000 0.000000 0.300000 0.300000 0.100000 0.300000 0.500000 0.200000 0.300000 0.000000 0.100000 0.200000 0.500000 0.200000 0.300000 0.600000 0.000000 0.100000 0.000000 0.700000 0.300000 0.000000 0.000000 0.000000 0.100000 0.900000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TT[TG][CG]T[CAG][TAC][TAC][TC]T[TA]TT[GC][ACT][AGC][GCT][CA][CG]T -------------------------------------------------------------------------------- Time 10.79 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 6 llr = 108 E-value = 1.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :257::a5::3a::838::7 pos.-specific C :3322::25:3:5::228:3 probability G a5:28a::5a3:5a22:2a: matrix T ::2::::3:::::::3:::: bits 2.1 * * * * * 1.9 * ** * * * * 1.7 * ** * * * * 1.5 * *** * * * ** Relative 1.3 * *** * * ** *** Entropy 1.1 * *** ** **** **** (25.9 bits) 0.9 * *** ** **** **** 0.6 ** **** ** **** **** 0.4 *************** **** 0.2 *************** **** 0.0 -------------------- Multilevel GGAAGGAACGAACGAAACGA consensus CC TG C G T C sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 32311 122 1.13e-10 TTATCCGTAG GGAAGGAACGAAGGAAACGC ACGATGCGGA 43537 222 5.37e-10 TAGGTAGGTA GGTAGGATGGGAGGATACGA GGTACGTACA 45548 355 1.76e-09 GAGAAAACGG GGACGGATGGCAGGACACGA AACAGAGCTG 43474 84 4.92e-09 CGTTCCGAAC GAAACGAACGAACGAAACGA ACAAGATCCG 12043 401 6.80e-09 CGGCTGATCC GCCAGGAACGCACGGGACGC CCGTCTTTGC 43493 380 3.43e-08 GTTTGCGTAC GCCGGGACGGGACGATCGGA AATTTTGGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32311 1.1e-10 121_[+3]_359 43537 5.4e-10 221_[+3]_259 45548 1.8e-09 354_[+3]_126 43474 4.9e-09 83_[+3]_397 12043 6.8e-09 400_[+3]_80 43493 3.4e-08 379_[+3]_101 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=6 32311 ( 122) GGAAGGAACGAAGGAAACGC 1 43537 ( 222) GGTAGGATGGGAGGATACGA 1 45548 ( 355) GGACGGATGGCAGGACACGA 1 43474 ( 84) GAAACGAACGAACGAAACGA 1 12043 ( 401) GCCAGGAACGCACGGGACGC 1 43493 ( 380) GCCGGGACGGGACGATCGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 11544 bayes= 11.3568 E= 1.5e+001 -923 -923 215 -923 -71 49 115 -923 87 49 -923 -66 128 -51 -44 -923 -923 -51 188 -923 -923 -923 215 -923 187 -923 -923 -923 87 -51 -923 34 -923 107 115 -923 -923 -923 215 -923 28 49 56 -923 187 -923 -923 -923 -923 107 115 -923 -923 -923 215 -923 161 -923 -44 -923 28 -51 -44 34 161 -51 -923 -923 -923 181 -44 -923 -923 -923 215 -923 128 49 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 1.5e+001 0.000000 0.000000 1.000000 0.000000 0.166667 0.333333 0.500000 0.000000 0.500000 0.333333 0.000000 0.166667 0.666667 0.166667 0.166667 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.166667 0.000000 0.333333 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.333333 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.333333 0.166667 0.166667 0.333333 0.833333 0.166667 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.333333 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[GC][AC]AGGA[AT][CG]G[ACG]A[CG]GA[AT]ACG[AC] -------------------------------------------------------------------------------- Time 15.60 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9554 1.04e-04 246_[+2(1.67e-07)]_234 32285 1.03e-06 117_[+2(1.12e-09)]_68_\ [+1(8.55e-05)]_283 43214 1.89e-04 75_[+1(3.43e-05)]_319_\ [+2(3.72e-07)]_74 12981 5.28e-02 414_[+1(2.80e-05)]_74 47247 1.29e-07 185_[+2(4.43e-10)]_107_\ [+1(7.18e-06)]_134_[+1(1.73e-05)]_30 28620 2.86e-03 236_[+1(1.58e-05)]_252 43493 2.10e-07 189_[+1(5.72e-07)]_178_\ [+3(3.43e-08)]_101 51591 1.51e-02 175_[+1(3.42e-06)]_313 19103 1.78e-01 500 45240 1.18e-01 30_[+1(3.43e-05)]_458 45548 1.11e-06 47_[+1(4.16e-05)]_86_[+1(1.73e-05)]_\ 197_[+3(1.76e-09)]_126 12043 4.00e-09 45_[+1(1.29e-05)]_316_\ [+2(1.18e-06)]_7_[+3(6.80e-09)]_80 35556 4.87e-02 263_[+1(8.16e-06)]_225 35637 6.92e-02 463_[+1(6.08e-05)]_25 39080 4.61e-01 500 32311 1.56e-09 121_[+3(1.13e-10)]_197_\ [+1(1.88e-07)]_54_[+1(1.45e-05)]_84 43474 4.97e-11 83_[+3(4.92e-09)]_136_\ [+1(3.88e-06)]_171_[+2(4.82e-08)]_58 35061 6.91e-05 204_[+2(1.96e-07)]_37_\ [+1(1.73e-05)]_227 43537 3.18e-06 221_[+3(5.37e-10)]_259 45776 1.65e-04 182_[+2(2.28e-07)]_117_\ [+1(3.79e-05)]_169 37402 5.72e-01 500 37401 2.26e-01 500 47363 4.11e-06 7_[+1(6.23e-06)]_222_[+2(4.82e-08)]_\ 239 40378 1.03e-04 272_[+1(3.43e-05)]_86_\ [+2(1.42e-07)]_110 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************