******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/369/369.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47253 1.0000 500 6076 1.0000 500 47583 1.0000 500 22166 1.0000 500 14867 1.0000 500 48175 1.0000 500 38884 1.0000 500 15641 1.0000 500 49186 1.0000 500 8181 1.0000 500 40282 1.0000 500 49636 1.0000 500 40521 1.0000 500 16455 1.0000 500 50155 1.0000 500 7801 1.0000 500 55200 1.0000 500 33339 1.0000 500 49588 1.0000 500 48933 1.0000 500 38018 1.0000 500 49075 1.0000 500 38426 1.0000 500 37545 1.0000 500 48041 1.0000 500 41267 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/369/369.seqs.fa -oc motifs/369 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 26 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 13000 N= 26 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.273 C 0.232 G 0.233 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.273 C 0.232 G 0.233 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 24 llr = 213 E-value = 3.3e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 2:2a:2::3953 pos.-specific C 4:2:92::2::2 probability G ::6:::9:5:41 matrix T 4a::161a::15 bits 2.1 1.9 * * * 1.7 * ** * 1.5 * ** ** * Relative 1.3 * ** ** * Entropy 1.1 * ** ** * (12.8 bits) 0.8 * ** ** * 0.6 * ** ** ** 0.4 ********** 0.2 ************ 0.0 ------------ Multilevel CTGACTGTGAAT consensus T C A A GA sequence A C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 49075 202 1.41e-07 GCCCTTCTCG TTGACTGTGAAT GTAAGAGCGA 48933 229 4.09e-07 TTTCAAAGTG CTGACTGTGAAA GGCCTCTTTT 8181 1 6.31e-07 . TTGACTGTGAAA TCAACCCTCC 6076 23 1.44e-06 TGGGTGTTGG CTGACAGTGAAT AATGTAAAGC 41267 303 1.89e-06 TTCCTGTTAT TTGACCGTGAAT AGGTATTCCG 38018 341 5.22e-06 TGTACATGTT CTGACTGTGATT CCATCACAAT 47583 412 5.22e-06 GAAGCTTGAG TTCACTGTAAAT CCATCCGTAA 40521 85 8.20e-06 CGGGAAGGAA ATGACCGTGAGT CCGGAATGGG 49186 458 8.20e-06 TCCTCATTGT TTGACAGTGAAC CCGTGTAATC 49636 128 9.12e-06 TTTACGCTCA CTGACTTTGAGT ATGCATCCTA 33339 410 1.21e-05 TCTGCATGTA TTAACTGTAAGT GGCACGCAAA 49588 482 1.92e-05 TAGACTTTTC TTGACTGTTAGT TTACGTC 37545 338 2.15e-05 TGTGACATGT CTGACTTTGAAC AGAGCGCGCT 55200 91 4.31e-05 ACACGTTTTA CTCACTGTCAGG GTCATTTGGC 38884 259 5.64e-05 CAGAGCATGC ATAACAGTCAAT CCATGCGCAT 50155 402 7.03e-05 TCCTTTAACA ATGACTTTCAAA TGACATTTCC 7801 52 8.72e-05 TTGCGTAAAC CTAACAGTAAGC GCTGCGGGCA 48041 101 9.43e-05 CAGGACGGAC CTCACTGTGGAA TGGTTACACA 22166 198 1.26e-04 AACTTGTCCT CTAACGGTAAGT CATTTTGGTG 14867 204 1.34e-04 GTCAATTATG ATGATTGTAAGA GCGATAATTG 47253 310 1.94e-04 TCCTTTTTGC TTCATCGTAAAT ACTTTCTCAT 15641 263 2.58e-04 CCGCAATTCA TTTACCGTCAGC GCCTACGAAC 40282 180 2.86e-04 TAGTCATCTC ATCACTGTCTAA TGAGAGTTTC 16455 409 3.88e-04 CTCTTTGCGG GTGACAGTGATG ACGACGGCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49075 1.4e-07 201_[+1]_287 48933 4.1e-07 228_[+1]_260 8181 6.3e-07 [+1]_488 6076 1.4e-06 22_[+1]_466 41267 1.9e-06 302_[+1]_186 38018 5.2e-06 340_[+1]_148 47583 5.2e-06 411_[+1]_77 40521 8.2e-06 84_[+1]_404 49186 8.2e-06 457_[+1]_31 49636 9.1e-06 127_[+1]_361 33339 1.2e-05 409_[+1]_79 49588 1.9e-05 481_[+1]_7 37545 2.1e-05 337_[+1]_151 55200 4.3e-05 90_[+1]_398 38884 5.6e-05 258_[+1]_230 50155 7e-05 401_[+1]_87 7801 8.7e-05 51_[+1]_437 48041 9.4e-05 100_[+1]_388 22166 0.00013 197_[+1]_291 14867 0.00013 203_[+1]_285 47253 0.00019 309_[+1]_179 15641 0.00026 262_[+1]_226 40282 0.00029 179_[+1]_309 16455 0.00039 408_[+1]_80 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=24 49075 ( 202) TTGACTGTGAAT 1 48933 ( 229) CTGACTGTGAAA 1 8181 ( 1) TTGACTGTGAAA 1 6076 ( 23) CTGACAGTGAAT 1 41267 ( 303) TTGACCGTGAAT 1 38018 ( 341) CTGACTGTGATT 1 47583 ( 412) TTCACTGTAAAT 1 40521 ( 85) ATGACCGTGAGT 1 49186 ( 458) TTGACAGTGAAC 1 49636 ( 128) CTGACTTTGAGT 1 33339 ( 410) TTAACTGTAAGT 1 49588 ( 482) TTGACTGTTAGT 1 37545 ( 338) CTGACTTTGAAC 1 55200 ( 91) CTCACTGTCAGG 1 38884 ( 259) ATAACAGTCAAT 1 50155 ( 402) ATGACTTTCAAA 1 7801 ( 52) CTAACAGTAAGC 1 48041 ( 101) CTCACTGTGGAA 1 22166 ( 198) CTAACGGTAAGT 1 14867 ( 204) ATGATTGTAAGA 1 47253 ( 310) TTCATCGTAAAT 1 15641 ( 263) TTTACCGTCAGC 1 40282 ( 180) ATCACTGTCTAA 1 16455 ( 409) GTGACAGTGATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 12714 bayes= 9.49463 E= 3.3e-003 -39 69 -248 51 -1123 -1123 -1123 193 -71 -15 133 -265 187 -1123 -1123 -1123 -1123 198 -1123 -166 -39 -47 -248 115 -1123 -1123 191 -107 -1123 -1123 -1123 193 -13 -15 110 -265 175 -1123 -248 -265 99 -1123 69 -166 -13 -47 -148 93 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 24 E= 3.3e-003 0.208333 0.375000 0.041667 0.375000 0.000000 0.000000 0.000000 1.000000 0.166667 0.208333 0.583333 0.041667 1.000000 0.000000 0.000000 0.000000 0.000000 0.916667 0.000000 0.083333 0.208333 0.166667 0.041667 0.583333 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 1.000000 0.250000 0.208333 0.500000 0.041667 0.916667 0.000000 0.041667 0.041667 0.541667 0.000000 0.375000 0.083333 0.250000 0.166667 0.083333 0.500000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CTA]T[GC]AC[TA]GT[GAC]A[AG][TA] -------------------------------------------------------------------------------- Time 5.43 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 7 llr = 114 E-value = 6.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::3::::::1:::::a37 pos.-specific C :71:379::74::149:61 probability G :19::117131:::11::: matrix T a1:771:39:3aa94::11 bits 2.1 1.9 * ** * 1.7 * ** * 1.5 * * * ** ** Relative 1.3 * * **** *** ** Entropy 1.1 * *** **** *** ** (23.6 bits) 0.8 ********** *** ** * 0.6 ********** ******** 0.4 ********** ******** 0.2 ******************* 0.0 ------------------- Multilevel TCGTTCCGTCCTTTCCACA consensus AC T GT T A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 48933 207 3.25e-11 CGCCACTACC TCGTTCCGTCCTTTTCAAA GTGCTGACTG 16455 16 1.63e-08 TCCTCTTGGA TCGTTTCGGCCTTTCCACT GCCCATTGCG 38426 84 1.77e-08 TTCCGGTAAA TCCATCCGTGATTTTCACA TAGACGAGGG 7801 426 1.77e-08 GACGATGGGC TTGTCCCTTCTTTTTCAAA ACCAGAACCC 14867 94 3.31e-08 TGATGCTTTT TCGTTGCTTCTTTTGCACC ATGGATTCTG 49588 118 3.57e-08 AATTGGTGTG TGGTTCGGTGCTTTCCATA GAGCTGGGAG 40521 121 5.78e-08 AGAGACGGCG TCGACCCGTCGTTCCGACA ATAAAAACCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48933 3.2e-11 206_[+2]_275 16455 1.6e-08 15_[+2]_466 38426 1.8e-08 83_[+2]_398 7801 1.8e-08 425_[+2]_56 14867 3.3e-08 93_[+2]_388 49588 3.6e-08 117_[+2]_364 40521 5.8e-08 120_[+2]_361 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=7 48933 ( 207) TCGTTCCGTCCTTTTCAAA 1 16455 ( 16) TCGTTTCGGCCTTTCCACT 1 38426 ( 84) TCCATCCGTGATTTTCACA 1 7801 ( 426) TTGTCCCTTCTTTTTCAAA 1 14867 ( 94) TCGTTGCTTCTTTTGCACC 1 49588 ( 118) TGGTTCGGTGCTTTCCATA 1 40521 ( 121) TCGACCCGTCGTTCCGACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 12532 bayes= 10.649 E= 6.1e+001 -945 -945 -945 193 -945 162 -70 -88 -945 -70 188 -945 7 -945 -945 144 -945 30 -945 144 -945 162 -70 -88 -945 189 -70 -945 -945 -945 162 12 -945 -945 -70 170 -945 162 30 -945 -93 89 -70 12 -945 -945 -945 193 -945 -945 -945 193 -945 -70 -945 170 -945 89 -70 71 -945 189 -70 -945 187 -945 -945 -945 7 130 -945 -88 139 -70 -945 -88 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 7 E= 6.1e+001 0.000000 0.000000 0.000000 1.000000 0.000000 0.714286 0.142857 0.142857 0.000000 0.142857 0.857143 0.000000 0.285714 0.000000 0.000000 0.714286 0.000000 0.285714 0.000000 0.714286 0.000000 0.714286 0.142857 0.142857 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 0.714286 0.285714 0.000000 0.000000 0.142857 0.857143 0.000000 0.714286 0.285714 0.000000 0.142857 0.428571 0.142857 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.428571 0.142857 0.428571 0.000000 0.857143 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.571429 0.000000 0.142857 0.714286 0.142857 0.000000 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TCG[TA][TC]CC[GT]T[CG][CT]TTT[CT]CA[CA]A -------------------------------------------------------------------------------- Time 10.70 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 10 llr = 133 E-value = 4.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::4::11:::::3:1: pos.-specific C 6:62316::3a::1:a probability G 3::1:52a12:3:3:: matrix T 1a:7731:95:7769: bits 2.1 * * * 1.9 * * * * 1.7 * * * * 1.5 * ** * ** Relative 1.3 * ** * ** Entropy 1.1 ** * ** *** ** (19.2 bits) 0.8 ***** ** *** ** 0.6 ***** ********* 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CTCTTGCGTTCTTTTC consensus G ACCTG C GAG sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 14867 298 3.30e-09 CGGACCGTTT CTCTTGCGTCCGTTTC TTCGTCTGTT 47253 97 1.02e-07 ACCGACAATA CTATCCCGTTCTTTTC TTCCACAGTT 48933 287 2.23e-07 CATCTTTTCA GTCTTGTGTCCTTGTC AACGACGTAT 48041 413 2.47e-07 CATCCGTAGA CTCTTTCGGGCTTTTC CCACGTAATA 49075 151 2.47e-07 CAATATCTAC CTACCTCGTTCTTGTC TGCTCGACCA 38884 461 3.27e-07 AATTATACCT GTATTGCGTCCTTTAC CTTTGGGAGG 22166 122 9.04e-07 TACAGGAGCT TTCTTAGGTTCTTTTC TGCCAGTGCA 49636 44 1.12e-06 GGTGCAGGAG GTCCCGGGTTCGATTC CCGGTTTGGA 49588 7 1.21e-06 AGTTAG CTATTTCGTGCGACTC CTCCAGTGGG 7801 256 1.21e-06 GCAATCAACC CTCGTGAGTTCTAGTC GATGAAAGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14867 3.3e-09 297_[+3]_187 47253 1e-07 96_[+3]_388 48933 2.2e-07 286_[+3]_198 48041 2.5e-07 412_[+3]_72 49075 2.5e-07 150_[+3]_334 38884 3.3e-07 460_[+3]_24 22166 9e-07 121_[+3]_363 49636 1.1e-06 43_[+3]_441 49588 1.2e-06 6_[+3]_478 7801 1.2e-06 255_[+3]_229 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=10 14867 ( 298) CTCTTGCGTCCGTTTC 1 47253 ( 97) CTATCCCGTTCTTTTC 1 48933 ( 287) GTCTTGTGTCCTTGTC 1 48041 ( 413) CTCTTTCGGGCTTTTC 1 49075 ( 151) CTACCTCGTTCTTGTC 1 38884 ( 461) GTATTGCGTCCTTTAC 1 22166 ( 122) TTCTTAGGTTCTTTTC 1 49636 ( 44) GTCCCGGGTTCGATTC 1 49588 ( 7) CTATTTCGTGCGACTC 1 7801 ( 256) CTCGTGAGTTCTAGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 12610 bayes= 10.5509 E= 4.2e+001 -997 137 37 -139 -997 -997 -997 193 55 137 -997 -997 -997 -21 -122 141 -997 37 -997 141 -145 -121 110 19 -145 137 -22 -139 -997 -997 210 -997 -997 -997 -122 178 -997 37 -22 93 -997 211 -997 -997 -997 -997 37 141 14 -997 -997 141 -997 -121 37 119 -145 -997 -997 178 -997 211 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 4.2e+001 0.000000 0.600000 0.300000 0.100000 0.000000 0.000000 0.000000 1.000000 0.400000 0.600000 0.000000 0.000000 0.000000 0.200000 0.100000 0.700000 0.000000 0.300000 0.000000 0.700000 0.100000 0.100000 0.500000 0.300000 0.100000 0.600000 0.200000 0.100000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.100000 0.900000 0.000000 0.300000 0.200000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.300000 0.700000 0.300000 0.000000 0.000000 0.700000 0.000000 0.100000 0.300000 0.600000 0.100000 0.000000 0.000000 0.900000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG]T[CA][TC][TC][GT][CG]GT[TCG]C[TG][TA][TG]TC -------------------------------------------------------------------------------- Time 16.67 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47253 5.42e-05 96_[+3(1.02e-07)]_388 6076 6.52e-03 22_[+1(1.44e-06)]_466 47583 1.80e-02 377_[+1(8.10e-05)]_4_[+1(3.95e-05)]_\ 6_[+1(5.22e-06)]_77 22166 1.03e-03 121_[+3(9.04e-07)]_363 14867 6.43e-10 93_[+2(3.31e-08)]_24_[+3(1.47e-05)]_\ 145_[+3(3.30e-09)]_187 48175 9.23e-01 500 38884 2.23e-04 258_[+1(5.64e-05)]_190_\ [+3(3.27e-07)]_24 15641 5.78e-01 500 49186 4.57e-02 457_[+1(8.20e-06)]_31 8181 8.80e-04 [+1(6.31e-07)]_488 40282 1.89e-02 406_[+2(2.57e-05)]_75 49636 2.25e-05 43_[+3(1.12e-06)]_68_[+1(9.12e-06)]_\ 361 40521 1.24e-05 84_[+1(8.20e-06)]_24_[+2(5.78e-08)]_\ 361 16455 3.47e-05 15_[+2(1.63e-08)]_466 50155 1.36e-01 401_[+1(7.03e-05)]_87 7801 5.68e-08 51_[+1(8.72e-05)]_192_\ [+3(1.21e-06)]_154_[+2(1.77e-08)]_14_[+3(4.36e-05)]_26 55200 9.50e-02 90_[+1(4.31e-05)]_398 33339 4.51e-02 409_[+1(1.21e-05)]_79 49588 2.73e-08 6_[+3(1.21e-06)]_95_[+2(3.57e-08)]_\ 35_[+2(7.71e-05)]_268_[+2(5.49e-05)]_4_[+1(1.92e-05)]_7 48933 2.27e-13 132_[+2(3.66e-05)]_55_\ [+2(3.25e-11)]_3_[+1(4.09e-07)]_46_[+3(2.23e-07)]_198 38018 4.33e-02 340_[+1(5.22e-06)]_148 49075 2.47e-07 92_[+1(5.64e-05)]_46_[+3(2.47e-07)]_\ 35_[+1(1.41e-07)]_149_[+1(7.54e-05)]_126 38426 2.96e-05 83_[+2(1.77e-08)]_376_\ [+2(3.24e-05)]_3 37545 5.18e-02 337_[+1(2.15e-05)]_151 48041 2.14e-05 100_[+1(9.43e-05)]_300_\ [+3(2.47e-07)]_72 41267 1.71e-02 302_[+1(1.89e-06)]_186 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************