******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/354/354.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 25217 1.0000 500 25218 1.0000 500 33028 1.0000 500 49730 1.0000 500 43917 1.0000 500 7736 1.0000 500 43961 1.0000 500 44077 1.0000 500 45020 1.0000 500 45560 1.0000 500 50486 1.0000 500 46511 1.0000 500 39969 1.0000 500 38235 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/354/354.seqs.fa -oc motifs/354 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.274 C 0.240 G 0.226 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.274 C 0.240 G 0.226 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 9 llr = 111 E-value = 2.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 7:281::2:::11:: pos.-specific C ::::2::8::::4:2 probability G 238:18a:2:6:128 matrix T 17:262::8a4938: bits 2.1 * 1.9 * * 1.7 * * 1.5 * * * Relative 1.3 * ***** * ** Entropy 1.1 *** ******* ** (17.8 bits) 0.9 *** ******* ** 0.6 **** ******* ** 0.4 ************ ** 0.2 *************** 0.0 --------------- Multilevel ATGATGGCTTGTCTG consensus GGATCT AG T TGC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 25217 319 4.21e-09 TCGTGCGAGA ATGATGGCTTTTTTG CCTCGGACTT 43961 180 1.02e-07 GGATTGTCAA GTGATTGCTTGTCTG TCACGAAGTC 46511 83 2.70e-07 TGTTCACTAA AGGAGGGCTTGTCGG AAGTCGACAA 38235 83 5.71e-07 TTGCGCTCAC ATGTCGGCTTGTATG GTGTTGGCCA 49730 66 6.83e-07 TGAGTCAGTT GTGATGGCGTTTCTC TCATTGTATT 7736 48 2.43e-06 CAGATTTTCG AGAATTGATTGTTTG GAACAAAGCC 45560 202 3.44e-06 TGAAGCAAGG ATGACGGAGTGACTG CTCCATATAA 33028 165 4.16e-06 GTTCCAATTT ATATTGGCTTTTGTC GATAGAATTT 39969 252 4.44e-06 AGGGTCGGGT TGGAAGGCTTTTTGG ATACGGTTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25217 4.2e-09 318_[+1]_167 43961 1e-07 179_[+1]_306 46511 2.7e-07 82_[+1]_403 38235 5.7e-07 82_[+1]_403 49730 6.8e-07 65_[+1]_420 7736 2.4e-06 47_[+1]_438 45560 3.4e-06 201_[+1]_284 33028 4.2e-06 164_[+1]_321 39969 4.4e-06 251_[+1]_234 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=9 25217 ( 319) ATGATGGCTTTTTTG 1 43961 ( 180) GTGATTGCTTGTCTG 1 46511 ( 83) AGGAGGGCTTGTCGG 1 38235 ( 83) ATGTCGGCTTGTATG 1 49730 ( 66) GTGATGGCGTTTCTC 1 7736 ( 48) AGAATTGATTGTTTG 1 45560 ( 202) ATGACGGAGTGACTG 1 33028 ( 165) ATATTGGCTTTTGTC 1 39969 ( 252) TGGAAGGCTTTTTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6804 bayes= 9.6948 E= 2.3e+002 128 -982 -2 -122 -982 -982 56 136 -30 -982 178 -982 150 -982 -982 -23 -130 -11 -102 109 -982 -982 178 -23 -982 -982 215 -982 -30 170 -982 -982 -982 -982 -2 158 -982 -982 -982 194 -982 -982 130 77 -130 -982 -982 177 -130 89 -102 36 -982 -982 -2 158 -982 -11 178 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 9 E= 2.3e+002 0.666667 0.000000 0.222222 0.111111 0.000000 0.000000 0.333333 0.666667 0.222222 0.000000 0.777778 0.000000 0.777778 0.000000 0.000000 0.222222 0.111111 0.222222 0.111111 0.555556 0.000000 0.000000 0.777778 0.222222 0.000000 0.000000 1.000000 0.000000 0.222222 0.777778 0.000000 0.000000 0.000000 0.000000 0.222222 0.777778 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.555556 0.444444 0.111111 0.000000 0.000000 0.888889 0.111111 0.444444 0.111111 0.333333 0.000000 0.000000 0.222222 0.777778 0.000000 0.222222 0.777778 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AG][TG][GA][AT][TC][GT]G[CA][TG]T[GT]T[CT][TG][GC] -------------------------------------------------------------------------------- Time 1.58 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 12 llr = 135 E-value = 4.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 13::48:3:2:8:68a pos.-specific C :41:1:76:3:2221: probability G :2824:::a25:72:: matrix T 92181331:45:211: bits 2.1 * 1.9 * * 1.7 * * 1.5 * * * Relative 1.3 * ** * * * Entropy 1.1 * ** ** * ** ** (16.2 bits) 0.9 * ** ** * *** ** 0.6 * ** **** *** ** 0.4 * ******* ****** 0.2 ********* ****** 0.0 ---------------- Multilevel TCGTAACCGTGAGAAA consensus A GTTA CT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 25217 79 7.44e-10 ATTCTAGAGA TCGTAACCGTGAGAAA GGGGTCGCTC 25218 195 7.57e-07 AATAAAGTGT TCGTGACTGTTCGAAA AGTATGAGAC 44077 384 1.26e-06 GTAAAACCCT TTGTATTCGATAGAAA ATCACGAATG 43961 355 1.26e-06 ATTTAAAAAA TAGTCATCGATAGAAA CTGTTCCCCG 45560 437 2.42e-06 AATCTCAAAT TATTGACAGTGAGGAA GGTTACTGCT 7736 187 2.42e-06 TCGTGTCCAT TCGTATCCGGTACCAA ATGAAGTATC 38235 404 2.89e-06 GGCTTCCAGC TCCTGACAGTGAGTAA AAATGTCCAT 46511 386 3.74e-06 AGGCGGACGG TAGTTTCCGGGAGGAA CAAAATCGAA 50486 253 7.00e-06 AAACGCACCC ATGGAATCGTGAGAAA GAACAGCGAG 45020 455 7.00e-06 TGGATAGTAT TGGTAACAGCGACACA GTTTGTTCAA 33028 302 9.31e-06 TTAATTTGAC TCGTGACAGCTATCTA CTTTCATTAG 39969 17 1.40e-05 GCGGAAGCTT TGGGGATCGCTCTAAA TTGGGGTCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25217 7.4e-10 78_[+2]_406 25218 7.6e-07 194_[+2]_290 44077 1.3e-06 383_[+2]_101 43961 1.3e-06 354_[+2]_130 45560 2.4e-06 436_[+2]_48 7736 2.4e-06 186_[+2]_298 38235 2.9e-06 403_[+2]_81 46511 3.7e-06 385_[+2]_99 50486 7e-06 252_[+2]_232 45020 7e-06 454_[+2]_30 33028 9.3e-06 301_[+2]_183 39969 1.4e-05 16_[+2]_468 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=12 25217 ( 79) TCGTAACCGTGAGAAA 1 25218 ( 195) TCGTGACTGTTCGAAA 1 44077 ( 384) TTGTATTCGATAGAAA 1 43961 ( 355) TAGTCATCGATAGAAA 1 45560 ( 437) TATTGACAGTGAGGAA 1 7736 ( 187) TCGTATCCGGTACCAA 1 38235 ( 404) TCCTGACAGTGAGTAA 1 46511 ( 386) TAGTTTCCGGGAGGAA 1 50486 ( 253) ATGGAATCGTGAGAAA 1 45020 ( 455) TGGTAACAGCGACACA 1 33028 ( 302) TCGTGACAGCTATCTA 1 39969 ( 17) TGGGGATCGCTCTAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 9.58982 E= 4.9e+001 -172 -1023 -1023 182 -13 80 -44 -64 -1023 -152 188 -164 -1023 -1023 -44 168 60 -152 88 -164 145 -1023 -1023 -6 -1023 147 -1023 36 28 128 -1023 -164 -1023 -1023 215 -1023 -72 6 -44 68 -1023 -1023 115 94 160 -53 -1023 -1023 -1023 -53 156 -64 109 -53 -44 -164 160 -152 -1023 -164 187 -1023 -1023 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 12 E= 4.9e+001 0.083333 0.000000 0.000000 0.916667 0.250000 0.416667 0.166667 0.166667 0.000000 0.083333 0.833333 0.083333 0.000000 0.000000 0.166667 0.833333 0.416667 0.083333 0.416667 0.083333 0.750000 0.000000 0.000000 0.250000 0.000000 0.666667 0.000000 0.333333 0.333333 0.583333 0.000000 0.083333 0.000000 0.000000 1.000000 0.000000 0.166667 0.250000 0.166667 0.416667 0.000000 0.000000 0.500000 0.500000 0.833333 0.166667 0.000000 0.000000 0.000000 0.166667 0.666667 0.166667 0.583333 0.166667 0.166667 0.083333 0.833333 0.083333 0.000000 0.083333 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[CA]GT[AG][AT][CT][CA]G[TC][GT]AGAAA -------------------------------------------------------------------------------- Time 3.46 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 10 llr = 119 E-value = 1.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 2:::5a:95::2:::1 pos.-specific C 13213:4:139:59:7 probability G ::891:6125:13:22 matrix T 77::1:::2217218: bits 2.1 1.9 * 1.7 * * 1.5 ** * * * * Relative 1.3 ** * * * ** Entropy 1.1 *** *** * ** (17.1 bits) 0.9 **** *** ** *** 0.6 **** *** ******* 0.4 **** *** ******* 0.2 **************** 0.0 ---------------- Multilevel TTGGAAGAAGCTCCTC consensus ACC C C GC AG GG sequence TT T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 45020 152 8.19e-08 GAACCCGTTG TTGGAACAGTCTGCTC ATCATCCCCC 49730 449 8.19e-08 CAAATCTCGC TCGGCAGATCCTCCTC CACAGTCAGA 33028 341 2.14e-07 TGAGGTAGTC TTCGAAGAACCTCCGC GAAGTCGTGC 44077 41 3.02e-07 CGGTTAGAAT TTGGAAGGACCTTCTC TGGGGCAAAC 39969 44 5.13e-07 TGGGGTCGGG ATGGAAGACGCACCTC CCGCACCGAC 25217 293 5.13e-07 ACGCCGTCGG TTCGCAGATGCTCCTG TCGTGCGAGA 7736 397 4.59e-06 CACCTTGTTT CCGGCACAATCAGCTC CTCAAGTCAA 38235 340 5.26e-06 TGTTGGTCTA ATGGTACAAGCTTCTA TCCTGTTCCT 45560 58 6.43e-06 CGTTCAGAGC TCGGAACAGGTTCCGG GTCTTCTGAC 46511 435 1.81e-05 TCGGGTCGGA TTGCGAGAAGCGGTTC GATCAAAGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45020 8.2e-08 151_[+3]_333 49730 8.2e-08 448_[+3]_36 33028 2.1e-07 340_[+3]_144 44077 3e-07 40_[+3]_444 39969 5.1e-07 43_[+3]_441 25217 5.1e-07 292_[+3]_192 7736 4.6e-06 396_[+3]_88 38235 5.3e-06 339_[+3]_145 45560 6.4e-06 57_[+3]_427 46511 1.8e-05 434_[+3]_50 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=10 45020 ( 152) TTGGAACAGTCTGCTC 1 49730 ( 449) TCGGCAGATCCTCCTC 1 33028 ( 341) TTCGAAGAACCTCCGC 1 44077 ( 41) TTGGAAGGACCTTCTC 1 39969 ( 44) ATGGAAGACGCACCTC 1 25217 ( 293) TTCGCAGATGCTCCTG 1 7736 ( 397) CCGGCACAATCAGCTC 1 38235 ( 340) ATGGTACAAGCTTCTA 1 45560 ( 58) TCGGAACAGGTTCCGG 1 46511 ( 435) TTGCGAGAAGCGGTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 9.00392 E= 1.2e+003 -46 -126 -997 143 -997 32 -997 143 -997 -26 182 -997 -997 -126 199 -997 87 32 -117 -138 187 -997 -997 -997 -997 74 141 -997 171 -997 -117 -997 87 -126 -17 -38 -997 32 115 -38 -997 191 -997 -138 -46 -997 -117 143 -997 106 41 -38 -997 191 -997 -138 -997 -997 -17 162 -145 154 -17 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 1.2e+003 0.200000 0.100000 0.000000 0.700000 0.000000 0.300000 0.000000 0.700000 0.000000 0.200000 0.800000 0.000000 0.000000 0.100000 0.900000 0.000000 0.500000 0.300000 0.100000 0.100000 1.000000 0.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.900000 0.000000 0.100000 0.000000 0.500000 0.100000 0.200000 0.200000 0.000000 0.300000 0.500000 0.200000 0.000000 0.900000 0.000000 0.100000 0.200000 0.000000 0.100000 0.700000 0.000000 0.500000 0.300000 0.200000 0.000000 0.900000 0.000000 0.100000 0.000000 0.000000 0.200000 0.800000 0.100000 0.700000 0.200000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TA][TC][GC]G[AC]A[GC]A[AGT][GCT]C[TA][CGT]C[TG][CG] -------------------------------------------------------------------------------- Time 5.16 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25217 1.28e-13 78_[+2(7.44e-10)]_198_\ [+3(5.13e-07)]_10_[+1(4.21e-09)]_167 25218 1.14e-03 194_[+2(7.57e-07)]_290 33028 2.24e-07 164_[+1(4.16e-06)]_122_\ [+2(9.31e-06)]_23_[+3(2.14e-07)]_144 49730 1.54e-06 65_[+1(6.83e-07)]_368_\ [+3(8.19e-08)]_36 43917 8.09e-01 500 7736 6.55e-07 47_[+1(2.43e-06)]_124_\ [+2(2.42e-06)]_194_[+3(4.59e-06)]_88 43961 7.79e-07 179_[+1(1.02e-07)]_160_\ [+2(1.26e-06)]_130 44077 9.81e-06 40_[+3(3.02e-07)]_327_\ [+2(1.26e-06)]_101 45020 1.30e-05 151_[+3(8.19e-08)]_287_\ [+2(7.00e-06)]_30 45560 1.21e-06 57_[+3(6.43e-06)]_128_\ [+1(3.44e-06)]_220_[+2(2.42e-06)]_48 50486 3.06e-03 252_[+2(7.00e-06)]_232 46511 4.58e-07 82_[+1(2.70e-07)]_288_\ [+2(3.74e-06)]_33_[+3(1.81e-05)]_50 39969 7.58e-07 16_[+2(1.40e-05)]_11_[+3(5.13e-07)]_\ 192_[+1(4.44e-06)]_234 38235 2.34e-07 82_[+1(5.71e-07)]_242_\ [+3(5.26e-06)]_48_[+2(2.89e-06)]_81 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************