******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/418/418.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42755 1.0000 500 13031 1.0000 500 36647 1.0000 500 48685 1.0000 500 39236 1.0000 500 43419 1.0000 500 49168 1.0000 500 40689 1.0000 500 40994 1.0000 500 44347 1.0000 500 44517 1.0000 500 44774 1.0000 500 34626 1.0000 500 54405 1.0000 500 38708 1.0000 500 41493 1.0000 500 36269 1.0000 500 45324 1.0000 500 49750 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/418/418.seqs.fa -oc motifs/418 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.282 C 0.232 G 0.215 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.282 C 0.232 G 0.215 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 18 llr = 160 E-value = 2.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 177746::::1: pos.-specific C :1::1:931:29 probability G 8322431::1:: matrix T 1:11:2:79971 bits 2.2 2.0 1.8 1.6 * * * Relative 1.3 * ** * Entropy 1.1 * **** * (12.9 bits) 0.9 ** ****** 0.7 ***** ****** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GAAAAACTTTTC consensus G GG C C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 49750 39 1.05e-06 GACAGAAGTG GAGAGACTTTTC ATATGTATAC 39236 361 1.67e-06 ACACACGCGT GAAAAGCCTTTC TTTCCAAATT 43419 76 2.56e-06 GGACAGAATA GAAGAGCTTTTC TGGAGTTGCT 44517 145 3.62e-06 GGCTCTGCTA GAAAGAGTTTTC ATTTCTCTAT 44774 407 9.21e-06 AACAGGCCCA GGTAGGCTTTTC ACCTTCGTTT 42755 122 1.18e-05 GAGCTTCACT GGAAGGCTCTTC TTTTAGGCTT 54405 398 2.56e-05 CCAGATTTCG GGAAAACCCTTC CGACGAAACC 36269 75 2.82e-05 TTAAATCTAT GAAAAACCTTAC CTCAGATTTA 41493 288 2.82e-05 ACTCAACTCT GAAACTCTTTCC GCTGGGTTCT 49168 73 2.82e-05 TAGCTCGTGA AAGAGGCTTTTC TTTGTCTAGA 38708 227 3.26e-05 TATATACCTG GGAAGACTTTCT CACTTCACTT 34626 35 4.24e-05 GGCAAAGTCC GAAACACTTGTC AGCAGCAAAA 40994 390 4.63e-05 AAAGTGGGCG GGGAAACTTTTT GCTACAGCTG 48685 227 4.63e-05 TGGTCATAGT TAAAAACCTTCC TTCTCATCCA 44347 265 6.93e-05 GTGATTTATT TAATGACCTTTC GAGCTAACAT 45324 314 7.52e-05 GAGAATTCCG AAAGATCTTTTC AGTTAGAAAT 40689 323 1.32e-04 CTAATGAAGT GCTGAACTTTTC CGTTCCGAGA 36647 371 1.94e-04 CAGGTCTGGT GAATGTGTTTCC GCTAATGAGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49750 1.1e-06 38_[+1]_450 39236 1.7e-06 360_[+1]_128 43419 2.6e-06 75_[+1]_413 44517 3.6e-06 144_[+1]_344 44774 9.2e-06 406_[+1]_82 42755 1.2e-05 121_[+1]_367 54405 2.6e-05 397_[+1]_91 36269 2.8e-05 74_[+1]_414 41493 2.8e-05 287_[+1]_201 49168 2.8e-05 72_[+1]_416 38708 3.3e-05 226_[+1]_262 34626 4.2e-05 34_[+1]_454 40994 4.6e-05 389_[+1]_99 48685 4.6e-05 226_[+1]_262 44347 6.9e-05 264_[+1]_224 45324 7.5e-05 313_[+1]_175 40689 0.00013 322_[+1]_166 36647 0.00019 370_[+1]_118 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=18 49750 ( 39) GAGAGACTTTTC 1 39236 ( 361) GAAAAGCCTTTC 1 43419 ( 76) GAAGAGCTTTTC 1 44517 ( 145) GAAAGAGTTTTC 1 44774 ( 407) GGTAGGCTTTTC 1 42755 ( 122) GGAAGGCTCTTC 1 54405 ( 398) GGAAAACCCTTC 1 36269 ( 75) GAAAAACCTTAC 1 41493 ( 288) GAAACTCTTTCC 1 49168 ( 73) AAGAGGCTTTTC 1 38708 ( 227) GGAAGACTTTCT 1 34626 ( 35) GAAACACTTGTC 1 40994 ( 390) GGGAAACTTTTT 1 48685 ( 227) TAAAAACCTTCC 1 44347 ( 265) TAATGACCTTTC 1 45324 ( 314) AAAGATCTTTTC 1 40689 ( 323) GCTGAACTTTTC 1 36647 ( 371) GAATGTGTTTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9291 bayes= 9.14345 E= 2.9e+002 -134 -1081 186 -128 124 -206 37 -1081 136 -1081 -37 -128 136 -1081 -37 -128 66 -106 105 -1081 98 -1081 37 -70 -1081 194 -95 -1081 -1081 26 -1081 141 -1081 -106 -1081 171 -1081 -1081 -195 180 -234 -6 -1081 141 -1081 194 -1081 -128 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 18 E= 2.9e+002 0.111111 0.000000 0.777778 0.111111 0.666667 0.055556 0.277778 0.000000 0.722222 0.000000 0.166667 0.111111 0.722222 0.000000 0.166667 0.111111 0.444444 0.111111 0.444444 0.000000 0.555556 0.000000 0.277778 0.166667 0.000000 0.888889 0.111111 0.000000 0.000000 0.277778 0.000000 0.722222 0.000000 0.111111 0.000000 0.888889 0.000000 0.000000 0.055556 0.944444 0.055556 0.222222 0.000000 0.722222 0.000000 0.888889 0.000000 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[AG]AA[AG][AG]C[TC]TT[TC]C -------------------------------------------------------------------------------- Time 3.23 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 4 llr = 68 E-value = 2.4e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::53::a::a8:: pos.-specific C a:a::::8::5::8: probability G :::::5a::a5:::a matrix T :a:a53:3::::33: bits 2.2 * * * 2.0 * * * * * 1.8 **** * ** * * 1.6 **** * ** * * Relative 1.3 **** **** * ** Entropy 1.1 **** ********* (24.4 bits) 0.9 ***** ********* 0.7 ***** ********* 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CTCTAGGCAGCAACG consensus TA T G TT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 44517 366 9.57e-09 CTTGCAGTGT CTCTTGGCAGGATCG ACTGTAATTC 43419 388 9.57e-09 CACTCTACTT CTCTATGCAGCAACG AATGCGAACC 34626 432 1.58e-08 CTCTCGTGTG CTCTTGGTAGCAACG TACGATCGGC 38708 164 3.47e-08 GGAGCTTTTC CTCTAAGCAGGAATG ACACCTTTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44517 9.6e-09 365_[+2]_120 43419 9.6e-09 387_[+2]_98 34626 1.6e-08 431_[+2]_54 38708 3.5e-08 163_[+2]_322 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=4 44517 ( 366) CTCTTGGCAGGATCG 1 43419 ( 388) CTCTATGCAGCAACG 1 34626 ( 432) CTCTTGGTAGCAACG 1 38708 ( 164) CTCTAAGCAGGAATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 9234 bayes= 11.1721 E= 2.4e+003 -865 210 -865 -865 -865 -865 -865 188 -865 210 -865 -865 -865 -865 -865 188 82 -865 -865 88 -17 -865 122 -12 -865 -865 222 -865 -865 169 -865 -12 182 -865 -865 -865 -865 -865 222 -865 -865 110 122 -865 182 -865 -865 -865 141 -865 -865 -12 -865 169 -865 -12 -865 -865 222 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 4 E= 2.4e+003 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.500000 0.000000 0.000000 0.500000 0.250000 0.000000 0.500000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.250000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CTCT[AT][GAT]G[CT]AG[CG]A[AT][CT]G -------------------------------------------------------------------------------- Time 6.68 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 9 llr = 130 E-value = 3.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :313::1918:::8::1183 pos.-specific C :236:11:2:36:19::21: probability G 7:3:a71:22::9::a:7:1 matrix T 3421:2714:74111:9:16 bits 2.2 * * 2.0 * * 1.8 * * * 1.6 * * ** Relative 1.3 * * * *** Entropy 1.1 * * * **** *** (20.8 bits) 0.9 * ** * ********** 0.7 * *** * ********** 0.4 ** ***** *********** 0.2 ******************** 0.0 -------------------- Multilevel GTCCGGTATATCGACGTGAT consensus TAGA T CGCT C A sequence CT G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 48685 340 4.87e-10 ACTTGAAGAC GTCCGGTATACTGACGTCAT CAGATGTATC 36269 5 1.41e-09 CTGG GCGCGGTATATCGATGTGAT CCGTTGTGAC 40994 480 2.29e-08 GAAGAAGCGG GCCAGGTTGACCGACGTGAA A 54405 205 6.68e-08 TGCGTTCTGT TTGAGCTATGTCGACGTCAT GGCGGCCGTA 45324 66 1.13e-07 CGTGACGACT TACAGTTACATTGACGTGTT TTTTCTGGCC 44517 87 4.19e-07 ATTCCGTTCT TTTCGGGATACTGACGAGAG TCGACATAAT 36647 426 4.49e-07 GGGTCGAGTT GTGCGGCAAATCTCCGTGAA TGCTGCGATG 40689 234 5.15e-07 AAGATTGATC GATTGGAACGTCGACGTGCT TGCTCGACTA 49750 408 8.11e-07 GAATGCACTC GAACGTTAGATTGTCGTAAA CGCCCTAGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48685 4.9e-10 339_[+3]_141 36269 1.4e-09 4_[+3]_476 40994 2.3e-08 479_[+3]_1 54405 6.7e-08 204_[+3]_276 45324 1.1e-07 65_[+3]_415 44517 4.2e-07 86_[+3]_394 36647 4.5e-07 425_[+3]_55 40689 5.2e-07 233_[+3]_247 49750 8.1e-07 407_[+3]_73 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=9 48685 ( 340) GTCCGGTATACTGACGTCAT 1 36269 ( 5) GCGCGGTATATCGATGTGAT 1 40994 ( 480) GCCAGGTTGACCGACGTGAA 1 54405 ( 205) TTGAGCTATGTCGACGTCAT 1 45324 ( 66) TACAGTTACATTGACGTGTT 1 44517 ( 87) TTTCGGGATACTGACGAGAG 1 36647 ( 426) GTGCGGCAAATCTCCGTGAA 1 40689 ( 234) GATTGGAACGTCGACGTGCT 1 49750 ( 408) GAACGTTAGATTGTCGTAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 9139 bayes= 10.1209 E= 3.2e+003 -982 -982 163 30 24 -6 -982 71 -134 52 63 -29 24 126 -982 -128 -982 -982 222 -982 -982 -106 163 -29 -134 -106 -95 130 166 -982 -982 -128 -134 -6 5 71 146 -982 5 -982 -982 52 -982 130 -982 126 -982 71 -982 -982 205 -128 146 -106 -982 -128 -982 194 -982 -128 -982 -982 222 -982 -134 -982 -982 171 -134 -6 163 -982 146 -106 -982 -128 24 -982 -95 104 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 9 E= 3.2e+003 0.000000 0.000000 0.666667 0.333333 0.333333 0.222222 0.000000 0.444444 0.111111 0.333333 0.333333 0.222222 0.333333 0.555556 0.000000 0.111111 0.000000 0.000000 1.000000 0.000000 0.000000 0.111111 0.666667 0.222222 0.111111 0.111111 0.111111 0.666667 0.888889 0.000000 0.000000 0.111111 0.111111 0.222222 0.222222 0.444444 0.777778 0.000000 0.222222 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.555556 0.000000 0.444444 0.000000 0.000000 0.888889 0.111111 0.777778 0.111111 0.000000 0.111111 0.000000 0.888889 0.000000 0.111111 0.000000 0.000000 1.000000 0.000000 0.111111 0.000000 0.000000 0.888889 0.111111 0.222222 0.666667 0.000000 0.777778 0.111111 0.000000 0.111111 0.333333 0.000000 0.111111 0.555556 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GT][TAC][CGT][CA]G[GT]TA[TCG][AG][TC][CT]GACGT[GC]A[TA] -------------------------------------------------------------------------------- Time 10.16 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42755 6.24e-02 121_[+1(1.18e-05)]_191_\ [+1(8.12e-05)]_164 13031 1.00e+00 500 36647 7.19e-04 425_[+3(4.49e-07)]_55 48685 7.22e-07 226_[+1(4.63e-05)]_101_\ [+3(4.87e-10)]_141 39236 1.50e-02 360_[+1(1.67e-06)]_128 43419 8.81e-08 75_[+1(2.56e-06)]_300_\ [+2(9.57e-09)]_98 49168 8.49e-02 72_[+1(2.82e-05)]_416 40689 4.85e-04 233_[+3(5.15e-07)]_247 40994 3.02e-05 389_[+1(4.63e-05)]_78_\ [+3(2.29e-08)]_1 44347 2.20e-01 264_[+1(6.93e-05)]_224 44517 6.57e-10 86_[+3(4.19e-07)]_38_[+1(3.62e-06)]_\ 209_[+2(9.57e-09)]_120 44774 2.58e-03 188_[+2(3.46e-05)]_203_\ [+1(9.21e-06)]_82 34626 2.05e-05 34_[+1(4.24e-05)]_385_\ [+2(1.58e-08)]_54 54405 2.73e-05 204_[+3(6.68e-08)]_129_\ [+3(7.44e-05)]_24_[+1(2.56e-05)]_91 38708 2.00e-05 163_[+2(3.47e-08)]_48_\ [+1(3.26e-05)]_262 41493 1.30e-01 287_[+1(2.82e-05)]_201 36269 1.94e-07 4_[+3(1.41e-09)]_50_[+1(2.82e-05)]_\ 414 45324 8.23e-05 65_[+3(1.13e-07)]_228_\ [+1(7.52e-05)]_175 49750 1.39e-05 38_[+1(1.05e-06)]_357_\ [+3(8.11e-07)]_73 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************