******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/181/181.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 46858 1.0000 500 14535 1.0000 500 15709 1.0000 500 16008 1.0000 500 40933 1.0000 500 10593 1.0000 500 50453 1.0000 500 7358 1.0000 500 26515 1.0000 500 44919 1.0000 500 32031 1.0000 500 35156 1.0000 500 44804 1.0000 500 49673 1.0000 500 35378 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/181/181.seqs.fa -oc motifs/181 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.258 C 0.237 G 0.234 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.258 C 0.237 G 0.234 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 15 llr = 153 E-value = 2.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::9:7:::9361643 pos.-specific C :2192:2111:4111 probability G :6:1:a:713:5:1: matrix T a21:1:82:24:346 bits 2.1 * 1.9 * * 1.7 * * 1.5 * * * Relative 1.3 * ** ** * Entropy 1.0 * ** ** * * (14.7 bits) 0.8 * ** **** * 0.6 ********* *** * 0.4 ********* *** * 0.2 ********* ***** 0.0 --------------- Multilevel TGACAGTGAAAGAAT consensus C C CT GTCTTA sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 26515 249 6.23e-09 TTTTTACTTT TGACAGTGAGACATT CACTCTCAAT 7358 414 1.71e-08 GTCAGCTGAA TGACAGTGAATGATT ATAGAGTTGC 15709 443 6.87e-07 AGTGCGCTGC TGACAGTGAAACCAT TCGCAATTCT 46858 404 3.24e-06 TTCCAAGTTT TCACAGTCAGACAAA TTCCATGTGT 10593 56 4.01e-06 TATCTAACAT TGAGAGTGATTCTTT TTTATTCCTA 32031 295 7.88e-06 CCAATGTAAT TGACTGCGAAAAAAT GAGGTCAATA 44804 45 9.43e-06 CTTGTGGGTT TGACTGTGAGAATGT ATATTGTTAG 14535 44 1.12e-05 CAAGGTTGGT TGACAGTTGGACAAA AGTGGCCAGT 44919 360 1.44e-05 AAGCTTCCTC TCACAGTCACAGATC AAGAAATGCC 35156 53 1.70e-05 GCTGTGCAAC TTACAGCGATAGTAC TTGTCCTTAT 40933 472 2.13e-05 CCCGTACGCA TGCCCGTGATTGATA CCGATAATGG 16008 42 2.13e-05 GAAAGACTGT TTACAGTTAATGTCT TCTTGAGAAT 49673 300 3.07e-05 TGTGTGTTGG TGTGAGTGAGTGAGT GAGTGACAAG 50453 443 3.77e-05 GACTAGTTGA TCACCGTGCCAGTAT TCTGGTGTGA 35378 236 4.59e-05 ACGGTAAATT TTACCGCTAATCATA ATTTAAGTGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 26515 6.2e-09 248_[+1]_237 7358 1.7e-08 413_[+1]_72 15709 6.9e-07 442_[+1]_43 46858 3.2e-06 403_[+1]_82 10593 4e-06 55_[+1]_430 32031 7.9e-06 294_[+1]_191 44804 9.4e-06 44_[+1]_441 14535 1.1e-05 43_[+1]_442 44919 1.4e-05 359_[+1]_126 35156 1.7e-05 52_[+1]_433 40933 2.1e-05 471_[+1]_14 16008 2.1e-05 41_[+1]_444 49673 3.1e-05 299_[+1]_186 50453 3.8e-05 442_[+1]_43 35378 4.6e-05 235_[+1]_250 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=15 26515 ( 249) TGACAGTGAGACATT 1 7358 ( 414) TGACAGTGAATGATT 1 15709 ( 443) TGACAGTGAAACCAT 1 46858 ( 404) TCACAGTCAGACAAA 1 10593 ( 56) TGAGAGTGATTCTTT 1 32031 ( 295) TGACTGCGAAAAAAT 1 44804 ( 45) TGACTGTGAGAATGT 1 14535 ( 44) TGACAGTTGGACAAA 1 44919 ( 360) TCACAGTCACAGATC 1 35156 ( 53) TTACAGCGATAGTAC 1 40933 ( 472) TGCCCGTGATTGATA 1 16008 ( 42) TTACAGTTAATGTCT 1 49673 ( 300) TGTGAGTGAGTGAGT 1 50453 ( 443) TCACCGTGCCAGTAT 1 35378 ( 236) TTACCGCTAATCATA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 7290 bayes= 8.92184 E= 2.8e+000 -1055 -1055 -1055 188 -1055 -24 136 -44 175 -183 -1055 -202 -1055 187 -81 -1055 137 -24 -1055 -102 -1055 -1055 209 -1055 -1055 -24 -1055 156 -1055 -83 151 -44 175 -183 -181 -1055 37 -83 51 -44 122 -1055 -1055 56 -95 76 99 -1055 122 -183 -1055 30 63 -183 -81 56 5 -83 -1055 115 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 15 E= 2.8e+000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.600000 0.200000 0.866667 0.066667 0.000000 0.066667 0.000000 0.866667 0.133333 0.000000 0.666667 0.200000 0.000000 0.133333 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.133333 0.666667 0.200000 0.866667 0.066667 0.066667 0.000000 0.333333 0.133333 0.333333 0.200000 0.600000 0.000000 0.000000 0.400000 0.133333 0.400000 0.466667 0.000000 0.600000 0.066667 0.000000 0.333333 0.400000 0.066667 0.133333 0.400000 0.266667 0.133333 0.000000 0.600000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[GCT]AC[AC]G[TC][GT]A[AGT][AT][GC][AT][AT][TA] -------------------------------------------------------------------------------- Time 2.04 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 11 llr = 117 E-value = 7.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 1::::::::2:: pos.-specific C :61:15:5:43a probability G :4:3::a:155: matrix T 9:9795:59:2: bits 2.1 * * 1.9 * * 1.7 * * 1.5 * * * * * * Relative 1.3 * * * * * * Entropy 1.0 ********* * (15.4 bits) 0.8 ********* * 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TCTTTTGCTGGC consensus G G C T CC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46858 117 1.27e-07 TGTTTGGGCA TCTTTTGCTGGC TGCACCACTG 7358 489 6.71e-07 CAGCCCAATG TCTTTTGTTCGC 32031 391 1.45e-06 CCGTGCCCTT TCTTTCGCTCCC CGTTTACTCC 44804 79 1.71e-06 GGTTAGGTTT TCTTTCGTTGCC GATTTCCGAT 26515 377 4.00e-06 TCGTAGTTGT TCTTTCGCTACC GAACGTCAGC 35378 451 8.34e-06 TTGGCCTCGT TGTGTTGCTAGC GTTGCATTCC 10593 113 8.34e-06 CCCTCTCCGT TGTTTCGTTCTC AAAGCAGTCA 49673 108 8.61e-06 ATTGTTGCGC ACTTTCGTTGGC CGTATGCCGT 50453 478 1.28e-05 CACAGACGCG TGTGTTGTTGTC CAAAAAAGGC 44919 228 1.94e-05 GTCCGGTCAC TGTGCTGCTGGC CAGAGATGAT 40933 351 3.38e-05 AAGAGCGGTT TCCTTTGCGCGC AATTATTCTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46858 1.3e-07 116_[+2]_372 7358 6.7e-07 488_[+2] 32031 1.4e-06 390_[+2]_98 44804 1.7e-06 78_[+2]_410 26515 4e-06 376_[+2]_112 35378 8.3e-06 450_[+2]_38 10593 8.3e-06 112_[+2]_376 49673 8.6e-06 107_[+2]_381 50453 1.3e-05 477_[+2]_11 44919 1.9e-05 227_[+2]_261 40933 3.4e-05 350_[+2]_138 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=11 46858 ( 117) TCTTTTGCTGGC 1 7358 ( 489) TCTTTTGTTCGC 1 32031 ( 391) TCTTTCGCTCCC 1 44804 ( 79) TCTTTCGTTGCC 1 26515 ( 377) TCTTTCGCTACC 1 35378 ( 451) TGTGTTGCTAGC 1 10593 ( 113) TGTTTCGTTCTC 1 49673 ( 108) ACTTTCGTTGGC 1 50453 ( 478) TGTGTTGTTGTC 1 44919 ( 228) TGTGCTGCTGGC 1 40933 ( 351) TCCTTTGCGCGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 9.73455 E= 7.5e+001 -150 -1010 -1010 174 -1010 143 63 -1010 -1010 -138 -1010 174 -1010 -1010 22 142 -1010 -138 -1010 174 -1010 94 -1010 101 -1010 -1010 209 -1010 -1010 120 -1010 74 -1010 -1010 -136 174 -50 62 95 -1010 -1010 20 122 -58 -1010 208 -1010 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 7.5e+001 0.090909 0.000000 0.000000 0.909091 0.000000 0.636364 0.363636 0.000000 0.000000 0.090909 0.000000 0.909091 0.000000 0.000000 0.272727 0.727273 0.000000 0.090909 0.000000 0.909091 0.000000 0.454545 0.000000 0.545455 0.000000 0.000000 1.000000 0.000000 0.000000 0.545455 0.000000 0.454545 0.000000 0.000000 0.090909 0.909091 0.181818 0.363636 0.454545 0.000000 0.000000 0.272727 0.545455 0.181818 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[CG]T[TG]T[TC]G[CT]T[GC][GC]C -------------------------------------------------------------------------------- Time 3.88 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 94 E-value = 2.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a2::::2:::28:a::44:8a pos.-specific C :6:42228:4:::::a::8:: probability G :2a:824:a262a:8:262:: matrix T :::6:622:42:::2:4::2: bits 2.1 * * * * 1.9 * * * ** * * 1.7 * * * ** * * 1.5 * * * * ** * * Relative 1.3 * * * ** ***** *** Entropy 1.0 * *** ** ***** **** (27.1 bits) 0.8 * *** ** ***** **** 0.6 ****** ** ****** **** 0.4 ****** ************** 0.2 ****** ************** 0.0 --------------------- Multilevel ACGTGTGCGCGAGAGCAGCAA consensus A CCCAT TAG T TAGT sequence G GC GT G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 44919 479 3.68e-10 ACAATCAGTG AAGCGTGCGCAAGAGCTACAA A 15709 384 3.68e-10 TTTGCGGTAT ACGTCTGCGTGAGATCAGCAA CAAAGCCGGA 35156 340 5.05e-10 TATCATCGCT ACGTGCCTGTGAGAGCAGCAA TTCCGAAACA 49673 405 1.89e-09 GTAGGGTCGT ACGCGTTCGCTAGAGCGGCTA GTGTAAGTGA 35378 19 1.22e-08 ATTTTGAAAG AGGTGGACGGGGGAGCTAGAA ATCTCTTTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44919 3.7e-10 478_[+3]_1 15709 3.7e-10 383_[+3]_96 35156 5e-10 339_[+3]_140 49673 1.9e-09 404_[+3]_75 35378 1.2e-08 18_[+3]_461 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 44919 ( 479) AAGCGTGCGCAAGAGCTACAA 1 15709 ( 384) ACGTCTGCGTGAGATCAGCAA 1 35156 ( 340) ACGTGCCTGTGAGAGCAGCAA 1 49673 ( 405) ACGCGTTCGCTAGAGCGGCTA 1 35378 ( 19) AGGTGGACGGGGGAGCTAGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 10.7426 E= 2.1e+002 195 -897 -897 -897 -36 134 -23 -897 -897 -897 209 -897 -897 76 -897 114 -897 -24 177 -897 -897 -24 -23 114 -36 -24 77 -44 -897 175 -897 -44 -897 -897 209 -897 -897 76 -23 56 -36 -897 135 -44 163 -897 -23 -897 -897 -897 209 -897 195 -897 -897 -897 -897 -897 177 -44 -897 208 -897 -897 63 -897 -23 56 63 -897 135 -897 -897 175 -23 -897 163 -897 -897 -44 195 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 2.1e+002 1.000000 0.000000 0.000000 0.000000 0.200000 0.600000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.200000 0.600000 0.200000 0.200000 0.400000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.200000 0.400000 0.200000 0.000000 0.600000 0.200000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.400000 0.000000 0.200000 0.400000 0.400000 0.000000 0.600000 0.000000 0.000000 0.800000 0.200000 0.000000 0.800000 0.000000 0.000000 0.200000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[CAG]G[TC][GC][TCG][GACT][CT]G[CTG][GAT][AG]GA[GT]C[ATG][GA][CG][AT]A -------------------------------------------------------------------------------- Time 5.73 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46858 1.02e-05 116_[+2(1.27e-07)]_275_\ [+1(3.24e-06)]_82 14535 1.21e-02 43_[+1(1.12e-05)]_442 15709 6.51e-09 383_[+3(3.68e-10)]_38_\ [+1(6.87e-07)]_43 16008 1.02e-01 41_[+1(2.13e-05)]_444 40933 3.25e-03 350_[+2(3.38e-05)]_109_\ [+1(2.13e-05)]_14 10593 2.98e-04 55_[+1(4.01e-06)]_42_[+2(8.34e-06)]_\ 376 50453 4.35e-03 442_[+1(3.77e-05)]_20_\ [+2(1.28e-05)]_11 7358 4.63e-07 413_[+1(1.71e-08)]_60_\ [+2(6.71e-07)] 26515 8.73e-08 248_[+1(6.23e-09)]_113_\ [+2(4.00e-06)]_112 44919 3.99e-09 227_[+2(1.94e-05)]_120_\ [+1(1.44e-05)]_104_[+3(3.68e-10)]_1 32031 1.87e-04 294_[+1(7.88e-06)]_81_\ [+2(1.45e-06)]_98 35156 2.29e-07 52_[+1(1.70e-05)]_272_\ [+3(5.05e-10)]_140 44804 1.92e-04 44_[+1(9.43e-06)]_19_[+2(1.71e-06)]_\ 410 49673 1.71e-08 107_[+2(8.61e-06)]_180_\ [+1(3.07e-05)]_90_[+3(1.89e-09)]_75 35378 1.32e-07 18_[+3(1.22e-08)]_196_\ [+1(4.59e-05)]_200_[+2(8.34e-06)]_38 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************