******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/256/256.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9865 1.0000 500 bd1039 1.0000 500 bd1048 1.0000 500 bd980 1.0000 500 ThpsCp005 1.0000 500 ThpsCp043 1.0000 500 ThpsCp049 1.0000 500 ThpsCp087 1.0000 500 ThpsCp129 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/256/256.seqs.fa -oc motifs/256 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.337 C 0.130 G 0.174 T 0.358 Background letter frequencies (from dataset with add-one prior applied): A 0.337 C 0.131 G 0.174 T 0.358 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 9 llr = 155 E-value = 1.6e-012 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :2::4::82:2::72:41:9 pos.-specific C :::3323:::4:::2:::9: probability G 4:42187:8a1a:1:4691: matrix T 68641::2::2:a266:::1 bits 2.9 2.6 * * 2.3 * * * 2.1 * * ** Relative 1.8 ** * * ** Entropy 1.5 ** ** ** ** (24.8 bits) 1.2 ** ** ** **** 0.9 *** ***** ** ***** 0.6 **** ***** ** ***** 0.3 ******************** 0.0 -------------------- Multilevel TTTTAGGAGGCGTATTGGCA consensus GAGCCCCTA A TAGA sequence G T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- ThpsCp129 181 1.42e-13 ATTGGCATTA TTGCCGGAGGCGTATGGGCA AATGAAGCGT ThpsCp087 197 1.42e-13 ATTGGCATTA TTGCCGGAGGCGTATGGGCA AATGAAGCGT bd1048 180 1.42e-13 ATTGGCATTA TTGCCGGAGGCGTATGGGCA AATGAAGCGT ThpsCp005 386 5.08e-09 GAAAATATAC GTTTACGAGGAGTACTAGCA ATGCTTGCAC bd1039 387 5.08e-09 GAAAATATAC GTTTACGAGGAGTACTAGCA ATGCTTGCAC 9865 453 1.36e-07 AGCAACAAAT GTGGTGGTGGGGTGAGGGGA TTAGAGCAGG ThpsCp043 450 1.43e-07 ATGGCTGCTT TATTAGCAAGTGTTTTAGCA ATCCGTTTAG bd980 410 1.43e-07 ATGGCTGCTT TATTAGCAAGTGTTTTAGCA ATCCGTTTAG ThpsCp049 392 1.72e-07 ATAAGTCTAG GTTGGGCTGGCGTAATGACT ATGTTTACTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ThpsCp129 1.4e-13 180_[+1]_300 ThpsCp087 1.4e-13 196_[+1]_284 bd1048 1.4e-13 179_[+1]_301 ThpsCp005 5.1e-09 385_[+1]_95 bd1039 5.1e-09 386_[+1]_94 9865 1.4e-07 452_[+1]_28 ThpsCp043 1.4e-07 449_[+1]_31 bd980 1.4e-07 409_[+1]_71 ThpsCp049 1.7e-07 391_[+1]_89 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=9 ThpsCp129 ( 181) TTGCCGGAGGCGTATGGGCA 1 ThpsCp087 ( 197) TTGCCGGAGGCGTATGGGCA 1 bd1048 ( 180) TTGCCGGAGGCGTATGGGCA 1 ThpsCp005 ( 386) GTTTACGAGGAGTACTAGCA 1 bd1039 ( 387) GTTTACGAGGAGTACTAGCA 1 9865 ( 453) GTGGTGGTGGGGTGAGGGGA 1 ThpsCp043 ( 450) TATTAGCAAGTGTTTTAGCA 1 bd980 ( 410) TATTAGCAAGTGTTTTAGCA 1 ThpsCp049 ( 392) GTTGGGCTGGCGTAATGACT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4329 bayes= 9.75622 E= 1.6e-012 -982 -982 135 63 -60 -982 -982 112 -982 -982 135 63 -982 135 35 31 40 135 -64 -169 -982 77 216 -982 -982 135 194 -982 120 -982 -982 -69 -60 -982 216 -982 -982 -982 252 -982 -60 177 -64 -69 -982 -982 252 -982 -982 -982 -982 148 98 -982 -64 -69 -60 77 -982 63 -982 -982 135 63 40 -982 168 -982 -160 -982 235 -982 -982 277 -64 -982 140 -982 -982 -169 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 9 E= 1.6e-012 0.000000 0.000000 0.444444 0.555556 0.222222 0.000000 0.000000 0.777778 0.000000 0.000000 0.444444 0.555556 0.000000 0.333333 0.222222 0.444444 0.444444 0.333333 0.111111 0.111111 0.000000 0.222222 0.777778 0.000000 0.000000 0.333333 0.666667 0.000000 0.777778 0.000000 0.000000 0.222222 0.222222 0.000000 0.777778 0.000000 0.000000 0.000000 1.000000 0.000000 0.222222 0.444444 0.111111 0.222222 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.666667 0.000000 0.111111 0.222222 0.222222 0.222222 0.000000 0.555556 0.000000 0.000000 0.444444 0.555556 0.444444 0.000000 0.555556 0.000000 0.111111 0.000000 0.888889 0.000000 0.000000 0.888889 0.111111 0.000000 0.888889 0.000000 0.000000 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TG][TA][TG][TCG][AC][GC][GC][AT][GA]G[CAT]GT[AT][TAC][TG][GA]GCA -------------------------------------------------------------------------------- Time 0.73 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 8 llr = 153 E-value = 3.3e-012 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :9::35:4:::a::9:83::3 pos.-specific C 9::3:151:3:::::83:335 probability G 11:884:54:8:::13::68: matrix T ::a:::5:683:aa:::81:3 bits 2.9 2.6 2.3 * 2.1 * * Relative 1.8 * * * * Entropy 1.5 * *** **** * * (27.6 bits) 1.2 ***** * ******** ** 0.9 ***** *********** ** 0.6 ********************* 0.3 ********************* 0.0 --------------------- Multilevel CATGGACGTTGATTACATGGC consensus CAGTAGCT GCACCA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- ThpsCp129 244 9.23e-13 CCTAAAGAAA CATGGGCATTGATTACATGGC TTGTATTTGC ThpsCp087 260 9.23e-13 CCTAAAGAAA CATGGGCATTGATTACATGGC TTGTATTTGC bd1048 243 9.23e-13 CCTAAAGAAA CATGGGCATTGATTACATGGC TTGTATTTGC ThpsCp005 301 1.83e-09 ATTAGTTATT CATGGATGGCGATTAGATCCA ATTCTTTTAT bd1039 302 1.83e-09 ATTAGTTATT CATGGATGGCGATTAGATCCA ATTCTTTTAT ThpsCp043 220 1.58e-08 TGAAATGGCT CATCAATGTTTATTACCAGGT GTTACTAAGT bd980 180 1.58e-08 TGAAATGGCT CATCAATGTTTATTACCAGGT GTTACTAAGT 9865 387 1.58e-08 TTGCCGTGGT GGTGGCCCGTGATTGCATTGC AACGATGGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ThpsCp129 9.2e-13 243_[+2]_236 ThpsCp087 9.2e-13 259_[+2]_220 bd1048 9.2e-13 242_[+2]_237 ThpsCp005 1.8e-09 300_[+2]_179 bd1039 1.8e-09 301_[+2]_178 ThpsCp043 1.6e-08 219_[+2]_260 bd980 1.6e-08 179_[+2]_300 9865 1.6e-08 386_[+2]_93 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=8 ThpsCp129 ( 244) CATGGGCATTGATTACATGGC 1 ThpsCp087 ( 260) CATGGGCATTGATTACATGGC 1 bd1048 ( 243) CATGGGCATTGATTACATGGC 1 ThpsCp005 ( 301) CATGGATGGCGATTAGATCCA 1 bd1039 ( 302) CATGGATGGCGATTAGATCCA 1 ThpsCp043 ( 220) CATCAATGTTTATTACCAGGT 1 bd980 ( 180) CATCAATGTTTATTACCAGGT 1 9865 ( 387) GGTGGCCCGTGATTGCATTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4320 bayes= 9.07414 E= 3.3e-012 -965 274 -48 -965 137 -965 -48 -965 -965 -965 -965 148 -965 94 211 -965 -43 -965 211 -965 57 -6 111 -965 -965 194 -965 48 15 -6 152 -965 -965 -965 111 80 -965 94 -965 106 -965 -965 211 -52 157 -965 -965 -965 -965 -965 -965 148 -965 -965 -965 148 137 -965 -48 -965 -965 252 52 -965 115 94 -965 -965 -43 -965 -965 106 -965 94 184 -152 -965 94 211 -965 -43 194 -965 -52 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 3.3e-012 0.000000 0.875000 0.125000 0.000000 0.875000 0.000000 0.125000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.250000 0.000000 0.750000 0.000000 0.500000 0.125000 0.375000 0.000000 0.000000 0.500000 0.000000 0.500000 0.375000 0.125000 0.500000 0.000000 0.000000 0.000000 0.375000 0.625000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.750000 0.250000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.875000 0.000000 0.125000 0.000000 0.000000 0.750000 0.250000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.250000 0.625000 0.125000 0.000000 0.250000 0.750000 0.000000 0.250000 0.500000 0.000000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CAT[GC][GA][AG][CT][GA][TG][TC][GT]ATTA[CG][AC][TA][GC][GC][CAT] -------------------------------------------------------------------------------- Time 1.40 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 8 llr = 148 E-value = 1.7e-012 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :64:4:835:44:a66:::: pos.-specific C a1:::1:1:94::::::::a probability G :::a::::1131a:44:3a: matrix T :36:69364::5::::a8:: bits 2.9 * * 2.6 * * * ** 2.3 * * * * ** 2.1 * * * * ** Relative 1.8 * * * * ** Entropy 1.5 * * * ** * ** (26.7 bits) 1.2 * * * * ** * ** 0.9 * * ** ** ******** 0.6 * ***** ** ******** 0.3 ******************** 0.0 -------------------- Multilevel CATGTTATACATGAAATTGC consensus TA A TAT CA GG G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- ThpsCp005 220 7.32e-10 CTTCCTTCGA CTTGTTAAACCTGAAGTTGC AAGAGACGAA bd1039 221 7.32e-10 CTTCCTTCGA CTTGTTAAACCTGAAGTTGC AAGAGACGAA ThpsCp043 199 1.48e-09 AGTATCACGT CATGTTTTACGTGAAATGGC TCATCAATGT bd980 159 1.48e-09 AGTATCACGT CATGTTTTACGTGAAATGGC TCATCAATGT ThpsCp129 78 1.69e-09 ACCACTGGAA CAAGATATTCAAGAGATTGC TTTTTTAAAG ThpsCp087 94 1.69e-09 ACCACTGGAA CAAGATATTCAAGAGATTGC TTTTTTAAAG bd1048 77 1.69e-09 ACCACTGGAA CAAGATATTCAAGAGATTGC TTTTTTAAAG ThpsCp049 135 1.81e-08 AGAGCACCGC CCTGTCACGGCGGAAGTTGC GAGTTCGAGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ThpsCp005 7.3e-10 219_[+3]_261 bd1039 7.3e-10 220_[+3]_260 ThpsCp043 1.5e-09 198_[+3]_282 bd980 1.5e-09 158_[+3]_322 ThpsCp129 1.7e-09 77_[+3]_403 ThpsCp087 1.7e-09 93_[+3]_387 bd1048 1.7e-09 76_[+3]_404 ThpsCp049 1.8e-08 134_[+3]_346 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=8 ThpsCp005 ( 220) CTTGTTAAACCTGAAGTTGC 1 bd1039 ( 221) CTTGTTAAACCTGAAGTTGC 1 ThpsCp043 ( 199) CATGTTTTACGTGAAATGGC 1 bd980 ( 159) CATGTTTTACGTGAAATGGC 1 ThpsCp129 ( 78) CAAGATATTCAAGAGATTGC 1 ThpsCp087 ( 94) CAAGATATTCAAGAGATTGC 1 bd1048 ( 77) CAAGATATTCAAGAGATTGC 1 ThpsCp049 ( 135) CCTGTCACGGCGGAAGTTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4329 bayes= 9.07715 E= 1.7e-012 -965 294 -965 -965 89 -6 -965 -52 15 -965 -965 80 -965 -965 252 -965 15 -965 -965 80 -965 -6 -965 129 115 -965 -965 -52 -43 -6 -965 80 57 -965 -48 7 -965 274 -48 -965 15 152 52 -965 15 -965 -48 48 -965 -965 252 -965 157 -965 -965 -965 89 -965 111 -965 89 -965 111 -965 -965 -965 -965 148 -965 -965 52 106 -965 -965 252 -965 -965 294 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 8 E= 1.7e-012 0.000000 1.000000 0.000000 0.000000 0.625000 0.125000 0.000000 0.250000 0.375000 0.000000 0.000000 0.625000 0.000000 0.000000 1.000000 0.000000 0.375000 0.000000 0.000000 0.625000 0.000000 0.125000 0.000000 0.875000 0.750000 0.000000 0.000000 0.250000 0.250000 0.125000 0.000000 0.625000 0.500000 0.000000 0.125000 0.375000 0.000000 0.875000 0.125000 0.000000 0.375000 0.375000 0.250000 0.000000 0.375000 0.000000 0.125000 0.500000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.625000 0.000000 0.375000 0.000000 0.625000 0.000000 0.375000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[AT][TA]G[TA]T[AT][TA][AT]C[ACG][TA]GA[AG][AG]T[TG]GC -------------------------------------------------------------------------------- Time 2.06 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9865 1.78e-08 386_[+2(1.58e-08)]_45_\ [+1(1.36e-07)]_28 bd1039 6.92e-16 220_[+3(7.32e-10)]_61_\ [+2(1.83e-09)]_64_[+1(5.08e-09)]_94 bd1048 4.43e-23 76_[+3(1.69e-09)]_83_[+1(1.42e-13)]_\ 43_[+2(9.23e-13)]_237 bd980 2.49e-13 158_[+3(1.48e-09)]_1_[+2(1.58e-08)]_\ 209_[+1(1.43e-07)]_71 ThpsCp005 6.92e-16 219_[+3(7.32e-10)]_61_\ [+2(1.83e-09)]_64_[+1(5.08e-09)]_95 ThpsCp043 2.49e-13 198_[+3(1.48e-09)]_1_[+2(1.58e-08)]_\ 209_[+1(1.43e-07)]_31 ThpsCp049 1.71e-07 134_[+3(1.81e-08)]_237_\ [+1(1.72e-07)]_89 ThpsCp087 4.43e-23 93_[+3(1.69e-09)]_83_[+1(1.42e-13)]_\ 43_[+2(9.23e-13)]_220 ThpsCp129 4.43e-23 77_[+3(1.69e-09)]_83_[+1(1.42e-13)]_\ 43_[+2(9.23e-13)]_236 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************