******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/72/72.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 3184 1.0000 500 35229 1.0000 500 35279 1.0000 500 35330 1.0000 500 bd974 1.0000 500 ThpsCp047 1.0000 500 ThpsCp084 1.0000 500 ThpsCp132 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/72/72.seqs.fa -oc motifs/72 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4000 N= 8 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.299 C 0.185 G 0.206 T 0.309 Background letter frequencies (from dataset with add-one prior applied): A 0.299 C 0.185 G 0.207 T 0.309 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 8 llr = 175 E-value = 1.2e-022 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a3::::::5::83:3:::::: pos.-specific C :8:5835a3::::a::::55: probability G ::::335:38:35:8aa855a matrix T ::a5:5:::3a:3::::3::: bits 2.4 * * 2.2 * * ** * 1.9 * * ** * 1.7 * * * * * * ** * Relative 1.5 *** * ** * * ** *** Entropy 1.2 *** * ** ** ******** (31.6 bits) 1.0 ***** ** *** ******** 0.7 ***** ** *** ******** 0.5 ********************* 0.2 ********************* 0.0 --------------------- Multilevel ACTCCTCCAGTAGCGGGGCCG consensus A TGCG CT GA A TGG sequence G G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 35279 52 7.68e-14 TCACTTCTTA ACTCCTCCAGTAGCGGGGGCG GACTTGCGGG 35229 52 7.68e-14 TCACTTCTTA ACTCCTCCAGTAGCGGGGGCG GACTTGCGGG 35330 95 4.02e-13 TCACTTCTTA ACTCCTCCGGTAGCGGGGGCG GACTTGCGGG 3184 69 4.02e-13 TGACTTCTTG ACTCCTCCGGTAGCGGGGGCG GACTTGCGGG ThpsCp047 66 2.15e-10 CGAACTCGCA ACTTCCGCCGTGACAGGGCGG TGCTCTAACC bd974 67 2.15e-10 CGAACTCGCA ACTTCCGCCGTGACAGGGCGG TGCTCTAACC ThpsCp132 433 1.13e-09 AAAACTAAAA AATTGGGCATTATCGGGTCGG ATTCGTCTTG ThpsCp084 435 1.13e-09 AAAACTAAAA AATTGGGCATTATCGGGTCGG ATTCGTCTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35279 7.7e-14 51_[+1]_428 35229 7.7e-14 51_[+1]_428 35330 4e-13 94_[+1]_385 3184 4e-13 68_[+1]_411 ThpsCp047 2.1e-10 65_[+1]_414 bd974 2.1e-10 66_[+1]_413 ThpsCp132 1.1e-09 432_[+1]_47 ThpsCp084 1.1e-09 434_[+1]_45 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=8 35279 ( 52) ACTCCTCCAGTAGCGGGGGCG 1 35229 ( 52) ACTCCTCCAGTAGCGGGGGCG 1 35330 ( 95) ACTCCTCCGGTAGCGGGGGCG 1 3184 ( 69) ACTCCTCCGGTAGCGGGGGCG 1 ThpsCp047 ( 66) ACTTCCGCCGTGACAGGGCGG 1 bd974 ( 67) ACTTCCGCCGTGACAGGGCGG 1 ThpsCp132 ( 433) AATTGGGCATTATCGGGTCGG 1 ThpsCp084 ( 435) AATTGGGCATTATCGGGTCGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3840 bayes= 8.90388 E= 1.2e-022 174 -965 -965 -965 -26 202 -965 -965 -965 -965 -965 169 -965 143 -965 70 -965 202 28 -965 -965 43 28 70 -965 143 127 -965 -965 243 -965 -965 74 43 28 -965 -965 -965 186 -30 -965 -965 -965 169 132 -965 28 -965 -26 -965 127 -30 -965 243 -965 -965 -26 -965 186 -965 -965 -965 227 -965 -965 -965 227 -965 -965 -965 186 -30 -965 143 127 -965 -965 143 127 -965 -965 -965 227 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 1.2e-022 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.250000 0.500000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.750000 0.000000 0.250000 0.000000 0.250000 0.000000 0.500000 0.250000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.500000 0.500000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[CA]T[CT][CG][TCG][CG]C[ACG][GT]T[AG][GAT]C[GA]GG[GT][CG][CG]G -------------------------------------------------------------------------------- Time 0.53 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 8 llr = 162 E-value = 1.1e-016 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :35:::::::a::63383::: pos.-specific C a3::8::8:::5::::33a8: probability G :5:8338:::::a485:5:3: matrix T ::53:833aa:5:::3::::a bits 2.4 * * 2.2 * * * 1.9 * * * 1.7 * * *** * *** Relative 1.5 * * **** * *** Entropy 1.2 * ** ***** * * * *** (29.3 bits) 1.0 * ************ * *** 0.7 *************** ***** 0.5 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CGAGCTGCTTACGAGGAGCCT consensus ATTGGTT T GAACA G sequence C T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 3184 112 9.67e-14 GCCTTGGTGG CGAGCTGCTTACGAGGAGCCT TTCCTCCAGT 35330 138 1.64e-13 GCTTTGGTGG CGAGCTGCTTACGGGGAGCCT TTCCTCCGGT 35279 95 1.64e-13 GCTTTGGTGG CGAGCTGCTTACGGGGAGCCT TTCCTCCGGT 35229 95 1.64e-13 GCTTTGGTGG CGAGCTGCTTACGGGGAGCCT TTCCTCCGGT ThpsCp047 327 2.90e-09 GAACCTACGA CCTTGGGCTTATGAGTCCCCT GCTCTAACCA bd974 328 2.90e-09 GAACCTACGA CCTTGGGCTTATGAGTCCCCT GCTCTAACCA ThpsCp132 325 7.51e-09 AGCAATGTAC CATGCTTTTTATGAAAAACGT GAATTTACAA ThpsCp084 327 7.51e-09 AGCAATGTAC CATGCTTTTTATGAAAAACGT GAATTTACAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3184 9.7e-14 111_[+2]_368 35330 1.6e-13 137_[+2]_342 35279 1.6e-13 94_[+2]_385 35229 1.6e-13 94_[+2]_385 ThpsCp047 2.9e-09 326_[+2]_153 bd974 2.9e-09 327_[+2]_152 ThpsCp132 7.5e-09 324_[+2]_155 ThpsCp084 7.5e-09 326_[+2]_153 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=8 3184 ( 112) CGAGCTGCTTACGAGGAGCCT 1 35330 ( 138) CGAGCTGCTTACGGGGAGCCT 1 35279 ( 95) CGAGCTGCTTACGGGGAGCCT 1 35229 ( 95) CGAGCTGCTTACGGGGAGCCT 1 ThpsCp047 ( 327) CCTTGGGCTTATGAGTCCCCT 1 bd974 ( 328) CCTTGGGCTTATGAGTCCCCT 1 ThpsCp132 ( 325) CATGCTTTTTATGAAAAACGT 1 ThpsCp084 ( 327) CATGCTTTTTATGAAAAACGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3840 bayes= 8.90388 E= 1.1e-016 -965 243 -965 -965 -26 43 127 -965 74 -965 -965 70 -965 -965 186 -30 -965 202 28 -965 -965 -965 28 128 -965 -965 186 -30 -965 202 -965 -30 -965 -965 -965 169 -965 -965 -965 169 174 -965 -965 -965 -965 143 -965 70 -965 -965 227 -965 106 -965 86 -965 -26 -965 186 -965 -26 -965 127 -30 132 43 -965 -965 -26 43 127 -965 -965 243 -965 -965 -965 202 28 -965 -965 -965 -965 169 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 1.1e-016 0.000000 1.000000 0.000000 0.000000 0.250000 0.250000 0.500000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 0.750000 0.250000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.625000 0.000000 0.375000 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.500000 0.250000 0.750000 0.250000 0.000000 0.000000 0.250000 0.250000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[GAC][AT][GT][CG][TG][GT][CT]TTA[CT]G[AG][GA][GAT][AC][GAC]C[CG]T -------------------------------------------------------------------------------- Time 1.03 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 8 llr = 158 E-value = 1.1e-014 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 86a::3a:a38:4:333856: pos.-specific C :1::a::8::3:63::8:53: probability G :3:a:::3:8:a:6:::::1: matrix T 3::::8:::::::188:3::a bits 2.4 * 2.2 ** * 1.9 ** * 1.7 *** *** * * Relative 1.5 *** *** * * * Entropy 1.2 *** ******* * * (28.5 bits) 1.0 * ***************** * 0.7 ********************* 0.5 ********************* 0.2 ********************* 0.0 --------------------- Multilevel AAAGCTACAGAGCGTTCAAAT consensus TG A G AC ACAAATCC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 35279 253 2.32e-13 AGTCTCGTAC AAAGCTACAGAGCGTTCACAT ACGACAATCT 35229 253 2.32e-13 AGTCTCGTAC AAAGCTACAGAGCGTTCACAT ACGACAATCT 3184 271 2.91e-11 AGTCTGGTAC AAAGCTACAGAGAGTTCACGT ACGACGATCT 35330 307 1.42e-10 GCAATCTACT ACAGCTACAGAGCTTTCACAT GCGACAATCT ThpsCp047 359 3.53e-09 CTCTAACCAC TGAGCTACAGAGCCTTATACT ATATCTTTAT bd974 360 3.53e-09 CTCTAACCAC TGAGCTACAGAGCCTTATACT ATATCTTTAT ThpsCp132 177 7.80e-09 ATTTGCCTCA AAAGCAAGAACGAGAACAAAT TTTCAAAATT ThpsCp084 179 7.80e-09 ATTTGCCTCA AAAGCAAGAACGAGAACAAAT TTTCAAAATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35279 2.3e-13 252_[+3]_227 35229 2.3e-13 252_[+3]_227 3184 2.9e-11 270_[+3]_209 35330 1.4e-10 306_[+3]_173 ThpsCp047 3.5e-09 358_[+3]_121 bd974 3.5e-09 359_[+3]_120 ThpsCp132 7.8e-09 176_[+3]_303 ThpsCp084 7.8e-09 178_[+3]_301 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=8 35279 ( 253) AAAGCTACAGAGCGTTCACAT 1 35229 ( 253) AAAGCTACAGAGCGTTCACAT 1 3184 ( 271) AAAGCTACAGAGAGTTCACGT 1 35330 ( 307) ACAGCTACAGAGCTTTCACAT 1 ThpsCp047 ( 359) TGAGCTACAGAGCCTTATACT 1 bd974 ( 360) TGAGCTACAGAGCCTTATACT 1 ThpsCp132 ( 177) AAAGCAAGAACGAGAACAAAT 1 ThpsCp084 ( 179) AAAGCAAGAACGAGAACAAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3840 bayes= 8.90388 E= 1.1e-014 132 -965 -965 -30 106 -57 28 -965 174 -965 -965 -965 -965 -965 227 -965 -965 243 -965 -965 -26 -965 -965 128 174 -965 -965 -965 -965 202 28 -965 174 -965 -965 -965 -26 -965 186 -965 132 43 -965 -965 -965 -965 227 -965 32 175 -965 -965 -965 43 160 -130 -26 -965 -965 128 -26 -965 -965 128 -26 202 -965 -965 132 -965 -965 -30 74 143 -965 -965 106 43 -72 -965 -965 -965 -965 169 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 1.1e-014 0.750000 0.000000 0.000000 0.250000 0.625000 0.125000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.000000 0.750000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.375000 0.625000 0.000000 0.000000 0.000000 0.250000 0.625000 0.125000 0.250000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.750000 0.250000 0.750000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.500000 0.500000 0.000000 0.000000 0.625000 0.250000 0.125000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AT][AG]AGC[TA]A[CG]A[GA][AC]G[CA][GC][TA][TA][CA][AT][AC][AC]T -------------------------------------------------------------------------------- Time 1.54 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3184 2.66e-25 68_[+1(4.02e-13)]_22_[+2(9.67e-14)]_\ 138_[+3(2.91e-11)]_209 35229 8.17e-28 51_[+1(7.68e-14)]_22_[+2(1.64e-13)]_\ 137_[+3(2.32e-13)]_227 35279 8.17e-28 51_[+1(7.68e-14)]_22_[+2(1.64e-13)]_\ 137_[+3(2.32e-13)]_227 35330 2.06e-24 94_[+1(4.02e-13)]_22_[+2(1.64e-13)]_\ 148_[+3(1.42e-10)]_45_[+1(9.02e-05)]_107 bd974 2.34e-16 66_[+1(2.15e-10)]_240_\ [+2(2.90e-09)]_11_[+3(3.53e-09)]_120 ThpsCp047 2.34e-16 65_[+1(2.15e-10)]_240_\ [+2(2.90e-09)]_11_[+3(3.53e-09)]_121 ThpsCp084 5.99e-15 178_[+3(7.80e-09)]_127_\ [+2(7.51e-09)]_87_[+1(1.13e-09)]_45 ThpsCp132 5.99e-15 176_[+3(7.80e-09)]_127_\ [+2(7.51e-09)]_87_[+1(1.13e-09)]_47 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************