******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/336/336.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 25241 1.0000 500 28199 1.0000 500 bd2063 1.0000 500 bd934 1.0000 500 ThpsCp026 1.0000 500 ThpsCp077 1.0000 500 ThpsCp078 1.0000 500 ThpsCp138 1.0000 500 ThpsCp139 1.0000 500 ThpsCt021 1.0000 500 ThpsCt028 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/336/336.seqs.fa -oc motifs/336 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.373 C 0.148 G 0.155 T 0.324 Background letter frequencies (from dataset with add-one prior applied): A 0.373 C 0.148 G 0.155 T 0.324 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 8 llr = 181 E-value = 8.7e-023 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::3::a:aa:a33:::a8::: pos.-specific C ::::::8:::::8:83::3:: probability G :::a::3::::8::3:::8:: matrix T aa8:a::::a:::a:8:3:aa bits 2.8 * 2.5 * 2.2 * 1.9 * * * * Relative 1.7 ** ** * * **** *** Entropy 1.4 ** ************ * *** (32.6 bits) 1.1 ** ************** *** 0.8 ***************** *** 0.6 ********************* 0.3 ********************* 0.0 --------------------- Multilevel TTTGTACAATAGCTCTAAGTT consensus A G AA GC TC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- ThpsCt028 139 1.28e-12 TAAATTGTAT TTTGTACAATAGCTCTAAGTT AAAAAACTAG ThpsCt021 141 1.28e-12 TAAATTGTAT TTTGTACAATAGCTCTAAGTT AAAAAACTAG ThpsCp139 448 1.28e-12 TAAATTGTAT TTTGTACAATAGCTCTAAGTT AAAAAACTAG ThpsCp138 17 1.28e-12 TAAATTGTAT TTTGTACAATAGCTCTAAGTT AAAAAACTAG ThpsCp078 19 1.28e-12 TAAATTGTAT TTTGTACAATAGCTCTAAGTT AAAAAACTAG ThpsCp077 450 1.28e-12 TAAATTGTAT TTTGTACAATAGCTCTAAGTT AAAAAACTAG ThpsCp026 258 1.48e-09 AAATTATGGA TTAGTAGAATAAATGCATCTT CAAGAGAAAA bd2063 410 1.48e-09 AAATTATGGA TTAGTAGAATAAATGCATCTT CAAGAGAAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ThpsCt028 1.3e-12 138_[+1]_341 ThpsCt021 1.3e-12 140_[+1]_339 ThpsCp139 1.3e-12 447_[+1]_32 ThpsCp138 1.3e-12 16_[+1]_463 ThpsCp078 1.3e-12 18_[+1]_461 ThpsCp077 1.3e-12 449_[+1]_30 ThpsCp026 1.5e-09 257_[+1]_222 bd2063 1.5e-09 409_[+1]_70 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=8 ThpsCt028 ( 139) TTTGTACAATAGCTCTAAGTT 1 ThpsCt021 ( 141) TTTGTACAATAGCTCTAAGTT 1 ThpsCp139 ( 448) TTTGTACAATAGCTCTAAGTT 1 ThpsCp138 ( 17) TTTGTACAATAGCTCTAAGTT 1 ThpsCp078 ( 19) TTTGTACAATAGCTCTAAGTT 1 ThpsCp077 ( 450) TTTGTACAATAGCTCTAAGTT 1 ThpsCp026 ( 258) TTAGTAGAATAAATGCATCTT 1 bd2063 ( 410) TTAGTAGAATAAATGCATCTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 9.36413 E= 8.7e-023 -965 -965 -965 162 -965 -965 -965 162 -58 -965 -965 121 -965 -965 269 -965 -965 -965 -965 162 142 -965 -965 -965 -965 234 69 -965 142 -965 -965 -965 142 -965 -965 -965 -965 -965 -965 162 142 -965 -965 -965 -58 -965 228 -965 -58 234 -965 -965 -965 -965 -965 162 -965 234 69 -965 -965 76 -965 121 142 -965 -965 -965 101 -965 -965 -37 -965 76 228 -965 -965 -965 -965 162 -965 -965 -965 162 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 8.7e-023 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.000000 0.750000 1.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TT[TA]GTA[CG]AATA[GA][CA]T[CG][TC]A[AT][GC]TT -------------------------------------------------------------------------------- Time 1.08 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 9 llr = 166 E-value = 1.5e-013 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::a:2:4:::2:1:::7:23 pos.-specific C 3:2:2:8:2718::3:2:284 probability G :28:7:2::37:3::1::8:2 matrix T 78::18:68:2:797983::: bits 2.8 2.5 2.2 1.9 * * * * Relative 1.7 * * * * ** Entropy 1.4 *** * * * ** (26.7 bits) 1.1 ***** * ********* ** 0.8 ******* ********* *** 0.6 ********************* 0.3 ********************* 0.0 --------------------- Multilevel TTGAGTCTTCGCTTTTTAGCC consensus CGC CAGACGTAG C CTCAA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- ThpsCt028 57 8.57e-13 TTTTTTTAAT TTGAGTCATCGCTTTTTAGCC ATATAATTAC ThpsCt021 58 8.57e-13 TTTTTTTAAT TTGAGTCATCGCTTTTTAGCC ATATAATTAC ThpsCp139 366 8.57e-13 TTTTTTTAAT TTGAGTCATCGCTTTTTAGCC ATATAATTAC ThpsCp077 367 8.57e-13 TTTTTTTAAT TTGAGTCATCGCTTTTTAGCC ATATAATTAC ThpsCp138 433 2.12e-08 AACTTTTGTT TTGAGAGTTCGAGTCTCTCCA TCCGCAAGTA ThpsCp078 435 2.12e-08 AACTTTTGTT TTGAGAGTTCGAGTCTCTCCA TCCGCAAGTA ThpsCp026 153 2.92e-08 TACAGAGGTG CGCACTCTCGTCTTTTTAGAG TTGCTAACCA bd2063 305 2.92e-08 TACAGAGGTG CGCACTCTCGTCTTTTTAGAG TTGCTAACCA bd934 90 5.89e-08 CACAGCGGTC CTGATTCTTGCCGACGTTGCA GTGCAGGAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ThpsCt028 8.6e-13 56_[+2]_423 ThpsCt021 8.6e-13 57_[+2]_422 ThpsCp139 8.6e-13 365_[+2]_114 ThpsCp077 8.6e-13 366_[+2]_113 ThpsCp138 2.1e-08 432_[+2]_47 ThpsCp078 2.1e-08 434_[+2]_45 ThpsCp026 2.9e-08 152_[+2]_327 bd2063 2.9e-08 304_[+2]_175 bd934 5.9e-08 89_[+2]_390 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=9 ThpsCt028 ( 57) TTGAGTCATCGCTTTTTAGCC 1 ThpsCt021 ( 58) TTGAGTCATCGCTTTTTAGCC 1 ThpsCp139 ( 366) TTGAGTCATCGCTTTTTAGCC 1 ThpsCp077 ( 367) TTGAGTCATCGCTTTTTAGCC 1 ThpsCp138 ( 433) TTGAGAGTTCGAGTCTCTCCA 1 ThpsCp078 ( 435) TTGAGAGTTCGAGTCTCTCCA 1 ThpsCp026 ( 153) CGCACTCTCGTCTTTTTAGAG 1 bd2063 ( 305) CGCACTCTCGTCTTTTTAGAG 1 bd934 ( 90) CTGATTCTTGCCGACGTTGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 10.043 E= 1.5e-013 -982 117 -982 104 -982 -982 52 126 -982 59 233 -982 142 -982 -982 -982 -982 59 211 -154 -75 -982 -982 126 -982 239 52 -982 25 -982 -982 78 -982 59 -982 126 -982 217 111 -982 -982 -41 211 -54 -75 239 -982 -982 -982 -982 111 104 -174 -982 -982 145 -982 117 -982 104 -982 -982 -48 145 -982 59 -982 126 84 -982 -982 4 -982 59 233 -982 -75 239 -982 -982 -16 159 52 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 1.5e-013 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.222222 0.777778 0.000000 0.222222 0.777778 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.222222 0.666667 0.111111 0.222222 0.000000 0.000000 0.777778 0.000000 0.777778 0.222222 0.000000 0.444444 0.000000 0.000000 0.555556 0.000000 0.222222 0.000000 0.777778 0.000000 0.666667 0.333333 0.000000 0.000000 0.111111 0.666667 0.222222 0.222222 0.777778 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.111111 0.000000 0.000000 0.888889 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.111111 0.888889 0.000000 0.222222 0.000000 0.777778 0.666667 0.000000 0.000000 0.333333 0.000000 0.222222 0.777778 0.000000 0.222222 0.777778 0.000000 0.000000 0.333333 0.444444 0.222222 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TC][TG][GC]A[GC][TA][CG][TA][TC][CG][GT][CA][TG]T[TC]T[TC][AT][GC][CA][CAG] -------------------------------------------------------------------------------- Time 2.10 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 149 E-value = 3.2e-015 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::::7:::aaa777:::a3 pos.-specific C ::7::33a7::::3::a:::7 probability G 7a::a:::3:::::3::aa:: matrix T 3:3a:7:::a:::::3::::: bits 2.8 * * * *** 2.5 * * * *** 2.2 * * * *** 1.9 * * ** *** Relative 1.7 * ** *** *** Entropy 1.4 ***** ****** ***** (35.8 bits) 1.1 ****** ****** ***** 0.8 *************** ***** 0.6 ********************* 0.3 ********************* 0.0 --------------------- Multilevel GGCTGTACCTAAAAAACGGAC consensus T T CC G CGT A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- ThpsCt028 193 3.40e-13 AAAACATCAT GGCTGTACCTAAAAAACGGAC ATCAAAAGCA ThpsCt021 195 3.40e-13 AAAACATCAT GGCTGTACCTAAAAAACGGAC ATCAAAAGCA ThpsCp138 71 3.40e-13 AAAACATCAT GGCTGTACCTAAAAAACGGAC ATCAAAAGCA ThpsCp078 73 3.40e-13 AAAACATCAT GGCTGTACCTAAAAAACGGAC ATCAAAAGCA ThpsCp026 97 3.84e-11 AACGAGGAAA TGTTGCCCGTAAACGTCGGAA GAAAATTTTA bd2063 249 3.84e-11 AACGAGGAAA TGTTGCCCGTAAACGTCGGAA GAAAATTTTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ThpsCt028 3.4e-13 192_[+3]_287 ThpsCt021 3.4e-13 194_[+3]_285 ThpsCp138 3.4e-13 70_[+3]_409 ThpsCp078 3.4e-13 72_[+3]_407 ThpsCp026 3.8e-11 96_[+3]_383 bd2063 3.8e-11 248_[+3]_231 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 ThpsCt028 ( 193) GGCTGTACCTAAAAAACGGAC 1 ThpsCt021 ( 195) GGCTGTACCTAAAAAACGGAC 1 ThpsCp138 ( 71) GGCTGTACCTAAAAAACGGAC 1 ThpsCp078 ( 73) GGCTGTACCTAAAAAACGGAC 1 ThpsCp026 ( 97) TGTTGCCCGTAAACGTCGGAA 1 bd2063 ( 249) TGTTGCCCGTAAACGTCGGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 10.2276 E= 3.2e-015 -923 -923 210 4 -923 -923 269 -923 -923 217 -923 4 -923 -923 -923 162 -923 -923 269 -923 -923 117 -923 104 84 117 -923 -923 -923 276 -923 -923 -923 217 111 -923 -923 -923 -923 162 142 -923 -923 -923 142 -923 -923 -923 142 -923 -923 -923 84 117 -923 -923 84 -923 111 -923 84 -923 -923 4 -923 276 -923 -923 -923 -923 269 -923 -923 -923 269 -923 142 -923 -923 -923 -16 217 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 3.2e-015 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GT]G[CT]TG[TC][AC]C[CG]TAAA[AC][AG][AT]CGGA[CA] -------------------------------------------------------------------------------- Time 3.23 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25241 9.52e-01 500 28199 7.04e-01 500 bd2063 1.79e-16 248_[+3(3.84e-11)]_35_\ [+2(2.92e-08)]_84_[+1(1.48e-09)]_70 bd934 8.22e-04 89_[+2(5.89e-08)]_390 ThpsCp026 1.79e-16 96_[+3(3.84e-11)]_35_[+2(2.92e-08)]_\ 84_[+1(1.48e-09)]_222 ThpsCp077 1.64e-16 366_[+2(8.57e-13)]_62_\ [+1(1.28e-12)]_30 ThpsCp078 1.61e-21 18_[+1(1.28e-12)]_33_[+3(3.40e-13)]_\ 341_[+2(2.12e-08)]_45 ThpsCp138 1.61e-21 16_[+1(1.28e-12)]_33_[+3(3.40e-13)]_\ 341_[+2(2.12e-08)]_47 ThpsCp139 1.64e-16 365_[+2(8.57e-13)]_61_\ [+1(1.28e-12)]_32 ThpsCt021 9.08e-26 57_[+2(8.57e-13)]_62_[+1(1.28e-12)]_\ 33_[+3(3.40e-13)]_285 ThpsCt028 9.08e-26 56_[+2(8.57e-13)]_61_[+1(1.28e-12)]_\ 33_[+3(3.40e-13)]_287 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************