******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/234/234.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11077 1.0000 500 11084 1.0000 500 11221 1.0000 500 11488 1.0000 500 12114 1.0000 500 1355 1.0000 500 1537 1.0000 500 2042 1.0000 500 21831 1.0000 500 264598 1.0000 500 3103 1.0000 500 3415 1.0000 500 3765 1.0000 500 5128 1.0000 500 5381 1.0000 500 5805 1.0000 500 6793 1.0000 500 681 1.0000 500 7632 1.0000 500 8517 1.0000 500 879 1.0000 500 8988 1.0000 500 9161 1.0000 500 9244 1.0000 500 9298 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/234/234.seqs.fa -oc motifs/234 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 25 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12500 N= 25 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.274 C 0.229 G 0.236 T 0.261 Background letter frequencies (from dataset with add-one prior applied): A 0.274 C 0.229 G 0.236 T 0.261 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 16 llr = 185 E-value = 3.6e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::412532::3::3:: pos.-specific C ::2:13::41:::111 probability G 48438311192:a::9 matrix T 6317::676:5a:691 bits 2.1 * 1.9 ** 1.7 * ** 1.5 * ** ** Relative 1.3 * * ** ** Entropy 1.1 ** * * ** ** (16.7 bits) 0.9 ** ** ** ** ** 0.6 ** ** **** ***** 0.4 ** ************* 0.2 **************** 0.0 ---------------- Multilevel TGATGATTTGTTGTTG consensus GTGG CA C A A sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 5128 265 5.07e-10 TGATTCAGAA GGGTGATTTGTTGTTG AATCGTATTC 8988 52 1.57e-07 TGCCATTCCC TTGTGGATTGTTGTTG GTTTTCTGAC 264598 232 1.57e-07 ATCTTGGCCA TGATGATACGATGTTG CAGTGTTTTA 11221 233 1.57e-07 ATCTTGGCCA TGATGATACGATGTTG CAGTGTTTTA 11488 267 4.72e-07 AGTTGGTGAT TGATGATGTGATGATG TGCATGTAGG 5381 398 5.89e-07 CTTGATAGGG TGAGGAGTTGTTGATG TAGACATCGT 21831 415 5.89e-07 TTGAGTACCG GGGTGGTATGATGATG TTTTGAGACA 11084 328 2.07e-06 ATGACTGGCG TGCTAATGTGATGTTG ATGTGAATCG 879 308 2.90e-06 TTCAGCTCTC TTATGAATCGTTGACG ACTAGGGACA 9244 67 4.99e-06 GTGCATATGC GTATGCATCGTTGTTC ATCACCAACT 681 99 4.99e-06 TCTGTTGCCA TGGAAAGTTGTTGTTG GTATCAGCAT 1537 353 4.99e-06 CTATCATATG GGGGGCATCGGTGCTG TTCATTATTG 9298 4 8.73e-06 AGG GGTTGGTTTGTTGATT GCAAGTAGAA 6793 188 9.94e-06 CTAGAGATGT TGCTCGTTTGGTGTCG TGATACAACC 11077 197 9.94e-06 GTTTCCTGGG GTGGACTTGGTTGTTG GTACACTGGC 12114 87 1.20e-05 TCGTGAAGTC GGCGGCATCCGTGTTG AGGACAGCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5128 5.1e-10 264_[+1]_220 8988 1.6e-07 51_[+1]_433 264598 1.6e-07 231_[+1]_253 11221 1.6e-07 232_[+1]_252 11488 4.7e-07 266_[+1]_218 5381 5.9e-07 397_[+1]_87 21831 5.9e-07 414_[+1]_70 11084 2.1e-06 327_[+1]_157 879 2.9e-06 307_[+1]_177 9244 5e-06 66_[+1]_418 681 5e-06 98_[+1]_386 1537 5e-06 352_[+1]_132 9298 8.7e-06 3_[+1]_481 6793 9.9e-06 187_[+1]_297 11077 9.9e-06 196_[+1]_288 12114 1.2e-05 86_[+1]_398 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=16 5128 ( 265) GGGTGATTTGTTGTTG 1 8988 ( 52) TTGTGGATTGTTGTTG 1 264598 ( 232) TGATGATACGATGTTG 1 11221 ( 233) TGATGATACGATGTTG 1 11488 ( 267) TGATGATGTGATGATG 1 5381 ( 398) TGAGGAGTTGTTGATG 1 21831 ( 415) GGGTGGTATGATGATG 1 11084 ( 328) TGCTAATGTGATGTTG 1 879 ( 308) TTATGAATCGTTGACG 1 9244 ( 67) GTATGCATCGTTGTTC 1 681 ( 99) TGGAAAGTTGTTGTTG 1 1537 ( 353) GGGGGCATCGGTGCTG 1 9298 ( 4) GGTTGGTTTGTTGATT 1 6793 ( 188) TGCTCGTTTGGTGTCG 1 11077 ( 197) GTGGACTTGGTTGTTG 1 12114 ( 87) GGCGGCATCCGTGTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 12125 bayes= 9.56379 E= 3.6e-002 -1064 -1064 89 111 -1064 -1064 166 -6 45 -29 66 -206 -213 -1064 8 140 -55 -187 166 -1064 87 13 8 -1064 19 -1064 -92 111 -55 -1064 -92 140 -1064 71 -192 111 -1064 -187 199 -1064 19 -1064 -33 94 -1064 -1064 -1064 194 -1064 -1064 208 -1064 19 -187 -1064 126 -1064 -87 -1064 174 -1064 -187 189 -206 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 16 E= 3.6e-002 0.000000 0.000000 0.437500 0.562500 0.000000 0.000000 0.750000 0.250000 0.375000 0.187500 0.375000 0.062500 0.062500 0.000000 0.250000 0.687500 0.187500 0.062500 0.750000 0.000000 0.500000 0.250000 0.250000 0.000000 0.312500 0.000000 0.125000 0.562500 0.187500 0.000000 0.125000 0.687500 0.000000 0.375000 0.062500 0.562500 0.000000 0.062500 0.937500 0.000000 0.312500 0.000000 0.187500 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.312500 0.062500 0.000000 0.625000 0.000000 0.125000 0.000000 0.875000 0.000000 0.062500 0.875000 0.062500 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TG][GT][AG][TG]G[ACG][TA]T[TC]G[TA]TG[TA]TG -------------------------------------------------------------------------------- Time 5.23 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 9 llr = 150 E-value = 6.3e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 12961113::111::a:9763 pos.-specific C 98::87324a61868:a1237 probability G ::1:12::::2:::::::11: matrix T :::4::646:18142:::::: bits 2.1 * * 1.9 * ** 1.7 * * ** 1.5 * * * *** Relative 1.3 *** * **** Entropy 1.1 *** * ** ******* * (24.1 bits) 0.9 ****** ** ******* * 0.6 ******* ** ********** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CCAACCTTTCCTCCCACAAAC consensus A T GCAC G TT CCA sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 264598 457 5.09e-11 ACCCCTCCCT CCATCCCTCCCTCTCACACAC TCTCACGGCA 11221 458 5.09e-11 ACCCCTCCCT CCATCCCTCCCTCTCACACAC TCTCACGGCA 11084 162 9.40e-11 ACGACTCCGA CCAACCTTCCTTCCCACAACC TCCGTACATC 11488 409 6.53e-09 ACACACAAAC CAAACCAATCCTCTCACAAGC AGCAAAAAGC 9244 410 1.01e-08 ACAACAGGCA CCAACATCTCCTCCCACCACA AGTGATCCGA 2042 450 3.77e-08 TTGGTCTCGA CCAACGCATCGACCTACAAAA CGTTTCTCAG 9161 298 9.37e-08 GAATAGCAAC ACAAGCTCTCGTCTCACAGCC CAACCACTCA 7632 163 1.12e-07 TGGCGCCTGC CCGTCGTACCCCACCACAAAA AAAATCATAG 12114 363 1.65e-07 TGGTTTCATT CAATACTTTCATTCTACAAAC ATACAAACCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 264598 5.1e-11 456_[+2]_23 11221 5.1e-11 457_[+2]_22 11084 9.4e-11 161_[+2]_318 11488 6.5e-09 408_[+2]_71 9244 1e-08 409_[+2]_70 2042 3.8e-08 449_[+2]_30 9161 9.4e-08 297_[+2]_182 7632 1.1e-07 162_[+2]_317 12114 1.7e-07 362_[+2]_117 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=9 264598 ( 457) CCATCCCTCCCTCTCACACAC 1 11221 ( 458) CCATCCCTCCCTCTCACACAC 1 11084 ( 162) CCAACCTTCCTTCCCACAACC 1 11488 ( 409) CAAACCAATCCTCTCACAAGC 1 9244 ( 410) CCAACATCTCCTCCCACCACA 1 2042 ( 450) CCAACGCATCGACCTACAAAA 1 9161 ( 298) ACAAGCTCTCGTCTCACAGCC 1 7632 ( 163) CCGTCGTACCCCACCACAAAA 1 12114 ( 363) CAATACTTTCATTCTACAAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 12000 bayes= 10.5141 E= 6.3e-002 -130 196 -982 -982 -30 177 -982 -982 170 -982 -109 -982 102 -982 -982 77 -130 177 -109 -982 -130 154 -9 -982 -130 54 -982 109 28 -4 -982 77 -982 96 -982 109 -982 213 -982 -982 -130 128 -9 -123 -130 -104 -982 157 -130 177 -982 -123 -982 128 -982 77 -982 177 -982 -23 187 -982 -982 -982 -982 213 -982 -982 170 -104 -982 -982 128 -4 -109 -982 102 54 -109 -982 28 154 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 6.3e-002 0.111111 0.888889 0.000000 0.000000 0.222222 0.777778 0.000000 0.000000 0.888889 0.000000 0.111111 0.000000 0.555556 0.000000 0.000000 0.444444 0.111111 0.777778 0.111111 0.000000 0.111111 0.666667 0.222222 0.000000 0.111111 0.333333 0.000000 0.555556 0.333333 0.222222 0.000000 0.444444 0.000000 0.444444 0.000000 0.555556 0.000000 1.000000 0.000000 0.000000 0.111111 0.555556 0.222222 0.111111 0.111111 0.111111 0.000000 0.777778 0.111111 0.777778 0.000000 0.111111 0.000000 0.555556 0.000000 0.444444 0.000000 0.777778 0.000000 0.222222 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.666667 0.222222 0.111111 0.000000 0.555556 0.333333 0.111111 0.000000 0.333333 0.666667 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[CA]A[AT]C[CG][TC][TAC][TC]C[CG]TC[CT][CT]ACA[AC][AC][CA] -------------------------------------------------------------------------------- Time 10.31 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 8 llr = 105 E-value = 2.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::9::4::a: pos.-specific C 9:aa::a:1a:5 probability G 19:::::55::: matrix T :1::1a:14::5 bits 2.1 ** * * 1.9 ** ** ** 1.7 ** ** ** 1.5 **** ** ** Relative 1.3 ******* ** Entropy 1.1 ******* *** (19.0 bits) 0.9 ******* *** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CGCCATCGGCAC consensus AT T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 8517 126 1.64e-07 ACGGTGTTTC CGCCATCAGCAC CAAGACGTTG 8988 132 2.60e-07 TCTTTACCTT CGCCATCAGCAT TGTCACGCAA 264598 205 2.60e-07 CGTAGATTAG CGCCATCGTCAT CGTTGATCTT 11221 206 2.60e-07 CGTAGATTAG CGCCATCGTCAT CGTTGATCTT 7632 230 5.69e-07 ATCAAAACGC CGCCATCACCAC CCCAGAACGC 11488 244 9.12e-07 GAGGAGATGG GGCCATCGGCAT CAGTTGGTGA 11084 469 1.31e-06 GGACGTTGAA CTCCATCGTCAC GGATCCTGAC 2042 85 2.11e-06 TGTGTTTATT CGCCTTCTGCAC GGCTTGTTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8517 1.6e-07 125_[+3]_363 8988 2.6e-07 131_[+3]_357 264598 2.6e-07 204_[+3]_284 11221 2.6e-07 205_[+3]_283 7632 5.7e-07 229_[+3]_259 11488 9.1e-07 243_[+3]_245 11084 1.3e-06 468_[+3]_20 2042 2.1e-06 84_[+3]_404 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=8 8517 ( 126) CGCCATCAGCAC 1 8988 ( 132) CGCCATCAGCAT 1 264598 ( 205) CGCCATCGTCAT 1 11221 ( 206) CGCCATCGTCAT 1 7632 ( 230) CGCCATCACCAC 1 11488 ( 244) GGCCATCGGCAT 1 11084 ( 469) CTCCATCGTCAC 1 2042 ( 85) CGCCTTCTGCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 12225 bayes= 11.3139 E= 2.4e-001 -965 194 -92 -965 -965 -965 189 -106 -965 213 -965 -965 -965 213 -965 -965 167 -965 -965 -106 -965 -965 -965 194 -965 213 -965 -965 45 -965 108 -106 -965 -87 108 52 -965 213 -965 -965 187 -965 -965 -965 -965 113 -965 94 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 2.4e-001 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 0.875000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.000000 0.000000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.375000 0.000000 0.500000 0.125000 0.000000 0.125000 0.500000 0.375000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CGCCATC[GA][GT]CA[CT] -------------------------------------------------------------------------------- Time 15.16 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11077 5.45e-02 196_[+1(9.94e-06)]_288 11084 1.50e-11 125_[+3(9.07e-05)]_24_\ [+2(9.40e-11)]_145_[+1(2.07e-06)]_125_[+3(1.31e-06)]_20 11221 1.62e-13 205_[+3(2.60e-07)]_15_\ [+1(1.57e-07)]_209_[+2(5.09e-11)]_22 11488 1.42e-10 243_[+3(9.12e-07)]_11_\ [+1(4.72e-07)]_126_[+2(6.53e-09)]_71 12114 1.11e-05 86_[+1(1.20e-05)]_30_[+2(4.34e-05)]_\ 209_[+2(1.65e-07)]_117 1355 9.31e-02 500 1537 1.92e-02 352_[+1(4.99e-06)]_132 2042 2.81e-06 84_[+3(2.11e-06)]_353_\ [+2(3.77e-08)]_30 21831 3.39e-03 414_[+1(5.89e-07)]_70 264598 1.62e-13 204_[+3(2.60e-07)]_15_\ [+1(1.57e-07)]_209_[+2(5.09e-11)]_23 3103 9.33e-01 500 3415 7.91e-01 500 3765 2.06e-01 500 5128 4.89e-06 264_[+1(5.07e-10)]_220 5381 4.99e-03 397_[+1(5.89e-07)]_87 5805 2.19e-01 500 6793 4.85e-02 187_[+1(9.94e-06)]_297 681 4.77e-03 98_[+1(4.99e-06)]_386 7632 7.85e-07 162_[+2(1.12e-07)]_46_\ [+3(5.69e-07)]_123_[+3(2.21e-05)]_124 8517 6.15e-04 125_[+3(1.64e-07)]_151_\ [+3(3.87e-05)]_200 879 1.10e-02 307_[+1(2.90e-06)]_177 8988 1.53e-07 51_[+1(1.57e-07)]_64_[+3(2.60e-07)]_\ 131_[+1(8.73e-06)]_1_[+1(2.12e-05)]_193 9161 2.76e-04 297_[+2(9.37e-08)]_182 9244 3.54e-07 66_[+1(4.99e-06)]_327_\ [+2(1.01e-08)]_70 9298 3.47e-02 3_[+1(8.73e-06)]_481 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************