******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/389/389.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 12107 1.0000 500 2016 1.0000 500 21157 1.0000 500 21652 1.0000 500 21726 1.0000 500 21824 1.0000 500 22939 1.0000 500 23066 1.0000 500 23196 1.0000 500 23750 1.0000 500 23815 1.0000 500 23896 1.0000 500 24764 1.0000 500 25675 1.0000 500 25802 1.0000 500 263228 1.0000 500 264206 1.0000 500 264808 1.0000 500 270038 1.0000 500 2804 1.0000 500 3092 1.0000 500 38153 1.0000 500 bd539 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/389/389.seqs.fa -oc motifs/389 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 23 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 11500 N= 23 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.267 C 0.246 G 0.228 T 0.259 Background letter frequencies (from dataset with add-one prior applied): A 0.267 C 0.246 G 0.228 T 0.259 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 15 llr = 187 E-value = 8.0e-008 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :71:33173:21:74: pos.-specific C ::4::4::3::7::14 probability G a35a5193:a71a326 matrix T :1::32::4:1:::3: bits 2.1 * * * * 1.9 * * * * 1.7 * * * * * 1.5 * * * * * Relative 1.3 * * * * * Entropy 1.1 * * ** * ** * (18.0 bits) 0.9 ** * ** ***** * 0.6 **** ** ***** * 0.4 ***** ******** * 0.2 ***** ********** 0.0 ---------------- Multilevel GAGGGCGATGGCGAAG consensus GC AA GC A GTC sequence TT A G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 263228 305 2.41e-08 ATGCAATGGA GACGGAGACGGCGAGG ATGGAGGATA 23815 312 2.98e-08 GTGGACAGTG GACGACGACGGCGAAC TAGATACCAT 21157 239 5.20e-08 GTTAGGAAGG GAGGGAGGCGGCGAGG AGCGCAGGAA 22939 64 8.61e-08 GGGAGGCGGT GGGGTCGGTGGCGAAG GAGATGGTGA 25675 71 1.00e-07 CGATGACGGA GACGGCGACGGCGACG GCGACGGCGA 25802 92 2.02e-07 TGATTGCTCC GGCGATGATGGCGAAG AGGCCGTATT 23896 68 2.30e-07 CCATCCTGAT GACGACGATGTCGAAG TGGTCTTTTC 12107 112 4.19e-07 GTATTATCAA GACGGAGAAGGAGAAG GGAAGAGCAT 38153 170 1.05e-06 ATTCAAAATG GAAGGCGACGACGAGC AACCCTTTCT 23066 268 1.05e-06 GTGAGCGTTG GAGGTCGATGGGGGTC AAATATATGT 2016 28 2.90e-06 GGTTCAACTA GAGGAGGAAGGGGGTG TCTAGTTGTG 21652 131 3.98e-06 CTTTATTGTT GGGGGTGGAGACGGTC TCCGTTGGGC 3092 157 5.26e-06 TGATCGCTGT GAGGTTAGAGGCGATC CAAAGTGTAA bd539 354 5.55e-06 AACTCGACGA GGAGGAGATGAAGAAG TGTATGAAGA 2804 403 1.25e-05 GGAGGTTGTC GTGGTGGGTGTCGGTC GGGCACAACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 263228 2.4e-08 304_[+1]_180 23815 3e-08 311_[+1]_173 21157 5.2e-08 238_[+1]_246 22939 8.6e-08 63_[+1]_421 25675 1e-07 70_[+1]_414 25802 2e-07 91_[+1]_393 23896 2.3e-07 67_[+1]_417 12107 4.2e-07 111_[+1]_373 38153 1.1e-06 169_[+1]_315 23066 1.1e-06 267_[+1]_217 2016 2.9e-06 27_[+1]_457 21652 4e-06 130_[+1]_354 3092 5.3e-06 156_[+1]_328 bd539 5.5e-06 353_[+1]_131 2804 1.3e-05 402_[+1]_82 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=15 263228 ( 305) GACGGAGACGGCGAGG 1 23815 ( 312) GACGACGACGGCGAAC 1 21157 ( 239) GAGGGAGGCGGCGAGG 1 22939 ( 64) GGGGTCGGTGGCGAAG 1 25675 ( 71) GACGGCGACGGCGACG 1 25802 ( 92) GGCGATGATGGCGAAG 1 23896 ( 68) GACGACGATGTCGAAG 1 12107 ( 112) GACGGAGAAGGAGAAG 1 38153 ( 170) GAAGGCGACGACGAGC 1 23066 ( 268) GAGGTCGATGGGGGTC 1 2016 ( 28) GAGGAGGAAGGGGGTG 1 21652 ( 131) GGGGGTGGAGACGGTC 1 3092 ( 157) GAGGTTAGAGGCGATC 1 bd539 ( 354) GGAGGAGATGAAGAAG 1 2804 ( 403) GTGGTGGGTGTCGGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 11155 bayes= 10.8123 E= 8.0e-008 -1055 -1055 213 -1055 132 -1055 22 -195 -100 70 103 -1055 -1055 -1055 213 -1055 0 -1055 103 4 0 70 -78 -37 -200 -1055 203 -1055 132 -1055 55 -1055 0 44 -1055 63 -1055 -1055 213 -1055 -42 -1055 155 -96 -100 158 -78 -1055 -1055 -1055 213 -1055 146 -1055 22 -1055 58 -188 -19 36 -1055 70 139 -1055 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 15 E= 8.0e-008 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.266667 0.066667 0.133333 0.400000 0.466667 0.000000 0.000000 0.000000 1.000000 0.000000 0.266667 0.000000 0.466667 0.266667 0.266667 0.400000 0.133333 0.200000 0.066667 0.000000 0.933333 0.000000 0.666667 0.000000 0.333333 0.000000 0.266667 0.333333 0.000000 0.400000 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.666667 0.133333 0.133333 0.733333 0.133333 0.000000 0.000000 0.000000 1.000000 0.000000 0.733333 0.000000 0.266667 0.000000 0.400000 0.066667 0.200000 0.333333 0.000000 0.400000 0.600000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[AG][GC]G[GAT][CAT]G[AG][TCA]G[GA]CG[AG][ATG][GC] -------------------------------------------------------------------------------- Time 4.67 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 17 llr = 172 E-value = 7.7e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 6441a:94:99: pos.-specific C 2128:a:2a:1a probability G 2111::12:1:: matrix T :431:::2:::: bits 2.1 1.9 ** * * 1.7 ** * * 1.5 *** **** Relative 1.3 *** **** Entropy 1.1 *** **** (14.6 bits) 0.9 **** **** 0.6 * **** **** 0.4 * **** **** 0.2 ******* **** 0.0 ------------ Multilevel AAACACAACAAC consensus TT C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 270038 417 1.86e-07 TAGCTTCACA AAACACAACAAC ATGGGTTTTT 264206 383 1.86e-07 TCGTCCTCCG AAACACAACAAC ATCGACGAAC 21652 472 5.39e-07 CAATCCAAGC AATCACAACAAC AGCAACCCAT 23066 5 7.20e-07 TATG AAACACATCAAC GCAATGGACT 38153 292 1.94e-06 TGAAGCGAAA ATGCACAACAAC TCACCAAAAA 3092 379 5.15e-06 ACCGCCCACC CAACACAGCAAC CCTCTCGGAT 21726 485 5.15e-06 CTCTCACTCA CTTCACACCAAC AACC 2804 344 8.00e-06 CTATCCAGTC GTCCACACCAAC GTTTCGGTGC 21157 458 8.00e-06 CATGAGTCAA ACACACATCAAC GTCAGGCACT 23815 459 1.58e-05 CGCTTCCCAC ATACACGTCAAC CAATCACAAT 21824 128 1.58e-05 AAAGATGGTG ATTTACAACAAC AAAGCGCTCT 24764 488 2.68e-05 ACAGCCCCAG AACGACACCAAC A 23196 473 2.68e-05 CCTCGACCAA AGTCACACCACC GCACACCCTC 25802 443 2.89e-05 GAGTTTCAAC GTAAACAGCAAC TTTCTTTCTG 25675 486 3.31e-05 CACACTGAGC AGCAACAACAAC CGG 22939 431 5.52e-05 TTCTCACAGT GAGCACATCACC AGCCCTCACA 263228 136 5.74e-05 TCCAACGGAT CTTCACAGCGAC ACTCGTTGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 270038 1.9e-07 416_[+2]_72 264206 1.9e-07 382_[+2]_106 21652 5.4e-07 471_[+2]_17 23066 7.2e-07 4_[+2]_484 38153 1.9e-06 291_[+2]_197 3092 5.1e-06 378_[+2]_110 21726 5.1e-06 484_[+2]_4 2804 8e-06 343_[+2]_145 21157 8e-06 457_[+2]_31 23815 1.6e-05 458_[+2]_30 21824 1.6e-05 127_[+2]_361 24764 2.7e-05 487_[+2]_1 23196 2.7e-05 472_[+2]_16 25802 2.9e-05 442_[+2]_46 25675 3.3e-05 485_[+2]_3 22939 5.5e-05 430_[+2]_58 263228 5.7e-05 135_[+2]_353 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=17 270038 ( 417) AAACACAACAAC 1 264206 ( 383) AAACACAACAAC 1 21652 ( 472) AATCACAACAAC 1 23066 ( 5) AAACACATCAAC 1 38153 ( 292) ATGCACAACAAC 1 3092 ( 379) CAACACAGCAAC 1 21726 ( 485) CTTCACACCAAC 1 2804 ( 344) GTCCACACCAAC 1 21157 ( 458) ACACACATCAAC 1 23815 ( 459) ATACACGTCAAC 1 21824 ( 128) ATTTACAACAAC 1 24764 ( 488) AACGACACCAAC 1 23196 ( 473) AGTCACACCACC 1 25802 ( 443) GTAAACAGCAAC 1 25675 ( 486) AGCAACAACAAC 1 22939 ( 431) GAGCACATCACC 1 263228 ( 136) CTTCACAGCGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 11247 bayes= 10.1632 E= 7.7e-004 128 -48 -37 -1073 63 -206 -96 67 63 -48 -96 18 -118 164 -195 -213 191 -1073 -1073 -1073 -1073 202 -1073 -1073 182 -1073 -195 -1073 40 -6 -37 -14 -1073 202 -1073 -1073 182 -1073 -195 -1073 172 -106 -1073 -1073 -1073 202 -1073 -1073 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 17 E= 7.7e-004 0.647059 0.176471 0.176471 0.000000 0.411765 0.058824 0.117647 0.411765 0.411765 0.176471 0.117647 0.294118 0.117647 0.764706 0.058824 0.058824 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.941176 0.000000 0.058824 0.000000 0.352941 0.235294 0.176471 0.235294 0.000000 1.000000 0.000000 0.000000 0.941176 0.000000 0.058824 0.000000 0.882353 0.117647 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[AT][AT]CACA[ACT]CAAC -------------------------------------------------------------------------------- Time 9.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 7 llr = 90 E-value = 3.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::41:6::a11: pos.-specific C :1:4:::::::a probability G :91:a::a::9: matrix T a:44:4a::9:: bits 2.1 * * 1.9 * * *** * 1.7 * * *** * 1.5 ** * *** ** Relative 1.3 ** * ****** Entropy 1.1 ** * ****** (18.4 bits) 0.9 ** ******** 0.6 ** ******** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGACGATGATGC consensus TT T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 264808 377 5.25e-08 TCGAGAATAT TGTCGATGATGC TAATGGATCG 21157 4 5.25e-08 CAT TGTCGATGATGC GTGCTGACAG 2804 229 2.19e-07 TTCGGAGGGC TGATGATGATGC ACGATGTCTC 270038 327 1.67e-06 TTGTGAAGGC TGATGATGATAC TGTCCTGCCA 38153 380 2.00e-06 CAAAGCTCGC TGACGTTGAAGC TCCTCCAAAC 24764 158 2.00e-06 CGTGCAGTTC TCTTGTTGATGC GTGGTTGAGG 21824 328 2.36e-06 ACAGCAATGC TGGAGTTGATGC GCAACAACTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 264808 5.2e-08 376_[+3]_112 21157 5.2e-08 3_[+3]_485 2804 2.2e-07 228_[+3]_260 270038 1.7e-06 326_[+3]_162 38153 2e-06 379_[+3]_109 24764 2e-06 157_[+3]_331 21824 2.4e-06 327_[+3]_161 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=7 264808 ( 377) TGTCGATGATGC 1 21157 ( 4) TGTCGATGATGC 1 2804 ( 229) TGATGATGATGC 1 270038 ( 327) TGATGATGATAC 1 38153 ( 380) TGACGTTGAAGC 1 24764 ( 158) TCTTGTTGATGC 1 21824 ( 328) TGGAGTTGATGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 11247 bayes= 10.4928 E= 3.4e+002 -945 -945 -945 195 -945 -78 191 -945 68 -945 -67 73 -90 80 -945 73 -945 -945 213 -945 110 -945 -945 73 -945 -945 -945 195 -945 -945 213 -945 190 -945 -945 -945 -90 -945 -945 173 -90 -945 191 -945 -945 202 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 3.4e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.857143 0.000000 0.428571 0.000000 0.142857 0.428571 0.142857 0.428571 0.000000 0.428571 0.000000 0.000000 1.000000 0.000000 0.571429 0.000000 0.000000 0.428571 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.000000 0.000000 0.857143 0.142857 0.000000 0.857143 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TG[AT][CT]G[AT]TGATGC -------------------------------------------------------------------------------- Time 13.94 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12107 1.37e-03 111_[+1(4.19e-07)]_373 2016 3.42e-03 27_[+1(2.90e-06)]_457 21157 9.70e-10 3_[+3(5.25e-08)]_223_[+1(5.20e-08)]_\ 203_[+2(8.00e-06)]_31 21652 4.68e-05 130_[+1(3.98e-06)]_325_\ [+2(5.39e-07)]_17 21726 1.13e-02 484_[+2(5.15e-06)]_4 21824 4.26e-04 127_[+2(1.58e-05)]_188_\ [+3(2.36e-06)]_161 22939 6.09e-05 63_[+1(8.61e-08)]_351_\ [+2(5.52e-05)]_58 23066 1.09e-05 4_[+2(7.20e-07)]_251_[+1(1.05e-06)]_\ 103_[+2(8.52e-05)]_102 23196 9.09e-02 472_[+2(2.68e-05)]_16 23750 3.00e-01 500 23815 6.06e-06 311_[+1(2.98e-08)]_131_\ [+2(1.58e-05)]_30 23896 4.99e-04 67_[+1(2.30e-07)]_99_[+1(5.05e-05)]_\ 302 24764 2.55e-04 157_[+3(2.00e-06)]_318_\ [+2(2.68e-05)]_1 25675 8.76e-05 70_[+1(1.00e-07)]_2_[+1(4.65e-05)]_\ 381_[+2(3.31e-05)]_3 25802 2.96e-05 91_[+1(2.02e-07)]_335_\ [+2(2.89e-05)]_46 263228 3.15e-05 135_[+2(5.74e-05)]_157_\ [+1(2.41e-08)]_180 264206 1.49e-04 382_[+2(1.86e-07)]_106 264808 6.49e-04 376_[+3(5.25e-08)]_112 270038 2.82e-06 326_[+3(1.67e-06)]_78_\ [+2(1.86e-07)]_72 2804 5.48e-07 228_[+3(2.19e-07)]_103_\ [+2(8.00e-06)]_47_[+1(1.25e-05)]_82 3092 5.39e-04 156_[+1(5.26e-06)]_206_\ [+2(5.15e-06)]_110 38153 1.20e-07 169_[+1(1.05e-06)]_106_\ [+2(1.94e-06)]_47_[+2(1.73e-05)]_17_[+3(2.00e-06)]_109 bd539 1.90e-02 353_[+1(5.55e-06)]_131 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************