******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/67/67.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 46656 1.0000 500 46785 1.0000 500 22117 1.0000 500 18103 1.0000 500 49810 1.0000 500 44520 1.0000 500 19191 1.0000 500 33725 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/67/67.seqs.fa -oc motifs/67 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4000 N= 8 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.265 C 0.234 G 0.226 T 0.276 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.234 G 0.226 T 0.275 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 8 llr = 88 E-value = 1.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :9a4:68a8:1a pos.-specific C 9:::4:::3:5: probability G :1::6:3::93: matrix T 1::6:4:::11: bits 2.1 1.9 * * * 1.7 * * * 1.5 *** * * * Relative 1.3 *** * * * Entropy 1.1 *** * **** * (15.9 bits) 0.9 ********** * 0.6 ********** * 0.4 ********** * 0.2 ************ 0.0 ------------ Multilevel CAATGAAAAGCA consensus ACTG C G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46785 30 1.39e-07 AAGAATCTGC CAAAGAAAAGCA ACGACTTGAA 49810 335 2.85e-07 ATGCCAATCA CAATGTAAAGCA CAAGAACTCA 44520 57 8.99e-07 CTCTTTTCGC CAATCAAAAGGA TGTACGATGC 33725 42 2.27e-06 GCTATAGGAA CAAACAAACGCA GCTCTGATAA 18103 238 3.96e-06 CAGGACCCTG CAATGAAACGAA AATTGCGTAC 22117 99 1.28e-05 GAGATCACCG CGAAGTGAAGCA GTTTGATGAA 19191 39 1.85e-05 AGAAAATGGC TAATGTAAAGTA AACGATCCTG 46656 219 1.95e-05 AGACATTGGA CAATCAGAATGA TCCAAAGATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46785 1.4e-07 29_[+1]_459 49810 2.8e-07 334_[+1]_154 44520 9e-07 56_[+1]_432 33725 2.3e-06 41_[+1]_447 18103 4e-06 237_[+1]_251 22117 1.3e-05 98_[+1]_390 19191 1.9e-05 38_[+1]_450 46656 2e-05 218_[+1]_270 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=8 46785 ( 30) CAAAGAAAAGCA 1 49810 ( 335) CAATGTAAAGCA 1 44520 ( 57) CAATCAAAAGGA 1 33725 ( 42) CAAACAAACGCA 1 18103 ( 238) CAATGAAACGAA 1 22117 ( 99) CGAAGTGAAGCA 1 19191 ( 39) TAATGTAAAGTA 1 46656 ( 219) CAATCAGAATGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3912 bayes= 8.93074 E= 1.0e+001 -965 190 -965 -114 172 -965 -85 -965 191 -965 -965 -965 50 -965 -965 118 -965 68 147 -965 124 -965 -965 44 150 -965 15 -965 191 -965 -965 -965 150 10 -965 -965 -965 -965 195 -114 -108 110 15 -114 191 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 1.0e+001 0.000000 0.875000 0.000000 0.125000 0.875000 0.000000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 0.375000 0.000000 0.000000 0.625000 0.000000 0.375000 0.625000 0.000000 0.625000 0.000000 0.000000 0.375000 0.750000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 0.875000 0.125000 0.125000 0.500000 0.250000 0.125000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CAA[TA][GC][AT][AG]A[AC]G[CG]A -------------------------------------------------------------------------------- Time 0.56 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 8 llr = 94 E-value = 1.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::91:38::9a365 pos.-specific C 6:1338::9::43: probability G 48::8::a11:1:5 matrix T :3:6::3::::31: bits 2.1 * 1.9 * * 1.7 * * 1.5 **** Relative 1.3 ** ** **** Entropy 1.1 *** ******* * (16.9 bits) 0.9 *** ******* * 0.6 *********** ** 0.4 *********** ** 0.2 *********** ** 0.0 -------------- Multilevel CGATGCAGCAACAA consensus GT CCAT ACG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 44520 73 2.34e-07 AAAGGATGTA CGATGCTGCAAAAA GAAATTCAAA 46656 317 3.11e-07 TTGTACAATT CGACCCAGCAACAA CAGACTGACA 18103 29 1.33e-06 AGGAAAATTG CGATGAAGCAAGCG GTCGTACAGG 33725 9 1.98e-06 GGGCTCTC CGCTGCAGCAACTG GAACTTCACG 46785 151 1.98e-06 CTACTTAACA GTAAGCAGCAACAA AAAAAGAAGA 22117 67 2.67e-06 TGGGGGATAC GGATGCAGCGATCA GCTTTACAGA 49810 83 4.62e-06 CTTCCGATCC GTACGAAGCAAAAG GAACGAAGCG 19191 53 6.00e-06 GTAAAGTAAA CGATCCTGGAATAG TTGTTTCCTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44520 2.3e-07 72_[+2]_414 46656 3.1e-07 316_[+2]_170 18103 1.3e-06 28_[+2]_458 33725 2e-06 8_[+2]_478 46785 2e-06 150_[+2]_336 22117 2.7e-06 66_[+2]_420 49810 4.6e-06 82_[+2]_404 19191 6e-06 52_[+2]_434 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=8 44520 ( 73) CGATGCTGCAAAAA 1 46656 ( 317) CGACCCAGCAACAA 1 18103 ( 29) CGATGAAGCAAGCG 1 33725 ( 9) CGCTGCAGCAACTG 1 46785 ( 151) GTAAGCAGCAACAA 1 22117 ( 67) GGATGCAGCGATCA 1 49810 ( 83) GTACGAAGCAAAAG 1 19191 ( 53) CGATCCTGGAATAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 3896 bayes= 8.92481 E= 1.7e+001 -965 142 73 -965 -965 -965 173 -14 172 -90 -965 -965 -108 10 -965 118 -965 10 173 -965 -8 168 -965 -965 150 -965 -965 -14 -965 -965 215 -965 -965 190 -85 -965 172 -965 -85 -965 191 -965 -965 -965 -8 68 -85 -14 124 10 -965 -114 92 -965 115 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 8 E= 1.7e+001 0.000000 0.625000 0.375000 0.000000 0.000000 0.000000 0.750000 0.250000 0.875000 0.125000 0.000000 0.000000 0.125000 0.250000 0.000000 0.625000 0.000000 0.250000 0.750000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.875000 0.000000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.375000 0.125000 0.250000 0.625000 0.250000 0.000000 0.125000 0.500000 0.000000 0.500000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG][GT]A[TC][GC][CA][AT]GCAA[CAT][AC][AG] -------------------------------------------------------------------------------- Time 1.07 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 8 llr = 83 E-value = 1.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :55:4:3:3:9: pos.-specific C :::4458a:41a probability G a:1614::86:: matrix T :54:11:::::: bits 2.1 * * * 1.9 * * * 1.7 * * * 1.5 * * * Relative 1.3 * *** ** Entropy 1.1 * * ****** (14.9 bits) 0.9 ** * ****** 0.6 ** * ******* 0.4 **** ******* 0.2 ************ 0.0 ------------ Multilevel GAAGACCCGGAC consensus TTCCGA AC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 18103 174 1.08e-07 CCACTTCTGT GAAGACCCGGAC GTAGTCCAAG 46656 241 6.22e-07 TCCAAAGATC GTTGACCCGGAC CCTATTCGAA 49810 123 4.59e-06 GTTCCTGGAA GAACCCCCAGAC GCGTCGAAGG 19191 402 5.94e-06 TAACAACATT GATGCTCCGGAC GGCGCTAGCG 22117 37 1.10e-05 GACATTGTTC GTACGGCCGCAC ATTGTCGCTG 33725 423 1.29e-05 CCTGTGAGTC GTTGAGACGCAC TGACAGTGAG 46785 93 1.97e-05 TGCTCGCATC GTGCCCCCAGAC GAGAAGGCCA 44520 18 6.31e-05 ATCGTAACTT GAAGTGACGCCC CCCCCCCCCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 18103 1.1e-07 173_[+3]_315 46656 6.2e-07 240_[+3]_248 49810 4.6e-06 122_[+3]_366 19191 5.9e-06 401_[+3]_87 22117 1.1e-05 36_[+3]_452 33725 1.3e-05 422_[+3]_66 46785 2e-05 92_[+3]_396 44520 6.3e-05 17_[+3]_471 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=8 18103 ( 174) GAAGACCCGGAC 1 46656 ( 241) GTTGACCCGGAC 1 49810 ( 123) GAACCCCCAGAC 1 19191 ( 402) GATGCTCCGGAC 1 22117 ( 37) GTACGGCCGCAC 1 33725 ( 423) GTTGAGACGCAC 1 46785 ( 93) GTGCCCCCAGAC 1 44520 ( 18) GAAGTGACGCCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3912 bayes= 9.66888 E= 1.2e+003 -965 -965 215 -965 92 -965 -965 86 92 -965 -85 44 -965 68 147 -965 50 68 -85 -114 -965 110 73 -114 -8 168 -965 -965 -965 210 -965 -965 -8 -965 173 -965 -965 68 147 -965 172 -90 -965 -965 -965 210 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 1.2e+003 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.500000 0.000000 0.125000 0.375000 0.000000 0.375000 0.625000 0.000000 0.375000 0.375000 0.125000 0.125000 0.000000 0.500000 0.375000 0.125000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.375000 0.625000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[AT][AT][GC][AC][CG][CA]C[GA][GC]AC -------------------------------------------------------------------------------- Time 1.59 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46656 1.12e-07 218_[+1(1.95e-05)]_10_\ [+3(6.22e-07)]_64_[+2(3.11e-07)]_170 46785 1.55e-07 29_[+1(1.39e-07)]_51_[+3(1.97e-05)]_\ 46_[+2(1.98e-06)]_336 22117 7.04e-06 36_[+3(1.10e-05)]_18_[+2(2.67e-06)]_\ 18_[+1(1.28e-05)]_390 18103 1.98e-08 28_[+2(1.33e-06)]_131_\ [+3(1.08e-07)]_52_[+1(3.96e-06)]_251 49810 1.72e-07 82_[+2(4.62e-06)]_26_[+3(4.59e-06)]_\ 200_[+1(2.85e-07)]_3_[+3(2.94e-05)]_139 44520 3.46e-07 17_[+3(6.31e-05)]_27_[+1(8.99e-07)]_\ 4_[+2(2.34e-07)]_414 19191 1.16e-05 38_[+1(1.85e-05)]_2_[+2(6.00e-06)]_\ 335_[+3(5.94e-06)]_87 33725 1.33e-06 8_[+2(1.98e-06)]_19_[+1(2.27e-06)]_\ 369_[+3(1.29e-05)]_66 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************