******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/254/254.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 21026 1.0000 500 25456 1.0000 500 261179 1.0000 500 262371 1.0000 500 263127 1.0000 500 26806 1.0000 500 269064 1.0000 500 269802 1.0000 500 269821 1.0000 500 34505 1.0000 500 40925 1.0000 500 40929 1.0000 500 4295 1.0000 500 8334 1.0000 500 9912 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/254/254.seqs.fa -oc motifs/254 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.279 C 0.235 G 0.236 T 0.250 Background letter frequencies (from dataset with add-one prior applied): A 0.279 C 0.235 G 0.236 T 0.250 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 14 llr = 161 E-value = 4.5e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :64:96:4:264:633 pos.-specific C 81:a12916814a176 probability G 226::1:4::32:1:: matrix T ::::1:124::::1:1 bits 2.1 * * 1.9 * * 1.7 * * * 1.5 * * * Relative 1.3 * * * * * * Entropy 1.0 * *** * ** * * (16.6 bits) 0.8 * *** * ** * ** 0.6 ******* *** * ** 0.4 ******* ***** ** 0.2 **************** 0.0 ---------------- Multilevel CAGCAACACCACCACC consensus GGA C GTAGA AA sequence T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 4295 227 1.92e-09 GTCATTATAC CAACAACGCCACCACC ACTGCAGCAG 21026 279 8.27e-08 TGTGGAAGCC CAGCAGCACCAGCACC TCTGTCTACC 269821 58 3.32e-07 GGAGCAGATG CAGCAACGTCAGCAAA CTCTTCAAAA 263127 98 3.32e-07 TGTCGTACAA CAACAACTCCAGCTCC ATCTGCACTA 25456 325 3.32e-07 TGCTACTCCT CCGCAACTCCACCCCC CTCAATTGCG 269064 343 5.47e-07 TGCAAATTGA CGACAACATCAACAAC GATCAACAAG 261179 6 1.18e-06 ATGCT CAACAACCTCAACGCC TCCGAATCAC 40925 99 1.93e-06 CTGCCGGTAG GAACAACACAACCACA GAGGAGACTT 40929 449 3.00e-06 CGTAGACAGA CAGCCACACAGACACC TTCCCCCTCA 26806 404 6.16e-06 CTCACCACCG CAGCACCGTCCACAAA TCCATCGGAC 8334 429 8.17e-06 ATCTAGTGCG CCGCACCATCGACCAC CAGCTACCTC 262371 460 1.13e-05 GGGTGTACGT GAACTGCGTCGCCACC TTTTATCTCG 269802 138 1.80e-05 GCTCACTCTC CGGCAATGCCACCTCT GACTCTGGTC 34505 415 2.97e-05 AGAATACGCA GGGCACCTCAGCCGCA CGTCACTGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 4295 1.9e-09 226_[+1]_258 21026 8.3e-08 278_[+1]_206 269821 3.3e-07 57_[+1]_427 263127 3.3e-07 97_[+1]_387 25456 3.3e-07 324_[+1]_160 269064 5.5e-07 342_[+1]_142 261179 1.2e-06 5_[+1]_479 40925 1.9e-06 98_[+1]_386 40929 3e-06 448_[+1]_36 26806 6.2e-06 403_[+1]_81 8334 8.2e-06 428_[+1]_56 262371 1.1e-05 459_[+1]_25 269802 1.8e-05 137_[+1]_347 34505 3e-05 414_[+1]_70 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=14 4295 ( 227) CAACAACGCCACCACC 1 21026 ( 279) CAGCAGCACCAGCACC 1 269821 ( 58) CAGCAACGTCAGCAAA 1 263127 ( 98) CAACAACTCCAGCTCC 1 25456 ( 325) CCGCAACTCCACCCCC 1 269064 ( 343) CGACAACATCAACAAC 1 261179 ( 6) CAACAACCTCAACGCC 1 40925 ( 99) GAACAACACAACCACA 1 40929 ( 449) CAGCCACACAGACACC 1 26806 ( 404) CAGCACCGTCCACAAA 1 8334 ( 429) CCGCACCATCGACCAC 1 262371 ( 460) GAACTGCGTCGCCACC 1 269802 ( 138) CGGCAATGCCACCTCT 1 34505 ( 415) GGGCACCTCAGCCGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7275 bayes= 9.62527 E= 4.5e-004 -1045 174 -14 -1045 120 -72 -14 -1045 62 -1045 128 -1045 -1045 209 -1045 -1045 162 -172 -1045 -180 120 -13 -72 -1045 -1045 198 -1045 -180 36 -172 60 -22 -1045 128 -1045 78 -38 174 -1045 -1045 120 -172 28 -1045 36 87 -14 -1045 -1045 209 -1045 -1045 103 -72 -72 -81 3 160 -1045 -1045 3 145 -1045 -180 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 14 E= 4.5e-004 0.000000 0.785714 0.214286 0.000000 0.642857 0.142857 0.214286 0.000000 0.428571 0.000000 0.571429 0.000000 0.000000 1.000000 0.000000 0.000000 0.857143 0.071429 0.000000 0.071429 0.642857 0.214286 0.142857 0.000000 0.000000 0.928571 0.000000 0.071429 0.357143 0.071429 0.357143 0.214286 0.000000 0.571429 0.000000 0.428571 0.214286 0.785714 0.000000 0.000000 0.642857 0.071429 0.285714 0.000000 0.357143 0.428571 0.214286 0.000000 0.000000 1.000000 0.000000 0.000000 0.571429 0.142857 0.142857 0.142857 0.285714 0.714286 0.000000 0.000000 0.285714 0.642857 0.000000 0.071429 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG][AG][GA]CA[AC]C[AGT][CT][CA][AG][CAG]CA[CA][CA] -------------------------------------------------------------------------------- Time 1.92 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 9 llr = 140 E-value = 4.5e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :81:a4:74:31:3:281:2 pos.-specific C 1:2:::::2::2:12:11:: probability G 921a::a21941a678:3a: matrix T ::6::6:12126::1:14:8 bits 2.1 * * * * 1.9 ** * * * 1.7 * ** * * * * 1.5 * ** * * * * Relative 1.3 * ** * * * * ** Entropy 1.0 ** ** * * * * ** (22.5 bits) 0.8 ** **** * * *** ** 0.6 ** ***** * ***** ** 0.4 ******** ******** ** 0.2 ******************** 0.0 -------------------- Multilevel GATGATGAAGGTGGGGATGT consensus GC A GC AC ACA G A sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 8334 61 1.83e-09 GCGTAGTGTA GATGATGTTGTTGGGGAGGT GCATGAGGTT 269064 20 3.41e-09 TGATGCAATG GACGAAGAAGGAGAGGATGT GTTCTCACCG 261179 169 6.08e-09 ATTGGTAGAA GAAGAAGAAGATGGGGAAGT GGTCGATCTA 25456 8 1.72e-08 ATTGGAT GATGATGACGACGACGACGT GTCACCTGAA 21026 333 2.07e-08 GTTCTTGAGT GACGATGACGGTGGTGTTGT TTCATCTGTG 9912 100 3.55e-08 CTCTTTTGCT GATGATGGATGTGGCGATGA CACATCATAT 26806 183 3.87e-08 CGTAGGTTGC GGTGATGGTGGCGAGGAGGA TTGTTGATGA 269821 450 2.75e-07 GGGAGGAGCG CAGGAAGAAGTTGCGAATGT TGGAGGATAC 40929 5 3.08e-07 AGAG GGTGAAGAGGAGGGGACGGT CGTTCATTGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8334 1.8e-09 60_[+2]_420 269064 3.4e-09 19_[+2]_461 261179 6.1e-09 168_[+2]_312 25456 1.7e-08 7_[+2]_473 21026 2.1e-08 332_[+2]_148 9912 3.6e-08 99_[+2]_381 26806 3.9e-08 182_[+2]_298 269821 2.8e-07 449_[+2]_31 40929 3.1e-07 4_[+2]_476 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=9 8334 ( 61) GATGATGTTGTTGGGGAGGT 1 269064 ( 20) GACGAAGAAGGAGAGGATGT 1 261179 ( 169) GAAGAAGAAGATGGGGAAGT 1 25456 ( 8) GATGATGACGACGACGACGT 1 21026 ( 333) GACGATGACGGTGGTGTTGT 1 9912 ( 100) GATGATGGATGTGGCGATGA 1 26806 ( 183) GGTGATGGTGGCGAGGAGGA 1 269821 ( 450) CAGGAAGAAGTTGCGAATGT 1 40929 ( 5) GGTGAAGAGGAGGGGACGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7215 bayes= 10.4939 E= 4.5e-003 -982 -108 191 -982 148 -982 -9 -982 -133 -8 -108 115 -982 -982 208 -982 184 -982 -982 -982 67 -982 -982 115 -982 -982 208 -982 125 -982 -9 -117 67 -8 -108 -17 -982 -982 191 -117 26 -982 91 -17 -133 -8 -108 115 -982 -982 208 -982 26 -108 124 -982 -982 -8 150 -117 -33 -982 172 -982 148 -108 -982 -117 -133 -108 50 83 -982 -982 208 -982 -33 -982 -982 164 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 9 E= 4.5e-003 0.000000 0.111111 0.888889 0.000000 0.777778 0.000000 0.222222 0.000000 0.111111 0.222222 0.111111 0.555556 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.444444 0.000000 0.000000 0.555556 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.222222 0.111111 0.444444 0.222222 0.111111 0.222222 0.000000 0.000000 0.888889 0.111111 0.333333 0.000000 0.444444 0.222222 0.111111 0.222222 0.111111 0.555556 0.000000 0.000000 1.000000 0.000000 0.333333 0.111111 0.555556 0.000000 0.000000 0.222222 0.666667 0.111111 0.222222 0.000000 0.777778 0.000000 0.777778 0.111111 0.000000 0.111111 0.111111 0.111111 0.333333 0.444444 0.000000 0.000000 1.000000 0.000000 0.222222 0.000000 0.000000 0.777778 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[AG][TC]GA[TA]G[AG][ACT]G[GAT][TC]G[GA][GC][GA]A[TG]G[TA] -------------------------------------------------------------------------------- Time 3.87 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 11 llr = 143 E-value = 5.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a:55337216742761125 pos.-specific C :a:1422882:38249:84 probability G ::5113::12:3::::::: matrix T :::4331:::31:1::9:1 bits 2.1 * 1.9 ** 1.7 ** * 1.5 ** * * *** Relative 1.3 ** ** * *** Entropy 1.0 *** ** * * **** (18.8 bits) 0.8 *** *** * ****** 0.6 *** ***** ******* 0.4 *** ***** ******* 0.2 ***** ************* 0.0 ------------------- Multilevel ACGACAACCAAACAACTCA consensus ATAG TC C C sequence TT G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 40925 467 1.19e-08 CACATAATAC ACATCTACCCAACACCTCA ATTCCAACAG 34505 443 2.09e-08 TCACTGTCAC ACATCAACCAACCAACTAA CGTTCACCAC 40929 275 8.79e-08 AACTTATTAA ACATCAAACATACAACTCC TGCAGCGTCT 269802 53 1.09e-07 AACAGTGGAT ACAATGCCCAAGCAACTCT CTCTTCCATC 263127 473 1.34e-07 GTCAGAGTGC ACGATAACCGACAACCTCA AACAACAAC 25456 281 2.92e-07 GGAGAGGAAG ACGTTCACCGATCCACTCA GAAAGATGCC 26806 454 3.21e-07 ATACCATACC ACACACACCCTACAACTCA ACTCAAACAC 269064 204 1.19e-06 GTGGTGGAAT ACGAAGACGATGAACCTCC GGTCGACGAT 261179 283 2.37e-06 TGAAGGGACA ACGAATACCAAACTACAAC AACGACAACG 21026 472 3.04e-06 ACCGCCACCG ACGGCTCCCAACCCCATCC CGAGCAGAGG 262371 241 3.87e-06 TTGATAAAAG ACGAGGTAAAAGCAACTCA CCCTGAGAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40925 1.2e-08 466_[+3]_15 34505 2.1e-08 442_[+3]_39 40929 8.8e-08 274_[+3]_207 269802 1.1e-07 52_[+3]_429 263127 1.3e-07 472_[+3]_9 25456 2.9e-07 280_[+3]_201 26806 3.2e-07 453_[+3]_28 269064 1.2e-06 203_[+3]_278 261179 2.4e-06 282_[+3]_199 21026 3e-06 471_[+3]_10 262371 3.9e-06 240_[+3]_241 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=11 40925 ( 467) ACATCTACCCAACACCTCA 1 34505 ( 443) ACATCAACCAACCAACTAA 1 40929 ( 275) ACATCAAACATACAACTCC 1 269802 ( 53) ACAATGCCCAAGCAACTCT 1 263127 ( 473) ACGATAACCGACAACCTCA 1 25456 ( 281) ACGTTCACCGATCCACTCA 1 26806 ( 454) ACACACACCCTACAACTCA 1 269064 ( 204) ACGAAGACGATGAACCTCC 1 261179 ( 283) ACGAATACCAAACTACAAC 1 21026 ( 472) ACGGCTCCCAACCCCATCC 1 262371 ( 241) ACGAGGTAAAAGCAACTCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 7230 bayes= 9.71373 E= 5.8e+000 184 -1010 -1010 -1010 -1010 209 -1010 -1010 70 -1010 121 -1010 70 -137 -137 54 -3 63 -137 13 -3 -37 21 13 138 -37 -1010 -146 -62 180 -1010 -1010 -162 180 -137 -1010 119 -37 -37 -1010 138 -1010 -1010 13 38 21 21 -146 -62 180 -1010 -1010 138 -37 -1010 -146 119 63 -1010 -1010 -162 195 -1010 -1010 -162 -1010 -1010 186 -62 180 -1010 -1010 97 63 -1010 -146 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 11 E= 5.8e+000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.454545 0.000000 0.545455 0.000000 0.454545 0.090909 0.090909 0.363636 0.272727 0.363636 0.090909 0.272727 0.272727 0.181818 0.272727 0.272727 0.727273 0.181818 0.000000 0.090909 0.181818 0.818182 0.000000 0.000000 0.090909 0.818182 0.090909 0.000000 0.636364 0.181818 0.181818 0.000000 0.727273 0.000000 0.000000 0.272727 0.363636 0.272727 0.272727 0.090909 0.181818 0.818182 0.000000 0.000000 0.727273 0.181818 0.000000 0.090909 0.636364 0.363636 0.000000 0.000000 0.090909 0.909091 0.000000 0.000000 0.090909 0.000000 0.000000 0.909091 0.181818 0.818182 0.000000 0.000000 0.545455 0.363636 0.000000 0.090909 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AC[GA][AT][CAT][AGT]ACCA[AT][ACG]CA[AC]CTC[AC] -------------------------------------------------------------------------------- Time 5.62 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21026 2.50e-10 278_[+1(8.27e-08)]_38_\ [+2(2.07e-08)]_119_[+3(3.04e-06)]_10 25456 8.62e-11 7_[+2(1.72e-08)]_57_[+2(2.87e-05)]_\ 176_[+3(2.92e-07)]_25_[+1(3.32e-07)]_160 261179 7.45e-10 5_[+1(1.18e-06)]_15_[+2(4.56e-05)]_\ 112_[+2(6.08e-09)]_94_[+3(2.37e-06)]_199 262371 6.33e-04 172_[+1(5.91e-05)]_52_\ [+3(3.87e-06)]_200_[+1(1.13e-05)]_25 263127 1.65e-06 97_[+1(3.32e-07)]_359_\ [+3(1.34e-07)]_9 26806 3.01e-09 182_[+2(3.87e-08)]_201_\ [+1(6.16e-06)]_34_[+3(3.21e-07)]_28 269064 1.12e-10 19_[+2(3.41e-09)]_103_\ [+2(2.62e-05)]_13_[+2(2.42e-06)]_8_[+3(1.19e-06)]_120_[+1(5.47e-07)]_142 269802 3.16e-05 52_[+3(1.09e-07)]_66_[+1(1.80e-05)]_\ 347 269821 1.36e-06 57_[+1(3.32e-07)]_376_\ [+2(2.75e-07)]_31 34505 1.98e-05 414_[+1(2.97e-05)]_12_\ [+3(2.09e-08)]_39 40925 6.69e-07 98_[+1(1.93e-06)]_352_\ [+3(1.19e-08)]_15 40929 3.19e-09 4_[+2(3.08e-07)]_250_[+3(8.79e-08)]_\ 55_[+3(9.89e-05)]_81_[+1(3.00e-06)]_36 4295 5.30e-05 226_[+1(1.92e-09)]_258 8334 5.38e-07 60_[+2(1.83e-09)]_192_\ [+2(4.13e-05)]_136_[+1(8.17e-06)]_56 9912 2.96e-04 99_[+2(3.55e-08)]_381 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************