******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/391/391.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9490 1.0000 500 42751 1.0000 500 54049 1.0000 500 4994 1.0000 500 47361 1.0000 500 14255 1.0000 500 47656 1.0000 500 38120 1.0000 500 14639 1.0000 500 14736 1.0000 500 6940 1.0000 500 40369 1.0000 500 50328 1.0000 500 43796 1.0000 500 31234 1.0000 500 50535 1.0000 500 33867 1.0000 500 35406 1.0000 500 45731 1.0000 500 51933 1.0000 500 12269 1.0000 500 20460 1.0000 500 46192 1.0000 500 12642 1.0000 500 47072 1.0000 500 38494 1.0000 500 37399 1.0000 500 35478 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/391/391.seqs.fa -oc motifs/391 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 28 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 14000 N= 28 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.243 G 0.235 T 0.254 Background letter frequencies (from dataset with add-one prior applied): A 0.267 C 0.243 G 0.235 T 0.254 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 16 llr = 179 E-value = 3.8e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1:4a:3:13a6: pos.-specific C 3:3:a1::2::: probability G 414:::a:6:4: matrix T 39:::7:9:::a bits 2.1 * * 1.9 ** * * * 1.7 * ** * * * 1.5 * ** ** * * Relative 1.3 * ** ** * * Entropy 1.0 * ** ** *** (16.1 bits) 0.8 * ***** *** 0.6 * ********* 0.4 *********** 0.2 ************ 0.0 ------------ Multilevel GTAACTGTGAAT consensus T G A A G sequence C C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 35406 246 1.77e-07 AACGTTGACA GTAACTGTGAAT TGGTTTTGGT 38494 355 2.99e-07 ACGATCCGTT TTGACTGTGAAT ACCGGTAGTC 20460 337 2.99e-07 ATGGTCGACG TTGACTGTGAAT CTGGTTTCAT 12642 80 7.19e-07 GTGAATGGGA CTGACTGTGAGT AGGCAATCCC 42751 49 7.19e-07 GGTCGTATGA CTGACTGTGAGT ATGCCGTTTT 50328 448 2.65e-06 CATTTGGCAA TTAACTGTAAGT TTGGCGACGA 46192 255 3.13e-06 TCCGAACGGT GTCACTGTCAAT TGCGGCTAGA 50535 128 4.17e-06 TTGACAATGA CTAACAGTGAGT CAGACTCACT 43796 416 4.36e-06 TCAAATGATC TTCACTGTCAGT TGTCTTTACA 47361 335 4.92e-06 CCCCCACCCT CTCACTGTCAGT AGAGCATCTA 35478 271 5.19e-06 AGTTTGAAGG GTAACAGTAAAT AGGATTGCAG 37399 271 5.19e-06 AGTTTGAAGG GTAACAGTAAAT AGGATTGCAG 14639 170 6.93e-06 TCCGGAGCAT GTCACTGAGAAT GGCGGGTACG 47072 471 9.13e-06 GAGTTTACAT TGGACTGTGAAT TGTTACTACA 40369 478 2.00e-05 ACGGATGGCC ATAACAGTAAAT CCCATTAAAG 14736 61 2.41e-05 ACGGGATGGT GTGACCGAGAGT AGGAAGTTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35406 1.8e-07 245_[+1]_243 38494 3e-07 354_[+1]_134 20460 3e-07 336_[+1]_152 12642 7.2e-07 79_[+1]_409 42751 7.2e-07 48_[+1]_440 50328 2.6e-06 447_[+1]_41 46192 3.1e-06 254_[+1]_234 50535 4.2e-06 127_[+1]_361 43796 4.4e-06 415_[+1]_73 47361 4.9e-06 334_[+1]_154 35478 5.2e-06 270_[+1]_218 37399 5.2e-06 270_[+1]_218 14639 6.9e-06 169_[+1]_319 47072 9.1e-06 470_[+1]_18 40369 2e-05 477_[+1]_11 14736 2.4e-05 60_[+1]_428 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=16 35406 ( 246) GTAACTGTGAAT 1 38494 ( 355) TTGACTGTGAAT 1 20460 ( 337) TTGACTGTGAAT 1 12642 ( 80) CTGACTGTGAGT 1 42751 ( 49) CTGACTGTGAGT 1 50328 ( 448) TTAACTGTAAGT 1 46192 ( 255) GTCACTGTCAAT 1 50535 ( 128) CTAACAGTGAGT 1 43796 ( 416) TTCACTGTCAGT 1 47361 ( 335) CTCACTGTCAGT 1 35478 ( 271) GTAACAGTAAAT 1 37399 ( 271) GTAACAGTAAAT 1 14639 ( 170) GTCACTGAGAAT 1 47072 ( 471) TGGACTGTGAAT 1 40369 ( 478) ATAACAGTAAAT 1 14736 ( 61) GTGACCGAGAGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 13692 bayes= 10.477 E= 3.8e-006 -209 4 67 30 -1064 -1064 -191 188 49 4 67 -1064 190 -1064 -1064 -1064 -1064 204 -1064 -1064 -10 -196 -1064 144 -1064 -1064 209 -1064 -110 -1064 -1064 178 -10 -37 126 -1064 190 -1064 -1064 -1064 107 -1064 89 -1064 -1064 -1064 -1064 198 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 16 E= 3.8e-006 0.062500 0.250000 0.375000 0.312500 0.000000 0.000000 0.062500 0.937500 0.375000 0.250000 0.375000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.062500 0.000000 0.687500 0.000000 0.000000 1.000000 0.000000 0.125000 0.000000 0.000000 0.875000 0.250000 0.187500 0.562500 0.000000 1.000000 0.000000 0.000000 0.000000 0.562500 0.000000 0.437500 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GTC]T[AGC]AC[TA]GT[GA]A[AG]T -------------------------------------------------------------------------------- Time 6.67 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 7 llr = 126 E-value = 4.8e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :3:1316339:3:::::1:7: pos.-specific C :::6:63:6::3:::9::::1 probability G 96a:7317117:3a1:a9:19 matrix T 11:3::::::347:91::a1: bits 2.1 * * * 1.9 * * * * 1.7 * * * * 1.5 * * ****** * Relative 1.3 * * * * ** ****** * Entropy 1.0 * * * * ** ******* * (25.9 bits) 0.8 * * * * ** ********* 0.6 *********** ********* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GGGCGCAGCAGTTGTCGGTAG consensus A TAGCAA TAG sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 35478 348 5.66e-12 CCCCCAACGT GGGCGCAACAGATGTCGGTAG CAGGACAGAC 37399 348 5.66e-12 CCCCCAACGT GGGCGCAACAGATGTCGGTAG CAGGACAGAC 12642 374 2.59e-10 GCGTCTTTTT GGGCGGCGCATCGGTCGGTAG GGGTCTTAAA 31234 18 1.13e-08 TTCCCGAGCA GGGTGCGGCGGTTGTTGGTGG TGGTACTTTC 38494 203 1.57e-08 AATGGACGGA GAGAACAGAAGCGGTCGATAG CTTTCTATAG 38120 192 2.66e-08 AGTGCAGGCC TTGTAAAGGAGTTGTCGGTAG TAGTCCAGAA 4994 196 4.91e-08 TGTCTTAATC GAGCGGCGAATTTGGCGGTTC ATGATTTGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35478 5.7e-12 347_[+2]_132 37399 5.7e-12 347_[+2]_132 12642 2.6e-10 373_[+2]_106 31234 1.1e-08 17_[+2]_462 38494 1.6e-08 202_[+2]_277 38120 2.7e-08 191_[+2]_288 4994 4.9e-08 195_[+2]_284 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=7 35478 ( 348) GGGCGCAACAGATGTCGGTAG 1 37399 ( 348) GGGCGCAACAGATGTCGGTAG 1 12642 ( 374) GGGCGGCGCATCGGTCGGTAG 1 31234 ( 18) GGGTGCGGCGGTTGTTGGTGG 1 38494 ( 203) GAGAACAGAAGCGGTCGATAG 1 38120 ( 192) TTGTAAAGGAGTTGTCGGTAG 1 4994 ( 196) GAGCGGCGAATTTGGCGGTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 13440 bayes= 11.5121 E= 4.8e-001 -945 -945 186 -83 9 -945 128 -83 -945 -945 209 -945 -90 123 -945 17 9 -945 160 -945 -90 123 28 -945 109 23 -72 -945 9 -945 160 -945 9 123 -72 -945 168 -945 -72 -945 -945 -945 160 17 9 23 -945 75 -945 -945 28 149 -945 -945 209 -945 -945 -945 -72 175 -945 182 -945 -83 -945 -945 209 -945 -90 -945 186 -945 -945 -945 -945 198 142 -945 -72 -83 -945 -77 186 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 4.8e-001 0.000000 0.000000 0.857143 0.142857 0.285714 0.000000 0.571429 0.142857 0.000000 0.000000 1.000000 0.000000 0.142857 0.571429 0.000000 0.285714 0.285714 0.000000 0.714286 0.000000 0.142857 0.571429 0.285714 0.000000 0.571429 0.285714 0.142857 0.000000 0.285714 0.000000 0.714286 0.000000 0.285714 0.571429 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 0.000000 0.714286 0.285714 0.285714 0.285714 0.000000 0.428571 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 1.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 0.000000 1.000000 0.714286 0.000000 0.142857 0.142857 0.000000 0.142857 0.857143 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[GA]G[CT][GA][CG][AC][GA][CA]A[GT][TAC][TG]GTCGGTAG -------------------------------------------------------------------------------- Time 13.53 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 7 llr = 123 E-value = 5.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::::::6::::16::3:: pos.-specific C a31:7739:1a9::1:6::3 probability G :6::1::::::193:337:7 matrix T :19a137149::16371:a: bits 2.1 * * 1.9 * * * * 1.7 * * * * 1.5 * ** * **** * Relative 1.3 * ** * * **** *** Entropy 1.0 * ** ******** * *** (25.4 bits) 0.8 * *********** * *** 0.6 ******************** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel CGTTCCTCATCCGTATCGTG consensus C TC T GTGGA C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 35478 46 7.42e-13 AAAATCCTCT CGTTCCTCATCCGTATCGTG CGAATCGTTG 37399 46 7.42e-13 AAAATCCTCT CGTTCCTCATCCGTATCGTG CGAATCGTTG 31234 98 1.07e-08 ACGCCTGGGT CGTTCCCCTCCCGATGCGTG CCGTGCCCTA 20460 273 2.11e-08 CTTTTCTTAG CTTTCTTCATCCGGCGTGTG CACCAAATGG 47072 199 2.71e-08 TCTATGTCGG CCCTGCCCTTCCGTATGATG GAATAGTTGT 42751 242 2.71e-08 CAAACAGCCA CCTTTCTCATCGGTTTCATC AATTTACTCT 14736 265 3.19e-08 TCCCTTCCAC CGTTCTTTTTCCTGATGGTC ACGTCGAAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35478 7.4e-13 45_[+3]_435 37399 7.4e-13 45_[+3]_435 31234 1.1e-08 97_[+3]_383 20460 2.1e-08 272_[+3]_208 47072 2.7e-08 198_[+3]_282 42751 2.7e-08 241_[+3]_239 14736 3.2e-08 264_[+3]_216 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=7 35478 ( 46) CGTTCCTCATCCGTATCGTG 1 37399 ( 46) CGTTCCTCATCCGTATCGTG 1 31234 ( 98) CGTTCCCCTCCCGATGCGTG 1 20460 ( 273) CTTTCTTCATCCGGCGTGTG 1 47072 ( 199) CCCTGCCCTTCCGTATGATG 1 42751 ( 242) CCTTTCTCATCGGTTTCATC 1 14736 ( 265) CGTTCTTTTTCCTGATGGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 13468 bayes= 11.5151 E= 5.6e+000 -945 204 -945 -945 -945 23 128 -83 -945 -77 -945 175 -945 -945 -945 198 -945 155 -72 -83 -945 155 -945 17 -945 23 -945 149 -945 182 -945 -83 109 -945 -945 75 -945 -77 -945 175 -945 204 -945 -945 -945 182 -72 -945 -945 -945 186 -83 -90 -945 28 117 109 -77 -945 17 -945 -945 28 149 -945 123 28 -83 9 -945 160 -945 -945 -945 -945 198 -945 23 160 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 5.6e+000 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.571429 0.142857 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.714286 0.142857 0.142857 0.000000 0.714286 0.000000 0.285714 0.000000 0.285714 0.000000 0.714286 0.000000 0.857143 0.000000 0.142857 0.571429 0.000000 0.000000 0.428571 0.000000 0.142857 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 0.857143 0.142857 0.142857 0.000000 0.285714 0.571429 0.571429 0.142857 0.000000 0.285714 0.000000 0.000000 0.285714 0.714286 0.000000 0.571429 0.285714 0.142857 0.285714 0.000000 0.714286 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.285714 0.714286 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[GC]TTC[CT][TC]C[AT]TCCG[TG][AT][TG][CG][GA]T[GC] -------------------------------------------------------------------------------- Time 20.34 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9490 3.14e-01 500 42751 3.83e-08 48_[+1(7.19e-07)]_86_[+1(9.67e-06)]_\ 83_[+3(2.71e-08)]_103_[+2(6.26e-05)]_115 54049 8.42e-01 500 4994 7.38e-04 195_[+2(4.91e-08)]_284 47361 7.00e-03 334_[+1(4.92e-06)]_19_\ [+1(6.37e-05)]_123 14255 4.96e-01 500 47656 8.10e-01 500 38120 2.96e-04 191_[+2(2.66e-08)]_288 14639 8.37e-03 169_[+1(6.93e-06)]_319 14736 7.49e-06 60_[+1(2.41e-05)]_43_[+3(7.88e-05)]_\ 129_[+3(3.19e-08)]_216 6940 1.56e-01 500 40369 3.25e-02 477_[+1(2.00e-05)]_11 50328 1.45e-02 447_[+1(2.65e-06)]_41 43796 2.54e-02 415_[+1(4.36e-06)]_73 31234 8.60e-09 17_[+2(1.13e-08)]_59_[+3(1.07e-08)]_\ 383 50535 2.02e-02 127_[+1(4.17e-06)]_361 33867 7.05e-01 500 35406 2.64e-03 245_[+1(1.77e-07)]_243 45731 4.42e-01 500 51933 5.22e-01 500 12269 2.98e-01 500 20460 1.82e-07 272_[+3(2.11e-08)]_44_\ [+1(2.99e-07)]_152 46192 9.95e-03 254_[+1(3.13e-06)]_234 12642 3.14e-09 63_[+1(2.34e-06)]_4_[+1(7.19e-07)]_\ 282_[+2(2.59e-10)]_106 47072 4.66e-06 198_[+3(2.71e-08)]_252_\ [+1(9.13e-06)]_18 38494 2.33e-07 202_[+2(1.57e-08)]_131_\ [+1(2.99e-07)]_134 37399 2.89e-18 45_[+3(7.42e-13)]_205_\ [+1(5.19e-06)]_65_[+2(5.66e-12)]_132 35478 2.89e-18 45_[+3(7.42e-13)]_205_\ [+1(5.19e-06)]_65_[+2(5.66e-12)]_132 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************