******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/436/436.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11372 1.0000 500 21432 1.0000 500 24305 1.0000 500 24452 1.0000 500 25017 1.0000 500 25516 1.0000 500 261368 1.0000 500 264837 1.0000 500 2954 1.0000 500 34487 1.0000 500 34541 1.0000 500 35381 1.0000 500 37472 1.0000 500 42133 1.0000 500 750 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/436/436.seqs.fa -oc motifs/436 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.270 C 0.236 G 0.230 T 0.265 Background letter frequencies (from dataset with add-one prior applied): A 0.270 C 0.236 G 0.230 T 0.265 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 13 llr = 144 E-value = 6.7e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 114:241:112252: pos.-specific C :62:41:2::::::1 probability G 9:2a359:19252:9 matrix T :33:1::88:6338: bits 2.1 * 1.9 * 1.7 * * * * * 1.5 * * * * * Relative 1.3 * * ** * ** Entropy 1.1 * * **** ** (16.0 bits) 0.8 ** * **** ** 0.6 ** * ****** ** 0.4 ** * ********** 0.2 ** ************ 0.0 --------------- Multilevel GCAGCGGTTGTGATG consensus TT GA C ATT sequence A A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 34487 69 2.30e-08 TAGGTTATAT GCAGCAGTTGTTATG TGTAAGGTAT 24452 256 3.06e-07 ATGTGTAGGT GTGGCGGTTGTAATG GTGACACGTC 37472 348 5.59e-07 GGTAAGTCAG GCCGCAGCTGTGTTG GGAATTGATC 35381 332 6.24e-07 ATGTGAGTGG GCTGAGGTTGATTTG GCGAGAGCAC 34541 67 8.66e-07 CTCAACAGCA GCAGGGGTTGGTGTG ATGATGTGCG 750 109 1.17e-06 GATTAGGTCG GCTGCGGCGGTGATG CGGGAGGGGG 42133 347 5.26e-06 ACAGAGCAAA GTCGGAGCTGAGTTG TCTGATATCA 2954 120 5.26e-06 TGCGTGGTCC ATTGGGGTTGAGATG AGAGGTACAA 25017 64 5.72e-06 AGATTGGCGT GTAGGCGTTGTAGTG TCTGGTTGCT 261368 485 8.92e-06 ACGCCGAAAC GCTGCAATTGTTAAG A 25516 102 9.57e-06 ACGATGATGG GCAGAGGTAGTGATC CATCAAGTGT 24305 213 9.57e-06 TGAATAAAGT GAAGTGGTTGTGAAG AATGTGCTTT 11372 6 3.28e-05 GTGTT GCGGAAGTTAGATTG TTCTTTCTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34487 2.3e-08 68_[+1]_417 24452 3.1e-07 255_[+1]_230 37472 5.6e-07 347_[+1]_138 35381 6.2e-07 331_[+1]_154 34541 8.7e-07 66_[+1]_419 750 1.2e-06 108_[+1]_377 42133 5.3e-06 346_[+1]_139 2954 5.3e-06 119_[+1]_366 25017 5.7e-06 63_[+1]_422 261368 8.9e-06 484_[+1]_1 25516 9.6e-06 101_[+1]_384 24305 9.6e-06 212_[+1]_273 11372 3.3e-05 5_[+1]_480 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=13 34487 ( 69) GCAGCAGTTGTTATG 1 24452 ( 256) GTGGCGGTTGTAATG 1 37472 ( 348) GCCGCAGCTGTGTTG 1 35381 ( 332) GCTGAGGTTGATTTG 1 34541 ( 67) GCAGGGGTTGGTGTG 1 750 ( 109) GCTGCGGCGGTGATG 1 42133 ( 347) GTCGGAGCTGAGTTG 1 2954 ( 120) ATTGGGGTTGAGATG 1 25017 ( 64) GTAGGCGTTGTAGTG 1 261368 ( 485) GCTGCAATTGTTAAG 1 25516 ( 102) GCAGAGGTAGTGATC 1 24305 ( 213) GAAGTGGTTGTGAAG 1 11372 ( 6) GCGGAAGTTAGATTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 7290 bayes= 9.66 E= 6.7e-002 -181 -1035 200 -1035 -181 138 -1035 22 51 -61 -58 22 -1035 -1035 212 -1035 -22 71 42 -178 51 -161 123 -1035 -181 -1035 200 -1035 -1035 -3 -1035 154 -181 -1035 -158 168 -181 -1035 200 -1035 -22 -1035 -58 122 -22 -1035 101 22 100 -1035 -58 22 -81 -1035 -1035 168 -1035 -161 200 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 13 E= 6.7e-002 0.076923 0.000000 0.923077 0.000000 0.076923 0.615385 0.000000 0.307692 0.384615 0.153846 0.153846 0.307692 0.000000 0.000000 1.000000 0.000000 0.230769 0.384615 0.307692 0.076923 0.384615 0.076923 0.538462 0.000000 0.076923 0.000000 0.923077 0.000000 0.000000 0.230769 0.000000 0.769231 0.076923 0.000000 0.076923 0.846154 0.076923 0.000000 0.923077 0.000000 0.230769 0.000000 0.153846 0.615385 0.230769 0.000000 0.461538 0.307692 0.538462 0.000000 0.153846 0.307692 0.153846 0.000000 0.000000 0.846154 0.000000 0.076923 0.923077 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[CT][AT]G[CGA][GA]G[TC]TG[TA][GTA][AT]TG -------------------------------------------------------------------------------- Time 2.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 8 llr = 114 E-value = 9.3e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::3:5::::14:: pos.-specific C 1:8::344:1:61::: probability G 3a3:8:5::3::8::a matrix T 6::a3511a6a4:6a: bits 2.1 * * 1.9 * * * * ** 1.7 * * * * ** 1.5 * * * * ** Relative 1.3 **** * * ** Entropy 1.1 **** * *** ** (20.6 bits) 0.8 **** * ****** 0.6 ***** ********** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TGCTGTGATTTCGTTG consensus G G TACC G T A sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 24452 348 4.18e-09 GACCGTTGCT TGCTGTGCTGTCGTTG GCCCCGACAC 21432 348 2.42e-08 TTTTTGTTGC TGCTTTGATTTTGTTG CCTAATCCAG 25017 133 5.21e-08 TGTTGATGTG TGCTGAGCTGTTGTTG AATATGTGAT 264837 447 6.48e-08 AAAGTGTCGT TGGTGCCATTTCGATG GCAGTAAACC 750 141 2.27e-07 GGGAGTGGGT GGCTGTCATTTTCTTG TTTTTGTGTG 42133 46 3.88e-07 TCGATATGCT CGCTGTGCTTTCAATG GTTCCAGACG 25516 67 3.88e-07 GTGTATCTTT GGGTGATATTTCGTTG CTCATTTTGA 2954 62 1.05e-06 GAAATTCGCC TGCTTCCTTCTCGATG GATTGCTGGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24452 4.2e-09 347_[+2]_137 21432 2.4e-08 347_[+2]_137 25017 5.2e-08 132_[+2]_352 264837 6.5e-08 446_[+2]_38 750 2.3e-07 140_[+2]_344 42133 3.9e-07 45_[+2]_439 25516 3.9e-07 66_[+2]_418 2954 1.1e-06 61_[+2]_423 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=8 24452 ( 348) TGCTGTGCTGTCGTTG 1 21432 ( 348) TGCTTTGATTTTGTTG 1 25017 ( 133) TGCTGAGCTGTTGTTG 1 264837 ( 447) TGGTGCCATTTCGATG 1 750 ( 141) GGCTGTCATTTTCTTG 1 42133 ( 46) CGCTGTGCTTTCAATG 1 25516 ( 67) GGGTGATATTTCGTTG 1 2954 ( 62) TGCTTCCTTCTCGATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7275 bayes= 10.5647 E= 9.3e-001 -965 -91 12 124 -965 -965 212 -965 -965 167 12 -965 -965 -965 -965 192 -965 -965 170 -8 -11 9 -965 92 -965 67 112 -108 89 67 -965 -108 -965 -965 -965 192 -965 -91 12 124 -965 -965 -965 192 -965 141 -965 50 -111 -91 170 -965 47 -965 -965 124 -965 -965 -965 192 -965 -965 212 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 9.3e-001 0.000000 0.125000 0.250000 0.625000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.250000 0.250000 0.000000 0.500000 0.000000 0.375000 0.500000 0.125000 0.500000 0.375000 0.000000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 0.125000 0.250000 0.625000 0.000000 0.000000 0.000000 1.000000 0.000000 0.625000 0.000000 0.375000 0.125000 0.125000 0.750000 0.000000 0.375000 0.000000 0.000000 0.625000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TG]G[CG]T[GT][TAC][GC][AC]T[TG]T[CT]G[TA]TG -------------------------------------------------------------------------------- Time 4.27 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 9 llr = 142 E-value = 5.5e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 377::6::a326::262:23: pos.-specific C 73:a842a:37:a78439337 probability G ::::::1::::1:2:::12:: matrix T ::3:2:7::313:1::4:233 bits 2.1 * * * 1.9 * ** * 1.7 * ** * 1.5 * ** * * Relative 1.3 ** ** * * * Entropy 1.1 ****** ** * ** * * (22.7 bits) 0.8 ********* * **** * * 0.6 ********* ****** * * 0.4 ****************** ** 0.2 ****************** ** 0.0 --------------------- Multilevel CAACCATCAACACCCATCCAC consensus ACT TCC CAT GACC ACT sequence T A GT T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 34487 140 1.44e-09 CAACAGAGCA AAACCACCATCACCCCTCCAC TCAGCTTCCA 261368 412 1.44e-09 ACATGCATCG CAACCCTCAACGCCCATCGTC CCACGCATCT 750 473 1.70e-09 CCTCCGCCCC CAACCCGCACCACCCACCGCC ACCACCA 2954 480 2.21e-08 CCCTCGAGCA CAACCACCAACACCCCCGACC 37472 475 4.19e-08 AACCGAGTAC AAACCCTCACCACCAAACTAT TCGCC 264837 162 4.96e-08 TCGTTCCTCG CATCCATCAACTCTCATCATT TGCAAATCGA 21432 88 9.77e-08 AGTAAGCCCC CCACTATCATATCGCCTCTCC CTCTTCCACC 35381 465 1.92e-07 AATTCCCTTC CCTCCCTCATTTCCACCCCTT CCTCACCCTA 24452 444 2.14e-07 GAAAAATCTC ACTCTATCACAACGCAACCAC TAGAACTGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34487 1.4e-09 139_[+3]_340 261368 1.4e-09 411_[+3]_68 750 1.7e-09 472_[+3]_7 2954 2.2e-08 479_[+3] 37472 4.2e-08 474_[+3]_5 264837 5e-08 161_[+3]_318 21432 9.8e-08 87_[+3]_392 35381 1.9e-07 464_[+3]_15 24452 2.1e-07 443_[+3]_36 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=9 34487 ( 140) AAACCACCATCACCCCTCCAC 1 261368 ( 412) CAACCCTCAACGCCCATCGTC 1 750 ( 473) CAACCCGCACCACCCACCGCC 1 2954 ( 480) CAACCACCAACACCCCCGACC 1 37472 ( 475) AAACCCTCACCACCAAACTAT 1 264837 ( 162) CATCCATCAACTCTCATCATT 1 21432 ( 88) CCACTATCATATCGCCTCTCC 1 35381 ( 465) CCTCCCTCATTTCCACCCCTT 1 24452 ( 444) ACTCTATCACAACGCAACCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 9.77651 E= 5.5e-001 31 150 -982 -982 130 50 -982 -982 130 -982 -982 33 -982 208 -982 -982 -982 172 -982 -25 104 91 -982 -982 -982 -8 -105 133 -982 208 -982 -982 189 -982 -982 -982 31 50 -982 33 -28 150 -982 -125 104 -982 -105 33 -982 208 -982 -982 -982 150 -5 -125 -28 172 -982 -982 104 91 -982 -982 -28 50 -982 75 -982 191 -105 -982 -28 50 -5 -25 31 50 -982 33 -982 150 -982 33 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 5.5e-001 0.333333 0.666667 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.777778 0.000000 0.222222 0.555556 0.444444 0.000000 0.000000 0.000000 0.222222 0.111111 0.666667 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.333333 0.000000 0.333333 0.222222 0.666667 0.000000 0.111111 0.555556 0.000000 0.111111 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.222222 0.111111 0.222222 0.777778 0.000000 0.000000 0.555556 0.444444 0.000000 0.000000 0.222222 0.333333 0.000000 0.444444 0.000000 0.888889 0.111111 0.000000 0.222222 0.333333 0.222222 0.222222 0.333333 0.333333 0.000000 0.333333 0.000000 0.666667 0.000000 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA][AC][AT]C[CT][AC][TC]CA[ACT][CA][AT]C[CG][CA][AC][TCA]C[CAGT][ACT][CT] -------------------------------------------------------------------------------- Time 6.04 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11372 2.44e-02 5_[+1(3.28e-05)]_480 21432 3.44e-08 87_[+3(9.77e-08)]_239_\ [+2(2.42e-08)]_137 24305 3.30e-02 140_[+1(7.49e-05)]_57_\ [+1(9.57e-06)]_273 24452 1.59e-11 255_[+1(3.06e-07)]_77_\ [+2(4.18e-09)]_80_[+3(2.14e-07)]_36 25017 9.22e-06 63_[+1(5.72e-06)]_54_[+2(5.21e-08)]_\ 352 25516 3.03e-05 66_[+2(3.88e-07)]_19_[+1(9.57e-06)]_\ 384 261368 4.30e-07 143_[+3(4.85e-05)]_247_\ [+3(1.44e-09)]_52_[+1(8.92e-06)]_1 264837 2.38e-08 161_[+3(4.96e-08)]_264_\ [+2(6.48e-08)]_38 2954 4.68e-09 19_[+2(9.35e-05)]_26_[+2(1.05e-06)]_\ 42_[+1(5.26e-06)]_345_[+3(2.21e-08)] 34487 8.92e-10 68_[+1(2.30e-08)]_56_[+3(1.44e-09)]_\ 340 34541 1.86e-03 66_[+1(8.66e-07)]_419 35381 1.86e-06 331_[+1(6.24e-07)]_118_\ [+3(1.92e-07)]_15 37472 9.31e-07 347_[+1(5.59e-07)]_76_\ [+3(2.45e-05)]_15_[+3(4.19e-08)]_5 42133 3.85e-05 45_[+2(3.88e-07)]_285_\ [+1(5.26e-06)]_139 750 2.56e-11 108_[+1(1.17e-06)]_17_\ [+2(2.27e-07)]_11_[+1(4.05e-05)]_290_[+3(1.70e-09)]_7 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************