******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/171/171.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 5466 1.0000 500 9005 1.0000 500 43124 1.0000 500 8770 1.0000 500 46419 1.0000 500 21207 1.0000 500 48876 1.0000 500 18036 1.0000 500 5104 1.0000 500 15795 1.0000 500 49384 1.0000 500 42767 1.0000 500 43365 1.0000 500 44879 1.0000 500 49520 1.0000 500 49564 1.0000 500 48917 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/171/171.seqs.fa -oc motifs/171 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.269 C 0.238 G 0.247 T 0.246 Background letter frequencies (from dataset with add-one prior applied): A 0.269 C 0.238 G 0.247 T 0.246 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 13 llr = 152 E-value = 5.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1:42:12512:::2:: pos.-specific C :2:85:22::9:8:52 probability G 1:2:57::9::72:22 matrix T 884:1273:8131846 bits 2.1 1.9 1.7 * * 1.4 * * * * * Relative 1.2 ** * *** * Entropy 1.0 ** * ****** (16.8 bits) 0.8 ** **** ****** 0.6 ** ************* 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TTACCGTAGTCGCTCT consensus T GT T A T TC sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 42767 368 1.27e-09 GTGGACCAGG TTTCGGTAGTCGCTTT TGGTCTGTCG 21207 16 8.76e-08 CGACTTAAAC TTTCCTTTGTCTCTTT ACAACGGAGT 43124 429 1.13e-07 ATCACCAATT TTTCCGCAGACGCTCT CAGTAAGAAT 15795 73 4.86e-07 TTCGCTTGCA TTACGACAGTCGCTCT AGAATACTAG 49384 465 1.52e-06 ATAAACACGG TCACCGTCGTCTCTTC ACATTGGGTG 48876 161 1.96e-06 ATCGAGGGTT TCGCCGTCGTCGCACT GATTCTTCCA 49564 189 2.13e-06 GTGTTTACAA TTAACGATGTCGCTCG TTCCATACGA 48917 69 2.96e-06 CGTACTCTAC TTTCGGTAGTCGGAGC TATAAATCAA 43365 368 3.21e-06 CGTGCAACGA TTGCGGAAATCGCTCC ATAGAGACTC 18036 25 3.47e-06 ACACGACACG ATACGTTTGACGCTTT CTCGCGTTTG 49520 220 7.21e-06 AATTCAAACG TTGATGTAGACGCTGT CGATGCGTCA 5104 413 8.84e-06 GATGTTTTAC TTACCTTTGTTTGTTT GAATCATGAG 9005 28 1.08e-05 CGAGAGCGCG GTTCGGTAGTCTTTCG CCAGAACAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42767 1.3e-09 367_[+1]_117 21207 8.8e-08 15_[+1]_469 43124 1.1e-07 428_[+1]_56 15795 4.9e-07 72_[+1]_412 49384 1.5e-06 464_[+1]_20 48876 2e-06 160_[+1]_324 49564 2.1e-06 188_[+1]_296 48917 3e-06 68_[+1]_416 43365 3.2e-06 367_[+1]_117 18036 3.5e-06 24_[+1]_460 49520 7.2e-06 219_[+1]_265 5104 8.8e-06 412_[+1]_72 9005 1.1e-05 27_[+1]_457 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=13 42767 ( 368) TTTCGGTAGTCGCTTT 1 21207 ( 16) TTTCCTTTGTCTCTTT 1 43124 ( 429) TTTCCGCAGACGCTCT 1 15795 ( 73) TTACGACAGTCGCTCT 1 49384 ( 465) TCACCGTCGTCTCTTC 1 48876 ( 161) TCGCCGTCGTCGCACT 1 49564 ( 189) TTAACGATGTCGCTCG 1 48917 ( 69) TTTCGGTAGTCGGAGC 1 43365 ( 368) TTGCGGAAATCGCTCC 1 18036 ( 25) ATACGTTTGACGCTTT 1 49520 ( 220) TTGATGTAGACGCTGT 1 5104 ( 413) TTACCTTTGTTTGTTT 1 9005 ( 28) GTTCGGTAGTCTTTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8245 bayes= 9.8378 E= 5.0e+000 -180 -1035 -168 178 -1035 -63 -1035 178 51 -1035 -10 64 -81 183 -1035 -1035 -1035 95 90 -168 -180 -1035 149 -9 -81 -63 -1035 149 100 -63 -1035 32 -180 -1035 190 -1035 -22 -1035 -1035 164 -1035 195 -1035 -168 -1035 -1035 149 32 -1035 169 -68 -168 -81 -1035 -1035 178 -1035 95 -68 64 -1035 -5 -68 132 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 13 E= 5.0e+000 0.076923 0.000000 0.076923 0.846154 0.000000 0.153846 0.000000 0.846154 0.384615 0.000000 0.230769 0.384615 0.153846 0.846154 0.000000 0.000000 0.000000 0.461538 0.461538 0.076923 0.076923 0.000000 0.692308 0.230769 0.153846 0.153846 0.000000 0.692308 0.538462 0.153846 0.000000 0.307692 0.076923 0.000000 0.923077 0.000000 0.230769 0.000000 0.000000 0.769231 0.000000 0.923077 0.000000 0.076923 0.000000 0.000000 0.692308 0.307692 0.000000 0.769231 0.153846 0.076923 0.153846 0.000000 0.000000 0.846154 0.000000 0.461538 0.153846 0.384615 0.000000 0.230769 0.153846 0.615385 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TT[ATG]C[CG][GT]T[AT]G[TA]C[GT]CT[CT][TC] -------------------------------------------------------------------------------- Time 2.57 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 12 llr = 127 E-value = 4.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 15:::9:3:::a pos.-specific C 823:4:a:::6: probability G ::1:11:182:: matrix T 137a5::7384: bits 2.1 * * 1.9 * * * 1.7 * * * 1.4 * ** * * Relative 1.2 * * ** ** * Entropy 1.0 * * ** **** (15.2 bits) 0.8 * ** ******* 0.6 * ********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CATTTACTGTCA consensus TC C AT T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 49384 245 9.52e-07 CCCCCGTTCG CATTCACAGTCA CAGACCTACC 48917 313 1.63e-06 TTCATCGTCT CATTTACAGTTA ACTGTAAGTC 46419 243 1.63e-06 GGTGCCGGCG CATTTACAGTTA GACTAATGTA 21207 140 2.14e-06 ATAATAAAGT CTTTTACTTTTA TTAAAACTAT 43365 311 2.80e-06 ACAGAGACCG CTTTGACTGTCA AACCATCTGG 5466 27 3.13e-06 CAAGGTGGAA CCTTCACTTTCA GTCGCTTCGT 15795 240 3.43e-06 CGTGTACTTG CTTTTACTGGTA GAGCTGACGC 18036 216 3.43e-06 TTCGGAGTCA CAGTCACTGTCA GTCAGTCGCA 44879 278 1.13e-05 CACTCCTTTC CTCTCACGGTCA ACACAGGAGT 48876 249 1.54e-05 TCTATCATGT TACTTACTGTTA GTACCTTTAC 43124 118 2.84e-05 GACTCGTGAC ACTTCACTTTCA CCTTGTGCTT 49520 283 3.37e-05 TTTCCGAATG CACTTGCTGGCA CCGACAAGCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49384 9.5e-07 244_[+2]_244 48917 1.6e-06 312_[+2]_176 46419 1.6e-06 242_[+2]_246 21207 2.1e-06 139_[+2]_349 43365 2.8e-06 310_[+2]_178 5466 3.1e-06 26_[+2]_462 15795 3.4e-06 239_[+2]_249 18036 3.4e-06 215_[+2]_273 44879 1.1e-05 277_[+2]_211 48876 1.5e-05 248_[+2]_240 43124 2.8e-05 117_[+2]_371 49520 3.4e-05 282_[+2]_206 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=12 49384 ( 245) CATTCACAGTCA 1 48917 ( 313) CATTTACAGTTA 1 46419 ( 243) CATTTACAGTTA 1 21207 ( 140) CTTTTACTTTTA 1 43365 ( 311) CTTTGACTGTCA 1 5466 ( 27) CCTTCACTTTCA 1 15795 ( 240) CTTTTACTGGTA 1 18036 ( 216) CAGTCACTGTCA 1 44879 ( 278) CTCTCACGGTCA 1 48876 ( 249) TACTTACTGTTA 1 43124 ( 118) ACTTCACTTTCA 1 49520 ( 283) CACTTGCTGGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.88212 E= 4.4e+001 -169 181 -1023 -156 89 -51 -1023 44 -1023 7 -156 144 -1023 -1023 -1023 202 -1023 81 -156 102 177 -1023 -156 -1023 -1023 207 -1023 -1023 -11 -1023 -156 144 -1023 -1023 160 2 -1023 -1023 -56 176 -1023 129 -1023 76 189 -1023 -1023 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 4.4e+001 0.083333 0.833333 0.000000 0.083333 0.500000 0.166667 0.000000 0.333333 0.000000 0.250000 0.083333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.416667 0.083333 0.500000 0.916667 0.000000 0.083333 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.083333 0.666667 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.166667 0.833333 0.000000 0.583333 0.000000 0.416667 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[AT][TC]T[TC]AC[TA][GT]T[CT]A -------------------------------------------------------------------------------- Time 5.09 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 6 llr = 105 E-value = 1.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a2::5:27::5:3a5::2:a pos.-specific C ::aa:52388:7::382:8: probability G :2::23::2:52::2288:: matrix T :7::327::2:27:::::2: bits 2.1 ** 1.9 * ** * * 1.7 * ** * * 1.4 * ** ** * ***** Relative 1.2 * ** ** * ***** Entropy 1.0 * ** **** ** ***** (25.2 bits) 0.8 **** ******** ***** 0.6 **** ********* ***** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel ATCCACTACCACTAACGGCA consensus TG C G A C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 5466 320 7.74e-10 TGAACTCAAA ATCCAGAACCGCAACCGGCA AGAAGAAGGT 8770 452 3.02e-09 ATACTCCCGG ATCCGCCACTGCTAACGGCA ATGCGGGGAA 46419 161 3.72e-09 CCGTTGCGTC ATCCTCTCGCGCTAACGGTA GAATGGCTGG 15795 272 5.23e-09 TCTCCACTAG AGCCAGTACCACTAGCCGCA GCTCGCTTTT 18036 420 7.82e-09 AGGATAGCAA ATCCTTTACCAGTACCGACA GCTCCCACAT 48917 93 2.00e-08 GCTATAAATC AACCACTCCCATAAAGGGCA CTATTCCGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5466 7.7e-10 319_[+3]_161 8770 3e-09 451_[+3]_29 46419 3.7e-09 160_[+3]_320 15795 5.2e-09 271_[+3]_209 18036 7.8e-09 419_[+3]_61 48917 2e-08 92_[+3]_388 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=6 5466 ( 320) ATCCAGAACCGCAACCGGCA 1 8770 ( 452) ATCCGCCACTGCTAACGGCA 1 46419 ( 161) ATCCTCTCGCGCTAACGGTA 1 15795 ( 272) AGCCAGTACCACTAGCCGCA 1 18036 ( 420) ATCCTTTACCAGTACCGACA 1 48917 ( 93) AACCACTCCCATAAAGGGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 8177 bayes= 10.8591 E= 1.1e+002 189 -923 -923 -923 -69 -923 -56 144 -923 207 -923 -923 -923 207 -923 -923 89 -923 -56 44 -923 107 43 -56 -69 -51 -923 144 131 48 -923 -923 -923 181 -56 -923 -923 181 -923 -56 89 -923 102 -923 -923 148 -56 -56 31 -923 -923 144 189 -923 -923 -923 89 48 -56 -923 -923 181 -56 -923 -923 -51 176 -923 -69 -923 176 -923 -923 181 -923 -56 189 -923 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 1.1e+002 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.166667 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.166667 0.333333 0.000000 0.500000 0.333333 0.166667 0.166667 0.166667 0.000000 0.666667 0.666667 0.333333 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.833333 0.000000 0.166667 0.500000 0.000000 0.500000 0.000000 0.000000 0.666667 0.166667 0.166667 0.333333 0.000000 0.000000 0.666667 1.000000 0.000000 0.000000 0.000000 0.500000 0.333333 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.166667 0.833333 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.833333 0.000000 0.166667 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- ATCC[AT][CG]T[AC]CC[AG]C[TA]A[AC]CGGCA -------------------------------------------------------------------------------- Time 8.05 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5466 1.00e-07 26_[+2(3.13e-06)]_281_\ [+3(7.74e-10)]_161 9005 7.38e-02 27_[+1(1.08e-05)]_457 43124 7.30e-05 117_[+2(2.84e-05)]_299_\ [+1(1.13e-07)]_56 8770 4.63e-05 451_[+3(3.02e-09)]_29 46419 2.25e-07 160_[+3(3.72e-09)]_62_\ [+2(1.63e-06)]_246 21207 6.06e-06 15_[+1(8.76e-08)]_108_\ [+2(2.14e-06)]_349 48876 3.39e-04 160_[+1(1.96e-06)]_72_\ [+2(1.54e-05)]_240 18036 3.66e-09 24_[+1(3.47e-06)]_175_\ [+2(3.43e-06)]_192_[+3(7.82e-09)]_61 5104 8.59e-02 412_[+1(8.84e-06)]_72 15795 4.08e-10 60_[+2(3.99e-05)]_[+1(4.86e-07)]_\ 151_[+2(3.43e-06)]_20_[+3(5.23e-09)]_209 49384 1.09e-05 244_[+2(9.52e-07)]_208_\ [+1(1.52e-06)]_20 42767 5.61e-06 367_[+1(1.27e-09)]_117 43365 1.74e-04 310_[+2(2.80e-06)]_45_\ [+1(3.21e-06)]_117 44879 3.41e-02 277_[+2(1.13e-05)]_211 49520 1.24e-03 219_[+1(7.21e-06)]_47_\ [+2(3.37e-05)]_206 49564 1.27e-02 188_[+1(2.13e-06)]_296 48917 3.79e-09 68_[+1(2.96e-06)]_8_[+3(2.00e-08)]_\ 200_[+2(1.63e-06)]_176 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************