******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/352/352.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 32097 1.0000 500 13710 1.0000 500 47535 1.0000 500 47653 1.0000 500 47769 1.0000 500 48718 1.0000 500 48882 1.0000 500 48883 1.0000 500 43686 1.0000 500 49136 1.0000 500 6914 1.0000 500 44527 1.0000 500 11183 1.0000 500 44984 1.0000 500 8445 1.0000 500 45247 1.0000 500 45419 1.0000 500 35858 1.0000 500 32815 1.0000 500 35582 1.0000 500 45136 1.0000 500 43818 1.0000 500 47760 1.0000 500 34555 1.0000 500 45610 1.0000 500 46143 1.0000 500 44558 1.0000 500 48575 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/352/352.seqs.fa -oc motifs/352 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 28 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 14000 N= 28 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.274 C 0.231 G 0.219 T 0.277 Background letter frequencies (from dataset with add-one prior applied): A 0.274 C 0.231 G 0.219 T 0.277 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 14 llr = 150 E-value = 4.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a4611:a2::6: pos.-specific C :6::16:417:9 probability G :::9:4:4:3:: matrix T ::4:9:::9:41 bits 2.2 2.0 * * 1.8 * * * 1.5 * * * * * Relative 1.3 * * * ** * Entropy 1.1 ** **** ** * (15.5 bits) 0.9 ******* **** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel ACAGTCACTCAC consensus AT G G GT sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47760 374 6.23e-08 GGCAAGATTC ACAGTCACTCAC TTACTCACTA 47769 369 4.33e-07 AGGGAATACG AAAGTCACTCAC GCGTTCTCCG 44527 188 8.12e-07 TAGAACTCTG ACAGTGAGTCTC GGTGAACGGA 35582 327 1.03e-06 GTATGTTCGC AAAGTGAGTCAC GTTGGAGCGT 47535 203 1.03e-06 ACTTACAGCT ACAGTCAATCAC GACTGACTCT 34555 211 4.24e-06 ACCCTATCTC ACTGTCACTGTC AGTCCTCCCC 45419 9 4.24e-06 GTGCACTC ACTGTCACTGTC AGTCATTTCA 45136 202 6.31e-06 CATGTCACTC ACAGTCACCCAC TCTACATCGC 48882 135 7.06e-06 AAAATACCTG ACTGTGAATGAC ATGACTGCGT 32097 141 9.50e-06 ATTGATTTTA ACAGCGACTCTC ATGACCGTTA 48718 106 1.12e-05 ATTAAAAAGT AAAGTGAGTCTT ATCTACTGTT 43686 236 1.94e-05 TTCCGTGACC AAAGACAATCAC GTCATTGCAC 48883 433 2.79e-05 GTGTGAGAGA AATATGAGTCAC TGATACCGTC 43818 385 2.93e-05 CTCACAGGCA AATGTCAGTGTT CGGCGGCTAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47760 6.2e-08 373_[+1]_115 47769 4.3e-07 368_[+1]_120 44527 8.1e-07 187_[+1]_301 35582 1e-06 326_[+1]_162 47535 1e-06 202_[+1]_286 34555 4.2e-06 210_[+1]_278 45419 4.2e-06 8_[+1]_480 45136 6.3e-06 201_[+1]_287 48882 7.1e-06 134_[+1]_354 32097 9.5e-06 140_[+1]_348 48718 1.1e-05 105_[+1]_383 43686 1.9e-05 235_[+1]_253 48883 2.8e-05 432_[+1]_56 43818 2.9e-05 384_[+1]_104 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=14 47760 ( 374) ACAGTCACTCAC 1 47769 ( 369) AAAGTCACTCAC 1 44527 ( 188) ACAGTGAGTCTC 1 35582 ( 327) AAAGTGAGTCAC 1 47535 ( 203) ACAGTCAATCAC 1 34555 ( 211) ACTGTCACTGTC 1 45419 ( 9) ACTGTCACTGTC 1 45136 ( 202) ACAGTCACCCAC 1 48882 ( 135) ACTGTGAATGAC 1 32097 ( 141) ACAGCGACTCTC 1 48718 ( 106) AAAGTGAGTCTT 1 43686 ( 236) AAAGACAATCAC 1 48883 ( 433) AATATGAGTCAC 1 43818 ( 385) AATGTCAGTGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 13692 bayes= 10.5384 E= 4.8e+001 187 -1045 -1045 -1045 65 131 -1045 -1045 123 -1045 -1045 37 -194 -1045 209 -1045 -194 -169 -1045 163 -1045 131 97 -1045 187 -1045 -1045 -1045 -35 89 71 -1045 -1045 -169 -1045 174 -1045 163 39 -1045 106 -1045 -1045 63 -1045 189 -1045 -95 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 4.8e+001 1.000000 0.000000 0.000000 0.000000 0.428571 0.571429 0.000000 0.000000 0.642857 0.000000 0.000000 0.357143 0.071429 0.000000 0.928571 0.000000 0.071429 0.071429 0.000000 0.857143 0.000000 0.571429 0.428571 0.000000 1.000000 0.000000 0.000000 0.000000 0.214286 0.428571 0.357143 0.000000 0.000000 0.071429 0.000000 0.928571 0.000000 0.714286 0.285714 0.000000 0.571429 0.000000 0.000000 0.428571 0.000000 0.857143 0.000000 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[CA][AT]GT[CG]A[CGA]T[CG][AT]C -------------------------------------------------------------------------------- Time 6.51 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 18 sites = 4 llr = 79 E-value = 8.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::aa::3::::::5::: pos.-specific C 3:8:::53a::3a::::a probability G 8a3::335:a88:3:aa: matrix T :::::83:::3::85::: bits 2.2 * ** * *** 2.0 * ** ** * *** 1.8 * ** ** * *** 1.5 * ** ** * *** Relative 1.3 ***** ***** *** Entropy 1.1 ****** ****** *** (28.5 bits) 0.9 ****** ********** 0.7 ****************** 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel GGCAATCGCGGGCTAGGC consensus C G GGA TC GT sequence TC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 48575 158 1.36e-10 AGCTTTAGGA GGCAATGCCGGGCTTGGC GGCAATTCCA 13710 30 3.75e-10 GCCCGGATAG GGGAATTGCGGGCTAGGC GTCGCCCGGA 47760 300 5.58e-10 AATAACCAGT GGCAAGCGCGTGCTAGGC TCCAACCACC 45419 337 3.06e-09 GCACATAAGT CGCAATCACGGCCGTGGC ATTTTCTGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48575 1.4e-10 157_[+2]_325 13710 3.7e-10 29_[+2]_453 47760 5.6e-10 299_[+2]_183 45419 3.1e-09 336_[+2]_146 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=18 seqs=4 48575 ( 158) GGCAATGCCGGGCTTGGC 1 13710 ( 30) GGGAATTGCGGGCTAGGC 1 47760 ( 300) GGCAAGCGCGTGCTAGGC 1 45419 ( 337) CGCAATCACGGCCGTGGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 13524 bayes= 11.7228 E= 8.7e+001 -865 12 178 -865 -865 -865 219 -865 -865 170 19 -865 187 -865 -865 -865 187 -865 -865 -865 -865 -865 19 144 -865 111 19 -15 -13 12 119 -865 -865 211 -865 -865 -865 -865 219 -865 -865 -865 178 -15 -865 12 178 -865 -865 211 -865 -865 -865 -865 19 144 87 -865 -865 85 -865 -865 219 -865 -865 -865 219 -865 -865 211 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 4 E= 8.7e+001 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.500000 0.250000 0.250000 0.250000 0.250000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GC]G[CG]AA[TG][CGT][GAC]CG[GT][GC]C[TG][AT]GGC -------------------------------------------------------------------------------- Time 13.18 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 22 llr = 201 E-value = 3.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::37:11:67a pos.-specific C :5::28:35::: probability G 73:7:216:43: matrix T 31a:1:8:5::: bits 2.2 2.0 * 1.8 * * 1.5 * * Relative 1.3 * * Entropy 1.1 * ** * *** (13.2 bits) 0.9 * ** ** **** 0.7 * ********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GCTGACTGCAAA consensus TG A CTGG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48718 309 2.22e-07 CAGTCGTGGC GCTGACTGTGAA TTTGGGAAGT 48883 185 1.31e-06 GGTGGTTTCC GCTAACTGTAAA CCTATCACGG 47653 393 1.48e-06 TTTTAAAAAT GCTGCCTGCGAA GGTGTCATAA 49136 184 5.79e-06 CCACCGGTCA TCTAACTGTAAA ACGAAATTGT 45419 207 6.51e-06 TCTTGTTCAA TCTAACTGTGAA TGAAAAAGTT 48882 307 6.51e-06 TGTGGGAATG GTTGACTGCAGA AGCCAAAGAA 35582 215 1.38e-05 GGTCTCTGCC GCTATCTGTAAA AAATATAACC 32097 407 1.66e-05 GAGGTGCAAC GGTGAGTGTGGA ATCACGTCAG 44527 5 2.23e-05 GTTT TGTGAGTGTGAA GACTGGCGCT 43686 434 2.23e-05 AGCGTTTTGG GGTATCTGCAAA TTCCCTGGAA 47769 309 2.23e-05 AATTACCTAT GCTAACTACGAA TACCGTTCCG 45136 306 2.46e-05 AGTACTCACG GCTGACGCTAGA GTGTCGACGA 43818 483 2.71e-05 ACAACATATC TTTGACTCCAAA GCAGCC 13710 148 2.97e-05 GGGCGACGCT GCTGACTTCAGA ATACTCGTCT 35858 85 4.24e-05 ATACAAGCAT TGTGAGTCCAAA TACATCTTTC 45247 71 4.24e-05 TTTCACCGCG TGTGACGGCAGA CGCGATGAAA 8445 37 4.24e-05 GTCTTCCTTG GGTGTCTCCAGA GTCTCATTGT 47535 277 5.03e-05 TAATATTCAA GATGCCTGTGAA AGAGGCAATA 46143 315 6.29e-05 GGAAAAAAAG TCTAACAGTAAA CCCGGATGAA 34555 328 1.11e-04 GACACGGCCG GCTGCATCCAAA ACATCCCGGA 48575 226 1.18e-04 CTGTAGAAAT GCTGAGGACGAA TCCGTTTCTG 44558 281 2.07e-04 TTACATTAGT GTTGCCACTGAA AGCCACCGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48718 2.2e-07 308_[+3]_180 48883 1.3e-06 184_[+3]_304 47653 1.5e-06 392_[+3]_96 49136 5.8e-06 183_[+3]_305 45419 6.5e-06 206_[+3]_282 48882 6.5e-06 306_[+3]_182 35582 1.4e-05 214_[+3]_274 32097 1.7e-05 406_[+3]_82 44527 2.2e-05 4_[+3]_484 43686 2.2e-05 433_[+3]_55 47769 2.2e-05 308_[+3]_180 45136 2.5e-05 305_[+3]_183 43818 2.7e-05 482_[+3]_6 13710 3e-05 147_[+3]_341 35858 4.2e-05 84_[+3]_404 45247 4.2e-05 70_[+3]_418 8445 4.2e-05 36_[+3]_452 47535 5e-05 276_[+3]_212 46143 6.3e-05 314_[+3]_174 34555 0.00011 327_[+3]_161 48575 0.00012 225_[+3]_263 44558 0.00021 280_[+3]_208 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=22 48718 ( 309) GCTGACTGTGAA 1 48883 ( 185) GCTAACTGTAAA 1 47653 ( 393) GCTGCCTGCGAA 1 49136 ( 184) TCTAACTGTAAA 1 45419 ( 207) TCTAACTGTGAA 1 48882 ( 307) GTTGACTGCAGA 1 35582 ( 215) GCTATCTGTAAA 1 32097 ( 407) GGTGAGTGTGGA 1 44527 ( 5) TGTGAGTGTGAA 1 43686 ( 434) GGTATCTGCAAA 1 47769 ( 309) GCTAACTACGAA 1 45136 ( 306) GCTGACGCTAGA 1 43818 ( 483) TTTGACTCCAAA 1 13710 ( 148) GCTGACTTCAGA 1 35858 ( 85) TGTGAGTCCAAA 1 45247 ( 71) TGTGACGGCAGA 1 8445 ( 37) GGTGTCTCCAGA 1 47535 ( 277) GATGCCTGTGAA 1 46143 ( 315) TCTAACAGTAAA 1 34555 ( 328) GCTGCATCCAAA 1 48575 ( 226) GCTGAGGACGAA 1 44558 ( 281) GTTGCCACTGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 13692 bayes= 10.3069 E= 3.1e+002 -1110 -1110 164 20 -259 124 32 -102 -1110 -1110 -1110 185 22 -1110 164 -1110 132 -34 -1110 -102 -259 174 -27 -1110 -159 -1110 -68 148 -159 24 143 -260 -1110 111 -1110 85 111 -1110 90 -1110 141 -1110 32 -1110 187 -1110 -1110 -1110 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 22 E= 3.1e+002 0.000000 0.000000 0.681818 0.318182 0.045455 0.545455 0.272727 0.136364 0.000000 0.000000 0.000000 1.000000 0.318182 0.000000 0.681818 0.000000 0.681818 0.181818 0.000000 0.136364 0.045455 0.772727 0.181818 0.000000 0.090909 0.000000 0.136364 0.772727 0.090909 0.272727 0.590909 0.045455 0.000000 0.500000 0.000000 0.500000 0.590909 0.000000 0.409091 0.000000 0.727273 0.000000 0.272727 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GT][CG]T[GA]ACT[GC][CT][AG][AG]A -------------------------------------------------------------------------------- Time 19.37 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32097 1.57e-03 140_[+1(9.50e-06)]_254_\ [+3(1.66e-05)]_82 13710 4.08e-07 29_[+2(3.75e-10)]_100_\ [+3(2.97e-05)]_341 47535 3.41e-04 202_[+1(1.03e-06)]_62_\ [+3(5.03e-05)]_212 47653 8.85e-03 392_[+3(1.48e-06)]_96 47769 4.71e-05 308_[+3(2.23e-05)]_48_\ [+1(4.33e-07)]_120 48718 3.67e-05 2_[+3(3.59e-06)]_91_[+1(1.12e-05)]_\ 191_[+3(2.22e-07)]_180 48882 4.41e-04 134_[+1(7.06e-06)]_160_\ [+3(6.51e-06)]_182 48883 2.55e-04 184_[+3(1.31e-06)]_236_\ [+1(2.79e-05)]_56 43686 9.10e-04 235_[+1(1.94e-05)]_186_\ [+3(2.23e-05)]_55 49136 2.28e-02 183_[+3(5.79e-06)]_305 6914 4.63e-01 500 44527 1.18e-04 4_[+3(2.23e-05)]_171_[+1(8.12e-07)]_\ 301 11183 6.19e-01 500 44984 7.78e-01 500 8445 9.32e-02 36_[+3(4.24e-05)]_452 45247 4.12e-02 70_[+3(4.24e-05)]_418 45419 3.39e-09 8_[+1(4.24e-06)]_186_[+3(6.51e-06)]_\ 118_[+2(3.06e-09)]_116_[+1(1.02e-05)]_18 35858 1.38e-01 84_[+3(4.24e-05)]_404 32815 9.79e-01 500 35582 2.65e-04 214_[+3(1.38e-05)]_100_\ [+1(1.03e-06)]_162 45136 3.93e-04 201_[+1(6.31e-06)]_92_\ [+3(2.46e-05)]_183 43818 7.66e-03 384_[+1(2.93e-05)]_86_\ [+3(2.71e-05)]_6 47760 2.71e-09 299_[+2(5.58e-10)]_56_\ [+1(6.23e-08)]_115 34555 3.70e-03 210_[+1(4.24e-06)]_278 45610 1.17e-01 500 46143 2.46e-01 314_[+3(6.29e-05)]_174 44558 3.42e-01 500 48575 7.02e-07 157_[+2(1.36e-10)]_325 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************