******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/345/345.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 37272 1.0000 500 38427 1.0000 500 43593 1.0000 500 54151 1.0000 500 44809 1.0000 500 34794 1.0000 500 45213 1.0000 500 27326 1.0000 500 45986 1.0000 500 36256 1.0000 500 48182 1.0000 500 48183 1.0000 500 31535 1.0000 500 48447 1.0000 500 41618 1.0000 500 32513 1.0000 500 48976 1.0000 500 45716 1.0000 500 49099 1.0000 500 50569 1.0000 500 46024 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/345/345.seqs.fa -oc motifs/345 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 21 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10500 N= 21 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.273 C 0.227 G 0.214 T 0.285 Background letter frequencies (from dataset with add-one prior applied): A 0.273 C 0.227 G 0.214 T 0.285 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 21 llr = 184 E-value = 6.8e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a27769::2a98 pos.-specific C :223113a1::: probability G :5::3:4:5::2 matrix T ::1:::2:1::: bits 2.2 * 2.0 * 1.8 * 1.6 * * * Relative 1.3 * * * ** Entropy 1.1 * * * ** (12.7 bits) 0.9 * * * * *** 0.7 * **** * *** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AGAAAAGCGAAA consensus A CG C A sequence C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43593 256 6.86e-08 AGGGCATAAA AGAAAAGCGAAA AACTAGAATA 38427 379 5.61e-07 TTACGTTGGT AAAAAAGCGAAA GTGCGCAGTA 46024 85 2.70e-06 ATAAGATCGT ACACAACCGAAA CGGGGTTGAT 41618 446 3.97e-06 GCCTTTAGCA AGAAGAGCTAAA GAAAGCATTG 31535 446 3.97e-06 GCCTTTAGCA AGAAGAGCTAAA GAAAGCATTG 48976 339 5.74e-06 CATCAGAAAC AGCAGATCGAAA TGTTCTGTCT 54151 371 1.97e-05 GTATGTGTGC ACCAAACCAAAA CAACCATCAC 44809 182 2.17e-05 ACTAACTGTA AGAGAATCGAAA GCGCAAAAAG 34794 231 2.36e-05 GCATTCACCA AATAAATCGAAA GGTGATCCTG 48182 429 2.91e-05 ACAAGAATGC ACAAGACCAAAG CAGCGTCGGG 27326 39 2.91e-05 ATACTTTCGT AGACGACCAAAG GCGTAAAAAC 45213 198 3.83e-05 GGTATGAAAA AGCAAAGCGGAA TTTTGCCGCT 37272 268 4.21e-05 CTCCACTTTC ATAAAACCAAAA CATGTCCTAT 49099 14 4.54e-05 CAGAAACACA ACAAAACCCAAG ACCATAATGA 48183 454 5.70e-05 CTTCACTGTA AGACAATCGAGA CATACAACCG 50569 45 6.19e-05 AGAGTGCAAA AGAAACGCAAAG TGGGTGAAGA 45716 369 6.19e-05 AGTGAATAAC AACCCAGCGAAA CACAGAAAGA 45986 374 1.39e-04 GTAAATCAGC ACAACACCGACA CATATCGCCA 48447 387 2.28e-04 GCCCCAGAAC AGTCACGCTAAA ATCTCTAAAG 36256 245 2.57e-04 CCTCTACACT CATCGAGCGAAA GAAAAAGATT 32513 273 2.72e-04 TCCCCTCTTT AAAAGATCCAAC TTGACCATGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43593 6.9e-08 255_[+1]_233 38427 5.6e-07 378_[+1]_110 46024 2.7e-06 84_[+1]_404 41618 4e-06 445_[+1]_43 31535 4e-06 445_[+1]_43 48976 5.7e-06 338_[+1]_150 54151 2e-05 370_[+1]_118 44809 2.2e-05 181_[+1]_307 34794 2.4e-05 230_[+1]_258 48182 2.9e-05 428_[+1]_60 27326 2.9e-05 38_[+1]_450 45213 3.8e-05 197_[+1]_291 37272 4.2e-05 267_[+1]_221 49099 4.5e-05 13_[+1]_475 48183 5.7e-05 453_[+1]_35 50569 6.2e-05 44_[+1]_444 45716 6.2e-05 368_[+1]_120 45986 0.00014 373_[+1]_115 48447 0.00023 386_[+1]_102 36256 0.00026 244_[+1]_244 32513 0.00027 272_[+1]_216 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=21 43593 ( 256) AGAAAAGCGAAA 1 38427 ( 379) AAAAAAGCGAAA 1 46024 ( 85) ACACAACCGAAA 1 41618 ( 446) AGAAGAGCTAAA 1 31535 ( 446) AGAAGAGCTAAA 1 48976 ( 339) AGCAGATCGAAA 1 54151 ( 371) ACCAAACCAAAA 1 44809 ( 182) AGAGAATCGAAA 1 34794 ( 231) AATAAATCGAAA 1 48182 ( 429) ACAAGACCAAAG 1 27326 ( 39) AGACGACCAAAG 1 45213 ( 198) AGCAAAGCGGAA 1 37272 ( 268) ATAAAACCAAAA 1 49099 ( 14) ACAAAACCCAAG 1 48183 ( 454) AGACAATCGAGA 1 50569 ( 45) AGAAACGCAAAG 1 45716 ( 369) AACCCAGCGAAA 1 45986 ( 374) ACAACACCGACA 1 48447 ( 387) AGTCACGCTAAA 1 36256 ( 245) CATCGAGCGAAA 1 32513 ( 273) AAAAGATCCAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 10269 bayes= 8.93074 E= 6.8e-001 180 -225 -1104 -1104 -20 7 115 -258 129 -26 -1104 -100 129 33 -217 -1104 106 -125 64 -1104 173 -125 -1104 -1104 -1104 55 100 -26 -1104 214 -1104 -1104 -20 -125 129 -100 180 -1104 -217 -1104 173 -225 -217 -1104 148 -225 -17 -1104 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 21 E= 6.8e-001 0.952381 0.047619 0.000000 0.000000 0.238095 0.238095 0.476190 0.047619 0.666667 0.190476 0.000000 0.142857 0.666667 0.285714 0.047619 0.000000 0.571429 0.095238 0.333333 0.000000 0.904762 0.095238 0.000000 0.000000 0.000000 0.333333 0.428571 0.238095 0.000000 1.000000 0.000000 0.000000 0.238095 0.095238 0.523810 0.142857 0.952381 0.000000 0.047619 0.000000 0.904762 0.047619 0.047619 0.000000 0.761905 0.047619 0.190476 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[GAC]A[AC][AG]A[GCT]C[GA]AAA -------------------------------------------------------------------------------- Time 3.95 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 114 E-value = 2.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :3::5:::::::53:::32:: pos.-specific C 5:2:32388:::::::a:2:2 probability G ::87::72:::::::7:7738 matrix T 57:328::2aaa57a3:::7: bits 2.2 * 2.0 * 1.8 *** * * 1.6 * * *** * * * Relative 1.3 * ****** * * * Entropy 1.1 ** ******* **** ** (27.5 bits) 0.9 **** **************** 0.7 **** **************** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CTGGATGCCTTTATTGCGGTG consensus TA TC C TA T A G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 41618 213 1.90e-11 GACTTATAAG CAGGCTGCCTTTAATGCGGTG TTTTGTCCTC 31535 213 1.90e-11 GGTACATGAG CAGGCTGCCTTTAATGCGGTG TTTTGTCCTC 27326 186 5.52e-10 TGAAAAAGCC CTCGATGCCTTTATTTCAGTG AAGTTGCGCT 46024 268 2.11e-09 CCTTCGAATT TTGGATGGCTTTTTTGCAAGG CTCGAGGTTC 45986 302 4.65e-09 CAGTCCTGTT TTGTACCCTTTTTTTGCGGGG TCTATGGAAA 54151 298 1.30e-08 CATCGGGTCC TTGTTTCCCTTTTTTTCGCTC ACAGACTGCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41618 1.9e-11 212_[+2]_267 31535 1.9e-11 212_[+2]_267 27326 5.5e-10 185_[+2]_294 46024 2.1e-09 267_[+2]_212 45986 4.7e-09 301_[+2]_178 54151 1.3e-08 297_[+2]_182 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 41618 ( 213) CAGGCTGCCTTTAATGCGGTG 1 31535 ( 213) CAGGCTGCCTTTAATGCGGTG 1 27326 ( 186) CTCGATGCCTTTATTTCAGTG 1 46024 ( 268) TTGGATGGCTTTTTTGCAAGG 1 45986 ( 302) TTGTACCCTTTTTTTGCGGGG 1 54151 ( 298) TTGTTTCCCTTTTTTTCGCTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 10080 bayes= 11.1611 E= 2.0e+001 -923 114 -923 81 29 -923 -923 122 -923 -45 196 -923 -923 -923 164 23 87 55 -923 -77 -923 -45 -923 155 -923 55 164 -923 -923 187 -36 -923 -923 187 -923 -77 -923 -923 -923 181 -923 -923 -923 181 -923 -923 -923 181 87 -923 -923 81 29 -923 -923 122 -923 -923 -923 181 -923 -923 164 23 -923 214 -923 -923 29 -923 164 -923 -71 -45 164 -923 -923 -923 64 122 -923 -45 196 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 2.0e+001 0.000000 0.500000 0.000000 0.500000 0.333333 0.000000 0.000000 0.666667 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.666667 0.333333 0.500000 0.333333 0.000000 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.333333 0.666667 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.500000 0.000000 0.000000 0.500000 0.333333 0.000000 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.166667 0.166667 0.666667 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.166667 0.833333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CT][TA]G[GT][AC]T[GC]CCTTT[AT][TA]T[GT]C[GA]G[TG]G -------------------------------------------------------------------------------- Time 7.75 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 4 llr = 90 E-value = 8.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::38:::a8::::833aa83: pos.-specific C 8:8:::5::aa:a3::::3:: probability G 3a:::35::::a:::::::5a matrix T :::3a8::3:::::88:::3: bits 2.2 * **** * 2.0 * **** * 1.8 * * * **** ** * 1.6 * * * **** ** * Relative 1.3 *** * * **** ** * Entropy 1.1 ******************* * (32.3 bits) 0.9 ******************* * 0.7 ******************* * 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CGCATTCAACCGCATTAAAGG consensus G AT GG T CAA CA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 41618 250 2.02e-13 CCTCAAACAC CGCATTGAACCGCATTAAAGG TTTTGAACTC 31535 250 2.02e-13 CCTCAAACAC CGCATTGAACCGCATTAAAGG TTTTGAACTC 48182 87 3.04e-10 GACTAAAATT CGCATTCATCCGCCAAAAAAG CCAACACTGT 45986 15 4.18e-10 AATCTGGTGT GGATTGCAACCGCATTAACTG GAAAGGCAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41618 2e-13 249_[+3]_230 31535 2e-13 249_[+3]_230 48182 3e-10 86_[+3]_393 45986 4.2e-10 14_[+3]_465 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=4 41618 ( 250) CGCATTGAACCGCATTAAAGG 1 31535 ( 250) CGCATTGAACCGCATTAAAGG 1 48182 ( 87) CGCATTCATCCGCCAAAAAAG 1 45986 ( 15) GGATTGCAACCGCATTAACTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 10080 bayes= 11.2986 E= 8.6e+001 -865 172 22 -865 -865 -865 222 -865 -13 172 -865 -865 145 -865 -865 -19 -865 -865 -865 181 -865 -865 22 139 -865 114 122 -865 187 -865 -865 -865 145 -865 -865 -19 -865 213 -865 -865 -865 213 -865 -865 -865 -865 222 -865 -865 213 -865 -865 145 14 -865 -865 -13 -865 -865 139 -13 -865 -865 139 187 -865 -865 -865 187 -865 -865 -865 145 14 -865 -865 -13 -865 122 -19 -865 -865 222 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 8.6e+001 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.500000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.750000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.000000 0.500000 0.250000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG]G[CA][AT]T[TG][CG]A[AT]CCGC[AC][TA][TA]AA[AC][GAT]G -------------------------------------------------------------------------------- Time 11.57 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37272 4.31e-02 267_[+1(4.21e-05)]_221 38427 6.70e-04 378_[+1(5.61e-07)]_110 43593 2.84e-05 255_[+1(6.86e-08)]_233 54151 8.89e-06 297_[+2(1.30e-08)]_52_\ [+1(1.97e-05)]_118 44809 5.89e-02 181_[+1(2.17e-05)]_307 34794 1.05e-01 230_[+1(2.36e-05)]_258 45213 2.87e-02 197_[+1(3.83e-05)]_291 27326 4.59e-07 38_[+1(2.91e-05)]_135_\ [+2(5.52e-10)]_294 45986 1.53e-11 14_[+3(4.18e-10)]_266_\ [+2(4.65e-09)]_178 36256 3.98e-01 500 48182 1.80e-07 86_[+3(3.04e-10)]_321_\ [+1(2.91e-05)]_60 48183 7.77e-02 453_[+1(5.70e-05)]_35 31535 2.04e-18 83_[+2(1.82e-05)]_108_\ [+2(1.90e-11)]_16_[+3(2.02e-13)]_175_[+1(3.97e-06)]_43 48447 1.66e-01 500 41618 2.04e-18 212_[+2(1.90e-11)]_16_\ [+3(2.02e-13)]_175_[+1(3.97e-06)]_43 32513 2.64e-01 500 48976 3.20e-02 338_[+1(5.74e-06)]_150 45716 3.55e-02 368_[+1(6.19e-05)]_120 49099 1.78e-01 13_[+1(4.54e-05)]_475 50569 1.57e-01 44_[+1(6.19e-05)]_444 46024 2.68e-07 84_[+1(2.70e-06)]_171_\ [+2(2.11e-09)]_212 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************