******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/421/421.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 49150 1.0000 500 49170 1.0000 500 8543 1.0000 500 26405 1.0000 500 45112 1.0000 500 34724 1.0000 500 12097 1.0000 500 43416 1.0000 500 35662 1.0000 500 47347 1.0000 500 40550 1.0000 500 35771 1.0000 500 44262 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/421/421.seqs.fa -oc motifs/421 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.290 C 0.232 G 0.205 T 0.273 Background letter frequencies (from dataset with add-one prior applied): A 0.290 C 0.232 G 0.205 T 0.273 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 102 E-value = 1.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 6aa::1:2:::2 pos.-specific C ::::a1::::4: probability G 4::7:::3::37 matrix T :::3:8a4aa21 bits 2.3 2.1 * 1.8 ** * * ** 1.6 ** * * ** Relative 1.4 ** * * ** Entropy 1.1 **** * ** (16.4 bits) 0.9 ******* ** * 0.7 ******* **** 0.5 ************ 0.2 ************ 0.0 ------------ Multilevel AAAGCTTTTTCG consensus G T G GA sequence A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 34724 55 1.04e-07 ACTTCAACAA GAAGCTTTTTCG AAATATGCGG 12097 62 8.00e-07 CCGAACTTGA AAAGCTTATTCG TGCTATTGCT 45112 52 1.51e-06 GGATGTAATA GAATCTTGTTGG AAGCTCTAAG 26405 290 1.82e-06 AAAGGTGAGC GAAGCTTTTTCA AACTGTGCTT 44262 476 2.24e-06 AGCAGAACTC GAAGCTTTTTGA AGAAGTGCCT 40550 396 2.42e-06 TGTCCTCCGG AAAGCTTATTTG TAGGTGTTTG 43416 268 4.60e-06 TGGTCGCGCA AAAGCTTTTTCT TGGCATCGAT 49150 319 1.12e-05 ATGGGAACAG AAATCATGTTGG CTGGTTTGCT 8543 17 1.48e-05 TTGGATATAC AAATCCTGTTTG GCCACGGAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34724 1e-07 54_[+1]_434 12097 8e-07 61_[+1]_427 45112 1.5e-06 51_[+1]_437 26405 1.8e-06 289_[+1]_199 44262 2.2e-06 475_[+1]_13 40550 2.4e-06 395_[+1]_93 43416 4.6e-06 267_[+1]_221 49150 1.1e-05 318_[+1]_170 8543 1.5e-05 16_[+1]_472 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 34724 ( 55) GAAGCTTTTTCG 1 12097 ( 62) AAAGCTTATTCG 1 45112 ( 52) GAATCTTGTTGG 1 26405 ( 290) GAAGCTTTTTCA 1 44262 ( 476) GAAGCTTTTTGA 1 40550 ( 396) AAAGCTTATTTG 1 43416 ( 268) AAAGCTTTTTCT 1 49150 ( 319) AAATCATGTTGG 1 8543 ( 17) AAATCCTGTTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 9.59664 E= 1.8e+001 93 -982 112 -982 178 -982 -982 -982 178 -982 -982 -982 -982 -982 170 29 -982 211 -982 -982 -138 -106 -982 151 -982 -982 -982 187 -39 -982 70 70 -982 -982 -982 187 -982 -982 -982 187 -982 94 70 -30 -39 -982 170 -129 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 1.8e+001 0.555556 0.000000 0.444444 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 1.000000 0.000000 0.000000 0.111111 0.111111 0.000000 0.777778 0.000000 0.000000 0.000000 1.000000 0.222222 0.000000 0.333333 0.444444 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.444444 0.333333 0.222222 0.222222 0.000000 0.666667 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AG]AA[GT]CTT[TGA]TT[CGT][GA] -------------------------------------------------------------------------------- Time 1.62 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 8 llr = 121 E-value = 3.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3:3:15:6435:9943aa5a pos.-specific C 638:93:41:3:1:36::3: probability G 1::::38:1638:141::3: matrix T :8:a::3:41:3:::::::: bits 2.3 2.1 1.8 * ** * 1.6 ** ** * Relative 1.4 ** * *** ** * Entropy 1.1 **** * *** ** * (21.8 bits) 0.9 **** ** * *** ** * 0.7 ***** ** * *** *** * 0.5 ******** *********** 0.2 ******** *********** 0.0 -------------------- Multilevel CTCTCAGAAGAGAAACAAAA consensus ACA CTCTACT GA C sequence G G C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 44262 46 5.26e-10 CATTAGCGAA CTCTCAGATGAGAACAAAAA ATAAATCAAA 26405 206 4.55e-09 TAATCTTAGT CTCTCATACGAGAAGCAACA CGAGCTTGCC 40550 263 2.28e-08 ATAGAGAAGC CTATCAGAATCGAAACAAAA CAAATAAAAT 34724 277 2.28e-08 CGCTCACCCC CTCTACGCAGGGAAACAAGA GAAAGAGCTC 43416 44 9.48e-08 TTGAAGCACA GCCTCGGCTGGTAAGCAAGA GAGGAAATGT 35662 401 1.11e-07 CGCAACCAGA ATCTCAGATACGAACGAACA CAGCTGACCA 35771 380 3.20e-07 ATTTTTGTAG CCCTCGTAAAAGAGGAAAAA TTTGACCTCA 45112 187 5.46e-07 ATGAATCACA ATATCCGCGGATCAACAAAA ACGTACTTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44262 5.3e-10 45_[+2]_435 26405 4.5e-09 205_[+2]_275 40550 2.3e-08 262_[+2]_218 34724 2.3e-08 276_[+2]_204 43416 9.5e-08 43_[+2]_437 35662 1.1e-07 400_[+2]_80 35771 3.2e-07 379_[+2]_101 45112 5.5e-07 186_[+2]_294 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=8 44262 ( 46) CTCTCAGATGAGAACAAAAA 1 26405 ( 206) CTCTCATACGAGAAGCAACA 1 40550 ( 263) CTATCAGAATCGAAACAAAA 1 34724 ( 277) CTCTACGCAGGGAAACAAGA 1 43416 ( 44) GCCTCGGCTGGTAAGCAAGA 1 35662 ( 401) ATCTCAGATACGAACGAACA 1 35771 ( 380) CCCTCGTAAAAGAGGAAAAA 1 45112 ( 187) ATATCCGCGGATCAACAAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6253 bayes= 9.60849 E= 3.9e+001 -22 143 -71 -965 -965 11 -965 146 -22 169 -965 -965 -965 -965 -965 187 -121 192 -965 -965 78 11 29 -965 -965 -965 187 -13 110 69 -965 -965 37 -89 -71 46 -22 -965 161 -113 78 11 29 -965 -965 -965 187 -13 159 -89 -965 -965 159 -965 -71 -965 37 11 87 -965 -22 143 -71 -965 178 -965 -965 -965 178 -965 -965 -965 78 11 29 -965 178 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 8 E= 3.9e+001 0.250000 0.625000 0.125000 0.000000 0.000000 0.250000 0.000000 0.750000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.125000 0.875000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.000000 0.750000 0.250000 0.625000 0.375000 0.000000 0.000000 0.375000 0.125000 0.125000 0.375000 0.250000 0.000000 0.625000 0.125000 0.500000 0.250000 0.250000 0.000000 0.000000 0.000000 0.750000 0.250000 0.875000 0.125000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.375000 0.250000 0.375000 0.000000 0.250000 0.625000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CA][TC][CA]TC[ACG][GT][AC][AT][GA][ACG][GT]AA[AGC][CA]AA[ACG]A -------------------------------------------------------------------------------- Time 3.03 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 104 E-value = 9.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :277:7:7:23:::3:::::: pos.-specific C 8::3a2825:52:53:52::: probability G :8:::2::28::7327::::a matrix T 2:3:::223:28322358aa: bits 2.3 * 2.1 * * 1.8 * *** 1.6 * * * *** Relative 1.4 ** * * * * **** Entropy 1.1 ** * * * ** * **** (25.1 bits) 0.9 ***** * * ** ****** 0.7 ********** *** ****** 0.5 ************** ****** 0.2 ************** ****** 0.0 --------------------- Multilevel CGAACACACGCTGCAGCTTTG consensus TC T A TGCTT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 43416 456 5.74e-11 ACGAAGCAGA CGTCCACACGCTGCTGCTTTG CTAAAGCGAA 45112 244 4.30e-09 TTCATGGTAC TGAACACACGCTGGATTCTTG GTACAAATTC 34724 173 6.60e-09 ACATGCAAAG CGAACCCTCGTTTGCGTTTTG TTCTCAAACA 26405 462 7.15e-09 TGCTTCGCAC CGTACGTAGGCTTCCGCTTTG GGGCTGGCGG 47347 241 1.75e-08 TAAATCAAGC CGACCACCTGACGTGGTTTTG TTCAACTTGA 44262 238 1.87e-08 GTAGAACAGG CAAACACATAATGCATCTTTG AATGACTATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43416 5.7e-11 455_[+3]_24 45112 4.3e-09 243_[+3]_236 34724 6.6e-09 172_[+3]_307 26405 7.2e-09 461_[+3]_18 47347 1.7e-08 240_[+3]_239 44262 1.9e-08 237_[+3]_242 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 43416 ( 456) CGTCCACACGCTGCTGCTTTG 1 45112 ( 244) TGAACACACGCTGGATTCTTG 1 34724 ( 173) CGAACCCTCGTTTGCGTTTTG 1 26405 ( 462) CGTACGTAGGCTTCCGCTTTG 1 47347 ( 241) CGACCACCTGACGTGGTTTTG 1 44262 ( 238) CAAACACATAATGCATCTTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6240 bayes= 9.67957 E= 9.1e+001 -923 184 -923 -71 -80 -923 202 -923 120 -923 -923 29 120 52 -923 -923 -923 211 -923 -923 120 -47 -30 -923 -923 184 -923 -71 120 -47 -923 -71 -923 111 -30 29 -80 -923 202 -923 20 111 -923 -71 -923 -47 -923 161 -923 -923 170 29 -923 111 70 -71 20 52 -30 -71 -923 -923 170 29 -923 111 -923 87 -923 -47 -923 161 -923 -923 -923 187 -923 -923 -923 187 -923 -923 229 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 9.1e+001 0.000000 0.833333 0.000000 0.166667 0.166667 0.000000 0.833333 0.000000 0.666667 0.000000 0.000000 0.333333 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.166667 0.166667 0.000000 0.000000 0.833333 0.000000 0.166667 0.666667 0.166667 0.000000 0.166667 0.000000 0.500000 0.166667 0.333333 0.166667 0.000000 0.833333 0.000000 0.333333 0.500000 0.000000 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.666667 0.333333 0.000000 0.500000 0.333333 0.166667 0.333333 0.333333 0.166667 0.166667 0.000000 0.000000 0.666667 0.333333 0.000000 0.500000 0.000000 0.500000 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CG[AT][AC]CACA[CT]G[CA]T[GT][CG][AC][GT][CT]TTTG -------------------------------------------------------------------------------- Time 4.29 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49150 3.36e-02 318_[+1(1.12e-05)]_170 49170 8.78e-01 500 8543 4.33e-02 16_[+1(1.48e-05)]_472 26405 3.79e-12 205_[+2(4.55e-09)]_64_\ [+1(1.82e-06)]_160_[+3(7.15e-09)]_18 45112 1.74e-10 51_[+1(1.51e-06)]_123_\ [+2(5.46e-07)]_37_[+3(4.30e-09)]_236 34724 1.08e-12 54_[+1(1.04e-07)]_106_\ [+3(6.60e-09)]_83_[+2(2.28e-08)]_204 12097 1.22e-02 61_[+1(8.00e-07)]_427 43416 1.68e-12 43_[+2(9.48e-08)]_204_\ [+1(4.60e-06)]_176_[+3(5.74e-11)]_24 35662 1.40e-04 148_[+2(5.35e-05)]_232_\ [+2(1.11e-07)]_33_[+3(5.73e-05)]_26 47347 2.20e-05 240_[+3(1.75e-08)]_137_\ [+2(4.36e-05)]_82 40550 1.72e-06 262_[+2(2.28e-08)]_113_\ [+1(2.42e-06)]_93 35771 2.70e-03 379_[+2(3.20e-07)]_101 44262 1.49e-12 45_[+2(5.26e-10)]_172_\ [+3(1.87e-08)]_217_[+1(2.24e-06)]_13 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************