******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/230/230.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 8772 1.0000 500 8670 1.0000 500 9409 1.0000 500 15777 1.0000 500 49969 1.0000 500 44147 1.0000 500 44172 1.0000 500 19030 1.0000 500 54420 1.0000 500 35813 1.0000 500 46291 1.0000 500 40097 1.0000 500 33842 1.0000 500 39656 1.0000 500 45016 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/230/230.seqs.fa -oc motifs/230 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.253 C 0.268 G 0.235 T 0.244 Background letter frequencies (from dataset with add-one prior applied): A 0.253 C 0.268 G 0.235 T 0.244 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 10 llr = 110 E-value = 1.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :1::1::22:8: pos.-specific C 7::a3a1:2::: probability G 391:::3:2a21 matrix T ::9:6:684::9 bits 2.1 * 1.9 * * * 1.7 *** * * * 1.5 *** * * * Relative 1.3 *** * * *** Entropy 1.0 **** * * *** (15.8 bits) 0.8 **** *** *** 0.6 ******** *** 0.4 ******** *** 0.2 ******** *** 0.0 ------------ Multilevel CGTCTCTTTGAT consensus G C GAA G sequence C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 15777 228 5.68e-08 CGGTGATTGT CGTCTCTTTGAT CCAGCCGTCC 40097 72 3.99e-07 TCGTGTTCCG CGTCTCTTCGAT TGTGGGGCCG 46291 283 5.53e-07 CACTCACAAA CGTCTCTTTGGT GTGTGTTGCG 33842 421 3.15e-06 AGTTTCCCAG CGTCTCCTGGAT TCACCGAACC 9409 208 3.98e-06 TAGGCGTGGA GGTCCCTTTGGT AAGGTTGTTT 44172 361 5.99e-06 GTCGTAGGTG CGTCCCTACGAT ACAAACAGCT 45016 116 6.23e-06 AAAACGGTCG CGTCACGTAGAT CTTTTGGGCG 35813 465 8.21e-06 ATCTCTGAAC GGTCTCGTTGAG AATCGGACTA 49969 450 1.72e-05 GTATCGACAA GATCCCTTGGAT TCCAACACAC 8772 446 2.48e-05 ATATTCTACA CGGCTCGAAGAT TCGTGGGTAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 15777 5.7e-08 227_[+1]_261 40097 4e-07 71_[+1]_417 46291 5.5e-07 282_[+1]_206 33842 3.1e-06 420_[+1]_68 9409 4e-06 207_[+1]_281 44172 6e-06 360_[+1]_128 45016 6.2e-06 115_[+1]_373 35813 8.2e-06 464_[+1]_24 49969 1.7e-05 449_[+1]_39 8772 2.5e-05 445_[+1]_43 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=10 15777 ( 228) CGTCTCTTTGAT 1 40097 ( 72) CGTCTCTTCGAT 1 46291 ( 283) CGTCTCTTTGGT 1 33842 ( 421) CGTCTCCTGGAT 1 9409 ( 208) GGTCCCTTTGGT 1 44172 ( 361) CGTCCCTACGAT 1 45016 ( 116) CGTCACGTAGAT 1 35813 ( 465) GGTCTCGTTGAG 1 49969 ( 450) GATCCCTTGGAT 1 8772 ( 446) CGGCTCGAAGAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 10.461 E= 1.1e+001 -997 139 35 -997 -134 -997 193 -997 -997 -997 -123 188 -997 190 -997 -997 -134 16 -997 130 -997 190 -997 -997 -997 -142 35 130 -34 -997 -997 171 -34 -42 -23 71 -997 -997 209 -997 166 -997 -23 -997 -997 -997 -123 188 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 1.1e+001 0.000000 0.700000 0.300000 0.000000 0.100000 0.000000 0.900000 0.000000 0.000000 0.000000 0.100000 0.900000 0.000000 1.000000 0.000000 0.000000 0.100000 0.300000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.100000 0.300000 0.600000 0.200000 0.000000 0.000000 0.800000 0.200000 0.200000 0.200000 0.400000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 0.100000 0.900000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG]GTC[TC]C[TG][TA][TACG]G[AG]T -------------------------------------------------------------------------------- Time 2.14 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 9 llr = 127 E-value = 1.3e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::::::1223:::111:2 pos.-specific C ::87:346:26231:1437:: probability G 1223::141:11281321:a: matrix T 98::a74:97141196242:8 bits 2.1 * * 1.9 * * 1.7 * * 1.5 * * * * * Relative 1.3 *** * * * ** Entropy 1.0 ****** ** ** ** (20.3 bits) 0.8 ****** *** ** ** 0.6 ********** *** *** 0.4 ********** *** *** 0.2 ************ ******** 0.0 --------------------- Multilevel TTCCTTCCTTCTAGTTCTCGT consensus GGG CTG CAAC GGCT A sequence CG T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 39656 283 2.01e-11 TCGTCTTTGC TTCCTTCCTTCTAGTGCCCGT GCAGACCCGG 46291 120 2.45e-08 GCCGCCTTTG TTGGTCTCTTCAAGTTCACGT ACTGGTAGTC 54420 399 3.05e-08 AAACGCAAAA TTCGTTGGTTACCGTTGCCGT TTCCTCTGTG 49969 385 4.62e-08 ACCGTCACTG TTCCTTTGTAGTAGTCCTCGT TTGCTTGCTA 15777 207 1.84e-07 CGGGGTTGCC TTGCTTCCTCCCGGTGATTGT CGTCTCTTTG 8670 452 3.22e-07 CTCGTGTTGC TGCCTCCCGTAGCGTTCCCGT TGCTGCCCGC 19030 359 5.00e-07 TCGCCAACGT TGCGTTTGTTCTCCTTGTAGA AGTATCGACG 33842 382 1.31e-06 CAACAGTTCA TTCCTCTCTCCATTGTTTCGT CGCCAACAAG 40097 134 1.46e-06 GGCCGGACGT GTCCTTCGTTTTGGTGTGTGA ATCGTGCGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39656 2e-11 282_[+2]_197 46291 2.5e-08 119_[+2]_360 54420 3e-08 398_[+2]_81 49969 4.6e-08 384_[+2]_95 15777 1.8e-07 206_[+2]_273 8670 3.2e-07 451_[+2]_28 19030 5e-07 358_[+2]_121 33842 1.3e-06 381_[+2]_98 40097 1.5e-06 133_[+2]_346 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=9 39656 ( 283) TTCCTTCCTTCTAGTGCCCGT 1 46291 ( 120) TTGGTCTCTTCAAGTTCACGT 1 54420 ( 399) TTCGTTGGTTACCGTTGCCGT 1 49969 ( 385) TTCCTTTGTAGTAGTCCTCGT 1 15777 ( 207) TTGCTTCCTCCCGGTGATTGT 1 8670 ( 452) TGCCTCCCGTAGCGTTCCCGT 1 19030 ( 359) TGCGTTTGTTCTCCTTGTAGA 1 33842 ( 382) TTCCTCTCTCCATTGTTTCGT 1 40097 ( 134) GTCCTTCGTTTTGGTGTGTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 9.77651 E= 1.3e+003 -982 -982 -108 186 -982 -982 -8 167 -982 154 -8 -982 -982 131 50 -982 -982 -982 -982 203 -982 32 -982 145 -982 73 -108 86 -982 105 92 -982 -982 -982 -108 186 -118 -27 -982 145 -19 105 -108 -113 -19 -27 -108 86 40 32 -8 -113 -982 -127 172 -113 -982 -982 -108 186 -982 -127 50 119 -118 73 -8 -13 -118 32 -108 86 -118 131 -982 -13 -982 -982 209 -982 -19 -982 -982 167 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 1.3e+003 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.222222 0.777778 0.000000 0.777778 0.222222 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.444444 0.111111 0.444444 0.000000 0.555556 0.444444 0.000000 0.000000 0.000000 0.111111 0.888889 0.111111 0.222222 0.000000 0.666667 0.222222 0.555556 0.111111 0.111111 0.222222 0.222222 0.111111 0.444444 0.333333 0.333333 0.222222 0.111111 0.000000 0.111111 0.777778 0.111111 0.000000 0.000000 0.111111 0.888889 0.000000 0.111111 0.333333 0.555556 0.111111 0.444444 0.222222 0.222222 0.111111 0.333333 0.111111 0.444444 0.111111 0.666667 0.000000 0.222222 0.000000 0.000000 1.000000 0.000000 0.222222 0.000000 0.000000 0.777778 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[TG][CG][CG]T[TC][CT][CG]T[TC][CA][TAC][ACG]GT[TG][CGT][TC][CT]G[TA] -------------------------------------------------------------------------------- Time 4.46 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 8 llr = 124 E-value = 6.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a1464a:::::55:3a851:a pos.-specific C :8:34:6:1:4:31::3143: probability G :1411:4:3953198::356: matrix T ::3:1::a61131::::1:1: bits 2.1 * * * * * 1.9 * * * * * 1.7 * * * * * 1.5 * * * * * * * Relative 1.3 * * * * **** * Entropy 1.0 * *** * **** * (22.3 bits) 0.8 ** ***** **** ** 0.6 ** * ******* **** *** 0.4 **** ******* **** *** 0.2 ********************* 0.0 --------------------- Multilevel ACAAAACTTGGAAGGAAAGGA consensus GCC G G CGC A CGCC sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 19030 202 6.67e-10 CAAAACCCTA ACAACACTTGGTCGAAAAGGA CACACGCCTG 44147 451 1.31e-08 CACTCGTCCT ACAATACTTTCAAGGAAGCGA TTCGTGTCCG 35813 70 1.76e-08 CTGTGAGTGA ACTCAACTTGGATGGAAACTA TGGGGATTCT 15777 139 4.36e-08 ACGTTCGCCG ACGACAGTCGCGAGGAACGCA TTGCCCGGGC 40097 197 4.73e-08 CGGGAAAACC ACAAAACTTGGTGGAACTGGA ACCCGAAATA 8772 67 6.05e-08 ACCTCCTCCG ACGACAGTGGTGAGGACAAGA CAGCATCCAG 9409 13 6.53e-08 TCGAAGCGGA AAGCGACTGGCACGGAAACGA TTTAAAACGC 54420 146 2.67e-07 CTCGCCTGAC AGTGAAGTTGGAACGAAGGCA CACAATAGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 19030 6.7e-10 201_[+3]_278 44147 1.3e-08 450_[+3]_29 35813 1.8e-08 69_[+3]_410 15777 4.4e-08 138_[+3]_341 40097 4.7e-08 196_[+3]_283 8772 6e-08 66_[+3]_413 9409 6.5e-08 12_[+3]_467 54420 2.7e-07 145_[+3]_334 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=8 19030 ( 202) ACAACACTTGGTCGAAAAGGA 1 44147 ( 451) ACAATACTTTCAAGGAAGCGA 1 35813 ( 70) ACTCAACTTGGATGGAAACTA 1 15777 ( 139) ACGACAGTCGCGAGGAACGCA 1 40097 ( 197) ACAAAACTTGGTGGAACTGGA 1 8772 ( 67) ACGACAGTGGTGAGGACAAGA 1 9409 ( 13) AAGCGACTGGCACGGAAACGA 1 54420 ( 146) AGTGAAGTTGGAACGAAGGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 9.81218 E= 6.9e+001 198 -965 -965 -965 -101 148 -91 -965 57 -965 67 3 130 -10 -91 -965 57 48 -91 -96 198 -965 -965 -965 -965 122 67 -965 -965 -965 -965 203 -965 -110 9 136 -965 -965 189 -96 -965 48 109 -96 98 -965 9 3 98 -10 -91 -96 -965 -110 189 -965 -2 -965 167 -965 198 -965 -965 -965 157 -10 -965 -965 98 -110 9 -96 -101 48 109 -965 -965 -10 141 -96 198 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 6.9e+001 1.000000 0.000000 0.000000 0.000000 0.125000 0.750000 0.125000 0.000000 0.375000 0.000000 0.375000 0.250000 0.625000 0.250000 0.125000 0.000000 0.375000 0.375000 0.125000 0.125000 1.000000 0.000000 0.000000 0.000000 0.000000 0.625000 0.375000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.125000 0.250000 0.625000 0.000000 0.000000 0.875000 0.125000 0.000000 0.375000 0.500000 0.125000 0.500000 0.000000 0.250000 0.250000 0.500000 0.250000 0.125000 0.125000 0.000000 0.125000 0.875000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.500000 0.125000 0.250000 0.125000 0.125000 0.375000 0.500000 0.000000 0.000000 0.250000 0.625000 0.125000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AC[AGT][AC][AC]A[CG]T[TG]G[GC][AGT][AC]G[GA]A[AC][AG][GC][GC]A -------------------------------------------------------------------------------- Time 6.58 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8772 3.90e-05 66_[+3(6.05e-08)]_358_\ [+1(2.48e-05)]_43 8670 2.28e-03 451_[+2(3.22e-07)]_28 9409 6.32e-06 12_[+3(6.53e-08)]_174_\ [+1(3.98e-06)]_281 15777 2.56e-11 138_[+3(4.36e-08)]_47_\ [+2(1.84e-07)]_[+1(5.68e-08)]_261 49969 1.09e-05 384_[+2(4.62e-08)]_44_\ [+1(1.72e-05)]_39 44147 1.52e-04 450_[+3(1.31e-08)]_29 44172 1.77e-02 360_[+1(5.99e-06)]_128 19030 1.25e-08 201_[+3(6.67e-10)]_136_\ [+2(5.00e-07)]_121 54420 3.12e-07 145_[+3(2.67e-07)]_232_\ [+2(3.05e-08)]_81 35813 4.54e-06 69_[+3(1.76e-08)]_374_\ [+1(8.21e-06)]_24 46291 2.19e-07 119_[+2(2.45e-08)]_142_\ [+1(5.53e-07)]_206 40097 1.18e-09 71_[+1(3.99e-07)]_50_[+2(1.46e-06)]_\ 42_[+3(4.73e-08)]_283 33842 7.28e-05 381_[+2(1.31e-06)]_18_\ [+1(3.15e-06)]_68 39656 2.96e-07 282_[+2(2.01e-11)]_197 45016 6.56e-03 115_[+1(6.23e-06)]_373 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************