******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/408/408.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1986 1.0000 500 20654 1.0000 500 21238 1.0000 500 22134 1.0000 500 22168 1.0000 500 23944 1.0000 500 261548 1.0000 500 261550 1.0000 500 264441 1.0000 500 268040 1.0000 500 269097 1.0000 500 27813 1.0000 500 3653 1.0000 500 38046 1.0000 500 6511 1.0000 500 7997 1.0000 500 8136 1.0000 500 8740 1.0000 500 9221 1.0000 500 9364 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/408/408.seqs.fa -oc motifs/408 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10000 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.265 C 0.225 G 0.242 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.225 G 0.243 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 9 llr = 145 E-value = 3.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::8:47::2:2:11:13:127 pos.-specific C 77:833a48a1678377:87: probability G :2221::4::44::6::a1:: matrix T 31::1::1::2:2112:::13 bits 2.2 * * 1.9 * * * 1.7 * * * 1.5 * * * Relative 1.3 * * ** * Entropy 1.1 * ** ** ** * * *** * (23.2 bits) 0.9 **** ** ** *** ****** 0.6 **** ***** ********** 0.4 **** ***** ********** 0.2 ********************* 0.0 --------------------- Multilevel CCACAACCCCGCCCGCCGCCA consensus TGGGCC GA AGT CTA AT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 22168 127 2.15e-11 TTCCCGAAAA CCACAACGACGGCCGCCGCCA GGCTGCCGAG 22134 324 3.67e-09 ACTCGTTTGT TCGCCACGCCTGCCCTCGCCA ACAACCTATC 21238 186 5.01e-09 CTGATGAGAA TCAGCACCCCGCCCGCAGACA CCACCTCGCC 20654 398 2.08e-08 CATGCATTAT CTACACCCCCTCTCGTCGCCT GTCCTCTCTC 264441 130 2.27e-08 TATGTTGTGG CGACAACTACAGCCGCCGCAA CGAGAGGAGG 261550 209 2.27e-08 CAGTTTTCAT CCACCACCCCGCTCCAAGCTA TCCATCGTTC 7997 399 3.73e-08 TTCTGTGACA CGGCGCCGCCGCCTGCCGCCT GCTCACAATT 8136 479 1.55e-07 ACATCTGGTC TCACAACCCCCCCACCAGGAA C 27813 251 1.65e-07 TGCACCTTCT CCAGTCCGCCAGACTCCGCCT GAGCAGTGGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22168 2.2e-11 126_[+1]_353 22134 3.7e-09 323_[+1]_156 21238 5e-09 185_[+1]_294 20654 2.1e-08 397_[+1]_82 264441 2.3e-08 129_[+1]_350 261550 2.3e-08 208_[+1]_271 7997 3.7e-08 398_[+1]_81 8136 1.6e-07 478_[+1]_1 27813 1.7e-07 250_[+1]_229 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=9 22168 ( 127) CCACAACGACGGCCGCCGCCA 1 22134 ( 324) TCGCCACGCCTGCCCTCGCCA 1 21238 ( 186) TCAGCACCCCGCCCGCAGACA 1 20654 ( 398) CTACACCCCCTCTCGTCGCCT 1 264441 ( 130) CGACAACTACAGCCGCCGCAA 1 261550 ( 209) CCACCACCCCGCTCCAAGCTA 1 7997 ( 399) CGGCGCCGCCGCCTGCCGCCT 1 8136 ( 479) TCACAACCCCCCCACCAGGAA 1 27813 ( 251) CCAGTCCGCCAGACTCCGCCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9600 bayes= 11.4628 E= 3.0e-001 -982 157 -982 31 -982 157 -13 -127 155 -982 -13 -982 -982 179 -13 -982 75 57 -112 -127 133 57 -982 -982 -982 215 -982 -982 -982 98 87 -127 -25 179 -982 -982 -982 215 -982 -982 -25 -101 87 -27 -982 130 87 -982 -125 157 -982 -27 -125 179 -982 -127 -982 57 120 -127 -125 157 -982 -27 33 157 -982 -982 -982 -982 204 -982 -125 179 -112 -982 -25 157 -982 -127 133 -982 -982 31 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 3.0e-001 0.000000 0.666667 0.000000 0.333333 0.000000 0.666667 0.222222 0.111111 0.777778 0.000000 0.222222 0.000000 0.000000 0.777778 0.222222 0.000000 0.444444 0.333333 0.111111 0.111111 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.444444 0.444444 0.111111 0.222222 0.777778 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.222222 0.111111 0.444444 0.222222 0.000000 0.555556 0.444444 0.000000 0.111111 0.666667 0.000000 0.222222 0.111111 0.777778 0.000000 0.111111 0.000000 0.333333 0.555556 0.111111 0.111111 0.666667 0.000000 0.222222 0.333333 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.111111 0.777778 0.111111 0.000000 0.222222 0.666667 0.000000 0.111111 0.666667 0.000000 0.000000 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CT][CG][AG][CG][AC][AC]C[CG][CA]C[GAT][CG][CT]C[GC][CT][CA]GC[CA][AT] -------------------------------------------------------------------------------- Time 3.64 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 20 llr = 176 E-value = 4.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :247176:a226 pos.-specific C :152::1:1::1 probability G a621a34a:681 matrix T 11:1:1:::2:3 bits 2.2 1.9 * 1.7 * * ** 1.5 * * ** Relative 1.3 * * ** * Entropy 1.1 * * ** * (12.7 bits) 0.9 * * *** * 0.6 * ********* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GGCAGAAGAGGA consensus AAC GG AAT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 22168 30 1.39e-06 ACGGGTTGGG GACAGAGGAGGA CCTCCGGGCG 7997 4 1.74e-06 GAG GCCAGAAGAGGA GCTGCCTCCT 261548 346 7.37e-06 CAGACGAAGT GGACGAAGAAGA GCCTCTCTCC 264441 196 8.32e-06 GGAGGAGGTC GGAGGAGGAGGA GGGACATTGA 261550 345 9.44e-06 GGGGGCGAGC GGCAGGGGAGGC AATATTGTCC 269097 398 1.19e-05 ATGGATGGTT GGATGAAGAGGA GAGTGGCAAA 22134 52 1.33e-05 ACCACCTTGA GACAGAAGATGT GCATCGTTCC 6511 18 1.65e-05 GTGTGGGTTA GGAAGGAGAAGT AGTATTGCCA 9221 286 1.81e-05 ATGTTGAGAG GGCAGACGATGA TGGAAGTCTG 1986 28 2.25e-05 ATATCAAAGC GGGAGAAGAGAT CCTCGTGAAC 38046 57 2.99e-05 ACGAGGTTGA GGACGGAGAGAA TGTCGGTAAA 8740 210 3.30e-05 TTGTCAAATC GACAGGGGAGAA ATATCTTGTC 8136 289 3.30e-05 TTGTTAAACA GGCAGAGGAAGG TTATGTGAAG 9364 154 4.77e-05 GGCGACTGCT GAGAGTAGAGGA GGAACACTGC 23944 388 5.58e-05 TGTGTGGGGA GGACAAAGAGGA CAAAATAAGT 3653 342 6.03e-05 GTACGAGTTC GCGAGAAGAGGC GTGGGATTGC 21238 307 9.75e-05 AGATTCTTCG GGCAGAGGCTGT TGAGGGTGCT 268040 145 1.21e-04 ATTCTATCTT GTAGGAAGAAGA CTTTTGGATA 27813 134 1.47e-04 AGGGGCGATA GTCCGGGGAGAA TCATAGTCGT 20654 207 3.27e-04 AGAAATTAAG TGCAGTAGATGT TGCTTTTGGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22168 1.4e-06 29_[+2]_459 7997 1.7e-06 3_[+2]_485 261548 7.4e-06 345_[+2]_143 264441 8.3e-06 195_[+2]_293 261550 9.4e-06 344_[+2]_144 269097 1.2e-05 397_[+2]_91 22134 1.3e-05 51_[+2]_437 6511 1.6e-05 17_[+2]_471 9221 1.8e-05 285_[+2]_203 1986 2.2e-05 27_[+2]_461 38046 3e-05 56_[+2]_432 8740 3.3e-05 209_[+2]_279 8136 3.3e-05 288_[+2]_200 9364 4.8e-05 153_[+2]_335 23944 5.6e-05 387_[+2]_101 3653 6e-05 341_[+2]_147 21238 9.7e-05 306_[+2]_182 268040 0.00012 144_[+2]_344 27813 0.00015 133_[+2]_355 20654 0.00033 206_[+2]_282 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=20 22168 ( 30) GACAGAGGAGGA 1 7997 ( 4) GCCAGAAGAGGA 1 261548 ( 346) GGACGAAGAAGA 1 264441 ( 196) GGAGGAGGAGGA 1 261550 ( 345) GGCAGGGGAGGC 1 269097 ( 398) GGATGAAGAGGA 1 22134 ( 52) GACAGAAGATGT 1 6511 ( 18) GGAAGGAGAAGT 1 9221 ( 286) GGCAGACGATGA 1 1986 ( 28) GGGAGAAGAGAT 1 38046 ( 57) GGACGGAGAGAA 1 8740 ( 210) GACAGGGGAGAA 1 8136 ( 289) GGCAGAGGAAGG 1 9364 ( 154) GAGAGTAGAGGA 1 23944 ( 388) GGACAAAGAGGA 1 3653 ( 342) GCGAGAAGAGGC 1 21238 ( 307) GGCAGAGGCTGT 1 268040 ( 145) GTAGGAAGAAGA 1 27813 ( 134) GTCCGGGGAGAA 1 20654 ( 207) TGCAGTAGATGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9780 bayes= 8.93074 E= 4.9e+000 -1097 -1097 197 -242 -40 -117 131 -142 40 115 -69 -1097 130 -17 -128 -242 -240 -1097 197 -1097 130 -1097 4 -142 118 -217 53 -1097 -1097 -1097 204 -1097 184 -217 -1097 -1097 -40 -1097 131 -42 -40 -1097 172 -1097 118 -117 -228 -10 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 20 E= 4.9e+000 0.000000 0.000000 0.950000 0.050000 0.200000 0.100000 0.600000 0.100000 0.350000 0.500000 0.150000 0.000000 0.650000 0.200000 0.100000 0.050000 0.050000 0.000000 0.950000 0.000000 0.650000 0.000000 0.250000 0.100000 0.600000 0.050000 0.350000 0.000000 0.000000 0.000000 1.000000 0.000000 0.950000 0.050000 0.000000 0.000000 0.200000 0.000000 0.600000 0.200000 0.200000 0.000000 0.800000 0.000000 0.600000 0.100000 0.050000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[GA][CA][AC]G[AG][AG]GA[GAT][GA][AT] -------------------------------------------------------------------------------- Time 7.19 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 8 llr = 125 E-value = 5.3e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::5344:4835:63:88:9 pos.-specific C :a1841a138:a:5a1:8: probability G :::::::1::1::::1::1 matrix T a:4:35:4::4:43::33: bits 2.2 * * * * 1.9 ** * * * 1.7 ** * * * 1.5 ** * * * Relative 1.3 ** * * * * * ** Entropy 1.1 ** * * ** * * *** (22.6 bits) 0.9 ** * * ** ** ***** 0.6 ** * * ** ** ***** 0.4 ******* *********** 0.2 ******************* 0.0 ------------------- Multilevel TCACATCAACACACCAACA consensus TACA TCAT TA TT sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 261550 482 5.19e-12 GCATAAGCTA TCACCTCTACACACCAACA 3653 456 8.75e-09 ACTCAACGCC TCCCTACTACACAACAACA CAAAACACAA 6511 144 1.12e-08 TCGGAACGTT TCTCCTCTACTCTTCATCA AACAGAAAGC 20654 430 1.72e-08 TCCTCTCTCT TCAATTCAACACACCCACA TACTCGCCAC 9221 192 6.74e-08 CAGCACCCAC TCACAACAACGCACCGATA AGTAGTGGAT 264441 289 1.46e-07 GAATTTCGCC TCAAATCACATCTCCATCA GCAGTACCCA 21238 154 1.85e-07 GCTGTTTCAC TCTCCCCCACTCTTCAACG GCTCTGATGA 22168 474 2.19e-07 CACACAGCAA TCTCAACGCAACAACAATA CATACAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 261550 5.2e-12 481_[+3] 3653 8.8e-09 455_[+3]_26 6511 1.1e-08 143_[+3]_338 20654 1.7e-08 429_[+3]_52 9221 6.7e-08 191_[+3]_290 264441 1.5e-07 288_[+3]_193 21238 1.9e-07 153_[+3]_328 22168 2.2e-07 473_[+3]_8 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=8 261550 ( 482) TCACCTCTACACACCAACA 1 3653 ( 456) TCCCTACTACACAACAACA 1 6511 ( 144) TCTCCTCTACTCTTCATCA 1 20654 ( 430) TCAATTCAACACACCCACA 1 9221 ( 192) TCACAACAACGCACCGATA 1 264441 ( 289) TCAAATCACATCTCCATCA 1 21238 ( 154) TCTCCCCCACTCTTCAACG 1 22168 ( 474) TCTCAACGCAACAACAATA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 9640 bayes= 10.9711 E= 5.3e+000 -965 -965 -965 190 -965 215 -965 -965 92 -84 -965 48 -8 174 -965 -965 50 74 -965 -10 50 -84 -965 90 -965 215 -965 -965 50 -84 -95 48 150 15 -965 -965 -8 174 -965 -965 92 -965 -95 48 -965 215 -965 -965 124 -965 -965 48 -8 115 -965 -10 -965 215 -965 -965 150 -84 -95 -965 150 -965 -965 -10 -965 174 -965 -10 172 -965 -95 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 8 E= 5.3e+000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.125000 0.000000 0.375000 0.250000 0.750000 0.000000 0.000000 0.375000 0.375000 0.000000 0.250000 0.375000 0.125000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.375000 0.125000 0.125000 0.375000 0.750000 0.250000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.500000 0.000000 0.125000 0.375000 0.000000 1.000000 0.000000 0.000000 0.625000 0.000000 0.000000 0.375000 0.250000 0.500000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.750000 0.125000 0.125000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.875000 0.000000 0.125000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TC[AT][CA][ACT][TA]C[AT][AC][CA][AT]C[AT][CAT]CA[AT][CT]A -------------------------------------------------------------------------------- Time 11.18 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1986 1.36e-01 27_[+2(2.25e-05)]_461 20654 4.17e-09 397_[+1(2.08e-08)]_11_\ [+3(1.72e-08)]_52 21238 3.47e-09 153_[+3(1.85e-07)]_13_\ [+1(5.01e-09)]_[+3(8.75e-05)]_81_[+2(9.75e-05)]_182 22134 1.01e-06 51_[+2(1.33e-05)]_260_\ [+1(3.67e-09)]_156 22168 4.77e-13 29_[+2(1.39e-06)]_85_[+1(2.15e-11)]_\ 326_[+3(2.19e-07)]_8 23944 1.34e-01 387_[+2(5.58e-05)]_101 261548 5.08e-04 164_[+1(1.71e-05)]_160_\ [+2(7.37e-06)]_143 261550 8.88e-14 8_[+2(7.70e-05)]_188_[+1(2.27e-08)]_\ 115_[+2(9.44e-06)]_125_[+3(5.19e-12)] 264441 1.18e-09 129_[+1(2.27e-08)]_45_\ [+2(8.32e-06)]_81_[+3(1.46e-07)]_193 268040 6.80e-02 500 269097 4.79e-02 397_[+2(1.19e-05)]_91 27813 9.26e-05 250_[+1(1.65e-07)]_229 3653 5.94e-06 341_[+2(6.03e-05)]_102_\ [+3(8.75e-09)]_26 38046 1.45e-01 56_[+2(2.99e-05)]_432 6511 1.01e-06 17_[+2(1.65e-05)]_114_\ [+3(1.12e-08)]_306_[+3(9.76e-05)]_13 7997 8.46e-07 3_[+2(1.74e-06)]_383_[+1(3.73e-08)]_\ 81 8136 4.66e-05 288_[+2(3.30e-05)]_178_\ [+1(1.55e-07)]_1 8740 4.84e-03 209_[+2(3.30e-05)]_207_\ [+3(8.12e-05)]_53 9221 3.43e-05 191_[+3(6.74e-08)]_75_\ [+2(1.81e-05)]_203 9364 5.69e-02 153_[+2(4.77e-05)]_335 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************