******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/434/434.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10524 1.0000 500 11951 1.0000 500 18099 1.0000 500 23666 1.0000 500 24658 1.0000 500 25331 1.0000 500 5878 1.0000 500 6182 1.0000 500 9529 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/434/434.seqs.fa -oc motifs/434 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.252 C 0.219 G 0.244 T 0.284 Background letter frequencies (from dataset with add-one prior applied): A 0.252 C 0.219 G 0.244 T 0.284 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 9 llr = 114 E-value = 1.1e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 8:9394:73691862 pos.-specific C 2a171:817214248 probability G :::::::::2::::: matrix T :::::622:::4::: bits 2.2 * 2.0 * 1.8 * 1.5 ** * * Relative 1.3 *** * * * * * Entropy 1.1 ***** * * * *** (18.3 bits) 0.9 ******* * * *** 0.7 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel ACACATCACAACAAC consensus C A ATTAC TCCA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 23666 401 2.26e-08 TGGAAAGTGG ACACATCACGATACC CTCAAACCTC 6182 485 4.61e-08 TTCGCCTATC ACAAATCAAAACAAC A 10524 5 3.39e-07 GGAG ACACATCACGCTAAC CATAATAGTG 18099 472 5.68e-07 GACTCAACTC ACAAATCTCAACAAA ATCTGTTCTT 25331 459 6.31e-07 GATCTGTCTT CCACAACAAAACACA CCGATCCACT 5878 298 1.04e-06 GTCATCGGTA ACCCATCACCATCAC TGTCGTGGAA 24658 484 2.06e-06 GCCGCCGCCG ACACAATCACACAAC TC 11951 446 2.06e-06 AGCTTCGCTT ACAAAACTCAAACCC CATCCATTCG 9529 471 2.33e-06 GCTTTCGCCA CCACCATACAATACC TTCGTAAACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23666 2.3e-08 400_[+1]_85 6182 4.6e-08 484_[+1]_1 10524 3.4e-07 4_[+1]_481 18099 5.7e-07 471_[+1]_14 25331 6.3e-07 458_[+1]_27 5878 1e-06 297_[+1]_188 24658 2.1e-06 483_[+1]_2 11951 2.1e-06 445_[+1]_40 9529 2.3e-06 470_[+1]_15 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=9 23666 ( 401) ACACATCACGATACC 1 6182 ( 485) ACAAATCAAAACAAC 1 10524 ( 5) ACACATCACGCTAAC 1 18099 ( 472) ACAAATCTCAACAAA 1 25331 ( 459) CCACAACAAAACACA 1 5878 ( 298) ACCCATCACCATCAC 1 24658 ( 484) ACACAATCACACAAC 1 11951 ( 446) ACAAAACTCAAACCC 1 9529 ( 471) CCACCATACAATACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4374 bayes= 8.92184 E= 1.1e-002 162 2 -982 -982 -982 219 -982 -982 181 -98 -982 -982 40 160 -982 -982 181 -98 -982 -982 82 -982 -982 97 -982 182 -982 -35 140 -98 -982 -35 40 160 -982 -982 114 2 -14 -982 181 -98 -982 -982 -118 102 -982 65 162 2 -982 -982 114 102 -982 -982 -18 182 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 9 E= 1.1e-002 0.777778 0.222222 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.444444 0.000000 0.000000 0.555556 0.000000 0.777778 0.000000 0.222222 0.666667 0.111111 0.000000 0.222222 0.333333 0.666667 0.000000 0.000000 0.555556 0.222222 0.222222 0.000000 0.888889 0.111111 0.000000 0.000000 0.111111 0.444444 0.000000 0.444444 0.777778 0.222222 0.000000 0.000000 0.555556 0.444444 0.000000 0.000000 0.222222 0.777778 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AC]CA[CA]A[TA][CT][AT][CA][ACG]A[CT][AC][AC][CA] -------------------------------------------------------------------------------- Time 0.72 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 8 llr = 88 E-value = 1.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :39119:::4:: pos.-specific C :1:::1:193:6 probability G :61:9::9:1:1 matrix T a::9::a:13a3 bits 2.2 2.0 1.8 * * * 1.5 * * ***** * Relative 1.3 * ******* * Entropy 1.1 * ******* * (15.9 bits) 0.9 * ******* ** 0.7 ********* ** 0.4 ********* ** 0.2 ********* ** 0.0 ------------ Multilevel TGATGATGCATC consensus A C T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 23666 195 1.36e-07 GTTGACATGA TGATGATGCCTC GATTGATGGT 24658 305 2.92e-07 TATTCTATCC TAATGATGCATC TCTCAACACT 5878 167 1.78e-06 CGACGTGCCA TCATGATGCTTC CGTTTAGACA 6182 166 2.68e-06 TGATATAATG TGAAGATGCTTC GGATAACCTG 9529 74 3.62e-06 GATCGTCTGC TGATGCTGCGTC ATCGTCAACG 25331 163 5.82e-06 CGACGAGATT TGGTGATGCCTT CCCCTTGAGC 11951 398 1.96e-05 ATAGAGCCTC TAATGATGTATG TTTAGATCTA 18099 231 2.33e-05 ATGATCTTTG TGATAATCCATT GCATCGTTGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23666 1.4e-07 194_[+2]_294 24658 2.9e-07 304_[+2]_184 5878 1.8e-06 166_[+2]_322 6182 2.7e-06 165_[+2]_323 9529 3.6e-06 73_[+2]_415 25331 5.8e-06 162_[+2]_326 11951 2e-05 397_[+2]_91 18099 2.3e-05 230_[+2]_258 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=8 23666 ( 195) TGATGATGCCTC 1 24658 ( 305) TAATGATGCATC 1 5878 ( 167) TCATGATGCTTC 1 6182 ( 166) TGAAGATGCTTC 1 9529 ( 74) TGATGCTGCGTC 1 25331 ( 163) TGGTGATGCCTT 1 11951 ( 398) TAATGATGTATG 1 18099 ( 231) TGATAATCCATT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4401 bayes= 9.10099 E= 1.7e+001 -965 -965 -965 182 -1 -81 135 -965 179 -965 -97 -965 -101 -965 -965 162 -101 -965 184 -965 179 -81 -965 -965 -965 -965 -965 182 -965 -81 184 -965 -965 199 -965 -118 57 19 -97 -18 -965 -965 -965 182 -965 151 -97 -18 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 1.7e+001 0.000000 0.000000 0.000000 1.000000 0.250000 0.125000 0.625000 0.000000 0.875000 0.000000 0.125000 0.000000 0.125000 0.000000 0.000000 0.875000 0.125000 0.000000 0.875000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.875000 0.000000 0.125000 0.375000 0.250000 0.125000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.625000 0.125000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[GA]ATGATGC[ACT]T[CT] -------------------------------------------------------------------------------- Time 1.48 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 6 llr = 75 E-value = 5.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::a:::2::22 pos.-specific C :::::::2::2: probability G :3a::aa22a:8 matrix T a7::a::58:7: bits 2.2 2.0 ** ** * 1.8 * ***** * 1.5 * ***** * Relative 1.3 * ***** * * Entropy 1.1 * ***** ** * (18.1 bits) 0.9 ******* ** * 0.7 ******* **** 0.4 ******* **** 0.2 ************ 0.0 ------------ Multilevel TTGATGGTTGTG consensus G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 23666 209 1.14e-07 GATGCCTCGA TTGATGGTTGTG TGATGAGAAG 11951 297 2.13e-07 CTGCGTCTAA TGGATGGTTGTG GTTGTGGCGT 25331 275 1.15e-06 CGGATCATTG TTGATGGTTGTA GCCTCCTGTT 18099 387 1.70e-06 TATGTATTTG TTGATGGGTGCG GTGTGCTTTC 10524 201 1.98e-06 TCAACGCACC TTGATGGATGAG TAACCTCCTC 9529 115 3.36e-06 ACACCCACGT TGGATGGCGGTG AGGCGAGGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23666 1.1e-07 208_[+3]_280 11951 2.1e-07 296_[+3]_192 25331 1.2e-06 274_[+3]_214 18099 1.7e-06 386_[+3]_102 10524 2e-06 200_[+3]_288 9529 3.4e-06 114_[+3]_374 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=6 23666 ( 209) TTGATGGTTGTG 1 11951 ( 297) TGGATGGTTGTG 1 25331 ( 275) TTGATGGTTGTA 1 18099 ( 387) TTGATGGGTGCG 1 10524 ( 201) TTGATGGATGAG 1 9529 ( 115) TGGATGGCGGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4401 bayes= 9.17512 E= 5.7e+001 -923 -923 -923 182 -923 -923 45 123 -923 -923 203 -923 198 -923 -923 -923 -923 -923 -923 182 -923 -923 203 -923 -923 -923 203 -923 -60 -40 -55 82 -923 -923 -55 155 -923 -923 203 -923 -60 -40 -923 123 -60 -923 177 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 5.7e+001 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.166667 0.166667 0.500000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 1.000000 0.000000 0.166667 0.166667 0.000000 0.666667 0.166667 0.000000 0.833333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[TG]GATGGTTGTG -------------------------------------------------------------------------------- Time 2.32 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10524 1.17e-05 4_[+1(3.39e-07)]_166_[+1(4.35e-05)]_\ [+3(1.98e-06)]_288 11951 2.34e-07 136_[+3(1.66e-05)]_148_\ [+3(2.13e-07)]_89_[+2(1.96e-05)]_36_[+1(2.06e-06)]_40 18099 5.61e-07 230_[+2(2.33e-05)]_144_\ [+3(1.70e-06)]_73_[+1(5.68e-07)]_14 23666 2.06e-11 194_[+2(1.36e-07)]_2_[+3(1.14e-07)]_\ 180_[+1(2.26e-08)]_85 24658 1.42e-05 304_[+2(2.92e-07)]_167_\ [+1(2.06e-06)]_2 25331 1.24e-07 162_[+2(5.82e-06)]_100_\ [+3(1.15e-06)]_172_[+1(6.31e-07)]_27 5878 4.58e-05 166_[+2(1.78e-06)]_119_\ [+1(1.04e-06)]_188 6182 4.23e-06 165_[+2(2.68e-06)]_307_\ [+1(4.61e-08)]_1 9529 6.95e-07 73_[+2(3.62e-06)]_29_[+3(3.36e-06)]_\ 344_[+1(2.33e-06)]_15 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************