******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/289/289.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 17265 1.0000 500 47958 1.0000 500 48361 1.0000 500 43295 1.0000 500 43884 1.0000 500 44384 1.0000 500 45361 1.0000 500 46312 1.0000 500 49123 1.0000 500 47079 1.0000 500 44268 1.0000 500 49105 1.0000 500 38045 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/289/289.seqs.fa -oc motifs/289 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.262 C 0.230 G 0.236 T 0.272 Background letter frequencies (from dataset with add-one prior applied): A 0.262 C 0.230 G 0.236 T 0.272 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 13 sites = 8 llr = 102 E-value = 2.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::a:5:1:94a48 pos.-specific C :3:a:::4:::41 probability G :8:::a:616:3: matrix T a:::5:9:::::1 bits 2.1 * * 1.9 * ** * * 1.7 * ** * * 1.5 * ** * * * Relative 1.3 **** ** * * Entropy 1.1 **** ****** (18.5 bits) 0.8 *********** * 0.6 *********** * 0.4 ************* 0.2 ************* 0.0 ------------- Multilevel TGACAGTGAGAAA consensus C T C A C sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 49123 376 3.05e-08 AGAAATACAG TGACTGTGAGACA TGCGCGTCCG 43884 110 1.11e-07 GTAGTCGCAC TGACTGTGAGAGA AATAGCAAAC 47079 272 2.48e-07 CTGGAGTATT TGACTGTGAAAAA GTGAAGCGAG 38045 348 5.39e-07 TGGGTGCAGG TCACTGTCAGACA ACGGCGTCAA 44268 167 6.34e-07 AGTACCATCG TGACAGTGAGAAC GCGAACACAA 49105 137 1.24e-06 TACGCTGTAC TGACAGACAGACA GAGACAGACA 46312 29 1.24e-06 TGGTAGGGAT TCACAGTCAAAAA AGCATTCTTG 45361 253 5.31e-06 AACACTGTCT TGACAGTGGAAGT ATTTGCAGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49123 3.1e-08 375_[+1]_112 43884 1.1e-07 109_[+1]_378 47079 2.5e-07 271_[+1]_216 38045 5.4e-07 347_[+1]_140 44268 6.3e-07 166_[+1]_321 49105 1.2e-06 136_[+1]_351 46312 1.2e-06 28_[+1]_459 45361 5.3e-06 252_[+1]_235 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=13 seqs=8 49123 ( 376) TGACTGTGAGACA 1 43884 ( 110) TGACTGTGAGAGA 1 47079 ( 272) TGACTGTGAAAAA 1 38045 ( 348) TCACTGTCAGACA 1 44268 ( 167) TGACAGTGAGAAC 1 49105 ( 137) TGACAGACAGACA 1 46312 ( 29) TCACAGTCAAAAA 1 45361 ( 253) TGACAGTGGAAGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 6344 bayes= 9.62936 E= 2.0e+000 -965 -965 -965 188 -965 12 167 -965 193 -965 -965 -965 -965 212 -965 -965 93 -965 -965 88 -965 -965 208 -965 -107 -965 -965 168 -965 70 141 -965 174 -965 -91 -965 52 -965 141 -965 193 -965 -965 -965 52 70 9 -965 152 -88 -965 -112 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 8 E= 2.0e+000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.125000 0.000000 0.000000 0.875000 0.000000 0.375000 0.625000 0.000000 0.875000 0.000000 0.125000 0.000000 0.375000 0.000000 0.625000 0.000000 1.000000 0.000000 0.000000 0.000000 0.375000 0.375000 0.250000 0.000000 0.750000 0.125000 0.000000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[GC]AC[AT]GT[GC]A[GA]A[ACG]A -------------------------------------------------------------------------------- Time 1.40 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 11 llr = 114 E-value = 3.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 293::a:61:24 pos.-specific C 8:5:1:a12221 probability G ::::::::7::5 matrix T :13a9::3:86: bits 2.1 * 1.9 * ** 1.7 * ** 1.5 ** **** Relative 1.3 ** **** * Entropy 1.1 ** **** ** (15.0 bits) 0.8 ** **** ** 0.6 ** ********* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CACTTACAGTTG consensus A T A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48361 237 6.67e-08 TTCTCATCGT CACTTACAGTTG CCTAACCATC 49105 448 1.41e-07 TACAGCGCAG CACTTACAGTTA CACCACCCAG 49123 250 8.56e-07 TGTCAAAGTA CACTTACTGTTA GGGATTTGAA 43884 73 1.02e-06 ACGTTGCGTG CATTTACTGTTG ACATCTCTTT 38045 119 3.15e-06 GTGCCTAATC CAATTACAGTCA TTAATGGTCT 17265 473 6.98e-06 ACAATTCAGC AACTTACACTTG CATTAGCGTC 47958 466 1.77e-05 TTGCTTAAAG CACTTACTGCAA GCGTCCGAAT 47079 115 2.22e-05 GGTCAATTGG CTTTTACACTTG CTCTGACAAA 44268 440 3.55e-05 ATTGTTGCCA CAATCACAATTG TATCATTTTC 46312 108 3.55e-05 CTCGCTCACT AAATTACAGCAG ACCCACGAAT 44384 39 4.36e-05 CGTCGGCATC CATTTACCGTCC GACTCGGGGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48361 6.7e-08 236_[+2]_252 49105 1.4e-07 447_[+2]_41 49123 8.6e-07 249_[+2]_239 43884 1e-06 72_[+2]_416 38045 3.1e-06 118_[+2]_370 17265 7e-06 472_[+2]_16 47958 1.8e-05 465_[+2]_23 47079 2.2e-05 114_[+2]_374 44268 3.6e-05 439_[+2]_49 46312 3.6e-05 107_[+2]_381 44384 4.4e-05 38_[+2]_450 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=11 48361 ( 237) CACTTACAGTTG 1 49105 ( 448) CACTTACAGTTA 1 49123 ( 250) CACTTACTGTTA 1 43884 ( 73) CATTTACTGTTG 1 38045 ( 119) CAATTACAGTCA 1 17265 ( 473) AACTTACACTTG 1 47958 ( 466) CACTTACTGCAA 1 47079 ( 115) CTTTTACACTTG 1 44268 ( 440) CAATCACAATTG 1 46312 ( 108) AAATTACAGCAG 1 44384 ( 39) CATTTACCGTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 9.52784 E= 3.8e+001 -53 183 -1010 -1010 179 -1010 -1010 -158 6 98 -1010 0 -1010 -1010 -1010 188 -1010 -134 -1010 174 193 -1010 -1010 -1010 -1010 212 -1010 -1010 128 -134 -1010 0 -153 -34 163 -1010 -1010 -34 -1010 159 -53 -34 -1010 122 47 -134 121 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 3.8e+001 0.181818 0.818182 0.000000 0.000000 0.909091 0.000000 0.000000 0.090909 0.272727 0.454545 0.000000 0.272727 0.000000 0.000000 0.000000 1.000000 0.000000 0.090909 0.000000 0.909091 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.636364 0.090909 0.000000 0.272727 0.090909 0.181818 0.727273 0.000000 0.000000 0.181818 0.000000 0.818182 0.181818 0.181818 0.000000 0.636364 0.363636 0.090909 0.545455 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CA[CAT]TTAC[AT]GTT[GA] -------------------------------------------------------------------------------- Time 2.92 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 3 llr = 68 E-value = 1.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :3:3::::3:::7:33:::: pos.-specific C a7:7aaaa::aa:::3a33a probability G ::a:::::::::3::::7:: matrix T ::::::::7a:::a73::7: bits 2.1 * * **** ** * * 1.9 * * **** *** * * * 1.7 * * **** *** * * * 1.5 * * **** *** * * * Relative 1.3 * * **** *** * ** * Entropy 1.1 *************** **** (32.8 bits) 0.8 *************** **** 0.6 *************** **** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel CCGCCCCCTTCCATTACGTC consensus A A A G AC CC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 48361 271 1.97e-11 GAAGGAACAA CCGCCCCCATCCATTCCCTC GCAAGAGTCT 43884 393 3.07e-11 AATAAAATTG CCGCCCCCTTCCGTATCGTC CAGAACAAAG 49105 398 8.86e-11 CACAAACAGA CAGACCCCTTCCATTACGCC AGCACCAGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48361 2e-11 270_[+3]_210 43884 3.1e-11 392_[+3]_88 49105 8.9e-11 397_[+3]_83 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=3 48361 ( 271) CCGCCCCCATCCATTCCCTC 1 43884 ( 393) CCGCCCCCTTCCGTATCGTC 1 49105 ( 398) CAGACCCCTTCCATTACGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6253 bayes= 11.4723 E= 1.6e+002 -823 212 -823 -823 35 153 -823 -823 -823 -823 208 -823 35 153 -823 -823 -823 212 -823 -823 -823 212 -823 -823 -823 212 -823 -823 -823 212 -823 -823 35 -823 -823 129 -823 -823 -823 187 -823 212 -823 -823 -823 212 -823 -823 134 -823 50 -823 -823 -823 -823 187 35 -823 -823 129 35 53 -823 29 -823 212 -823 -823 -823 53 150 -823 -823 53 -823 129 -823 212 -823 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 3 E= 1.6e+002 0.000000 1.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.000000 0.000000 1.000000 0.333333 0.000000 0.000000 0.666667 0.333333 0.333333 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[CA]G[CA]CCCC[TA]TCC[AG]T[TA][ACT]C[GC][TC]C -------------------------------------------------------------------------------- Time 4.78 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17265 2.59e-02 472_[+2(6.98e-06)]_16 47958 4.11e-02 430_[+2(7.95e-05)]_23_\ [+2(1.77e-05)]_23 48361 9.28e-11 236_[+2(6.67e-08)]_22_\ [+3(1.97e-11)]_210 43295 9.68e-01 500 43884 2.65e-13 72_[+2(1.02e-06)]_25_[+1(1.11e-07)]_\ 270_[+3(3.07e-11)]_88 44384 9.35e-02 38_[+2(4.36e-05)]_450 45361 1.64e-02 252_[+1(5.31e-06)]_235 46312 5.81e-04 28_[+1(1.24e-06)]_66_[+2(3.55e-05)]_\ 381 49123 6.23e-07 249_[+2(8.56e-07)]_114_\ [+1(3.05e-08)]_34_[+1(1.08e-05)]_65 47079 1.33e-04 114_[+2(2.22e-05)]_145_\ [+1(2.48e-07)]_192_[+2(9.83e-05)]_12 44268 3.05e-04 166_[+1(6.34e-07)]_260_\ [+2(3.55e-05)]_49 49105 1.08e-12 136_[+1(1.24e-06)]_248_\ [+3(8.86e-11)]_30_[+2(1.41e-07)]_41 38045 3.69e-06 118_[+2(3.15e-06)]_217_\ [+1(5.39e-07)]_140 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************