******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/68/68.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 3211 1.0000 500 46880 1.0000 500 48302 1.0000 500 48676 1.0000 500 48881 1.0000 500 43411 1.0000 500 2936 1.0000 500 10299 1.0000 500 34321 1.0000 500 42506 1.0000 500 49299 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/68/68.seqs.fa -oc motifs/68 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.275 C 0.251 G 0.232 T 0.241 Background letter frequencies (from dataset with add-one prior applied): A 0.275 C 0.251 G 0.232 T 0.241 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 9 llr = 129 E-value = 2.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :a42274::1a::2:6:::: pos.-specific C 1:3341::12::36::111: probability G 8::1:2::77:6:2711399 matrix T 1:233:6a2::47:3386:1 bits 2.1 * 1.9 * * * 1.7 * * * ** 1.5 * * * ** Relative 1.3 * * * * ** Entropy 1.1 ** ** *** * * ** (20.7 bits) 0.8 ** ******* * * ** 0.6 ** *************** 0.4 *** **************** 0.2 *** **************** 0.0 -------------------- Multilevel GAACCATTGGAGTCGATTGG consensus CTTGA TC TCATT G sequence TAA G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 42506 309 7.51e-09 ATTCGAGTGA GAATCATTGGATTGGATTCG AGATCGCACA 46880 135 8.58e-09 GTTATGGGAG TAATTATTGGATTGGATTGG AAGTACCAAT 48302 182 1.09e-08 CACCGTTCGT CATCCAATGGAGTCGATGGG CAACACGTCA 10299 333 3.35e-08 GTCGCTGACG GACACGTTGGAGCAGTTGGG GGAACAACAA 48881 434 1.24e-07 AAAACAGCTT GAATTGTTTCAGTCGGTGGG GACAAGGCAT 34321 189 1.36e-07 GATACCGGCT GACCAAATGAAGTATTTTGG GCAATATCGT 2936 144 3.67e-07 CGTAGAAGTC GATGCAATGGATTCTACCGG GCGATGGTAC 48676 463 7.04e-07 AAGGACTAGT GACAAATTCCAGCCGAGTGG CCCTTCATCA 43411 62 8.42e-07 TCTAGTATGC GAACTCATTGATCCTTTTGT CGACGCATGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42506 7.5e-09 308_[+1]_172 46880 8.6e-09 134_[+1]_346 48302 1.1e-08 181_[+1]_299 10299 3.3e-08 332_[+1]_148 48881 1.2e-07 433_[+1]_47 34321 1.4e-07 188_[+1]_292 2936 3.7e-07 143_[+1]_337 48676 7e-07 462_[+1]_18 43411 8.4e-07 61_[+1]_419 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=9 42506 ( 309) GAATCATTGGATTGGATTCG 1 46880 ( 135) TAATTATTGGATTGGATTGG 1 48302 ( 182) CATCCAATGGAGTCGATGGG 1 10299 ( 333) GACACGTTGGAGCAGTTGGG 1 48881 ( 434) GAATTGTTTCAGTCGGTGGG 1 34321 ( 189) GACCAAATGAAGTATTTTGG 1 2936 ( 144) GATGCAATGGATTCTACCGG 1 48676 ( 463) GACAAATTCCAGCCGAGTGG 1 43411 ( 62) GAACTCATTGATCCTTTTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5291 bayes= 9.33146 E= 2.1e+000 -982 -117 174 -112 186 -982 -982 -982 69 41 -982 -12 -31 41 -106 46 -31 82 -982 46 127 -117 -6 -982 69 -982 -982 120 -982 -982 -982 205 -982 -117 152 -12 -131 -17 152 -982 186 -982 -982 -982 -982 -982 126 88 -982 41 -982 146 -31 115 -6 -982 -982 -982 152 46 101 -982 -106 46 -982 -117 -106 169 -982 -117 52 120 -982 -117 194 -982 -982 -982 194 -112 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 9 E= 2.1e+000 0.000000 0.111111 0.777778 0.111111 1.000000 0.000000 0.000000 0.000000 0.444444 0.333333 0.000000 0.222222 0.222222 0.333333 0.111111 0.333333 0.222222 0.444444 0.000000 0.333333 0.666667 0.111111 0.222222 0.000000 0.444444 0.000000 0.000000 0.555556 0.000000 0.000000 0.000000 1.000000 0.000000 0.111111 0.666667 0.222222 0.111111 0.222222 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.555556 0.444444 0.000000 0.333333 0.000000 0.666667 0.222222 0.555556 0.222222 0.000000 0.000000 0.000000 0.666667 0.333333 0.555556 0.000000 0.111111 0.333333 0.000000 0.111111 0.111111 0.777778 0.000000 0.111111 0.333333 0.555556 0.000000 0.111111 0.888889 0.000000 0.000000 0.000000 0.888889 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GA[ACT][CTA][CTA][AG][TA]T[GT][GC]A[GT][TC][CAG][GT][AT]T[TG]GG -------------------------------------------------------------------------------- Time 1.17 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 11 llr = 107 E-value = 5.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 2::::1:1:3:2 pos.-specific C :2:1651::475 probability G ::a:2431:::3 matrix T 88:92:68a43: bits 2.1 * * 1.9 * * 1.7 ** * 1.5 ** * Relative 1.3 **** ** * Entropy 1.1 **** ** * (14.0 bits) 0.8 **** *** * 0.6 ********* ** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTGTCCTTTCCC consensus GG TTG sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 2936 482 4.35e-08 GTACTAAGAT TTGTCCTTTTCC GGGCTAC 48676 112 1.29e-07 TAGTTCAATT TTGTCGTTTTCC TAAACGCTTT 48302 385 3.74e-06 CACATCGAAT TTGTGCTTTTTC CCATCGAGAC 43411 287 7.20e-06 AATTGAAATC ATGTCGTTTCCG TCCGTGGTCT 46880 333 8.05e-06 TTGCGACACA TTGTTCGTTCCG GATCCTCCGG 10299 42 1.27e-05 ATCGTCGTCA TCGTCCTTTACA CACCGTATCC 49299 74 1.80e-05 GAGACACTTT TTGTCCGATACC GTACCATTGG 48881 171 4.48e-05 TTCATGCGAC ATGTCGCTTTCG GCCGGACATG 3211 220 5.42e-05 TGTGCGTTAT TTGTTGTGTCTC CCATTTTATC 42506 372 6.93e-05 ACTTCGGCAG TTGCGCTTTACA TTGCAAAATA 34321 220 7.39e-05 CAATATCGTG TCGTCAGTTCTC TGGACGCTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 2936 4.3e-08 481_[+2]_7 48676 1.3e-07 111_[+2]_377 48302 3.7e-06 384_[+2]_104 43411 7.2e-06 286_[+2]_202 46880 8e-06 332_[+2]_156 10299 1.3e-05 41_[+2]_447 49299 1.8e-05 73_[+2]_415 48881 4.5e-05 170_[+2]_318 3211 5.4e-05 219_[+2]_269 42506 6.9e-05 371_[+2]_117 34321 7.4e-05 219_[+2]_269 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=11 2936 ( 482) TTGTCCTTTTCC 1 48676 ( 112) TTGTCGTTTTCC 1 48302 ( 385) TTGTGCTTTTTC 1 43411 ( 287) ATGTCGTTTCCG 1 46880 ( 333) TTGTTCGTTCCG 1 10299 ( 42) TCGTCCTTTACA 1 49299 ( 74) TTGTCCGATACC 1 48881 ( 171) ATGTCGCTTTCG 1 3211 ( 220) TTGTTGTGTCTC 1 42506 ( 372) TTGCGCTTTACA 1 34321 ( 220) TCGTCAGTTCTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5379 bayes= 8.93074 E= 5.8e+002 -60 -1010 -1010 176 -1010 -46 -1010 176 -1010 -1010 211 -1010 -1010 -146 -1010 191 -1010 134 -35 -41 -160 112 65 -1010 -1010 -146 23 140 -160 -1010 -135 176 -1010 -1010 -1010 205 -1 53 -1010 59 -1010 153 -1010 18 -60 112 23 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 5.8e+002 0.181818 0.000000 0.000000 0.818182 0.000000 0.181818 0.000000 0.818182 0.000000 0.000000 1.000000 0.000000 0.000000 0.090909 0.000000 0.909091 0.000000 0.636364 0.181818 0.181818 0.090909 0.545455 0.363636 0.000000 0.000000 0.090909 0.272727 0.636364 0.090909 0.000000 0.090909 0.818182 0.000000 0.000000 0.000000 1.000000 0.272727 0.363636 0.000000 0.363636 0.000000 0.727273 0.000000 0.272727 0.181818 0.545455 0.272727 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TTGTC[CG][TG]TT[CTA][CT][CG] -------------------------------------------------------------------------------- Time 2.16 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 11 llr = 107 E-value = 1.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::17:5::1:a1 pos.-specific C 71::3:14:a:7 probability G :9733414:::2 matrix T 3:2:51839::: bits 2.1 1.9 ** 1.7 * *** 1.5 * *** Relative 1.3 ** * *** Entropy 1.1 **** * *** (14.1 bits) 0.8 **** * **** 0.6 **** ** **** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CGGATATCTCAC consensus T GCG G sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 42506 441 1.20e-07 TCAACGTTAA CGGATATCTCAC ACTTTTGTCT 48881 185 3.96e-07 CGCTTTCGGC CGGACATGTCAC ACTTGAATGC 48676 259 2.87e-06 AAACCCTGTC CGGGGGTCTCAC CGAAGGATTG 10299 371 6.79e-06 AAGTCTGGCA CGGATGGGTCAC CGAGCTGTCT 2936 29 1.24e-05 CGGCCAAGGC TGGACGTGTCAG AAGGCTCGCA 3211 183 1.41e-05 CTGGTTGGGA CGTATATTTCAG CTCGTCACTT 49299 478 1.55e-05 CTCCCACGTT CGGGTATCTCAA CACAACTTTG 48302 444 1.67e-05 GTGTGTGCGT TGTACATTTCAC AGGTAGACTT 43411 327 3.45e-05 AGCCAACTAC CGGATTCCTCAC AAAGCTCGAT 34321 284 3.93e-05 CATCCCATTG TCGAGATTTCAC ATTCAACATG 46880 377 1.48e-04 GTCCGCGAAA CGAGGGTGACAC TTCCAGGATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42506 1.2e-07 440_[+3]_48 48881 4e-07 184_[+3]_304 48676 2.9e-06 258_[+3]_230 10299 6.8e-06 370_[+3]_118 2936 1.2e-05 28_[+3]_460 3211 1.4e-05 182_[+3]_306 49299 1.6e-05 477_[+3]_11 48302 1.7e-05 443_[+3]_45 43411 3.5e-05 326_[+3]_162 34321 3.9e-05 283_[+3]_205 46880 0.00015 376_[+3]_112 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=11 42506 ( 441) CGGATATCTCAC 1 48881 ( 185) CGGACATGTCAC 1 48676 ( 259) CGGGGGTCTCAC 1 10299 ( 371) CGGATGGGTCAC 1 2936 ( 29) TGGACGTGTCAG 1 3211 ( 183) CGTATATTTCAG 1 49299 ( 478) CGGGTATCTCAA 1 48302 ( 444) TGTACATTTCAC 1 43411 ( 327) CGGATTCCTCAC 1 34321 ( 284) TCGAGATTTCAC 1 46880 ( 377) CGAGGGTGACAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5379 bayes= 8.93074 E= 1.2e+003 -1010 153 -1010 18 -1010 -146 197 -1010 -160 -1010 165 -41 140 -1010 23 -1010 -1010 12 23 91 99 -1010 65 -141 -1010 -146 -135 176 -1010 53 65 18 -160 -1010 -1010 191 -1010 199 -1010 -1010 186 -1010 -1010 -1010 -160 153 -35 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 1.2e+003 0.000000 0.727273 0.000000 0.272727 0.000000 0.090909 0.909091 0.000000 0.090909 0.000000 0.727273 0.181818 0.727273 0.000000 0.272727 0.000000 0.000000 0.272727 0.272727 0.454545 0.545455 0.000000 0.363636 0.090909 0.000000 0.090909 0.090909 0.818182 0.000000 0.363636 0.363636 0.272727 0.090909 0.000000 0.000000 0.909091 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.090909 0.727273 0.181818 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CT]GG[AG][TCG][AG]T[CGT]TCAC -------------------------------------------------------------------------------- Time 3.10 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3211 6.07e-03 182_[+3(1.41e-05)]_25_\ [+2(5.42e-05)]_269 46880 2.65e-07 134_[+1(8.58e-09)]_178_\ [+2(8.05e-06)]_156 48302 2.30e-08 181_[+1(1.09e-08)]_183_\ [+2(3.74e-06)]_47_[+3(1.67e-05)]_45 48676 9.56e-09 111_[+2(1.29e-07)]_135_\ [+3(2.87e-06)]_192_[+1(7.04e-07)]_18 48881 6.72e-08 170_[+2(4.48e-05)]_2_[+3(3.96e-07)]_\ 237_[+1(1.24e-07)]_47 43411 4.11e-06 61_[+1(8.42e-07)]_205_\ [+2(7.20e-06)]_28_[+3(3.45e-05)]_162 2936 7.38e-09 28_[+3(1.24e-05)]_103_\ [+1(3.67e-07)]_318_[+2(4.35e-08)]_7 10299 8.62e-08 41_[+2(1.27e-05)]_279_\ [+1(3.35e-08)]_18_[+3(6.79e-06)]_118 34321 7.11e-06 188_[+1(1.36e-07)]_11_\ [+2(7.39e-05)]_52_[+3(3.93e-05)]_205 42506 2.51e-09 308_[+1(7.51e-09)]_43_\ [+2(6.93e-05)]_57_[+3(1.20e-07)]_48 49299 2.15e-03 73_[+2(1.80e-05)]_392_\ [+3(1.55e-05)]_11 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************