******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/337/337.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 28056 1.0000 500 40023 1.0000 500 40841 1.0000 500 50197 1.0000 500 43808 1.0000 500 44203 1.0000 500 44658 1.0000 500 44815 1.0000 500 45041 1.0000 500 36133 1.0000 500 46819 1.0000 500 37237 1.0000 500 47167 1.0000 500 47548 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/337/337.seqs.fa -oc motifs/337 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.276 C 0.219 G 0.217 T 0.288 Background letter frequencies (from dataset with add-one prior applied): A 0.276 C 0.219 G 0.217 T 0.288 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 12 llr = 118 E-value = 3.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::::1:2:4: pos.-specific C 232a::3:3:2: probability G :8::8232::3: matrix T 8:8:38385a1a bits 2.2 * 2.0 * 1.8 * * * 1.5 * * * Relative 1.3 ****** * * * Entropy 1.1 ****** * * * (14.2 bits) 0.9 ****** * * * 0.7 ****** * * * 0.4 ****** *** * 0.2 ************ 0.0 ------------ Multilevel TGTCGTCTTTAT consensus C T G C G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43808 437 3.64e-07 CAGTCATCAG TGTCGTCTTTAT CTTCTGATAC 28056 325 8.82e-07 TCCCTGACGC TGTCGTTTTTAT AGGTGTCATT 47167 355 1.23e-06 ATCACAAATC TGTCGTCTTTCT CTAAAAGTCG 37237 192 2.07e-06 GAGTTCACCG TGTCGTGTATGT TAACTGTGTG 40841 294 5.56e-06 AAATAGGACA TGTCTTGTCTGT TATTTCAATT 40023 300 1.45e-05 TTCGCGGTAA TGTCGGCTATAT ACGAGGGCAT 50197 461 2.69e-05 GTAAATCTTT CGTCGTGTTTTT TGTAGACGAG 44203 46 2.77e-05 GTTGGCAGTA TGTCGTAGCTGT CTCATAGATA 45041 377 3.07e-05 CCGTGGTCTA TCTCTTTTCTGT GCCTTGCGTA 44815 245 3.07e-05 ATTGGTTTCC TCTCTTGTTTCT GATATTTACA 44658 235 6.43e-05 ATAACAGCCG CGCCGTCGCTAT TGTCTACAGG 47548 310 6.96e-05 CCAAGCGTTA TCCCGGTTTTAT ATAGTGCATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43808 3.6e-07 436_[+1]_52 28056 8.8e-07 324_[+1]_164 47167 1.2e-06 354_[+1]_134 37237 2.1e-06 191_[+1]_297 40841 5.6e-06 293_[+1]_195 40023 1.4e-05 299_[+1]_189 50197 2.7e-05 460_[+1]_28 44203 2.8e-05 45_[+1]_443 45041 3.1e-05 376_[+1]_112 44815 3.1e-05 244_[+1]_244 44658 6.4e-05 234_[+1]_254 47548 7e-05 309_[+1]_179 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=12 43808 ( 437) TGTCGTCTTTAT 1 28056 ( 325) TGTCGTTTTTAT 1 47167 ( 355) TGTCGTCTTTCT 1 37237 ( 192) TGTCGTGTATGT 1 40841 ( 294) TGTCTTGTCTGT 1 40023 ( 300) TGTCGGCTATAT 1 50197 ( 461) CGTCGTGTTTTT 1 44203 ( 46) TGTCGTAGCTGT 1 45041 ( 377) TCTCTTTTCTGT 1 44815 ( 245) TCTCTTGTTTCT 1 44658 ( 235) CGCCGTCGCTAT 1 47548 ( 310) TCCCGGTTTTAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.60169 E= 3.4e+002 -1023 -39 -1023 153 -1023 19 179 -1023 -1023 -39 -1023 153 -1023 219 -1023 -1023 -1023 -1023 179 -20 -1023 -1023 -38 153 -172 61 62 -20 -1023 -1023 -38 153 -73 61 -1023 80 -1023 -1023 -1023 180 59 -39 62 -179 -1023 -1023 -1023 180 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 3.4e+002 0.000000 0.166667 0.000000 0.833333 0.000000 0.250000 0.750000 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.166667 0.833333 0.083333 0.333333 0.333333 0.250000 0.000000 0.000000 0.166667 0.833333 0.166667 0.333333 0.000000 0.500000 0.000000 0.000000 0.000000 1.000000 0.416667 0.166667 0.333333 0.083333 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[GC]TC[GT]T[CGT]T[TC]T[AG]T -------------------------------------------------------------------------------- Time 1.78 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 7 llr = 91 E-value = 6.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A a:139:644:a19: pos.-specific C ::44111::::61: probability G :a13:9366a:3:a matrix T ::3::::::::::: bits 2.2 * * * 2.0 * * * 1.8 ** ** * 1.5 ** * ** * Relative 1.3 ** ** ** ** Entropy 1.1 ** ** **** ** (18.8 bits) 0.9 ** ** **** ** 0.7 ** ********** 0.4 ** *********** 0.2 ************** 0.0 -------------- Multilevel AGCCAGAGGGACAG consensus TA GAA G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 50197 327 1.16e-08 TGACTTGAAA AGCAAGAGGGACAG GAACAACATT 46819 156 5.00e-08 TGGAATTCAC AGTCAGAGAGACAG TGCGCGAGGA 43808 180 3.40e-07 CAATGCCGAG AGGCAGGGGGAGAG GGGGGTTCGG 37237 136 7.13e-07 GTCAAAACGG AGTCACAAGGACAG ACGTATACAA 36133 38 9.69e-07 AAAGCACTCA AGCGAGCAAGAGAG ATGTTGGATG 47167 470 1.80e-06 TGGGCTGAGA AGAAAGAAGGAAAG GATTCGGGAC 44203 299 2.49e-06 AAAAAACAAG AGCGCGGGAGACCG CATCTCTCCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50197 1.2e-08 326_[+2]_160 46819 5e-08 155_[+2]_331 43808 3.4e-07 179_[+2]_307 37237 7.1e-07 135_[+2]_351 36133 9.7e-07 37_[+2]_449 47167 1.8e-06 469_[+2]_17 44203 2.5e-06 298_[+2]_188 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=7 50197 ( 327) AGCAAGAGGGACAG 1 46819 ( 156) AGTCAGAGAGACAG 1 43808 ( 180) AGGCAGGGGGAGAG 1 37237 ( 136) AGTCACAAGGACAG 1 36133 ( 38) AGCGAGCAAGAGAG 1 47167 ( 470) AGAAAGAAGGAAAG 1 44203 ( 299) AGCGCGGGAGACCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 6818 bayes= 10.5325 E= 6.3e+001 186 -945 -945 -945 -945 -945 220 -945 -95 97 -60 -1 5 97 39 -945 163 -61 -945 -945 -945 -61 198 -945 105 -61 39 -945 63 -945 139 -945 63 -945 139 -945 -945 -945 220 -945 186 -945 -945 -945 -95 138 39 -945 163 -61 -945 -945 -945 -945 220 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 7 E= 6.3e+001 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.428571 0.142857 0.285714 0.285714 0.428571 0.285714 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.571429 0.142857 0.285714 0.000000 0.428571 0.000000 0.571429 0.000000 0.428571 0.000000 0.571429 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.571429 0.285714 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- AG[CT][CAG]AG[AG][GA][GA]GA[CG]AG -------------------------------------------------------------------------------- Time 3.49 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 3 llr = 56 E-value = 1.9e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :aa3:3::::::7::: pos.-specific C ::::7:7:::a::a3a probability G a::3:7:aaa::3:7: matrix T :::33:3::::a:::: bits 2.2 * **** * * 2.0 * **** * * 1.8 *** ***** * * 1.5 *** ***** * * Relative 1.3 *** ***** *** Entropy 1.1 *** ************ (27.1 bits) 0.9 *** ************ 0.7 *** ************ 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GAAACGCGGGCTACGC consensus GTAT G C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 37237 48 1.72e-09 CCGGATAGGG GAATTGCGGGCTACGC GTCGCCCGGA 43808 8 1.72e-09 CTAGTTG GAAACGTGGGCTACGC TATCGCTACC 44658 312 4.17e-09 AACGATGGAA GAAGCACGGGCTGCCC CTTCTCATCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37237 1.7e-09 47_[+3]_437 43808 1.7e-09 7_[+3]_477 44658 4.2e-09 311_[+3]_173 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=3 37237 ( 48) GAATTGCGGGCTACGC 1 43808 ( 8) GAAACGTGGGCTACGC 1 44658 ( 312) GAAGCACGGGCTGCCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 10.8024 E= 1.9e+003 -823 -823 220 -823 185 -823 -823 -823 185 -823 -823 -823 27 -823 62 21 -823 160 -823 21 27 -823 161 -823 -823 160 -823 21 -823 -823 220 -823 -823 -823 220 -823 -823 -823 220 -823 -823 219 -823 -823 -823 -823 -823 179 127 -823 62 -823 -823 219 -823 -823 -823 61 161 -823 -823 219 -823 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 3 E= 1.9e+003 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.333333 0.333333 0.000000 0.666667 0.000000 0.333333 0.333333 0.000000 0.666667 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GAA[AGT][CT][GA][CT]GGGCT[AG]C[GC]C -------------------------------------------------------------------------------- Time 5.01 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 28056 3.67e-04 16_[+3(3.53e-05)]_292_\ [+1(8.82e-07)]_164 40023 3.65e-02 299_[+1(1.45e-05)]_189 40841 4.12e-02 293_[+1(5.56e-06)]_195 50197 9.12e-06 326_[+2(1.16e-08)]_120_\ [+1(2.69e-05)]_28 43808 1.29e-11 7_[+3(1.72e-09)]_156_[+2(3.40e-07)]_\ 243_[+1(3.64e-07)]_52 44203 4.41e-04 45_[+1(2.77e-05)]_241_\ [+2(2.49e-06)]_188 44658 3.86e-06 234_[+1(6.43e-05)]_65_\ [+3(4.17e-09)]_173 44815 9.34e-02 244_[+1(3.07e-05)]_244 45041 1.89e-01 376_[+1(3.07e-05)]_112 36133 3.08e-03 37_[+2(9.69e-07)]_449 46819 8.70e-04 155_[+2(5.00e-08)]_331 37237 1.31e-10 47_[+3(1.72e-09)]_72_[+2(7.13e-07)]_\ 42_[+1(2.07e-06)]_297 47167 1.57e-05 354_[+1(1.23e-06)]_103_\ [+2(1.80e-06)]_17 47548 2.01e-01 309_[+1(6.96e-05)]_179 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************