******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/117/117.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31866 1.0000 500 31918 1.0000 500 32014 1.0000 500 54066 1.0000 500 47565 1.0000 500 38086 1.0000 500 48810 1.0000 500 41383 1.0000 500 10306 1.0000 500 33329 1.0000 500 44715 1.0000 500 45413 1.0000 500 46834 1.0000 500 39888 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/117/117.seqs.fa -oc motifs/117 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.256 C 0.238 G 0.237 T 0.269 Background letter frequencies (from dataset with add-one prior applied): A 0.256 C 0.238 G 0.237 T 0.269 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 11 llr = 121 E-value = 6.5e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 61:35:3::2a9 pos.-specific C 19:3:a:::5:: probability G 2::24::a:4:1 matrix T 1:a31:7:a::: bits 2.1 * * 1.9 * * ** * 1.7 ** * ** * 1.5 ** * ** ** Relative 1.2 ** * ** ** Entropy 1.0 ** **** ** (15.9 bits) 0.8 ** **** ** 0.6 ** ******** 0.4 *** ******** 0.2 *** ******** 0.0 ------------ Multilevel ACTAACTGTCAA consensus CG A G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 54066 146 2.04e-07 ACGATGGCAG ACTTACTGTCAA TCTAGCATCA 31866 326 2.04e-07 TAATACTTTG ACTTACTGTCAA TTAGTCGAAC 45413 134 5.92e-07 TCGAACCACG ACTCGCTGTCAA TTTTGGGACT 38086 25 5.92e-07 TATATTTTAC ACTTACTGTGAA GGATGAAGGA 48810 217 1.15e-06 AATATGCGGA ACTCACAGTCAA ACGGAATGGC 10306 372 3.28e-06 TGACTGTGAA GCTGACTGTGAA GCTGACTGTG 33329 97 8.96e-06 CCTCTGGTTT AATCGCTGTCAA TGTGGATGCT 41383 319 1.15e-05 GAATTGCGTC GCTGGCAGTGAA CACAGGCAGG 39888 15 1.30e-05 GTAATGGAAC ACTAACTGTAAG GCAGCGTCGC 32014 51 1.49e-05 TAGGCCCCAA CCTAGCAGTGAA ACAAGCATGA 31918 142 3.24e-05 GTCGAATTGA TCTATCTGTAAA GGAGAAGATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54066 2e-07 145_[+1]_343 31866 2e-07 325_[+1]_163 45413 5.9e-07 133_[+1]_355 38086 5.9e-07 24_[+1]_464 48810 1.2e-06 216_[+1]_272 10306 3.3e-06 371_[+1]_117 33329 9e-06 96_[+1]_392 41383 1.2e-05 318_[+1]_170 39888 1.3e-05 14_[+1]_474 32014 1.5e-05 50_[+1]_438 31918 3.2e-05 141_[+1]_347 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=11 54066 ( 146) ACTTACTGTCAA 1 31866 ( 326) ACTTACTGTCAA 1 45413 ( 134) ACTCGCTGTCAA 1 38086 ( 25) ACTTACTGTGAA 1 48810 ( 217) ACTCACAGTCAA 1 10306 ( 372) GCTGACTGTGAA 1 33329 ( 97) AATCGCTGTCAA 1 41383 ( 319) GCTGGCAGTGAA 1 39888 ( 15) ACTAACTGTAAG 1 32014 ( 51) CCTAGCAGTGAA 1 31918 ( 142) TCTATCTGTAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.6349 E= 6.5e-002 131 -139 -38 -156 -149 193 -1010 -1010 -1010 -1010 -1010 190 9 19 -38 2 109 -1010 62 -156 -1010 207 -1010 -1010 9 -1010 -1010 144 -1010 -1010 208 -1010 -1010 -1010 -1010 190 -49 93 62 -1010 196 -1010 -1010 -1010 183 -1010 -138 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 6.5e-002 0.636364 0.090909 0.181818 0.090909 0.090909 0.909091 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.272727 0.272727 0.181818 0.272727 0.545455 0.000000 0.363636 0.090909 0.000000 1.000000 0.000000 0.000000 0.272727 0.000000 0.000000 0.727273 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.181818 0.454545 0.363636 0.000000 1.000000 0.000000 0.000000 0.000000 0.909091 0.000000 0.090909 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- ACT[ACT][AG]C[TA]GT[CG]AA -------------------------------------------------------------------------------- Time 1.64 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 8 llr = 120 E-value = 4.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :81a13:9:5:6::3334::: pos.-specific C 8:3:53:18::18a:3:1:91 probability G 3:4:::a:333:1:1343::9 matrix T :33:45:::3831:6343a1: bits 2.1 * * 1.9 * * * * 1.7 * * * * 1.5 * ** * *** Relative 1.2 * * *** * *** Entropy 1.0 ** * *** * ** *** (21.7 bits) 0.8 ** * *** * ** *** 0.6 ** ** *** ***** *** 0.4 ** ************ * *** 0.2 *************** * *** 0.0 --------------------- Multilevel CAGACTGACATACCTAGATCG consensus GTC TA GGGT ACTG sequence T C T GAT T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 48810 307 4.29e-11 ACAAATGTCC CAGACTGACTTACCTATATCG CGGGGGGGGG 31918 115 7.07e-09 AAAACGCCAT CACATAGACGTACCATGGTCG AATTGATCTA 39888 435 3.56e-08 ACTCGAGCAT GAAATTGAGAGACCTAGATCG ATCTGGTTTC 33329 216 4.29e-08 AGGCAATTCT CATATTGACATTCCTGATTCC AATCACTCGT 45413 98 9.28e-08 TGTGCCCAAA CAGAACGACGGACCGGAATCG TGCCATCGAA 44715 218 2.08e-07 ACCTTTCTAG CTTACAGAGATAGCTCTCTCG AATAGCGAGC 46834 378 2.53e-07 GCCAAATCTC GTGACCGCCATCCCTCGTTCG TTCACGGACA 41383 426 4.82e-07 TAGATATCTG CACACTGACTTTTCATTGTTG GTGCATCGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48810 4.3e-11 306_[+2]_173 31918 7.1e-09 114_[+2]_365 39888 3.6e-08 434_[+2]_45 33329 4.3e-08 215_[+2]_264 45413 9.3e-08 97_[+2]_382 44715 2.1e-07 217_[+2]_262 46834 2.5e-07 377_[+2]_102 41383 4.8e-07 425_[+2]_54 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=8 48810 ( 307) CAGACTGACTTACCTATATCG 1 31918 ( 115) CACATAGACGTACCATGGTCG 1 39888 ( 435) GAAATTGAGAGACCTAGATCG 1 33329 ( 216) CATATTGACATTCCTGATTCC 1 45413 ( 98) CAGAACGACGGACCGGAATCG 1 44715 ( 218) CTTACAGAGATAGCTCTCTCG 1 46834 ( 378) GTGACCGCCATCCCTCGTTCG 1 41383 ( 426) CACACTGACTTTTCATTGTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6720 bayes= 9.71253 E= 4.9e+002 -965 165 8 -965 155 -965 -965 -10 -103 7 66 -10 196 -965 -965 -965 -103 107 -965 48 -3 7 -965 90 -965 -965 207 -965 177 -93 -965 -965 -965 165 8 -965 96 -965 8 -10 -965 -965 8 148 129 -93 -965 -10 -965 165 -92 -110 -965 207 -965 -965 -3 -965 -92 122 -3 7 8 -10 -3 -965 66 48 55 -93 8 -10 -965 -965 -965 190 -965 188 -965 -110 -965 -93 188 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 4.9e+002 0.000000 0.750000 0.250000 0.000000 0.750000 0.000000 0.000000 0.250000 0.125000 0.250000 0.375000 0.250000 1.000000 0.000000 0.000000 0.000000 0.125000 0.500000 0.000000 0.375000 0.250000 0.250000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.500000 0.000000 0.250000 0.250000 0.000000 0.000000 0.250000 0.750000 0.625000 0.125000 0.000000 0.250000 0.000000 0.750000 0.125000 0.125000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.125000 0.625000 0.250000 0.250000 0.250000 0.250000 0.250000 0.000000 0.375000 0.375000 0.375000 0.125000 0.250000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.125000 0.875000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG][AT][GCT]A[CT][TAC]GA[CG][AGT][TG][AT]CC[TA][ACGT][GTA][AGT]TCG -------------------------------------------------------------------------------- Time 3.65 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 3 llr = 65 E-value = 2.0e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::3:7:::::::::3::373: pos.-specific C 3:::::::::33:33:3:::: probability G 7a7a3aaaaa73a33a7737a matrix T :::::::::::3:3::::::: bits 2.1 * * ***** * * * 1.9 * * ***** * * * 1.7 * * ***** * * * 1.5 * * ***** * * * Relative 1.2 ** * ****** * ** * Entropy 1.0 *********** * ****** (31.1 bits) 0.8 *********** * ****** 0.6 *********** * ****** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GGGGAGGGGGGCGCAGGGAGG consensus C A G CG GC CAGA sequence T TG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 48810 329 2.45e-11 CCTATATCGC GGGGGGGGGGGGGGGGGGGGG GGGGGGTCTC 38086 67 9.23e-11 TACTGTGACA CGAGAGGGGGGCGTAGGGAGG AATGCCCTTT 31866 241 4.19e-10 GGGCAGACTA GGGGAGGGGGCTGCCGCAAAG CTCGGGAGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48810 2.5e-11 328_[+3]_151 38086 9.2e-11 66_[+3]_413 31866 4.2e-10 240_[+3]_239 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=3 48810 ( 329) GGGGGGGGGGGGGGGGGGGGG 1 38086 ( 67) CGAGAGGGGGGCGTAGGGAGG 1 31866 ( 241) GGGGAGGGGGCTGCCGCAAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6720 bayes= 11.5763 E= 2.0e+003 -823 48 149 -823 -823 -823 207 -823 38 -823 149 -823 -823 -823 207 -823 138 -823 49 -823 -823 -823 207 -823 -823 -823 207 -823 -823 -823 207 -823 -823 -823 207 -823 -823 -823 207 -823 -823 48 149 -823 -823 48 49 31 -823 -823 207 -823 -823 48 49 31 38 48 49 -823 -823 -823 207 -823 -823 48 149 -823 38 -823 149 -823 138 -823 49 -823 38 -823 149 -823 -823 -823 207 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 3 E= 2.0e+003 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.333333 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.333333 0.333333 0.333333 0.333333 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.333333 0.000000 0.666667 0.000000 0.666667 0.000000 0.333333 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC]G[GA]G[AG]GGGGG[GC][CGT]G[CGT][ACG]G[GC][GA][AG][GA]G -------------------------------------------------------------------------------- Time 5.59 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31866 2.67e-09 71_[+1(7.30e-05)]_38_[+1(9.90e-05)]_\ 107_[+3(4.19e-10)]_64_[+1(2.04e-07)]_163 31918 6.83e-06 114_[+2(7.07e-09)]_6_[+1(3.24e-05)]_\ 347 32014 8.50e-02 50_[+1(1.49e-05)]_438 54066 3.27e-03 145_[+1(2.04e-07)]_343 47565 9.20e-01 500 38086 3.69e-09 24_[+1(5.92e-07)]_30_[+3(9.23e-11)]_\ 413 48810 1.35e-16 216_[+1(1.15e-06)]_78_\ [+2(4.29e-11)]_1_[+3(2.45e-11)]_151 41383 1.12e-05 318_[+1(1.15e-05)]_95_\ [+2(4.82e-07)]_54 10306 1.36e-02 359_[+1(9.08e-05)]_[+1(3.28e-06)]_\ [+1(3.28e-06)]_105 33329 1.34e-05 96_[+1(8.96e-06)]_107_\ [+2(4.29e-08)]_264 44715 2.09e-03 217_[+2(2.08e-07)]_262 45413 2.06e-06 97_[+2(9.28e-08)]_15_[+1(5.92e-07)]_\ 355 46834 5.31e-03 377_[+2(2.53e-07)]_102 39888 2.38e-06 14_[+1(1.30e-05)]_408_\ [+2(3.56e-08)]_45 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************