******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/479/479.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 36581 1.0000 500 48535 1.0000 500 39557 1.0000 500 40022 1.0000 500 50519 1.0000 500 33648 1.0000 500 54330 1.0000 500 32984 1.0000 500 40534 1.0000 500 46821 1.0000 500 34440 1.0000 500 38113 1.0000 500 45721 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/479/479.seqs.fa -oc motifs/479 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.270 C 0.232 G 0.214 T 0.284 Background letter frequencies (from dataset with add-one prior applied): A 0.270 C 0.232 G 0.214 T 0.284 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 105 E-value = 1.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a57::22aa82::a25:3::: pos.-specific C :::52::::2238:252:3:2 probability G :535788:::752:3:8352: matrix T ::::2::::::2::3::3288 bits 2.2 2.0 * ** * 1.8 * ** * 1.6 * **** ** * Relative 1.3 * ***** ** * * Entropy 1.1 **** ***** ** * ** (25.2 bits) 0.9 *********** ** ** ** 0.7 ************** ** *** 0.4 ************** ****** 0.2 ************** ****** 0.0 --------------------- Multilevel AAACGGGAAAGGCAGAGAGTT consensus GGG C TC GC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 46821 158 4.90e-11 GGAGTTCGCC AGACGGGAAAGTCAGAGGGTT GGCTTTTGGC 48535 354 9.15e-10 CTGTTACGGT AGGGGGGAAAGGCACCGTCGT CTTCGTACCA 40534 10 2.28e-09 GGAGGGGCC AAAGTAGAAAGGCATCGGGTT GCACCAAGAC 40022 125 4.31e-09 ATGTAAATGT AGACGGAAAAACCATCGAGTT CGCAATGAAA 36581 284 2.55e-08 GGATAAATAG AAGGGGGAAACCGAGAGACTC CTTCTGCAAA 32984 445 5.60e-08 CTTTCGTACA AAACCGGAACGGCAAACTTTT TTGAAGGTTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46821 4.9e-11 157_[+1]_322 48535 9.2e-10 353_[+1]_126 40534 2.3e-09 9_[+1]_470 40022 4.3e-09 124_[+1]_355 36581 2.5e-08 283_[+1]_196 32984 5.6e-08 444_[+1]_35 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 46821 ( 158) AGACGGGAAAGTCAGAGGGTT 1 48535 ( 354) AGGGGGGAAAGGCACCGTCGT 1 40534 ( 10) AAAGTAGAAAGGCATCGGGTT 1 40022 ( 125) AGACGGAAAAACCATCGAGTT 1 36581 ( 284) AAGGGGGAAACCGAGAGACTC 1 32984 ( 445) AAACCGGAACGGCAAACTTTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6240 bayes= 10.4688 E= 1.1e+002 189 -923 -923 -923 89 -923 122 -923 130 -923 64 -923 -923 111 122 -923 -923 -47 164 -77 -69 -923 196 -923 -69 -923 196 -923 189 -923 -923 -923 189 -923 -923 -923 163 -47 -923 -923 -69 -47 164 -923 -923 52 122 -77 -923 184 -36 -923 189 -923 -923 -923 -69 -47 64 23 89 111 -923 -923 -923 -47 196 -923 30 -923 64 23 -923 52 122 -77 -923 -923 -36 155 -923 -47 -923 155 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 1.1e+002 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.166667 0.666667 0.166667 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.166667 0.166667 0.666667 0.000000 0.000000 0.333333 0.500000 0.166667 0.000000 0.833333 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.166667 0.333333 0.333333 0.500000 0.500000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.333333 0.000000 0.333333 0.333333 0.000000 0.333333 0.500000 0.166667 0.000000 0.000000 0.166667 0.833333 0.000000 0.166667 0.000000 0.833333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[AG][AG][CG]GGGAAAG[GC]CA[GT][AC]G[AGT][GC]TT -------------------------------------------------------------------------------- Time 1.78 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 4 llr = 57 E-value = 6.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::a:3::3:: pos.-specific C :::3:8::8::: probability G :a:8::8:38aa matrix T a:a::3:a:::: bits 2.2 * ** 2.0 * * ** 1.8 *** * * ** 1.6 *** * * ** Relative 1.3 ************ Entropy 1.1 ************ (20.6 bits) 0.9 ************ 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGTGACGTCGGG consensus C TA GA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 38113 307 9.61e-08 CCGGTGTTGG TGTCACGTCGGG ATGGCGCCAA 39557 65 2.16e-07 GGTTGGTGAC TGTGACATCGGG AGTAACCAAC 48535 162 2.16e-07 ATCTGGGCTT TGTGACGTCAGG TCTCATTTTT 46821 209 3.58e-07 CAGATTGGAT TGTGATGTGGGG TTTCCATTAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38113 9.6e-08 306_[+2]_182 39557 2.2e-07 64_[+2]_424 48535 2.2e-07 161_[+2]_327 46821 3.6e-07 208_[+2]_280 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=4 38113 ( 307) TGTCACGTCGGG 1 39557 ( 65) TGTGACATCGGG 1 48535 ( 162) TGTGACGTCAGG 1 46821 ( 209) TGTGATGTGGGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 10.6332 E= 6.6e+002 -865 -865 -865 181 -865 -865 222 -865 -865 -865 -865 181 -865 11 181 -865 189 -865 -865 -865 -865 169 -865 -18 -11 -865 181 -865 -865 -865 -865 181 -865 169 22 -865 -11 -865 181 -865 -865 -865 222 -865 -865 -865 222 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 4 E= 6.6e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TGT[GC]A[CT][GA]T[CG][GA]GG -------------------------------------------------------------------------------- Time 3.55 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 8 llr = 92 E-value = 3.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1::91a:::55: pos.-specific C 1:4:9:::654: probability G 166:::a:4:1a matrix T 64:1:::a:::: bits 2.2 * * 2.0 ** * 1.8 *** * 1.6 **** * Relative 1.3 ****** * Entropy 1.1 ******** * (16.5 bits) 0.9 ********* * 0.7 *********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGGACAGTCAAG consensus TC GCC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 40534 257 1.26e-07 GATACAATAG TGGACAGTCCCG AACACGCAAT 33648 261 6.70e-07 TCGCGGTATC TTGACAGTCAAG GTAGATGGAG 34440 256 2.34e-06 TCGGTTGCTA GGGACAGTGCCG TGAGGGATAC 32984 90 2.47e-06 ATATACTCAA AGGACAGTGCAG CACGTAACTT 54330 138 3.42e-06 AAAGATTTAT TGGTCAGTCAAG TCGACCGTCT 48535 112 5.24e-06 CGATAAGCAA TTCACAGTCAGG CTTTTCCACC 46821 473 7.44e-06 GGAAGGACTT CTCACAGTCACG AGACGTTTTC 40022 322 7.44e-06 ATGAAAGGCA TGCAAAGTGCAG GGAAGACACG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40534 1.3e-07 256_[+3]_232 33648 6.7e-07 260_[+3]_228 34440 2.3e-06 255_[+3]_233 32984 2.5e-06 89_[+3]_399 54330 3.4e-06 137_[+3]_351 48535 5.2e-06 111_[+3]_377 46821 7.4e-06 472_[+3]_16 40022 7.4e-06 321_[+3]_167 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=8 40534 ( 257) TGGACAGTCCCG 1 33648 ( 261) TTGACAGTCAAG 1 34440 ( 256) GGGACAGTGCCG 1 32984 ( 90) AGGACAGTGCAG 1 54330 ( 138) TGGTCAGTCAAG 1 48535 ( 112) TTCACAGTCAGG 1 46821 ( 473) CTCACAGTCACG 1 40022 ( 322) TGCAAAGTGCAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 10.37 E= 3.6e+002 -111 -89 -78 114 -965 -965 154 40 -965 69 154 -965 170 -965 -965 -118 -111 192 -965 -965 189 -965 -965 -965 -965 -965 222 -965 -965 -965 -965 181 -965 143 81 -965 89 111 -965 -965 89 69 -78 -965 -965 -965 222 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 3.6e+002 0.125000 0.125000 0.125000 0.625000 0.000000 0.000000 0.625000 0.375000 0.000000 0.375000 0.625000 0.000000 0.875000 0.000000 0.000000 0.125000 0.125000 0.875000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.625000 0.375000 0.000000 0.500000 0.500000 0.000000 0.000000 0.500000 0.375000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[GT][GC]ACAGT[CG][AC][AC]G -------------------------------------------------------------------------------- Time 5.17 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36581 1.12e-04 283_[+1(2.55e-08)]_196 48535 5.62e-11 111_[+3(5.24e-06)]_38_\ [+2(2.16e-07)]_180_[+1(9.15e-10)]_126 39557 8.60e-04 64_[+2(2.16e-07)]_424 40022 3.48e-07 124_[+1(4.31e-09)]_176_\ [+3(7.44e-06)]_167 50519 7.15e-01 500 33648 4.48e-03 260_[+3(6.70e-07)]_228 54330 1.30e-02 137_[+3(3.42e-06)]_351 32984 1.23e-06 89_[+3(2.47e-06)]_343_\ [+1(5.60e-08)]_35 40534 3.83e-09 9_[+1(2.28e-09)]_226_[+3(1.26e-07)]_\ 232 46821 8.07e-12 157_[+1(4.90e-11)]_30_\ [+2(3.58e-07)]_252_[+3(7.44e-06)]_16 34440 1.38e-03 255_[+3(2.34e-06)]_233 38113 1.81e-03 306_[+2(9.61e-08)]_182 45721 6.43e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************