******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/454/454.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 37030 1.0000 500 37354 1.0000 500 38549 1.0000 500 29561 1.0000 500 54187 1.0000 500 44753 1.0000 500 27077 1.0000 500 12405 1.0000 500 33082 1.0000 500 36323 1.0000 500 49318 1.0000 500 38429 1.0000 500 40830 1.0000 500 43782 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/454/454.seqs.fa -oc motifs/454 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.276 C 0.236 G 0.214 T 0.275 Background letter frequencies (from dataset with add-one prior applied): A 0.276 C 0.236 G 0.214 T 0.275 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 6 llr = 103 E-value = 2.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 87:7:a885::::22:::: pos.-specific C ::5:2::2523:3332::a probability G :35:8:2::8:87:3::a: matrix T 2::3::::::72:528a:: bits 2.2 * 2.0 ** 1.8 * *** 1.6 ** * * *** Relative 1.3 **** * ** **** Entropy 1.1 *** **** **** **** (24.7 bits) 0.9 ************* **** 0.7 ************* **** 0.4 ************** **** 0.2 ******************* 0.0 ------------------- Multilevel AACAGAAAAGTGGTCTTGC consensus GGT C C CCG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 40830 430 3.33e-10 CAACAACAAC AACAGAAAAGTGCCGTTGC GTCCTTTACC 29561 421 6.72e-10 GCAGCGAAGA AACAGAAAAGCGCTCTTGC CGTCCTAGGC 49318 61 8.16e-09 CCTCCATCAG AAGAGAAACCCGGACTTGC TGTTATGGCG 43782 336 1.21e-08 AAGAGACACA AGGTCAAACGTGGTTTTGC TCCACGGTCT 12405 61 2.17e-08 GAGTTAACTG TACAGAAACGTTGTATTGC ATGAAAGACT 54187 103 5.72e-08 TGCGTGACCA AGGTGAGCAGTGGCGCTGC GGGACGCGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40830 3.3e-10 429_[+1]_52 29561 6.7e-10 420_[+1]_61 49318 8.2e-09 60_[+1]_421 43782 1.2e-08 335_[+1]_146 12405 2.2e-08 60_[+1]_421 54187 5.7e-08 102_[+1]_379 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=6 40830 ( 430) AACAGAAAAGTGCCGTTGC 1 29561 ( 421) AACAGAAAAGCGCTCTTGC 1 49318 ( 61) AAGAGAAACCCGGACTTGC 1 43782 ( 336) AGGTCAAACGTGGTTTTGC 1 12405 ( 61) TACAGAAACGTTGTATTGC 1 54187 ( 103) AGGTGAGCAGTGGCGCTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 6748 bayes= 10.5818 E= 2.5e+001 159 -923 -923 -72 127 -923 64 -923 -923 108 122 -923 127 -923 -923 28 -923 -50 196 -923 186 -923 -923 -923 159 -923 -36 -923 159 -50 -923 -923 86 108 -923 -923 -923 -50 196 -923 -923 50 -923 128 -923 -923 196 -72 -923 50 164 -923 -72 50 -923 86 -72 50 64 -72 -923 -50 -923 160 -923 -923 -923 186 -923 -923 222 -923 -923 208 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 6 E= 2.5e+001 0.833333 0.000000 0.000000 0.166667 0.666667 0.000000 0.333333 0.000000 0.000000 0.500000 0.500000 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.833333 0.166667 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.833333 0.166667 0.000000 0.333333 0.666667 0.000000 0.166667 0.333333 0.000000 0.500000 0.166667 0.333333 0.333333 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[AG][CG][AT]GAAA[AC]G[TC]G[GC][TC][CG]TTGC -------------------------------------------------------------------------------- Time 1.83 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 6 llr = 76 E-value = 6.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::38a22::2 pos.-specific C ::7:7:::7aa: probability G aa3a:2:2:::5 matrix T :::::::72::3 bits 2.2 ** * 2.0 ** * ** 1.8 ** * * ** 1.6 ** * * ** Relative 1.3 ** * ** ** Entropy 1.1 ******* ** (18.3 bits) 0.9 ******* ** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GGCGCAATCCCG consensus G A T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44753 273 2.03e-07 TAGAAGCTGT GGGGCAATCCCT TTCACGAGCC 37030 129 3.01e-07 TCAAAAAATA GGCGCAAACCCG CCAGAAGGAA 29561 266 3.84e-07 AACAGTGAGA GGCGAAATCCCT CGCAAATTGG 27077 483 4.59e-07 ATGCTCTGGT GGCGCAATACCG ACCACG 43782 65 2.60e-06 TCGGTCGCAA GGCGAGAGCCCG GCTAAACGAA 49318 8 2.94e-06 AAATTTG GGGGCAATTCCA ATTTACATAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44753 2e-07 272_[+2]_216 37030 3e-07 128_[+2]_360 29561 3.8e-07 265_[+2]_223 27077 4.6e-07 482_[+2]_6 43782 2.6e-06 64_[+2]_424 49318 2.9e-06 7_[+2]_481 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=6 44753 ( 273) GGGGCAATCCCT 1 37030 ( 129) GGCGCAAACCCG 1 29561 ( 266) GGCGAAATCCCT 1 27077 ( 483) GGCGCAATACCG 1 43782 ( 65) GGCGAGAGCCCG 1 49318 ( 8) GGGGCAATTCCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.81344 E= 6.9e+002 -923 -923 222 -923 -923 -923 222 -923 -923 150 64 -923 -923 -923 222 -923 27 150 -923 -923 159 -923 -36 -923 186 -923 -923 -923 -72 -923 -36 128 -72 150 -923 -72 -923 208 -923 -923 -923 208 -923 -923 -72 -923 122 28 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 6.9e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.166667 0.666667 0.166667 0.666667 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.000000 0.500000 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GG[CG]G[CA]AATCCC[GT] -------------------------------------------------------------------------------- Time 3.45 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 5 llr = 89 E-value = 2.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :4a4::66222:86::::4: pos.-specific C :::2:8:46::::::::82: probability G a::2a22:2:84::a::22: matrix T :6:2::2::8:624:aa:2a bits 2.2 * * * 2.0 * * * 1.8 * * * *** * 1.6 * * * *** * Relative 1.3 * * ** * **** * Entropy 1.1 * * ** **** **** * (25.8 bits) 0.9 *** ** * ********* * 0.7 *** ************** * 0.4 *** ************** * 0.2 *** ************** * 0.0 -------------------- Multilevel GTAAGCAACTGTAAGTTCAT consensus A C GGCAAAGTT GC sequence G T G G T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 33082 56 1.87e-10 ACAACTGGCA GTATGCACCTGTAAGTTCCT TCGTCCCACA 43782 203 3.49e-10 AGACAAAGAG GAAAGCGACTGGATGTTCAT TGGGGCGAAA 38549 368 9.12e-10 CTTGGCGAAC GTAAGCTAATGTAAGTTCAT GCTTTCGCCT 27077 208 2.47e-08 TATTTTTTCC GAAGGGAACAGTAAGTTGTT TCAAGCGAAG 54187 480 2.80e-08 CATCCTTAGG GTACGCACGTAGTTGTTCGT C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33082 1.9e-10 55_[+3]_425 43782 3.5e-10 202_[+3]_278 38549 9.1e-10 367_[+3]_113 27077 2.5e-08 207_[+3]_273 54187 2.8e-08 479_[+3]_1 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=5 33082 ( 56) GTATGCACCTGTAAGTTCCT 1 43782 ( 203) GAAAGCGACTGGATGTTCAT 1 38549 ( 368) GTAAGCTAATGTAAGTTCAT 1 27077 ( 208) GAAGGGAACAGTAAGTTGTT 1 54187 ( 480) GTACGCACGTAGTTGTTCGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6734 bayes= 10.646 E= 2.2e+003 -897 -897 222 -897 54 -897 -897 113 186 -897 -897 -897 54 -24 -10 -46 -897 -897 222 -897 -897 176 -10 -897 112 -897 -10 -46 112 76 -897 -897 -46 135 -10 -897 -46 -897 -897 154 -46 -897 190 -897 -897 -897 90 113 154 -897 -897 -46 112 -897 -897 54 -897 -897 222 -897 -897 -897 -897 186 -897 -897 -897 186 -897 176 -10 -897 54 -24 -10 -46 -897 -897 -897 186 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 5 E= 2.2e+003 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.000000 0.600000 1.000000 0.000000 0.000000 0.000000 0.400000 0.200000 0.200000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.600000 0.000000 0.200000 0.200000 0.600000 0.400000 0.000000 0.000000 0.200000 0.600000 0.200000 0.000000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 0.400000 0.600000 0.800000 0.000000 0.000000 0.200000 0.600000 0.000000 0.000000 0.400000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.200000 0.000000 0.400000 0.200000 0.200000 0.200000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[TA]A[ACGT]G[CG][AGT][AC][CAG][TA][GA][TG][AT][AT]GTT[CG][ACGT]T -------------------------------------------------------------------------------- Time 5.04 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37030 1.85e-04 86_[+3(3.23e-05)]_22_[+2(3.01e-07)]_\ 360 37354 7.07e-01 500 38549 4.33e-05 367_[+3(9.12e-10)]_113 29561 1.31e-08 265_[+2(3.84e-07)]_143_\ [+1(6.72e-10)]_61 54187 1.95e-08 102_[+1(5.72e-08)]_358_\ [+3(2.80e-08)]_1 44753 3.58e-03 272_[+2(2.03e-07)]_216 27077 6.51e-08 207_[+3(2.47e-08)]_255_\ [+2(4.59e-07)]_6 12405 6.30e-04 60_[+1(2.17e-08)]_421 33082 1.72e-06 55_[+3(1.87e-10)]_425 36323 6.25e-01 500 49318 1.06e-06 7_[+2(2.94e-06)]_41_[+1(8.16e-09)]_\ 421 38429 5.78e-01 500 40830 5.58e-07 429_[+1(3.33e-10)]_52 43782 7.80e-13 64_[+2(2.60e-06)]_126_\ [+3(3.49e-10)]_113_[+1(1.21e-08)]_146 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************