******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/287/287.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 50695 1.0000 500 47712 1.0000 500 47778 1.0000 500 22367 1.0000 500 43299 1.0000 500 49189 1.0000 500 50182 1.0000 500 50309 1.0000 500 50428 1.0000 500 44468 1.0000 500 45154 1.0000 500 48439 1.0000 500 47661 1.0000 500 39877 1.0000 500 49254 1.0000 500 49872 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/287/287.seqs.fa -oc motifs/287 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.263 C 0.238 G 0.229 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.263 C 0.238 G 0.229 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 10 llr = 123 E-value = 3.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 6741794:9:2::69 pos.-specific C :2:2::1::a81::1 probability G 2154315:1::991: matrix T 2:13:::a::::13: bits 2.1 * 1.9 * * 1.7 * * ** 1.5 * *** ** * Relative 1.3 * ****** * Entropy 1.1 ** ****** * (17.7 bits) 0.9 * ** ****** * 0.6 *** *********** 0.4 *** *********** 0.2 *************** 0.0 --------------- Multilevel AAGGAAGTACCGGAA consensus GCATG A A T sequence T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 47661 78 3.38e-08 ACCATATTCA AAGCAAGTACCGGTA CCCGCGATTT 49872 46 1.28e-07 GTAGCAGTGT AGGTAAGTACCGGAA ACAGGCATTC 50695 71 2.85e-07 GAAACACCTG GAATGAATACCGGAA ACGAAAAATG 50428 237 3.20e-07 GCTCAGTTTG TAACAAGTACCGGTA GTGAGAGTCC 44468 179 3.48e-07 ACTCTCGTTA ACATAAATACCGGTA TACCTGGCAA 43299 302 6.90e-07 TTGCTCAAAC TATGGAGTACCGGAA AAACGCACAA 47778 310 2.15e-06 GCAACCTCGC AAAGAAGTACCGTAC CGGTCGTACA 39877 170 3.49e-06 TCGACAACGG AAGGAACTGCCGGGA AATGTCGTGG 45154 314 5.10e-06 GCTTTTGCGG ACGGGAATACACGAA ATTTTGAGCT 48439 384 7.19e-06 AAGCACGGAA GAGAAGATACAGGAA AGCTCACTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47661 3.4e-08 77_[+1]_408 49872 1.3e-07 45_[+1]_440 50695 2.9e-07 70_[+1]_415 50428 3.2e-07 236_[+1]_249 44468 3.5e-07 178_[+1]_307 43299 6.9e-07 301_[+1]_184 47778 2.2e-06 309_[+1]_176 39877 3.5e-06 169_[+1]_316 45154 5.1e-06 313_[+1]_172 48439 7.2e-06 383_[+1]_102 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=10 47661 ( 78) AAGCAAGTACCGGTA 1 49872 ( 46) AGGTAAGTACCGGAA 1 50695 ( 71) GAATGAATACCGGAA 1 50428 ( 237) TAACAAGTACCGGTA 1 44468 ( 179) ACATAAATACCGGTA 1 43299 ( 302) TATGGAGTACCGGAA 1 47778 ( 310) AAAGAAGTACCGTAC 1 39877 ( 170) AAGGAACTGCCGGGA 1 45154 ( 314) ACGGGAATACACGAA 1 48439 ( 384) GAGAAGATACAGGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 7776 bayes= 9.85286 E= 3.9e+000 119 -997 -19 -44 141 -25 -119 -997 61 -997 113 -144 -139 -25 81 15 141 -997 39 -997 178 -997 -119 -997 61 -125 113 -997 -997 -997 -997 188 178 -997 -119 -997 -997 207 -997 -997 -39 175 -997 -997 -997 -125 198 -997 -997 -997 198 -144 119 -997 -119 15 178 -125 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 10 E= 3.9e+000 0.600000 0.000000 0.200000 0.200000 0.700000 0.200000 0.100000 0.000000 0.400000 0.000000 0.500000 0.100000 0.100000 0.200000 0.400000 0.300000 0.700000 0.000000 0.300000 0.000000 0.900000 0.000000 0.100000 0.000000 0.400000 0.100000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.900000 0.000000 0.100000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.100000 0.900000 0.000000 0.000000 0.000000 0.900000 0.100000 0.600000 0.000000 0.100000 0.300000 0.900000 0.100000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AGT][AC][GA][GTC][AG]A[GA]TAC[CA]GG[AT]A -------------------------------------------------------------------------------- Time 2.37 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 18 sites = 6 llr = 98 E-value = 6.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :a:a27772::5:22::2 pos.-specific C 2:2::::22a:2:25a:: probability G 8:8:833:7:7:5:3:a8 matrix T :::::::2::3357:::: bits 2.1 * ** 1.9 * * * ** 1.7 * * * ** 1.5 ***** * *** Relative 1.3 ***** * *** Entropy 1.1 ******* ** * *** (23.5 bits) 0.9 ******* *** * *** 0.6 *********** ****** 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel GAGAGAAAGCGAGTCCGG consensus GG TTT G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 48439 61 2.32e-10 GTCTAGCCTA GAGAGAGAGCGTTTCCGG GATTTTACCA 50182 101 4.16e-10 CCAAATGAGA GAGAGAAAGCTATTGCGG TGAAGAGTCG 45154 276 1.39e-08 CTGGATTGGT GAGAGGAACCGAGTCCGA CGCCGGGATC 22367 2 2.92e-08 G GAGAGGGTACGTGTCCGG GGTGTCTGTA 47778 214 1.14e-07 TCAGACGGCT GACAAAAAGCGCGAGCGG TAATAGCTTC 43299 452 1.40e-07 ACTGCCGGGA CAGAGAACGCTATCACGG TAGTCCTCCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48439 2.3e-10 60_[+2]_422 50182 4.2e-10 100_[+2]_382 45154 1.4e-08 275_[+2]_207 22367 2.9e-08 1_[+2]_481 47778 1.1e-07 213_[+2]_269 43299 1.4e-07 451_[+2]_31 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=18 seqs=6 48439 ( 61) GAGAGAGAGCGTTTCCGG 1 50182 ( 101) GAGAGAAAGCTATTGCGG 1 45154 ( 276) GAGAGGAACCGAGTCCGA 1 22367 ( 2) GAGAGGGTACGTGTCCGG 1 47778 ( 214) GACAAAAAGCGCGAGCGG 1 43299 ( 452) CAGAGAACGCTATCACGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 7728 bayes= 9.98846 E= 6.9e+001 -923 -51 186 -923 193 -923 -923 -923 -923 -51 186 -923 193 -923 -923 -923 -65 -923 186 -923 134 -923 54 -923 134 -923 54 -923 134 -51 -923 -70 -65 -51 154 -923 -923 207 -923 -923 -923 -923 154 30 93 -51 -923 30 -923 -923 113 88 -65 -51 -923 130 -65 107 54 -923 -923 207 -923 -923 -923 -923 213 -923 -65 -923 186 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 6 E= 6.9e+001 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.666667 0.000000 0.333333 0.000000 0.666667 0.000000 0.333333 0.000000 0.666667 0.166667 0.000000 0.166667 0.166667 0.166667 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.500000 0.166667 0.000000 0.333333 0.000000 0.000000 0.500000 0.500000 0.166667 0.166667 0.000000 0.666667 0.166667 0.500000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.833333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GAGAG[AG][AG]AGC[GT][AT][GT]T[CG]CGG -------------------------------------------------------------------------------- Time 4.48 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 11 llr = 131 E-value = 2.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A aa8457584561a322 pos.-specific C :::62:524:19:147 probability G ::2:1:::15:::15: matrix T ::::23::2:3::5:1 bits 2.1 1.9 ** * 1.7 ** ** 1.5 ** ** Relative 1.3 *** * ** Entropy 1.1 **** *** * ** (17.1 bits) 0.9 **** *** * ** * 0.6 **** *** **** ** 0.4 **** *** **** ** 0.2 **************** 0.0 ---------------- Multilevel AAACAACAAGACATGC consensus A TA CAT AC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 50428 84 1.28e-08 ACGCTATTGC AAACAACATGACATCC GATGAAAGAT 49872 117 1.58e-07 CGACTAGGGG AAAAAACCCGACATCC CAAACCCCCC 49254 256 4.35e-07 CTATTAGCAT AAACCACAAAACACGC TGAATTTGAT 22367 206 4.80e-07 TTCCCGGGAC AAACCAAACATCAAGC CCTACCCCTA 50182 57 1.02e-06 CACTTCCTAC AAACTTAAAATCATGC ATACCCTGAT 49189 32 1.61e-06 ACCCACAGAA AAAAAAACAGACAAAC CAAACCGACC 50309 315 3.37e-06 ATTTCAGGAG AAACAAAAGGACAGCA ACCTCGTAGG 39877 447 3.61e-06 GTATAGGTAA AAACATCATGTCATGT GTGCCAGGCT 45154 354 4.79e-06 CACACTTGTC AAGAATCACGCCATCC TCCTCCGCAA 43299 102 5.86e-06 TTGTTACCGG AAGCTAAACAACATAA TGTAAATTTG 48439 362 7.07e-06 AGTCTTGGGG AAAAGACAAAAAAAGC ACGGAAGAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50428 1.3e-08 83_[+3]_401 49872 1.6e-07 116_[+3]_368 49254 4.4e-07 255_[+3]_229 22367 4.8e-07 205_[+3]_279 50182 1e-06 56_[+3]_428 49189 1.6e-06 31_[+3]_453 50309 3.4e-06 314_[+3]_170 39877 3.6e-06 446_[+3]_38 45154 4.8e-06 353_[+3]_131 43299 5.9e-06 101_[+3]_383 48439 7.1e-06 361_[+3]_123 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=11 50428 ( 84) AAACAACATGACATCC 1 49872 ( 117) AAAAAACCCGACATCC 1 49254 ( 256) AAACCACAAAACACGC 1 22367 ( 206) AAACCAAACATCAAGC 1 50182 ( 57) AAACTTAAAATCATGC 1 49189 ( 32) AAAAAAACAGACAAAC 1 50309 ( 315) AAACAAAAGGACAGCA 1 39877 ( 447) AAACATCATGTCATGT 1 45154 ( 354) AAGAATCACGCCATCC 1 43299 ( 102) AAGCTAAACAACATAA 1 48439 ( 362) AAAAGACAAAAAAAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7760 bayes= 9.8159 E= 2.1e+002 193 -1010 -1010 -1010 193 -1010 -1010 -1010 164 -1010 -33 -1010 47 142 -1010 -1010 105 -39 -133 -58 147 -1010 -1010 1 79 120 -1010 -1010 164 -39 -1010 -1010 47 61 -133 -58 79 -1010 125 -1010 128 -138 -1010 1 -153 193 -1010 -1010 193 -1010 -1010 -1010 5 -138 -133 101 -53 61 99 -1010 -53 161 -1010 -157 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 2.1e+002 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.818182 0.000000 0.181818 0.000000 0.363636 0.636364 0.000000 0.000000 0.545455 0.181818 0.090909 0.181818 0.727273 0.000000 0.000000 0.272727 0.454545 0.545455 0.000000 0.000000 0.818182 0.181818 0.000000 0.000000 0.363636 0.363636 0.090909 0.181818 0.454545 0.000000 0.545455 0.000000 0.636364 0.090909 0.000000 0.272727 0.090909 0.909091 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.272727 0.090909 0.090909 0.545455 0.181818 0.363636 0.454545 0.000000 0.181818 0.727273 0.000000 0.090909 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AAA[CA]A[AT][CA]A[AC][GA][AT]CA[TA][GC]C -------------------------------------------------------------------------------- Time 6.91 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50695 4.30e-04 70_[+1(2.85e-07)]_415 47712 8.94e-01 500 47778 1.24e-06 213_[+2(1.14e-07)]_78_\ [+1(2.15e-06)]_176 22367 2.26e-07 1_[+2(2.92e-08)]_23_[+2(6.43e-05)]_\ 145_[+3(4.80e-07)]_279 43299 1.93e-08 101_[+3(5.86e-06)]_184_\ [+1(6.90e-07)]_135_[+2(1.40e-07)]_31 49189 2.15e-03 31_[+3(1.61e-06)]_453 50182 1.16e-08 56_[+3(1.02e-06)]_28_[+2(4.16e-10)]_\ 382 50309 8.00e-03 314_[+3(3.37e-06)]_170 50428 1.65e-07 83_[+3(1.28e-08)]_137_\ [+1(3.20e-07)]_249 44468 3.30e-03 178_[+1(3.48e-07)]_307 45154 1.20e-08 275_[+2(1.39e-08)]_20_\ [+1(5.10e-06)]_25_[+3(4.79e-06)]_131 48439 5.38e-10 60_[+2(2.32e-10)]_283_\ [+3(7.07e-06)]_6_[+1(7.19e-06)]_102 47661 6.54e-04 77_[+1(3.38e-08)]_408 39877 1.01e-04 169_[+1(3.49e-06)]_262_\ [+3(3.61e-06)]_38 49254 2.93e-03 255_[+3(4.35e-07)]_229 49872 2.88e-07 45_[+1(1.28e-07)]_56_[+3(1.58e-07)]_\ 368 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************