******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/273/273.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 34027 1.0000 500 26565 1.0000 500 34576 1.0000 500 46056 1.0000 500 31534 1.0000 500 31537 1.0000 500 48910 1.0000 500 40038 1.0000 500 48172 1.0000 500 44462 1.0000 500 44506 1.0000 500 37225 1.0000 500 37456 1.0000 500 38978 1.0000 500 35444 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/273/273.seqs.fa -oc motifs/273 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.288 C 0.219 G 0.231 T 0.261 Background letter frequencies (from dataset with add-one prior applied): A 0.288 C 0.219 G 0.231 T 0.261 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 6 llr = 108 E-value = 6.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :2:728:3a:5:::277:8: pos.-specific C :852::a::323:::::8:7 probability G a:3272:7:7:7::83:2:3 matrix T ::2:2:::::3:aa::3:2: bits 2.2 * * 2.0 * * ** 1.8 * * * ** 1.5 ** * * ** * Relative 1.3 ** * ** **** * * Entropy 1.1 ** ***** **** *** (26.0 bits) 0.9 ** ****** ********* 0.7 ********** ********* 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GCCAGACGAGAGTTGAACAC consensus G A CTC GT G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 34576 396 4.60e-11 TTCGTTAGAA GCCAGACGAGTCTTGATCAC CTTGTGCTGT 37225 348 2.32e-10 TTTGTTAGAA GCCAGACAAGTCTTGATCAC CTCGTGCTGT 37456 382 9.74e-10 CATTTCTTCT GCGAGGCGACCGTTGAACAC TTCACACTTT 46056 168 3.95e-09 GAGTTCGTTG GCTCGACAAGAGTTGGACAG AGATCGTGCC 44506 99 1.50e-08 GTCGAACGTA GACATACGAGAGTTAAACAG TTGATTCTGT 48172 50 3.69e-08 GGAACTGTCT GCGGAACGACAGTTGGAGTC GGGCAATAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34576 4.6e-11 395_[+1]_85 37225 2.3e-10 347_[+1]_133 37456 9.7e-10 381_[+1]_99 46056 4e-09 167_[+1]_313 44506 1.5e-08 98_[+1]_382 48172 3.7e-08 49_[+1]_431 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=6 34576 ( 396) GCCAGACGAGTCTTGATCAC 1 37225 ( 348) GCCAGACAAGTCTTGATCAC 1 37456 ( 382) GCGAGGCGACCGTTGAACAC 1 46056 ( 168) GCTCGACAAGAGTTGGACAG 1 44506 ( 99) GACATACGAGAGTTAAACAG 1 48172 ( 50) GCGGAACGACAGTTGGAGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7215 bayes= 9.88926 E= 6.5e+000 -923 -923 211 -923 -79 193 -923 -923 -923 119 53 -65 121 -39 -47 -923 -79 -923 152 -65 153 -923 -47 -923 -923 219 -923 -923 21 -923 152 -923 179 -923 -923 -923 -923 61 152 -923 79 -39 -923 35 -923 61 152 -923 -923 -923 -923 194 -923 -923 -923 194 -79 -923 185 -923 121 -923 53 -923 121 -923 -923 35 -923 193 -47 -923 153 -923 -923 -65 -923 160 53 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 6.5e+000 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.500000 0.333333 0.166667 0.666667 0.166667 0.166667 0.000000 0.166667 0.000000 0.666667 0.166667 0.833333 0.000000 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.500000 0.166667 0.000000 0.333333 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.166667 0.000000 0.833333 0.000000 0.666667 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 0.833333 0.166667 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.666667 0.333333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GC[CG]AGAC[GA]A[GC][AT][GC]TTG[AG][AT]CA[CG] -------------------------------------------------------------------------------- Time 2.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 10 llr = 124 E-value = 2.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 8::2:22:2::2::11 pos.-specific C :1611:1521::::42 probability G ::2:183519::1a5: matrix T 29278:4:5:a89::7 bits 2.2 * 2.0 * * 1.8 ** * 1.5 * ** ** Relative 1.3 * * ** ** Entropy 1.1 ** ** * ***** (18.0 bits) 0.9 ** *** * ***** * 0.7 ****** * ******* 0.4 ****** * ******* 0.2 **************** 0.0 ---------------- Multilevel ATCTTGTCTGTTTGGT consensus T GA AGGA A CC sequence T A C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 37225 34 4.93e-09 GTCGAAACTA ATCTTGTGTGTTTGGC CATAAATCCC 34576 55 1.02e-08 GCCAAAACTA ATCATGTGTGTTTGGT CACAGATTGC 35444 370 2.33e-07 GAACCAATTG TTGTTGTCCGTTTGCT TGGTGTTCGT 44506 6 7.34e-07 GTTGG ATCTTAGGAGTATGCT GTCCGAATGA 37456 431 8.01e-07 GCCGCTCACT ATTTTGTCACTTTGGT GCGCAACGGC 44462 223 1.65e-06 GATTGGCACA TTGATGCCTGTTTGCT AGTCCGATAG 38978 192 1.92e-06 CGAGAAACGA ATCTTGAGCGTTGGAT GCCTATTCGA 26565 54 2.75e-06 ACGCGACGAC ACCTTGGCTGTATGCA AAGACGGCGT 46056 280 2.94e-06 AATATGGATG ATTTGGGGGGTTTGGC GCTCTGCCAT 34027 39 3.37e-06 GGTAATCCTG ATCCCAACTGTTTGGT GTGCAACGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37225 4.9e-09 33_[+2]_451 34576 1e-08 54_[+2]_430 35444 2.3e-07 369_[+2]_115 44506 7.3e-07 5_[+2]_479 37456 8e-07 430_[+2]_54 44462 1.7e-06 222_[+2]_262 38978 1.9e-06 191_[+2]_293 26565 2.8e-06 53_[+2]_431 46056 2.9e-06 279_[+2]_205 34027 3.4e-06 38_[+2]_446 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=10 37225 ( 34) ATCTTGTGTGTTTGGC 1 34576 ( 55) ATCATGTGTGTTTGGT 1 35444 ( 370) TTGTTGTCCGTTTGCT 1 44506 ( 6) ATCTTAGGAGTATGCT 1 37456 ( 431) ATTTTGTCACTTTGGT 1 44462 ( 223) TTGATGCCTGTTTGCT 1 38978 ( 192) ATCTTGAGCGTTGGAT 1 26565 ( 54) ACCTTGGCTGTATGCA 1 46056 ( 280) ATTTGGGGGGTTTGGC 1 34027 ( 39) ATCCCAACTGTTTGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7275 bayes= 9.75668 E= 2.0e+001 147 -997 -997 -38 -997 -113 -997 178 -997 145 -21 -38 -53 -113 -997 142 -997 -113 -121 161 -53 -997 179 -997 -53 -113 37 61 -997 119 111 -997 -53 -13 -121 94 -997 -113 196 -997 -997 -997 -997 194 -53 -997 -997 161 -997 -997 -121 178 -997 -997 211 -997 -153 87 111 -997 -153 -13 -997 142 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 2.0e+001 0.800000 0.000000 0.000000 0.200000 0.000000 0.100000 0.000000 0.900000 0.000000 0.600000 0.200000 0.200000 0.200000 0.100000 0.000000 0.700000 0.000000 0.100000 0.100000 0.800000 0.200000 0.000000 0.800000 0.000000 0.200000 0.100000 0.300000 0.400000 0.000000 0.500000 0.500000 0.000000 0.200000 0.200000 0.100000 0.500000 0.000000 0.100000 0.900000 0.000000 0.000000 0.000000 0.000000 1.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.000000 0.100000 0.900000 0.000000 0.000000 1.000000 0.000000 0.100000 0.400000 0.500000 0.000000 0.100000 0.200000 0.000000 0.700000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AT]T[CGT][TA]T[GA][TGA][CG][TAC]GT[TA]TG[GC][TC] -------------------------------------------------------------------------------- Time 4.16 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 98 E-value = 3.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::2a2a::::a:8a8:4:4a pos.-specific C ::::::::6:4::2:224:2: probability G 4:6::6:a2::::::::22:: matrix T 6a48:2::2a6:a:::8:84: bits 2.2 * 2.0 * * * * 1.8 * * ** * ** * * 1.5 * * ** * ** * * Relative 1.3 * * ** * ** * * * * Entropy 1.1 ***** ** ******** * * (28.4 bits) 0.9 ***** ** ******** * * 0.7 ***************** * * 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TTGTAGAGCTTATAAATATAA consensus G TA A G C C CCCGT sequence T T G C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 34576 324 1.54e-11 TTTGGCTTTA GTTTAGAGCTTATAAATCTTA ACAACGATTG 37225 276 1.80e-10 TAGCTCTTTA GTTTAGAGTTTATAAATCTTA ACAACAATTG 31534 330 5.76e-10 GTTCTTGCGT TTGTAGAGCTCATCAATAGAA CTAATAAGAA 40038 403 7.01e-10 TTTTCTCTTT TTGTATAGCTTATAAACATCA TTGTATGAAA 44506 305 6.24e-09 TCATGGCGTT TTGAAAAGGTCATAACTGTAA AGTCAAGTAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34576 1.5e-11 323_[+3]_156 37225 1.8e-10 275_[+3]_204 31534 5.8e-10 329_[+3]_150 40038 7e-10 402_[+3]_77 44506 6.2e-09 304_[+3]_175 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 34576 ( 324) GTTTAGAGCTTATAAATCTTA 1 37225 ( 276) GTTTAGAGTTTATAAATCTTA 1 31534 ( 330) TTGTAGAGCTCATCAATAGAA 1 40038 ( 403) TTGTATAGCTTATAAACATCA 1 44506 ( 305) TTGAAAAGGTCATAACTGTAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 10.7426 E= 3.7e+002 -897 -897 79 120 -897 -897 -897 193 -897 -897 137 61 -53 -897 -897 161 179 -897 -897 -897 -53 -897 137 -38 179 -897 -897 -897 -897 -897 211 -897 -897 145 -21 -38 -897 -897 -897 193 -897 87 -897 120 179 -897 -897 -897 -897 -897 -897 193 147 -13 -897 -897 179 -897 -897 -897 147 -13 -897 -897 -897 -13 -897 161 47 87 -21 -897 -897 -897 -21 161 47 -13 -897 61 179 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 3.7e+002 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.400000 0.200000 0.000000 0.000000 0.800000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.600000 0.200000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.200000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.600000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.800000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.400000 0.400000 0.200000 0.000000 0.000000 0.000000 0.200000 0.800000 0.400000 0.200000 0.000000 0.400000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TG]T[GT][TA]A[GAT]AG[CGT]T[TC]AT[AC]A[AC][TC][ACG][TG][ATC]A -------------------------------------------------------------------------------- Time 5.92 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34027 3.24e-02 38_[+2(3.37e-06)]_446 26565 7.90e-03 53_[+2(2.75e-06)]_431 34576 9.93e-19 54_[+2(1.02e-08)]_253_\ [+3(1.54e-11)]_51_[+1(4.60e-11)]_85 46056 5.53e-07 167_[+1(3.95e-09)]_92_\ [+2(2.94e-06)]_205 31534 1.76e-05 329_[+3(5.76e-10)]_150 31537 1.92e-01 500 48910 8.57e-01 500 40038 1.70e-05 402_[+3(7.01e-10)]_77 48172 2.95e-04 49_[+1(3.69e-08)]_431 44462 3.70e-03 222_[+2(1.65e-06)]_262 44506 4.31e-12 5_[+2(7.34e-07)]_77_[+1(1.50e-08)]_\ 186_[+3(6.24e-09)]_175 37225 2.46e-17 33_[+2(4.93e-09)]_226_\ [+3(1.80e-10)]_51_[+1(2.32e-10)]_71_[+2(7.39e-06)]_46 37456 4.97e-08 381_[+1(9.74e-10)]_29_\ [+2(8.01e-07)]_54 38978 5.66e-03 191_[+2(1.92e-06)]_293 35444 1.66e-04 369_[+2(2.33e-07)]_115 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************