******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/84/84.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42619 1.0000 500 32261 1.0000 500 43171 1.0000 500 13208 1.0000 500 37171 1.0000 500 52200 1.0000 500 48011 1.0000 500 48762 1.0000 500 49498 1.0000 500 49964 1.0000 500 44190 1.0000 500 33955 1.0000 500 34010 1.0000 500 44955 1.0000 500 20308 1.0000 500 43131 1.0000 500 38402 1.0000 500 47828 1.0000 500 36104 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/84/84.seqs.fa -oc motifs/84 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.274 C 0.246 G 0.227 T 0.253 Background letter frequencies (from dataset with add-one prior applied): A 0.274 C 0.246 G 0.227 T 0.253 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 11 llr = 126 E-value = 2.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 3432:a:5:::7 pos.-specific C :5::::a1:::3 probability G 71:17:::a:a: matrix T :1773::4:a:: bits 2.1 * * 1.9 ** *** 1.7 ** *** 1.5 ** *** Relative 1.3 * *** *** Entropy 1.1 * * *** **** (16.6 bits) 0.9 * ***** **** 0.6 * ********** 0.4 * ********** 0.2 ************ 0.0 ------------ Multilevel GCTTGACAGTGA consensus AAA T T C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 33955 303 5.35e-08 TTCACGCTAT GCTTGACAGTGA CACGCGTGGA 49964 226 1.63e-07 GGCATGATTT GATTGACAGTGA CGGTGATTGC 44955 25 2.18e-07 TGTAATCTGG GATTGACTGTGA ACTCAAGGCG 47828 329 5.46e-07 GGGTTTTGGA ACTTGACAGTGA GATTCGCGGT 44190 4 5.46e-07 GAC GCTTGACTGTGC CTTCCAATAA 20308 75 8.43e-07 ACGGTCACAT GAATGACAGTGA ATGCAGTTGA 34010 169 5.58e-06 AGCGTCCCTC GCATTACAGTGC CTCTCTCTAG 42619 194 9.76e-06 ACTGCAATGG AATTTACTGTGC AGCTCTCGTT 36104 193 1.23e-05 ATGGTATCCT GTTATACAGTGA GAGAGTCCTT 43131 214 1.46e-05 TTTTAATAGC AGTAGACTGTGA TGTTATGATA 37171 219 1.66e-05 CGAAATCGGC GCAGGACCGTGA CATTCGCGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33955 5.4e-08 302_[+1]_186 49964 1.6e-07 225_[+1]_263 44955 2.2e-07 24_[+1]_464 47828 5.5e-07 328_[+1]_160 44190 5.5e-07 3_[+1]_485 20308 8.4e-07 74_[+1]_414 34010 5.6e-06 168_[+1]_320 42619 9.8e-06 193_[+1]_295 36104 1.2e-05 192_[+1]_296 43131 1.5e-05 213_[+1]_275 37171 1.7e-05 218_[+1]_270 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=11 33955 ( 303) GCTTGACAGTGA 1 49964 ( 226) GATTGACAGTGA 1 44955 ( 25) GATTGACTGTGA 1 47828 ( 329) ACTTGACAGTGA 1 44190 ( 4) GCTTGACTGTGC 1 20308 ( 75) GAATGACAGTGA 1 34010 ( 169) GCATTACAGTGC 1 42619 ( 194) AATTTACTGTGC 1 36104 ( 193) GTTATACAGTGA 1 43131 ( 214) AGTAGACTGTGA 1 37171 ( 219) GCAGGACCGTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9291 bayes= 10.7478 E= 2.0e-001 -1 -1010 168 -1010 41 89 -132 -147 -1 -1010 -1010 152 -59 -1010 -132 152 -1010 -1010 168 11 186 -1010 -1010 -1010 -1010 202 -1010 -1010 99 -143 -1010 53 -1010 -1010 214 -1010 -1010 -1010 -1010 198 -1010 -1010 214 -1010 141 15 -1010 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 2.0e-001 0.272727 0.000000 0.727273 0.000000 0.363636 0.454545 0.090909 0.090909 0.272727 0.000000 0.000000 0.727273 0.181818 0.000000 0.090909 0.727273 0.000000 0.000000 0.727273 0.272727 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.545455 0.090909 0.000000 0.363636 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.727273 0.272727 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA][CA][TA]T[GT]AC[AT]GTG[AC] -------------------------------------------------------------------------------- Time 3.10 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 5 llr = 81 E-value = 2.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :2a:2:6::::a8::: pos.-specific C 2::224:2a:8::a:: probability G 88::2644:a2:::aa matrix T :::84::4::::2::: bits 2.1 * ** 1.9 * ** * *** 1.7 * ** * *** 1.5 * * ** * *** Relative 1.3 **** **** *** Entropy 1.1 **** ** ******** (23.5 bits) 0.9 **** ** ******** 0.6 **** ** ******** 0.4 **** *********** 0.2 **** *********** 0.0 ---------------- Multilevel GGATTGAGCGCAACGG consensus CA CACGT G T sequence C C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 33955 333 1.66e-09 GACTGTGAAG GGATGGATCGCAACGG AATGCGATAT 13208 219 7.33e-09 CTTGCTCGAC GGATTCGCCGCAACGG ATCGTTTCCA 43171 17 3.37e-08 CAAACTCATC GGATCCGGCGGAACGG CGGCATCCGC 44190 135 4.35e-08 GTCTAATTTC GAATTGATCGCATCGG CAGCTCCTTT 38402 140 7.33e-08 GATCATCCAT CGACAGAGCGCAACGG AATGAGCATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33955 1.7e-09 332_[+2]_152 13208 7.3e-09 218_[+2]_266 43171 3.4e-08 16_[+2]_468 44190 4.3e-08 134_[+2]_350 38402 7.3e-08 139_[+2]_345 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=5 33955 ( 333) GGATGGATCGCAACGG 1 13208 ( 219) GGATTCGCCGCAACGG 1 43171 ( 17) GGATCCGGCGGAACGG 1 44190 ( 135) GAATTGATCGCATCGG 1 38402 ( 140) CGACAGAGCGCAACGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9215 bayes= 11.0987 E= 2.4e+002 -897 -30 182 -897 -46 -897 182 -897 186 -897 -897 -897 -897 -30 -897 166 -46 -30 -18 66 -897 70 140 -897 113 -897 82 -897 -897 -30 82 66 -897 202 -897 -897 -897 -897 214 -897 -897 170 -18 -897 186 -897 -897 -897 154 -897 -897 -34 -897 202 -897 -897 -897 -897 214 -897 -897 -897 214 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 2.4e+002 0.000000 0.200000 0.800000 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.200000 0.200000 0.200000 0.400000 0.000000 0.400000 0.600000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 0.200000 0.400000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GC][GA]A[TC][TACG][GC][AG][GTC]CG[CG]A[AT]CGG -------------------------------------------------------------------------------- Time 6.26 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 8 llr = 119 E-value = 3.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::1:33:6:84::3:13::1 pos.-specific C ::4::1:19:::46133a1: probability G 8:3a1:a::11151331:99 matrix T 3a3:66:311591:644::: bits 2.1 * * 1.9 * * * * 1.7 * * * * 1.5 * * * * * *** Relative 1.3 ** * * * * *** Entropy 1.1 ** * * * * *** (21.4 bits) 0.9 ** * * ** * *** 0.6 ** ************ *** 0.4 ** ************ *** 0.2 **************** *** 0.0 -------------------- Multilevel GTCGTTGACATTGCTTTCGG consensus T G AA T A CAGCA sequence T GC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 34010 93 2.61e-10 ATACTCTACC GTTGTTGACATTGATTACGG TGCATCGGTC 20308 440 1.35e-08 GGCAGTCTTC GTGGTAGTCTTTGCTGTCGG ACGCAGATTT 52200 233 3.42e-08 CACCCATCAT TTGGTTGACGTTGGTCTCGG AATGCGCCCA 48762 79 5.47e-08 GGTACGTCAC GTCGTCGACATGTCTTCCGG TTCAACCCAG 36104 305 1.00e-07 TGTTTAAGTA TTCGTTGTCAATCATAGCGG GTTTTGATCA 48011 110 1.27e-07 ACGGTTAGTT GTCGATGATAGTCCGGCCGG TCCGCGAGAA 33955 211 2.65e-07 CGGTCATTTT GTTGGAGACAATCCGCTCGA AATCTTCGAG 47828 99 5.98e-07 AACAGTTTAA GTAGATGCCAATGCCTACCG GCAGGCTAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34010 2.6e-10 92_[+3]_388 20308 1.4e-08 439_[+3]_41 52200 3.4e-08 232_[+3]_248 48762 5.5e-08 78_[+3]_402 36104 1e-07 304_[+3]_176 48011 1.3e-07 109_[+3]_371 33955 2.7e-07 210_[+3]_270 47828 6e-07 98_[+3]_382 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=8 34010 ( 93) GTTGTTGACATTGATTACGG 1 20308 ( 440) GTGGTAGTCTTTGCTGTCGG 1 52200 ( 233) TTGGTTGACGTTGGTCTCGG 1 48762 ( 79) GTCGTCGACATGTCTTCCGG 1 36104 ( 305) TTCGTTGTCAATCATAGCGG 1 48011 ( 110) GTCGATGATAGTCCGGCCGG 1 33955 ( 211) GTTGGAGACAATCCGCTCGA 1 47828 ( 99) GTAGATGCCAATGCCTACCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 9139 bayes= 10.894 E= 3.3e+002 -965 -965 172 -2 -965 -965 -965 198 -113 61 14 -2 -965 -965 214 -965 -13 -965 -86 131 -13 -97 -965 131 -965 -965 214 -965 119 -97 -965 -2 -965 183 -965 -101 145 -965 -86 -101 45 -965 -86 98 -965 -965 -86 179 -965 61 114 -101 -13 134 -86 -965 -965 -97 14 131 -113 2 14 57 -13 2 -86 57 -965 202 -965 -965 -965 -97 195 -965 -113 -965 195 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 8 E= 3.3e+002 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.125000 0.375000 0.250000 0.250000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.125000 0.625000 0.250000 0.125000 0.000000 0.625000 0.000000 0.000000 1.000000 0.000000 0.625000 0.125000 0.000000 0.250000 0.000000 0.875000 0.000000 0.125000 0.750000 0.000000 0.125000 0.125000 0.375000 0.000000 0.125000 0.500000 0.000000 0.000000 0.125000 0.875000 0.000000 0.375000 0.500000 0.125000 0.250000 0.625000 0.125000 0.000000 0.000000 0.125000 0.250000 0.625000 0.125000 0.250000 0.250000 0.375000 0.250000 0.250000 0.125000 0.375000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.125000 0.000000 0.875000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GT]T[CGT]G[TA][TA]G[AT]CA[TA]T[GC][CA][TG][TCG][TAC]CGG -------------------------------------------------------------------------------- Time 9.76 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42619 2.12e-02 193_[+1(9.76e-06)]_295 32261 7.91e-01 500 43171 4.07e-04 16_[+2(3.37e-08)]_468 13208 9.87e-05 218_[+2(7.33e-09)]_266 37171 5.72e-02 218_[+1(1.66e-05)]_270 52200 2.23e-04 232_[+3(3.42e-08)]_248 48011 3.90e-04 109_[+3(1.27e-07)]_371 48762 6.22e-04 78_[+3(5.47e-08)]_402 49498 8.77e-01 500 49964 3.28e-04 225_[+1(1.63e-07)]_157_\ [+1(2.62e-05)]_94 44190 4.46e-07 3_[+1(5.46e-07)]_119_[+2(4.35e-08)]_\ 350 33955 1.60e-12 210_[+3(2.65e-07)]_72_\ [+1(5.35e-08)]_18_[+2(1.66e-09)]_119_[+3(9.53e-05)]_13 34010 3.48e-09 48_[+3(5.68e-06)]_24_[+3(2.61e-10)]_\ 56_[+1(5.58e-06)]_232_[+3(5.28e-05)]_68 44955 1.30e-03 24_[+1(2.18e-07)]_464 20308 1.68e-07 74_[+1(8.43e-07)]_353_\ [+3(1.35e-08)]_41 43131 5.80e-02 213_[+1(1.46e-05)]_275 38402 1.25e-03 139_[+2(7.33e-08)]_345 47828 1.07e-05 98_[+3(5.98e-07)]_210_\ [+1(5.46e-07)]_160 36104 8.08e-06 192_[+1(1.23e-05)]_100_\ [+3(1.00e-07)]_176 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************