******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/461/461.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 54051 1.0000 500 46672 1.0000 500 47148 1.0000 500 47270 1.0000 500 3295 1.0000 500 14571 1.0000 500 9903 1.0000 500 43517 1.0000 500 50806 1.0000 500 34301 1.0000 500 45024 1.0000 500 45390 1.0000 500 12713 1.0000 500 44631 1.0000 500 45706 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/461/461.seqs.fa -oc motifs/461 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.264 C 0.239 G 0.229 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.264 C 0.239 G 0.229 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 10 llr = 144 E-value = 1.1e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 3a2a9:26:81:324::59: pos.-specific C ::3::11::21:2:112216 probability G 7:3::9619:892:1553:4 matrix T ::2:1:131::138443::: bits 2.1 1.9 * * 1.7 * * * * * 1.5 * *** * * * Relative 1.3 ** *** **** * * Entropy 1.1 ** *** **** * ** (20.7 bits) 0.9 ** *** **** * ** 0.6 ** *** ***** * ** ** 0.4 ** ********* * ***** 0.2 ** ********* ******* 0.0 -------------------- Multilevel GACAAGGAGAGGATAGGAAC consensus A G AT C TATTTG G sequence A C CC T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 14571 265 5.73e-12 CCAGACGAAG GACAAGGAGAGGATTGGAAC AACTTCGATT 3295 192 7.28e-11 AGATCAAATT GACAAGGAGAGGATAGGGAG CAACAGGAGT 45706 307 5.54e-08 GAGCCATCGA AAAAAGGTGAGGTTCTGGAG CACGCTGTGA 47270 164 9.49e-08 CTTTGTTAGA GAGAAGGTTAGGTTATCCAC AGAGTCTAGC 47148 316 1.56e-07 CGCGGATCCC GACAAGGAGACGCATCGAAC TTACGGACTG 54051 12 3.33e-07 GTTGTAAGGA GATAACGTGAGGGAATTGAC TTTCTTATTA 45024 243 5.03e-07 TCCACAGCAT GAGAAGAAGCGTGTGGTAAC ATTTATCATA 12713 62 5.37e-07 TCAGGGATCG GATAAGCGGAGGATAGTACG ATTACCGCCC 9903 277 5.37e-07 CTTACGGATG AAGAAGAAGCAGTTTTGCAC GATCGATTAC 34301 371 6.94e-07 ACGGCGATAC AAAATGTAGAGGCTTGCAAG CTTTGCGTGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14571 5.7e-12 264_[+1]_216 3295 7.3e-11 191_[+1]_289 45706 5.5e-08 306_[+1]_174 47270 9.5e-08 163_[+1]_317 47148 1.6e-07 315_[+1]_165 54051 3.3e-07 11_[+1]_469 45024 5e-07 242_[+1]_238 12713 5.4e-07 61_[+1]_419 9903 5.4e-07 276_[+1]_204 34301 6.9e-07 370_[+1]_110 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=10 14571 ( 265) GACAAGGAGAGGATTGGAAC 1 3295 ( 192) GACAAGGAGAGGATAGGGAG 1 45706 ( 307) AAAAAGGTGAGGTTCTGGAG 1 47270 ( 164) GAGAAGGTTAGGTTATCCAC 1 47148 ( 316) GACAAGGAGACGCATCGAAC 1 54051 ( 12) GATAACGTGAGGGAATTGAC 1 45024 ( 243) GAGAAGAAGCGTGTGGTAAC 1 12713 ( 62) GATAAGCGGAGGATAGTACG 1 9903 ( 277) AAGAAGAAGCAGTTTTGCAC 1 34301 ( 371) AAAATGTAGAGGCTTGCAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7215 bayes= 10.4372 E= 1.1e-002 18 -997 161 -997 192 -997 -997 -997 -40 33 39 -42 192 -997 -997 -997 177 -997 -997 -142 -997 -126 197 -997 -40 -126 139 -142 118 -997 -119 16 -997 -997 197 -142 160 -26 -997 -997 -140 -126 180 -997 -997 -997 197 -142 18 -26 -19 16 -40 -997 -997 158 60 -126 -119 58 -997 -126 113 58 -997 -26 113 16 92 -26 39 -997 177 -126 -997 -997 -997 133 80 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 10 E= 1.1e-002 0.300000 0.000000 0.700000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.300000 0.300000 0.200000 1.000000 0.000000 0.000000 0.000000 0.900000 0.000000 0.000000 0.100000 0.000000 0.100000 0.900000 0.000000 0.200000 0.100000 0.600000 0.100000 0.600000 0.000000 0.100000 0.300000 0.000000 0.000000 0.900000 0.100000 0.800000 0.200000 0.000000 0.000000 0.100000 0.100000 0.800000 0.000000 0.000000 0.000000 0.900000 0.100000 0.300000 0.200000 0.200000 0.300000 0.200000 0.000000 0.000000 0.800000 0.400000 0.100000 0.100000 0.400000 0.000000 0.100000 0.500000 0.400000 0.000000 0.200000 0.500000 0.300000 0.500000 0.200000 0.300000 0.000000 0.900000 0.100000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA]A[CGAT]AAG[GA][AT]G[AC]GG[ATCG][TA][AT][GT][GTC][AGC]A[CG] -------------------------------------------------------------------------------- Time 1.79 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 7 llr = 116 E-value = 5.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 19474731:::1:311::97: pos.-specific C 613::349:a33:799191:7 probability G 3:116:3:1:4:a:::9::1: matrix T ::11::::9:36:::::1:13 bits 2.1 * * 1.9 * * 1.7 * * 1.5 * * * **** Relative 1.3 * *** * ***** Entropy 1.1 * ** *** ******* * (24.0 bits) 0.9 * *** *** ********* 0.6 ** *** *** ********** 0.4 ** ****************** 0.2 ********************* 0.0 --------------------- Multilevel CAAAGACCTCGTGCCCGCAAC consensus G C ACA CC A T sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 50806 421 1.80e-10 TCAGGAGAGA GAAAGAACTCTTGACCGCAAC AATGAGGGCG 14571 395 8.44e-10 CTTTGCTCGC CAAAGCGCTCGAGCCCGCAAT CGCGACCGAG 44631 212 1.50e-09 CGCGTACTAC AACAAAGCTCGTGCCCGCAGC GACGCCATCG 9903 429 1.84e-08 ACAAGCATTT CAGAAACCTCTCGCACCCAAC GAACAACGTG 47270 391 2.15e-08 GTAGCAGCAA CAAAAAACTCCCGCCCGTATT CTAATTCACA 47148 88 4.15e-08 TTGCCCGCCT GCCTGCCCGCCTGCCCGCAAC CTTGCAAACG 45706 163 1.64e-07 TTATCTGGAC CATGGACATCGTGACAGCCAC ATTGCACAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50806 1.8e-10 420_[+2]_59 14571 8.4e-10 394_[+2]_85 44631 1.5e-09 211_[+2]_268 9903 1.8e-08 428_[+2]_51 47270 2.2e-08 390_[+2]_89 47148 4.1e-08 87_[+2]_392 45706 1.6e-07 162_[+2]_317 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=7 50806 ( 421) GAAAGAACTCTTGACCGCAAC 1 14571 ( 395) CAAAGCGCTCGAGCCCGCAAT 1 44631 ( 212) AACAAAGCTCGTGCCCGCAGC 1 9903 ( 429) CAGAAACCTCTCGCACCCAAC 1 47270 ( 391) CAAAAAACTCCCGCCCGTATT 1 47148 ( 88) GCCTGCCCGCCTGCCCGCAAC 1 45706 ( 163) CATGGACATCGTGACAGCCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 9.84874 E= 5.9e+001 -89 126 32 -945 170 -74 -945 -945 70 26 -68 -90 143 -945 -68 -90 70 -945 132 -945 143 26 -945 -945 11 84 32 -945 -89 184 -945 -945 -945 -945 -68 168 -945 206 -945 -945 -945 26 90 9 -89 26 -945 109 -945 -945 213 -945 11 158 -945 -945 -89 184 -945 -945 -89 184 -945 -945 -945 -74 190 -945 -945 184 -945 -90 170 -74 -945 -945 143 -945 -68 -90 -945 158 -945 9 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 5.9e+001 0.142857 0.571429 0.285714 0.000000 0.857143 0.142857 0.000000 0.000000 0.428571 0.285714 0.142857 0.142857 0.714286 0.000000 0.142857 0.142857 0.428571 0.000000 0.571429 0.000000 0.714286 0.285714 0.000000 0.000000 0.285714 0.428571 0.285714 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.428571 0.285714 0.142857 0.285714 0.000000 0.571429 0.000000 0.000000 1.000000 0.000000 0.285714 0.714286 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.857143 0.000000 0.142857 0.857143 0.142857 0.000000 0.000000 0.714286 0.000000 0.142857 0.142857 0.000000 0.714286 0.000000 0.285714 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG]A[AC]A[GA][AC][CAG]CTC[GCT][TC]G[CA]CCGCAA[CT] -------------------------------------------------------------------------------- Time 3.59 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 8 llr = 108 E-value = 2.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 3:5:41:1:13:4a:9 pos.-specific C ::::664::3:::::1 probability G 8a41:3:1:1883:a: matrix T ::19::68a5:34::: bits 2.1 * * 1.9 * * ** 1.7 * * ** 1.5 * * * *** Relative 1.3 ** * * ** *** Entropy 1.1 ** ** * * ** *** (19.6 bits) 0.9 ** ****** ** *** 0.6 ********* ** *** 0.4 ********* ****** 0.2 **************** 0.0 ---------------- Multilevel GGATCCTTTTGGAAGA consensus A G AGC CATT sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 45706 271 2.11e-09 ATGGTAGAGC GGGTCCCTTTGGAAGA ATACCAAGCG 9903 19 1.41e-08 GAAAGTACGG AGGTCCTTTTGGAAGA GACGATCCAA 46672 57 1.66e-07 ACAGCTCCAG GGATAGTTTTGTGAGA TCCGGCCCGA 34301 7 1.88e-07 GAAAGC AGATCCTTTCAGAAGA GCCGTATGGC 47270 148 2.51e-07 AACCATCTTC GGGTAGCTTTGTTAGA GAGAAGGTTA 3295 93 1.08e-06 TTTAACCCGC GGAGCATTTGGGGAGA TTTCACTCGC 54051 260 1.15e-06 ATTTGAATCG GGATACTATAAGTAGA CCCGTGACTG 44631 18 2.11e-06 TCTCGTCTGT GGTTCCCGTCGGTAGC CGAGTTGTAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45706 2.1e-09 270_[+3]_214 9903 1.4e-08 18_[+3]_466 46672 1.7e-07 56_[+3]_428 34301 1.9e-07 6_[+3]_478 47270 2.5e-07 147_[+3]_337 3295 1.1e-06 92_[+3]_392 54051 1.2e-06 259_[+3]_225 44631 2.1e-06 17_[+3]_467 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=8 45706 ( 271) GGGTCCCTTTGGAAGA 1 9903 ( 19) AGGTCCTTTTGGAAGA 1 46672 ( 57) GGATAGTTTTGTGAGA 1 34301 ( 7) AGATCCTTTCAGAAGA 1 47270 ( 148) GGGTAGCTTTGTTAGA 1 3295 ( 93) GGAGCATTTGGGGAGA 1 54051 ( 260) GGATACTATAAGTAGA 1 44631 ( 18) GGTTCCCGTCGGTAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7275 bayes= 9.82714 E= 2.7e+002 -8 -965 171 -965 -965 -965 213 -965 92 -965 71 -110 -965 -965 -87 171 50 139 -965 -965 -108 139 13 -965 -965 65 -965 122 -108 -965 -87 148 -965 -965 -965 190 -108 6 -87 90 -8 -965 171 -965 -965 -965 171 -10 50 -965 13 49 192 -965 -965 -965 -965 -965 213 -965 173 -93 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 2.7e+002 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.375000 0.125000 0.000000 0.000000 0.125000 0.875000 0.375000 0.625000 0.000000 0.000000 0.125000 0.625000 0.250000 0.000000 0.000000 0.375000 0.000000 0.625000 0.125000 0.000000 0.125000 0.750000 0.000000 0.000000 0.000000 1.000000 0.125000 0.250000 0.125000 0.500000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 0.750000 0.250000 0.375000 0.000000 0.250000 0.375000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.875000 0.125000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA]G[AG]T[CA][CG][TC]TT[TC][GA][GT][ATG]AGA -------------------------------------------------------------------------------- Time 5.50 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54051 4.18e-06 11_[+1(3.33e-07)]_228_\ [+3(1.15e-06)]_225 46672 5.64e-04 56_[+3(1.66e-07)]_428 47148 2.71e-07 87_[+2(4.15e-08)]_207_\ [+1(1.56e-07)]_165 47270 2.85e-11 147_[+3(2.51e-07)]_[+1(9.49e-08)]_\ 207_[+2(2.15e-08)]_89 3295 5.22e-09 92_[+3(1.08e-06)]_83_[+1(7.28e-11)]_\ 289 14571 1.59e-13 264_[+1(5.73e-12)]_110_\ [+2(8.44e-10)]_85 9903 8.42e-12 18_[+3(1.41e-08)]_242_\ [+1(5.37e-07)]_132_[+2(1.84e-08)]_51 43517 8.37e-03 473_[+2(1.17e-05)]_6 50806 7.75e-06 420_[+2(1.80e-10)]_59 34301 4.25e-06 6_[+3(1.88e-07)]_348_[+1(6.94e-07)]_\ 110 45024 4.18e-03 242_[+1(5.03e-07)]_238 45390 9.64e-01 500 12713 2.23e-03 61_[+1(5.37e-07)]_419 44631 1.63e-07 17_[+3(2.11e-06)]_178_\ [+2(1.50e-09)]_268 45706 1.30e-12 162_[+2(1.64e-07)]_87_\ [+3(2.11e-09)]_20_[+1(5.54e-08)]_174 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************