******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/498/498.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11306 1.0000 500 18741 1.0000 500 2073 1.0000 500 21389 1.0000 500 21452 1.0000 500 22454 1.0000 500 23303 1.0000 500 23308 1.0000 500 23653 1.0000 500 24003 1.0000 500 24606 1.0000 500 263346 1.0000 500 33774 1.0000 500 36995 1.0000 500 7873 1.0000 500 bd895 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/498/498.seqs.fa -oc motifs/498 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.277 C 0.242 G 0.225 T 0.257 Background letter frequencies (from dataset with add-one prior applied): A 0.277 C 0.242 G 0.225 T 0.257 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 7 llr = 122 E-value = 1.3e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :a3:191346a:7796:44:4 pos.-specific C 1:1:::::34:a::1::6::6 probability G 9:3a9:933:::33:19::6: matrix T ::3::1:4:::::::31:64: bits 2.2 * * 1.9 * * ** 1.7 * * ** 1.5 ** ** * ** * Relative 1.3 ** **** ** * * Entropy 1.1 ** **** ***** ** ** (25.1 bits) 0.9 ** **** ****** ***** 0.6 ** **** ************ 0.4 ** ****************** 0.2 ** ****************** 0.0 --------------------- Multilevel GAAGGAGTAAACAAAAGCTGC consensus G ACC GG T AATA sequence T GG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 36995 420 9.07e-11 TGCCCTAGGC GATGGAGTGAACGAAAGCTGC GCCATCATAT 18741 420 9.07e-11 TGCCCTAGGC GATGGAGTGAACGAAAGCTGC GCCATCATAT 21452 74 3.25e-10 TCCGTTGCTT GAAGGAGAACACAAAAGATGA CTATTCCATA 33774 222 2.28e-09 AGGCAGATGA GAGGGAGGACACAGATGAAGA TGATCCAGAT 24606 411 7.66e-09 AATTGCGAGA CAAGGAGAAAACAAATGCTTA TTAGGTACAC 22454 234 7.33e-08 GATGATTATC GAGGGTATCCACAGAGGAATC AGAATTTTGT 2073 355 9.36e-08 CAAGAGCGAC GACGAAGGCAACAACATCATC CTAAACATCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36995 9.1e-11 419_[+1]_60 18741 9.1e-11 419_[+1]_60 21452 3.2e-10 73_[+1]_406 33774 2.3e-09 221_[+1]_258 24606 7.7e-09 410_[+1]_69 22454 7.3e-08 233_[+1]_246 2073 9.4e-08 354_[+1]_125 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=7 36995 ( 420) GATGGAGTGAACGAAAGCTGC 1 18741 ( 420) GATGGAGTGAACGAAAGCTGC 1 21452 ( 74) GAAGGAGAACACAAAAGATGA 1 33774 ( 222) GAGGGAGGACACAGATGAAGA 1 24606 ( 411) CAAGGAGAAAACAAATGCTTA 1 22454 ( 234) GAGGGTATCCACAGAGGAATC 1 2073 ( 355) GACGAAGGCAACAACATCATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 10.7044 E= 1.3e+000 -945 -76 193 -945 185 -945 -945 -945 5 -76 35 15 -945 -945 215 -945 -95 -945 193 -945 163 -945 -945 -85 -95 -945 193 -945 5 -945 35 74 63 24 35 -945 104 83 -945 -945 185 -945 -945 -945 -945 205 -945 -945 137 -945 35 -945 137 -945 35 -945 163 -76 -945 -945 104 -945 -65 15 -945 -945 193 -85 63 124 -945 -945 63 -945 -945 115 -945 -945 135 74 63 124 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 1.3e+000 0.000000 0.142857 0.857143 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.142857 0.285714 0.285714 0.000000 0.000000 1.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.857143 0.000000 0.000000 0.142857 0.142857 0.000000 0.857143 0.000000 0.285714 0.000000 0.285714 0.428571 0.428571 0.285714 0.285714 0.000000 0.571429 0.428571 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.714286 0.000000 0.285714 0.000000 0.714286 0.000000 0.285714 0.000000 0.857143 0.142857 0.000000 0.000000 0.571429 0.000000 0.142857 0.285714 0.000000 0.000000 0.857143 0.142857 0.428571 0.571429 0.000000 0.000000 0.428571 0.000000 0.000000 0.571429 0.000000 0.000000 0.571429 0.428571 0.428571 0.571429 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GA[AGT]GGAG[TAG][ACG][AC]AC[AG][AG]A[AT]G[CA][TA][GT][CA] -------------------------------------------------------------------------------- Time 2.09 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 6 llr = 109 E-value = 1.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::::3::::75::32::: pos.-specific C 237:5:3::33::::7::87 probability G ::2::a3:a2522:a::a:: matrix T 872a5::a:5223a::8:23 bits 2.2 * * * * 1.9 * * ** ** * 1.7 * * ** ** * 1.5 * * ** ** * Relative 1.3 * * * ** ** *** Entropy 1.1 ** *** ** ******* (26.3 bits) 0.9 ****** ** ******* 0.6 ****** ***** ******* 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel TTCTCGATGTGAATGCTGCC consensus C T C CC T A T sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 36995 478 4.17e-12 CCTGAAAGAC TTCTCGATGTGAATGCTGCC AAG 18741 478 4.17e-12 CCTGAAAGAC TTCTCGATGTGAATGCTGCC AAG 2073 132 2.52e-09 ACGCTTGCGA TTCTTGCTGCCTTTGATGCC ATGGTTTTGG 263346 192 1.09e-08 GGTAGTAGCT CTCTTGGTGCCAGTGCAGCC GACGGTGTTT 23308 93 1.75e-08 TATTGAATTG TCTTCGCTGTTAATGCTGTT CATTAAGGGT 22454 22 2.26e-08 TTGATGGTGT TCGTTGGTGGGGTTGATGCT CGGATGGACG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36995 4.2e-12 477_[+2]_3 18741 4.2e-12 477_[+2]_3 2073 2.5e-09 131_[+2]_349 263346 1.1e-08 191_[+2]_289 23308 1.7e-08 92_[+2]_388 22454 2.3e-08 21_[+2]_459 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=6 36995 ( 478) TTCTCGATGTGAATGCTGCC 1 18741 ( 478) TTCTCGATGTGAATGCTGCC 1 2073 ( 132) TTCTTGCTGCCTTTGATGCC 1 263346 ( 192) CTCTTGGTGCCAGTGCAGCC 1 23308 ( 93) TCTTCGCTGTTAATGCTGTT 1 22454 ( 22) TCGTTGGTGGGGTTGATGCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7696 bayes= 10.7716 E= 1.8e+000 -923 -53 -923 170 -923 46 -923 137 -923 146 -43 -62 -923 -923 -923 196 -923 105 -923 96 -923 -923 215 -923 27 46 57 -923 -923 -923 -923 196 -923 -923 215 -923 -923 46 -43 96 -923 46 115 -62 127 -923 -43 -62 85 -923 -43 37 -923 -923 -923 196 -923 -923 215 -923 27 146 -923 -923 -73 -923 -923 170 -923 -923 215 -923 -923 178 -923 -62 -923 146 -923 37 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 1.8e+000 0.000000 0.166667 0.000000 0.833333 0.000000 0.333333 0.000000 0.666667 0.000000 0.666667 0.166667 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.333333 0.333333 0.333333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.166667 0.500000 0.000000 0.333333 0.500000 0.166667 0.666667 0.000000 0.166667 0.166667 0.500000 0.000000 0.166667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.666667 0.000000 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[TC]CT[CT]G[ACG]TG[TC][GC]A[AT]TG[CA]TGC[CT] -------------------------------------------------------------------------------- Time 4.03 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 13 sites = 8 llr = 104 E-value = 1.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::4a1::344:1a pos.-specific C 3::::::::::9: probability G 8a:::1a4:69:: matrix T ::6:99:46:1:: bits 2.2 * * 1.9 * * * * 1.7 * * * * 1.5 * **** *** Relative 1.3 ** **** *** Entropy 1.1 ** **** **** (18.7 bits) 0.9 ******* ***** 0.6 ******* ***** 0.4 ************* 0.2 ************* 0.0 ------------- Multilevel GGTATTGGTGGCA consensus C A TAA sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 23653 2 2.25e-08 T GGTATTGTTGGCA GCGTCCCCCC 36995 453 5.81e-08 CATCATATTA GGTATTGGAGGCA AGCCTGAAAG 18741 453 5.81e-08 CATCATATTA GGTATTGGAGGCA AGCCTGAAAG bd895 393 3.78e-07 TCGTCGTCTT CGTATTGATGGCA CAATCCGATT 263346 116 1.06e-06 CTCCGGTGCC GGAATTGTTGGAA ACAATAGCAA 21389 114 1.06e-06 AGCTATGTTT GGTAATGTTAGCA CGGCTGCCGA 23308 5 2.26e-06 TTGA GGAATTGATATCA CCAAGAATGT 24606 37 3.95e-06 CCGACACAAA CGAATGGGAAGCA CGAGACGGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23653 2.3e-08 1_[+3]_486 36995 5.8e-08 452_[+3]_35 18741 5.8e-08 452_[+3]_35 bd895 3.8e-07 392_[+3]_95 263346 1.1e-06 115_[+3]_372 21389 1.1e-06 113_[+3]_374 23308 2.3e-06 4_[+3]_483 24606 3.9e-06 36_[+3]_451 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=13 seqs=8 23653 ( 2) GGTATTGTTGGCA 1 36995 ( 453) GGTATTGGAGGCA 1 18741 ( 453) GGTATTGGAGGCA 1 bd895 ( 393) CGTATTGATGGCA 1 263346 ( 116) GGAATTGTTGGAA 1 21389 ( 114) GGTAATGTTAGCA 1 23308 ( 5) GGAATTGATATCA 1 24606 ( 37) CGAATGGGAAGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 7808 bayes= 11.2521 E= 1.8e+000 -965 5 174 -965 -965 -965 215 -965 44 -965 -965 128 185 -965 -965 -965 -115 -965 -965 177 -965 -965 -84 177 -965 -965 215 -965 -15 -965 74 54 44 -965 -965 128 44 -965 148 -965 -965 -965 196 -104 -115 186 -965 -965 185 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 8 E= 1.8e+000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.375000 0.000000 0.000000 0.625000 1.000000 0.000000 0.000000 0.000000 0.125000 0.000000 0.000000 0.875000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.375000 0.375000 0.375000 0.000000 0.000000 0.625000 0.375000 0.000000 0.625000 0.000000 0.000000 0.000000 0.875000 0.125000 0.125000 0.875000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC]G[TA]ATTG[GTA][TA][GA]GCA -------------------------------------------------------------------------------- Time 6.01 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11306 1.44e-01 500 18741 2.91e-18 419_[+1(9.07e-11)]_12_\ [+3(5.81e-08)]_12_[+2(4.17e-12)]_3 2073 6.95e-09 131_[+2(2.52e-09)]_203_\ [+1(9.36e-08)]_125 21389 4.75e-03 113_[+3(1.06e-06)]_374 21452 6.59e-06 73_[+1(3.25e-10)]_406 22454 8.64e-08 21_[+2(2.26e-08)]_192_\ [+1(7.33e-08)]_246 23303 3.79e-01 500 23308 1.58e-06 4_[+3(2.26e-06)]_75_[+2(1.75e-08)]_\ 388 23653 2.08e-04 1_[+3(2.25e-08)]_486 24003 3.75e-01 500 24606 1.10e-06 36_[+3(3.95e-06)]_361_\ [+1(7.66e-09)]_69 263346 2.45e-07 115_[+3(1.06e-06)]_63_\ [+2(1.09e-08)]_245_[+2(7.90e-05)]_24 33774 4.56e-05 221_[+1(2.28e-09)]_258 36995 2.91e-18 419_[+1(9.07e-11)]_12_\ [+3(5.81e-08)]_12_[+2(4.17e-12)]_3 7873 7.52e-02 500 bd895 3.36e-04 359_[+1(4.88e-05)]_12_\ [+3(3.78e-07)]_95 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************