******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/430/430.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47158 1.0000 500 48319 1.0000 500 40634 1.0000 500 33840 1.0000 500 46006 1.0000 500 36568 1.0000 500 40189 1.0000 500 40186 1.0000 500 39769 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/430/430.seqs.fa -oc motifs/430 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.255 C 0.235 G 0.216 T 0.294 Background letter frequencies (from dataset with add-one prior applied): A 0.255 C 0.235 G 0.216 T 0.294 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 18 sites = 6 llr = 101 E-value = 1.2e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 7::5:a8a::3232:aa5 pos.-specific C :::2a:2:2358:5a::: probability G ::22::::83::7::::5 matrix T 3a82:::::32::3:::: bits 2.2 2.0 ** * *** 1.8 * ** * *** 1.5 * ** ** *** Relative 1.3 * ***** * *** Entropy 1.1 ** ***** ** **** (24.3 bits) 0.9 *** ***** ** **** 0.7 *** ***** ** **** 0.4 *** ************** 0.2 ****************** 0.0 ------------------ Multilevel ATTACAAAGCCCGCCAAA consensus T GA AT G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 36568 6 3.95e-10 GCTTT TTTACAAAGCCCGCCAAA ACAGAAAATA 40186 462 1.24e-09 CAACGAGCGA ATTACAAAGTACACCAAG CTGTAAGCTG 40189 462 1.24e-09 CAACGAGCGA ATTACAAAGTACACCAAG CTGTAAGCTG 48319 137 1.62e-08 AACACAACAA ATTCCAAACGCCGTCAAA AATCGGTCTG 39769 467 6.26e-08 GAAAAACTGG ATTTCACAGCCAGTCAAG GTGTAATTCG 40634 452 8.39e-08 ATTTCGGCCC TTGGCAAAGGTCGACAAA AGTTGTACAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36568 3.9e-10 5_[+1]_477 40186 1.2e-09 461_[+1]_21 40189 1.2e-09 461_[+1]_21 48319 1.6e-08 136_[+1]_346 39769 6.3e-08 466_[+1]_16 40634 8.4e-08 451_[+1]_31 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=18 seqs=6 36568 ( 6) TTTACAAAGCCCGCCAAA 1 40186 ( 462) ATTACAAAGTACACCAAG 1 40189 ( 462) ATTACAAAGTACACCAAG 1 48319 ( 137) ATTCCAAACGCCGTCAAA 1 39769 ( 467) ATTTCACAGCCAGTCAAG 1 40634 ( 452) TTGGCAAAGGTCGACAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 4347 bayes= 9.15728 E= 1.2e-001 139 -923 -923 18 -923 -923 -923 176 -923 -923 -37 150 97 -50 -37 -82 -923 209 -923 -923 197 -923 -923 -923 171 -50 -923 -923 197 -923 -923 -923 -923 -50 195 -923 -923 50 63 18 39 109 -923 -82 -61 182 -923 -923 39 -923 163 -923 -61 109 -923 18 -923 209 -923 -923 197 -923 -923 -923 197 -923 -923 -923 97 -923 121 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 6 E= 1.2e-001 0.666667 0.000000 0.000000 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.833333 0.500000 0.166667 0.166667 0.166667 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.333333 0.333333 0.333333 0.333333 0.500000 0.000000 0.166667 0.166667 0.833333 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.166667 0.500000 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AT]TTACAAAG[CGT][CA]C[GA][CT]CAA[AG] -------------------------------------------------------------------------------- Time 0.84 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 5 llr = 86 E-value = 2.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::a2::48:::2:8a2 pos.-specific C a:::28::4a:6a::8 probability G :a::8:::::a::2:: matrix T :::8:2626::2:::: bits 2.2 * * 2.0 *** ** * * 1.8 *** ** * * 1.5 *** * ** * * Relative 1.3 *** ** ** **** Entropy 1.1 ****** * ** **** (24.8 bits) 0.9 *********** **** 0.7 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CGATGCTATCGCCAAC consensus ACTATC A G A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 40186 316 1.82e-10 CTTTTGAGCC CGATGCTATCGCCAAC TCTTCTTCCC 40189 316 1.82e-10 CTTTTGAGCC CGATGCTATCGCCAAC TCTTCTTCCC 47158 411 1.55e-08 TCGTTCCCTG CGATGTTACCGACAAC AGAAATGGAA 33840 283 2.29e-08 GCTTGTGCGT CGATGCATTCGCCAAA CAAATCCGTA 48319 178 7.72e-08 GTCCTTTCGC CGAACCAACCGTCGAC CCCCTCTTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40186 1.8e-10 315_[+2]_169 40189 1.8e-10 315_[+2]_169 47158 1.6e-08 410_[+2]_74 33840 2.3e-08 282_[+2]_202 48319 7.7e-08 177_[+2]_307 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=5 40186 ( 316) CGATGCTATCGCCAAC 1 40189 ( 316) CGATGCTATCGCCAAC 1 47158 ( 411) CGATGTTACCGACAAC 1 33840 ( 283) CGATGCATTCGCCAAA 1 48319 ( 178) CGAACCAACCGTCGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4365 bayes= 10.02 E= 2.6e-001 -897 209 -897 -897 -897 -897 221 -897 197 -897 -897 -897 -35 -897 -897 144 -897 -23 189 -897 -897 176 -897 -56 65 -897 -897 103 165 -897 -897 -56 -897 77 -897 103 -897 209 -897 -897 -897 -897 221 -897 -35 135 -897 -56 -897 209 -897 -897 165 -897 -11 -897 197 -897 -897 -897 -35 176 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 2.6e-001 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.200000 0.800000 0.000000 0.000000 0.800000 0.000000 0.200000 0.400000 0.000000 0.000000 0.600000 0.800000 0.000000 0.000000 0.200000 0.000000 0.400000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.600000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CGA[TA][GC][CT][TA][AT][TC]CG[CAT]C[AG]A[CA] -------------------------------------------------------------------------------- Time 1.52 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 6 llr = 91 E-value = 3.3e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a2::8:2:::52:a: pos.-specific C :855:a::533:2:: probability G :::52::a:728::a matrix T ::5:::8:5:::8:: bits 2.2 * * 2.0 * * * ** 1.8 * * * ** 1.5 * * * * ** Relative 1.3 ** ** * * * ** Entropy 1.1 ** ***** * **** (21.9 bits) 0.9 ********** **** 0.7 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel ACCCACTGCGAGTAG consensus TG TCC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 40186 108 4.84e-09 CGAGTGGACC ACCCACTGCGCGTAG ACAGGACGTG 40189 108 4.84e-09 CGAGTGGACC ACCCACTGCGCGTAG ACAGGACGTG 40634 186 5.66e-09 AATTAGCAGG ACTGACTGTGAGTAG ACACTACGAT 33840 433 3.07e-08 GATTGATTGT ACTGACTGCGAGCAG TTCATCCTCC 39769 361 4.01e-07 ACCTTCGTTT ACTCACAGTCAATAG GAGTAACGGG 46006 191 4.92e-07 ATGTGAATAA AACGGCTGTCGGTAG TATATGCGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40186 4.8e-09 107_[+3]_378 40189 4.8e-09 107_[+3]_378 40634 5.7e-09 185_[+3]_300 33840 3.1e-08 432_[+3]_53 39769 4e-07 360_[+3]_125 46006 4.9e-07 190_[+3]_295 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=6 40186 ( 108) ACCCACTGCGCGTAG 1 40189 ( 108) ACCCACTGCGCGTAG 1 40634 ( 186) ACTGACTGTGAGTAG 1 33840 ( 433) ACTGACTGCGAGCAG 1 39769 ( 361) ACTCACAGTCAATAG 1 46006 ( 191) AACGGCTGTCGGTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4374 bayes= 9.95578 E= 3.3e-001 197 -923 -923 -923 -61 182 -923 -923 -923 109 -923 76 -923 109 121 -923 171 -923 -37 -923 -923 209 -923 -923 -61 -923 -923 150 -923 -923 221 -923 -923 109 -923 76 -923 50 163 -923 97 50 -37 -923 -61 -923 195 -923 -923 -50 -923 150 197 -923 -923 -923 -923 -923 221 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 6 E= 3.3e-001 1.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.500000 0.500000 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.333333 0.666667 0.000000 0.500000 0.333333 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.833333 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AC[CT][CG]ACTG[CT][GC][AC]GTAG -------------------------------------------------------------------------------- Time 2.22 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47158 1.14e-04 410_[+2(1.55e-08)]_74 48319 2.95e-08 136_[+1(1.62e-08)]_23_\ [+2(7.72e-08)]_307 40634 1.87e-08 185_[+3(5.66e-09)]_251_\ [+1(8.39e-08)]_31 33840 3.23e-08 282_[+2(2.29e-08)]_134_\ [+3(3.07e-08)]_53 46006 8.64e-03 190_[+3(4.92e-07)]_295 36568 2.36e-06 5_[+1(3.95e-10)]_477 40189 1.24e-16 107_[+3(4.84e-09)]_193_\ [+2(1.82e-10)]_130_[+1(1.24e-09)]_21 40186 1.24e-16 107_[+3(4.84e-09)]_193_\ [+2(1.82e-10)]_130_[+1(1.24e-09)]_21 39769 2.07e-07 360_[+3(4.01e-07)]_91_\ [+1(6.26e-08)]_16 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************