******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/35/35.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42478 1.0000 500 28684 1.0000 500 37922 1.0000 500 47655 1.0000 500 50489 1.0000 500 34431 1.0000 500 46140 1.0000 500 46157 1.0000 500 46165 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/35/35.seqs.fa -oc motifs/35 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.276 C 0.218 G 0.223 T 0.282 Background letter frequencies (from dataset with add-one prior applied): A 0.276 C 0.218 G 0.223 T 0.282 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 18 sites = 5 llr = 94 E-value = 6.7e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::::::::a622:888 pos.-specific C :a2:a28:6::4::a::2 probability G a::6::::4a::48:2:: matrix T ::84:82a::::4:::2: bits 2.2 ** * * * 2.0 ** * * * 1.8 ** * * ** * 1.5 ** * * ** * Relative 1.3 ** * ***** ** Entropy 1.1 ************ ***** (27.0 bits) 0.9 ************ ***** 0.7 ************ ***** 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel GCTGCTCTCGAAGGCAAA consensus CT CT G CTA GTC sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 47655 163 1.18e-10 TGTTACCTTA GCTTCTCTCGAATGCAAA AATCGCTATG 37922 163 1.18e-10 TGTTACCTTA GCTTCTCTCGAATGCAAA AATCGCTATG 42478 82 1.90e-09 TCGGGAAGGG GCTGCTCTCGACGGCATC GCCGATCCTG 46165 106 6.60e-09 TTCACGGTTA GCTGCTTTGGAAAGCGAA ATATTGTCAG 50489 250 8.46e-09 ACGCGGAGTT GCCGCCCTGGACGACAAA GGTTCGGGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47655 1.2e-10 162_[+1]_320 37922 1.2e-10 162_[+1]_320 42478 1.9e-09 81_[+1]_401 46165 6.6e-09 105_[+1]_377 50489 8.5e-09 249_[+1]_233 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=18 seqs=5 47655 ( 163) GCTTCTCTCGAATGCAAA 1 37922 ( 163) GCTTCTCTCGAATGCAAA 1 42478 ( 82) GCTGCTCTCGACGGCATC 1 46165 ( 106) GCTGCTTTGGAAAGCGAA 1 50489 ( 250) GCCGCCCTGGACGACAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 4347 bayes= 9.19582 E= 6.7e-002 -897 -897 216 -897 -897 220 -897 -897 -897 -12 -897 150 -897 -897 143 50 -897 220 -897 -897 -897 -12 -897 150 -897 187 -897 -50 -897 -897 -897 182 -897 146 84 -897 -897 -897 216 -897 185 -897 -897 -897 112 87 -897 -897 -47 -897 84 50 -47 -897 184 -897 -897 220 -897 -897 153 -897 -16 -897 153 -897 -897 -50 153 -12 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 5 E= 6.7e-002 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.600000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.200000 0.000000 0.400000 0.400000 0.200000 0.000000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.200000 0.800000 0.200000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GC[TC][GT]C[TC][CT]T[CG]GA[AC][GTA][GA]C[AG][AT][AC] -------------------------------------------------------------------------------- Time 0.88 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 8 llr = 93 E-value = 4.2e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 6::1:9::6911 pos.-specific C 3::94::8::99 probability G ::::6::331:: matrix T 1aa::1a:1::: bits 2.2 2.0 1.8 ** * 1.5 *** * ** Relative 1.3 ******* *** Entropy 1.1 ******* *** (16.8 bits) 0.9 ******* *** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel ATTCGATCAACC consensus C C GG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47655 364 1.30e-07 GAGCATTGCA ATTCCATCAACC TTTGATGATG 37922 364 1.30e-07 GAGCATTGCA ATTCCATCAACC TTTGATGATG 50489 368 1.81e-07 CCTCGGCTTC CTTCGATCAACC CTGCACAAAC 28684 485 6.45e-07 ATCCGATCCA ATTCGATCTACC AACA 34431 92 8.46e-07 TCCCAAGAAA ATTCGATGGACC ATGCTGAGCG 46157 152 8.17e-06 AAAACAGTTG TTTCCATCAACA ACAGTAAGTA 46140 197 1.12e-05 AAAAGTTAAT CTTAGTTCAACC CAAATGTCTT 42478 349 2.01e-05 GTCAGTTTGA ATTCGATGGGAC GGTCATACCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47655 1.3e-07 363_[+2]_125 37922 1.3e-07 363_[+2]_125 50489 1.8e-07 367_[+2]_121 28684 6.5e-07 484_[+2]_4 34431 8.5e-07 91_[+2]_397 46157 8.2e-06 151_[+2]_337 46140 1.1e-05 196_[+2]_292 42478 2e-05 348_[+2]_140 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=8 47655 ( 364) ATTCCATCAACC 1 37922 ( 364) ATTCCATCAACC 1 50489 ( 368) CTTCGATCAACC 1 28684 ( 485) ATTCGATCTACC 1 34431 ( 92) ATTCGATGGACC 1 46157 ( 152) TTTCCATCAACA 1 46140 ( 197) CTTAGTTCAACC 1 42478 ( 349) ATTCGATGGGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4401 bayes= 9.83901 E= 4.2e-001 118 20 -965 -117 -965 -965 -965 182 -965 -965 -965 182 -114 200 -965 -965 -965 78 148 -965 166 -965 -965 -117 -965 -965 -965 182 -965 178 16 -965 118 -965 16 -117 166 -965 -83 -965 -114 200 -965 -965 -114 200 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 4.2e-001 0.625000 0.250000 0.000000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.125000 0.875000 0.000000 0.000000 0.000000 0.375000 0.625000 0.000000 0.875000 0.000000 0.000000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.625000 0.000000 0.250000 0.125000 0.875000 0.000000 0.125000 0.000000 0.125000 0.875000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AC]TTC[GC]AT[CG][AG]ACC -------------------------------------------------------------------------------- Time 1.60 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 4 llr = 87 E-value = 1.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::338:::8::::a5a:::3 pos.-specific C a:83::533:a5a:5:::a: probability G :a::3558::::::::a::8 matrix T :::5:5:::a:5:::::a:: bits 2.2 ** * * * * 2.0 ** * * * * 1.8 ** ** ** **** 1.5 ** ** ** **** Relative 1.3 *** * ** ** ***** Entropy 1.1 *** **************** (31.3 bits) 0.9 *** **************** 0.7 *** **************** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel CGCTAGCGATCCCAAAGTCG consensus AAGTGCC T C A sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 47655 184 6.53e-12 ATGCAAAAAT CGCTATGGATCCCAAAGTCG TGTCCCTCCG 37922 184 6.53e-12 ATGCAAAAAT CGCTATGGATCCCAAAGTCG TGTCCCTCCG 50489 131 3.04e-10 TCCATGTGAT CGAAGGCGATCTCACAGTCG AGATGGCGGA 42478 292 4.72e-10 TCAGACGATA CGCCAGCCCTCTCACAGTCA ACGCGCCATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47655 6.5e-12 183_[+3]_297 37922 6.5e-12 183_[+3]_297 50489 3e-10 130_[+3]_350 42478 4.7e-10 291_[+3]_189 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=4 47655 ( 184) CGCTATGGATCCCAAAGTCG 1 37922 ( 184) CGCTATGGATCCCAAAGTCG 1 50489 ( 131) CGAAGGCGATCTCACAGTCG 1 42478 ( 292) CGCCAGCCCTCTCACAGTCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4329 bayes= 10.0785 E= 1.2e+000 -865 219 -865 -865 -865 -865 216 -865 -14 178 -865 -865 -14 20 -865 82 144 -865 16 -865 -865 -865 116 82 -865 120 116 -865 -865 20 175 -865 144 20 -865 -865 -865 -865 -865 182 -865 219 -865 -865 -865 120 -865 82 -865 219 -865 -865 185 -865 -865 -865 85 120 -865 -865 185 -865 -865 -865 -865 -865 216 -865 -865 -865 -865 182 -865 219 -865 -865 -14 -865 175 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 1.2e+000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 0.250000 0.000000 0.500000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.500000 0.500000 0.000000 0.000000 0.250000 0.750000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CG[CA][TAC][AG][GT][CG][GC][AC]TC[CT]CA[AC]AGTC[GA] -------------------------------------------------------------------------------- Time 2.44 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42478 1.24e-12 81_[+1(1.90e-09)]_192_\ [+3(4.72e-10)]_37_[+2(2.01e-05)]_140 28684 6.56e-03 484_[+2(6.45e-07)]_4 37922 1.25e-17 162_[+1(1.18e-10)]_3_[+3(6.53e-12)]_\ 160_[+2(1.30e-07)]_91_[+2(2.04e-05)]_22 47655 1.25e-17 162_[+1(1.18e-10)]_3_[+3(6.53e-12)]_\ 160_[+2(1.30e-07)]_91_[+2(2.04e-05)]_22 50489 3.92e-14 130_[+3(3.04e-10)]_99_\ [+1(8.46e-09)]_100_[+2(1.81e-07)]_121 34431 5.79e-03 91_[+2(8.46e-07)]_397 46140 3.68e-02 196_[+2(1.12e-05)]_292 46157 4.11e-02 151_[+2(8.17e-06)]_337 46165 1.04e-04 105_[+1(6.60e-09)]_377 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************