******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/136/136.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9124 1.0000 500 42907 1.0000 500 8802 1.0000 500 55013 1.0000 500 43850 1.0000 500 12757 1.0000 500 34876 1.0000 500 47348 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/136/136.seqs.fa -oc motifs/136 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4000 N= 8 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.278 C 0.239 G 0.217 T 0.266 Background letter frequencies (from dataset with add-one prior applied): A 0.278 C 0.239 G 0.217 T 0.266 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 8 llr = 86 E-value = 3.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 9:::6:43::58 pos.-specific C :::949:::a:1 probability G 151:::18::5: matrix T :591:15:a::1 bits 2.2 2.0 ** 1.8 ** 1.5 * * ** Relative 1.3 * ** * *** Entropy 1.1 **** * **** (15.4 bits) 0.9 ****** ***** 0.7 ****** ***** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AGTCACTGTCAA consensus T C AA G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 12757 395 3.05e-07 GCCCACAGTC AGTCCCTGTCGA ACGAAGTCTT 43850 437 3.94e-07 TACTCTGTCC ATTCACTGTCAA TGCCTTTTTC 55013 198 6.63e-07 AGTACTCGAT AGTCACAGTCAA ACCTAATTCC 47348 355 3.50e-06 CAGTCAAGTA AGGCCCTGTCGA AGGACGAGGC 9124 354 7.78e-06 CAACGCTGTC ATTCACAGTCAT ATCATCTTTG 8802 450 1.27e-05 GCACAAATGC ATTCATGGTCGA GACTCAAATA 34876 73 2.18e-05 TGGAAGTCTT AGTTCCTATCAA TCACTGTAAG 42907 352 4.26e-05 GAATCAGCCT GTTCACAATCGC AATATGGATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12757 3.1e-07 394_[+1]_94 43850 3.9e-07 436_[+1]_52 55013 6.6e-07 197_[+1]_291 47348 3.5e-06 354_[+1]_134 9124 7.8e-06 353_[+1]_135 8802 1.3e-05 449_[+1]_39 34876 2.2e-05 72_[+1]_416 42907 4.3e-05 351_[+1]_137 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=8 12757 ( 395) AGTCCCTGTCGA 1 43850 ( 437) ATTCACTGTCAA 1 55013 ( 198) AGTCACAGTCAA 1 47348 ( 355) AGGCCCTGTCGA 1 9124 ( 354) ATTCACAGTCAT 1 8802 ( 450) ATTCATGGTCGA 1 34876 ( 73) AGTTCCTATCAA 1 42907 ( 352) GTTCACAATCGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3912 bayes= 8.93074 E= 3.4e+002 165 -965 -79 -965 -965 -965 120 91 -965 -965 -79 172 -965 187 -965 -109 117 65 -965 -965 -965 187 -965 -109 43 -965 -79 91 -15 -965 179 -965 -965 -965 -965 191 -965 206 -965 -965 85 -965 120 -965 143 -93 -965 -109 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 3.4e+002 0.875000 0.000000 0.125000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.125000 0.875000 0.000000 0.875000 0.000000 0.125000 0.625000 0.375000 0.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.375000 0.000000 0.125000 0.500000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.750000 0.125000 0.000000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[GT]TC[AC]C[TA][GA]TC[AG]A -------------------------------------------------------------------------------- Time 0.62 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 18 sites = 5 llr = 82 E-value = 4.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::84:266::8::2a:2: pos.-specific C a2246824622::2:::: probability G :8::4:2::8:6a6:a2a matrix T :::2::::4::4::::6: bits 2.2 * * * 2.0 * * * * 1.8 * * ** * 1.5 ** * * ** * Relative 1.3 ** * * * ** * Entropy 1.1 *** ** ***** ** * (23.7 bits) 0.9 *** ** ****** ** * 0.7 *** ************** 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel CGAACCAACGAGGGAGTG consensus CCCGACCTCCT A A sequence T G C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 34876 305 2.81e-09 GTCGTGGTTT CGACGCCATGATGGAGTG CTGCGAACAC 12757 159 4.50e-09 ATTACCTTAA CGATCCGACGAGGGAGGG GGCCAACTGA 42907 92 1.08e-08 TTGGACAGGG CCAACCACCGAGGAAGTG TACGTACAAT 43850 133 4.68e-08 AGTGTGTCAT CGACCCACTCCGGGAGAG GTCCTATATG 8802 420 6.41e-08 CGATGGCACA CGCAGAAACGATGCAGTG TGGCACAAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34876 2.8e-09 304_[+2]_178 12757 4.5e-09 158_[+2]_324 42907 1.1e-08 91_[+2]_391 43850 4.7e-08 132_[+2]_350 8802 6.4e-08 419_[+2]_63 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=18 seqs=5 34876 ( 305) CGACGCCATGATGGAGTG 1 12757 ( 159) CGATCCGACGAGGGAGGG 1 42907 ( 92) CCAACCACCGAGGAAGTG 1 43850 ( 133) CGACCCACTCCGGGAGAG 1 8802 ( 420) CGCAGAAACGATGCAGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 3864 bayes= 9.84392 E= 4.0e+002 -897 206 -897 -897 -897 -26 188 -897 152 -26 -897 -897 53 74 -897 -41 -897 133 88 -897 -47 174 -897 -897 111 -26 -12 -897 111 74 -897 -897 -897 133 -897 59 -897 -26 188 -897 152 -26 -897 -897 -897 -897 147 59 -897 -897 220 -897 -47 -26 147 -897 185 -897 -897 -897 -897 -897 220 -897 -47 -897 -12 117 -897 -897 220 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 5 E= 4.0e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.800000 0.200000 0.000000 0.000000 0.400000 0.400000 0.000000 0.200000 0.000000 0.600000 0.400000 0.000000 0.200000 0.800000 0.000000 0.000000 0.600000 0.200000 0.200000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.200000 0.800000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 1.000000 0.000000 0.200000 0.200000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.200000 0.600000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[GC][AC][ACT][CG][CA][ACG][AC][CT][GC][AC][GT]G[GAC]AG[TAG]G -------------------------------------------------------------------------------- Time 1.29 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 4 llr = 63 E-value = 7.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::::::3:3:::8a pos.-specific C a:a33:83::a:3:: probability G :::88a::3::883: matrix T :a::::3588:3::: bits 2.2 * 2.0 *** * * 1.8 *** * * * 1.5 *** * * * Relative 1.3 ******* *** * Entropy 1.1 ******* ******* (22.6 bits) 0.9 ******* ******* 0.7 ******* ******* 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CTCGGGCTTTCGGAA consensus CC TAGA TCG sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 12757 307 6.82e-09 ATAAAATGAC CTCGGGCTTTCTGAA ACATGTTTTG 43850 322 4.03e-08 CTGTGCTTCT CTCGGGTCTTCGGGA AGTGTCAAAT 47348 311 7.38e-08 GTGGATGGTG CTCGGGCTGACGCAA CGTCCAAACT 8802 241 7.38e-08 ACCGGATTCA CTCCCGCATTCGGAA CCAAGCGTAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12757 6.8e-09 306_[+3]_179 43850 4e-08 321_[+3]_164 47348 7.4e-08 310_[+3]_175 8802 7.4e-08 240_[+3]_245 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=4 12757 ( 307) CTCGGGCTTTCTGAA 1 43850 ( 322) CTCGGGTCTTCGGGA 1 47348 ( 311) CTCGGGCTGACGCAA 1 8802 ( 241) CTCCCGCATTCGGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 3888 bayes= 9.92333 E= 7.9e+002 -865 206 -865 -865 -865 -865 -865 191 -865 206 -865 -865 -865 6 179 -865 -865 6 179 -865 -865 -865 220 -865 -865 165 -865 -9 -15 6 -865 91 -865 -865 20 149 -15 -865 -865 149 -865 206 -865 -865 -865 -865 179 -9 -865 6 179 -865 143 -865 20 -865 185 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 4 E= 7.9e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.250000 0.250000 0.000000 0.500000 0.000000 0.000000 0.250000 0.750000 0.250000 0.000000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.250000 0.750000 0.000000 0.750000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CTC[GC][GC]G[CT][TAC][TG][TA]C[GT][GC][AG]A -------------------------------------------------------------------------------- Time 1.90 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9124 2.81e-02 353_[+1(7.78e-06)]_135 42907 3.25e-06 91_[+2(1.08e-08)]_242_\ [+1(4.26e-05)]_137 8802 2.45e-09 240_[+3(7.38e-08)]_164_\ [+2(6.41e-08)]_12_[+1(1.27e-05)]_39 55013 5.91e-03 197_[+1(6.63e-07)]_291 43850 4.12e-11 132_[+2(4.68e-08)]_171_\ [+3(4.03e-08)]_100_[+1(3.94e-07)]_52 12757 6.78e-13 158_[+2(4.50e-09)]_86_\ [+3(8.73e-05)]_29_[+3(6.82e-09)]_73_[+1(3.05e-07)]_94 34876 2.11e-06 72_[+1(2.18e-05)]_220_\ [+2(2.81e-09)]_178 47348 5.24e-06 310_[+3(7.38e-08)]_29_\ [+1(3.50e-06)]_134 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************