******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/422/422.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 48164 1.0000 500 43412 1.0000 500 49062 1.0000 500 49844 1.0000 500 44684 1.0000 500 36008 1.0000 500 36009 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/422/422.seqs.fa -oc motifs/422 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.280 C 0.237 G 0.211 T 0.272 Background letter frequencies (from dataset with add-one prior applied): A 0.280 C 0.237 G 0.211 T 0.272 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 5 llr = 71 E-value = 1.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::::a:2:::: pos.-specific C a:::8:a:a4a: probability G :8:82::8:4:a matrix T :2a2:::::2:: bits 2.2 * 2.0 * * * ** 1.8 * * ** * ** 1.6 * * ** * ** Relative 1.3 ********* ** Entropy 1.1 ********* ** (20.6 bits) 0.9 ********* ** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CGTGCACGCCCG consensus T TG A G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 36009 236 5.06e-08 TGTATTGGTG CGTGCACGCCCG CTTGCTTCTC 36008 236 5.06e-08 TGTATTGGTG CGTGCACGCCCG CTTGCTTCTC 49844 213 2.19e-07 TTCCAACTTC CGTTCACGCGCG GGGAGAGAGT 43412 469 3.51e-07 CAACATAAAC CGTGGACGCTCG AGGAAACTCA 49062 425 7.68e-07 AAAAATATGT CTTGCACACGCG GCATACCAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36009 5.1e-08 235_[+1]_253 36008 5.1e-08 235_[+1]_253 49844 2.2e-07 212_[+1]_276 43412 3.5e-07 468_[+1]_20 49062 7.7e-07 424_[+1]_64 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=5 36009 ( 236) CGTGCACGCCCG 1 36008 ( 236) CGTGCACGCCCG 1 49844 ( 213) CGTTCACGCGCG 1 43412 ( 469) CGTGGACGCTCG 1 49062 ( 425) CTTGCACACGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3423 bayes= 9.66888 E= 1.4e-001 -897 207 -897 -897 -897 -897 192 -44 -897 -897 -897 188 -897 -897 192 -44 -897 175 -8 -897 184 -897 -897 -897 -897 207 -897 -897 -48 -897 192 -897 -897 207 -897 -897 -897 75 92 -44 -897 207 -897 -897 -897 -897 224 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 5 E= 1.4e-001 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.400000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[GT]T[GT][CG]AC[GA]C[CGT]CG -------------------------------------------------------------------------------- Time 0.42 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 4 llr = 87 E-value = 6.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::5:::3::338:::::: pos.-specific C 8aa3a:aa8:a:53:a5a:a8 probability G 3::8::::38:3:5::3:8:3 matrix T :::::5:::::83:3:3:3:: bits 2.2 2.0 ** * ** * * * * 1.8 ** * ** * * * * 1.6 ** * ** * * * * Relative 1.3 ***** ***** * **** Entropy 1.1 ***** ****** ** **** (31.4 bits) 0.9 ************ ** **** 0.7 ************ ******** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CCCGCACCCGCTCGACCCGCC consensus G C T GA GAAT G T G sequence TC T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 36009 378 1.47e-13 CTTGTTGGCT CCCGCTCCCGCTCGACCCGCC TTTCGTTGAA 36008 378 2.75e-12 CTTGTTGGCT CCCGCTCCCACTCGACCCGCC TTTCGTCGAA 49844 400 2.40e-10 TCTACGAACG CCCGCACCCGCGAATCTCGCC GCCCAGCACC 43412 362 1.13e-09 GAATCAGCGC GCCCCACCGGCTTCACGCTCG AGGAAGTTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36009 1.5e-13 377_[+2]_102 36008 2.7e-12 377_[+2]_102 49844 2.4e-10 399_[+2]_80 43412 1.1e-09 361_[+2]_118 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=4 36009 ( 378) CCCGCTCCCGCTCGACCCGCC 1 36008 ( 378) CCCGCTCCCACTCGACCCGCC 1 49844 ( 400) CCCGCACCCGCGAATCTCGCC 1 43412 ( 362) GCCCCACCGGCTTCACGCTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3360 bayes= 9.71253 E= 6.4e-001 -865 166 24 -865 -865 207 -865 -865 -865 207 -865 -865 -865 8 182 -865 -865 207 -865 -865 84 -865 -865 88 -865 207 -865 -865 -865 207 -865 -865 -865 166 24 -865 -16 -865 182 -865 -865 207 -865 -865 -865 -865 24 146 -16 107 -865 -12 -16 8 124 -865 142 -865 -865 -12 -865 207 -865 -865 -865 107 24 -12 -865 207 -865 -865 -865 -865 182 -12 -865 207 -865 -865 -865 166 24 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 6.4e-001 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.250000 0.500000 0.000000 0.250000 0.250000 0.250000 0.500000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG]CC[GC]C[AT]CC[CG][GA]C[TG][CAT][GAC][AT]C[CGT]C[GT]C[CG] -------------------------------------------------------------------------------- Time 0.93 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 7 llr = 98 E-value = 6.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::37:16:71::aa pos.-specific C 1:4:6::919::3a:: probability G :4:9::::1:17:::: matrix T 966113a:11117::: bits 2.2 2.0 * 1.8 * *** 1.6 * * *** Relative 1.3 * * ** * *** Entropy 1.1 ** * ** * **** (20.2 bits) 0.9 **** *** * ***** 0.7 ******** ******* 0.4 ******** ******* 0.2 **************** 0.0 ---------------- Multilevel TTTGCATCACAGTCAA consensus GC AT C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 36009 26 6.39e-10 ATCGCGTGTT TGTGCATCACAGTCAA ACGCGTATGT 36008 26 6.39e-10 ATCGCGTGTT TGTGCATCACAGTCAA ACGCGTATGT 48164 22 1.66e-07 CGAATCTAGA TTTTTATCACAGTCAA TTTTAGTTTG 44684 338 2.53e-07 ATGCGTACAA TTCGCATCGTAGCCAA TCCAGATTCG 43412 446 2.75e-07 TTTCAGCGGA TTTGCTTCTCAATCAA CATAAACCGT 49844 313 5.67e-07 GTAATCACGA CTCGATTCACGGTCAA TCCGGTGGTC 49062 205 4.47e-06 CCAGATTGAC TGCGAATACCTTCCAA GGAATCCTAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36009 6.4e-10 25_[+3]_459 36008 6.4e-10 25_[+3]_459 48164 1.7e-07 21_[+3]_463 44684 2.5e-07 337_[+3]_147 43412 2.7e-07 445_[+3]_39 49844 5.7e-07 312_[+3]_172 49062 4.5e-06 204_[+3]_280 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=7 36009 ( 26) TGTGCATCACAGTCAA 1 36008 ( 26) TGTGCATCACAGTCAA 1 48164 ( 22) TTTTTATCACAGTCAA 1 44684 ( 338) TTCGCATCGTAGCCAA 1 43412 ( 446) TTTGCTTCTCAATCAA 1 49844 ( 313) CTCGATTCACGGTCAA 1 49062 ( 205) TGCGAATACCTTCCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 3395 bayes= 8.91886 E= 6.6e-001 -945 -73 -945 166 -945 -945 102 107 -945 85 -945 107 -945 -945 202 -93 3 127 -945 -93 135 -945 -945 7 -945 -945 -945 188 -97 185 -945 -945 103 -73 -56 -93 -945 185 -945 -93 135 -945 -56 -93 -97 -945 175 -93 -945 27 -945 139 -945 207 -945 -945 184 -945 -945 -945 184 -945 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 6.6e-001 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.428571 0.571429 0.000000 0.428571 0.000000 0.571429 0.000000 0.000000 0.857143 0.142857 0.285714 0.571429 0.000000 0.142857 0.714286 0.000000 0.000000 0.285714 0.000000 0.000000 0.000000 1.000000 0.142857 0.857143 0.000000 0.000000 0.571429 0.142857 0.142857 0.142857 0.000000 0.857143 0.000000 0.142857 0.714286 0.000000 0.142857 0.142857 0.142857 0.000000 0.714286 0.142857 0.000000 0.285714 0.000000 0.714286 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[TG][TC]G[CA][AT]TCACAG[TC]CAA -------------------------------------------------------------------------------- Time 1.41 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48164 8.78e-04 21_[+3(1.66e-07)]_463 43412 6.79e-12 361_[+2(1.13e-09)]_63_\ [+3(2.75e-07)]_7_[+1(3.51e-07)]_20 49062 2.68e-05 204_[+3(4.47e-06)]_204_\ [+1(7.68e-07)]_64 49844 2.00e-12 212_[+1(2.19e-07)]_88_\ [+3(5.67e-07)]_71_[+2(2.40e-10)]_80 44684 2.97e-03 337_[+3(2.53e-07)]_147 36008 1.12e-17 25_[+3(6.39e-10)]_194_\ [+1(5.06e-08)]_102_[+3(6.73e-05)]_12_[+2(2.75e-12)]_102 36009 6.75e-19 25_[+3(6.39e-10)]_194_\ [+1(5.06e-08)]_102_[+3(6.73e-05)]_12_[+2(1.47e-13)]_102 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************