******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/474/474.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 53961 1.0000 500 13230 1.0000 500 944 1.0000 500 34415 1.0000 500 45662 1.0000 500 46013 1.0000 500 46239 1.0000 500 44073 1.0000 500 50572 1.0000 500 33054 1.0000 500 48943 1.0000 500 50187 1.0000 500 50184 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/474/474.seqs.fa -oc motifs/474 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.296 C 0.228 G 0.204 T 0.272 Background letter frequencies (from dataset with add-one prior applied): A 0.296 C 0.228 G 0.204 T 0.272 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 13 llr = 121 E-value = 1.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::::::1:25: pos.-specific C 11::4:54a215 probability G ::::2:22:221 matrix T 99aa5a33:534 bits 2.3 2.1 * 1.8 ** * * 1.6 **** * * Relative 1.4 **** * * Entropy 1.1 **** * * (13.4 bits) 0.9 **** * * 0.7 ******* * * 0.5 ******* * * 0.2 ************ 0.0 ------------ Multilevel TTTTTTCCCTAC consensus C TT TT sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46013 274 1.62e-07 GAGCGAGCAA TTTTCTCCCTAC TCACTCCATA 50572 98 2.23e-06 AATGTCAATT TTTTTTCGCTAT TTGGCTCTGG 53961 21 2.23e-06 TTGTATTTTC TTTTTTCGCTAT CGAATTAAAT 45662 218 4.87e-06 TCCTTAGTCT TTTTCTCTCGAC TGTTATATAG 50184 218 1.27e-05 GAAAGACATA TTTTCTGCCTTT TTGTCAAAAG 33054 439 1.61e-05 ATTCCATAAA TTTTTTCGCATC CTTACAGTTG 50187 222 2.40e-05 TCTCGTTCGG TTTTTTCTCAGC AGTCATTTTC 46239 104 3.66e-05 TTTGATTGCC TTTTCTTCCTTG CCTTCGTGGT 34415 216 4.04e-05 CGCTCAGAAT TTTTTTTACTTC CACCCAAAAT 48943 192 6.45e-05 GATACAACGG TTTTGTTCCTCT TAAGCCTTAT 13230 407 7.20e-05 ACGGTCCTCC TTTTGTTTCCGC GTCAAGTAAG 44073 328 9.49e-05 CATTTGGGGA CTTTCTCTCGAT TGTTTTGTCT 944 332 1.15e-04 ACAATCTTGA TCTTTTGCCCAC ACCTGGCGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46013 1.6e-07 273_[+1]_215 50572 2.2e-06 97_[+1]_391 53961 2.2e-06 20_[+1]_468 45662 4.9e-06 217_[+1]_271 50184 1.3e-05 217_[+1]_271 33054 1.6e-05 438_[+1]_50 50187 2.4e-05 221_[+1]_267 46239 3.7e-05 103_[+1]_385 34415 4e-05 215_[+1]_273 48943 6.4e-05 191_[+1]_297 13230 7.2e-05 406_[+1]_82 44073 9.5e-05 327_[+1]_161 944 0.00012 331_[+1]_157 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=13 46013 ( 274) TTTTCTCCCTAC 1 50572 ( 98) TTTTTTCGCTAT 1 53961 ( 21) TTTTTTCGCTAT 1 45662 ( 218) TTTTCTCTCGAC 1 50184 ( 218) TTTTCTGCCTTT 1 33054 ( 439) TTTTTTCGCATC 1 50187 ( 222) TTTTTTCTCAGC 1 46239 ( 104) TTTTCTTCCTTG 1 34415 ( 216) TTTTTTTACTTC 1 48943 ( 192) TTTTGTTCCTCT 1 13230 ( 407) TTTTGTTTCCGC 1 44073 ( 328) CTTTCTCTCGAT 1 944 ( 332) TCTTTTGCCCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 8.93074 E= 1.0e+001 -1035 -157 -1035 176 -1035 -157 -1035 176 -1035 -1035 -1035 188 -1035 -1035 -1035 188 -1035 75 -41 76 -1035 -1035 -1035 188 -1035 124 -41 18 -194 75 18 18 -1035 213 -1035 -1035 -94 -57 -41 99 64 -157 -41 18 -1035 124 -141 50 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 1.0e+001 0.000000 0.076923 0.000000 0.923077 0.000000 0.076923 0.000000 0.923077 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.384615 0.153846 0.461538 0.000000 0.000000 0.000000 1.000000 0.000000 0.538462 0.153846 0.307692 0.076923 0.384615 0.230769 0.307692 0.000000 1.000000 0.000000 0.000000 0.153846 0.153846 0.153846 0.538462 0.461538 0.076923 0.153846 0.307692 0.000000 0.538462 0.076923 0.384615 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TTTT[TC]T[CT][CTG]CT[AT][CT] -------------------------------------------------------------------------------- Time 1.61 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 5 llr = 83 E-value = 1.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::4:6:a:::::8:a pos.-specific C 8:26:2:::8::::8: probability G ::8:6:a:2:a6a:2: matrix T 2a::42::82:4:2:: bits 2.3 * * * 2.1 * * * 1.8 * ** * * * 1.6 ** ** * * * Relative 1.4 *** ** ** * ** Entropy 1.1 *** * ********** (24.1 bits) 0.9 ***** ********** 0.7 ***** ********** 0.5 **************** 0.2 **************** 0.0 ---------------- Multilevel CTGCGAGATCGGGACA consensus T CATC GT T TG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 50187 178 3.33e-09 GCCAAAAACC CTGCTAGATCGGGTCA TTTATCAGAG 50184 49 8.28e-09 AACCAGTAGA CTGAGAGAGCGTGACA TTTTTGGGTC 944 55 1.25e-08 GAAATAAGCG CTGCTCGATCGGGAGA CATTCGGGCC 45662 485 1.41e-08 AGGTAATCTC TTGAGAGATCGTGACA 50572 42 4.06e-08 GACCTGGAGT CTCCGTGATTGGGACA ACATTGGCCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50187 3.3e-09 177_[+2]_307 50184 8.3e-09 48_[+2]_436 944 1.3e-08 54_[+2]_430 45662 1.4e-08 484_[+2] 50572 4.1e-08 41_[+2]_443 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=5 50187 ( 178) CTGCTAGATCGGGTCA 1 50184 ( 49) CTGAGAGAGCGTGACA 1 944 ( 55) CTGCTCGATCGGGAGA 1 45662 ( 485) TTGAGAGATCGTGACA 1 50572 ( 42) CTCCGTGATTGGGACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 9.73306 E= 1.0e+001 -897 181 -897 -44 -897 -897 -897 188 -897 -19 197 -897 43 139 -897 -897 -897 -897 155 56 102 -19 -897 -44 -897 -897 229 -897 176 -897 -897 -897 -897 -897 -3 156 -897 181 -897 -44 -897 -897 229 -897 -897 -897 155 56 -897 -897 229 -897 143 -897 -897 -44 -897 181 -3 -897 176 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 1.0e+001 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.800000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 0.000000 0.600000 0.400000 0.600000 0.200000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CT]T[GC][CA][GT][ACT]GA[TG][CT]G[GT]G[AT][CG]A -------------------------------------------------------------------------------- Time 3.25 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 7 llr = 84 E-value = 4.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::3a:a:::74: pos.-specific C 714:a:::6::9 probability G ::1:::a::361 matrix T 391::::a4::: bits 2.3 * 2.1 * * 1.8 ***** 1.6 ***** * Relative 1.4 * ***** * Entropy 1.1 ** ********* (17.2 bits) 0.9 ** ********* 0.7 ** ********* 0.5 ** ********* 0.2 ************ 0.0 ------------ Multilevel CTCACAGTCAGC consensus T A TGA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 45662 382 4.20e-07 CAAACAATCG TTCACAGTCAGC CCCTAATCAC 50184 174 6.75e-07 TTTACATGCC CTCACAGTTAAC TAGCTGTAAA 48943 455 6.75e-07 AACTTTGCGA CTAACAGTTAGC ATTAGTAGTT 33054 451 2.58e-06 TTTTCGCATC CTTACAGTTGGC TGAATAGAAA 50187 371 3.23e-06 CGCCTCCTTA CTCACAGTCAAG GTTCATATTG 53961 381 3.79e-06 GGCTGATTGC TTGACAGTCAAC GCAACTGGAT 13230 131 4.19e-06 ACGGTCAACA CCAACAGTCGGC GGCTGCGGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45662 4.2e-07 381_[+3]_107 50184 6.7e-07 173_[+3]_315 48943 6.7e-07 454_[+3]_34 33054 2.6e-06 450_[+3]_38 50187 3.2e-06 370_[+3]_118 53961 3.8e-06 380_[+3]_108 13230 4.2e-06 130_[+3]_358 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=7 45662 ( 382) TTCACAGTCAGC 1 50184 ( 174) CTCACAGTTAAC 1 48943 ( 455) CTAACAGTTAGC 1 33054 ( 451) CTTACAGTTGGC 1 50187 ( 371) CTCACAGTCAAG 1 53961 ( 381) TTGACAGTCAAC 1 13230 ( 131) CCAACAGTCGGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 9.66888 E= 4.8e+002 -945 164 -945 7 -945 -68 -945 166 -5 91 -52 -92 176 -945 -945 -945 -945 213 -945 -945 176 -945 -945 -945 -945 -945 229 -945 -945 -945 -945 188 -945 132 -945 66 127 -945 48 -945 53 -945 148 -945 -945 191 -52 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 4.8e+002 0.000000 0.714286 0.000000 0.285714 0.000000 0.142857 0.000000 0.857143 0.285714 0.428571 0.142857 0.142857 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.571429 0.000000 0.428571 0.714286 0.000000 0.285714 0.000000 0.428571 0.000000 0.571429 0.000000 0.000000 0.857143 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CT]T[CA]ACAGT[CT][AG][GA]C -------------------------------------------------------------------------------- Time 4.87 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 53961 7.59e-05 20_[+1(2.23e-06)]_348_\ [+3(3.79e-06)]_108 13230 9.58e-04 130_[+3(4.19e-06)]_264_\ [+1(7.20e-05)]_82 944 2.42e-05 54_[+2(1.25e-08)]_430 34415 1.63e-01 215_[+1(4.04e-05)]_273 45662 1.26e-09 217_[+1(4.87e-06)]_152_\ [+3(4.20e-07)]_91_[+2(1.41e-08)] 46013 4.23e-03 273_[+1(1.62e-07)]_215 46239 1.51e-01 103_[+1(3.66e-05)]_385 44073 3.89e-02 327_[+1(9.49e-05)]_144_\ [+3(6.19e-05)]_5 50572 9.98e-07 41_[+2(4.06e-08)]_40_[+1(2.23e-06)]_\ 391 33054 3.37e-04 438_[+1(1.61e-05)]_[+3(2.58e-06)]_\ 38 48943 7.78e-04 191_[+1(6.45e-05)]_251_\ [+3(6.75e-07)]_34 50187 9.50e-09 177_[+2(3.33e-09)]_28_\ [+1(2.40e-05)]_137_[+3(3.23e-06)]_118 50184 2.90e-09 48_[+2(8.28e-09)]_109_\ [+3(6.75e-07)]_32_[+1(1.27e-05)]_271 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************