******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/319/319.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 28191 1.0000 500 36848 1.0000 500 46904 1.0000 500 13897 1.0000 500 14327 1.0000 500 48159 1.0000 500 48362 1.0000 500 44012 1.0000 500 52750 1.0000 500 45504 1.0000 500 45802 1.0000 500 43513 1.0000 500 45870 1.0000 500 40662 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/319/319.seqs.fa -oc motifs/319 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.273 C 0.234 G 0.230 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.273 C 0.234 G 0.230 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 108 E-value = 4.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :75::3:28::223::3:::2 pos.-specific C 7::::7:::8:52:25:55:8 probability G 323:a:28:2a373::75:a: matrix T :22a::8:2::::385::5:: bits 2.1 * * * 1.9 ** * * 1.7 ** * * 1.5 ** * ** ** Relative 1.3 * ** ***** * ** Entropy 1.1 * ******** ******* (26.0 bits) 0.8 * ******** * ******* 0.6 ** ********** ******* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CAATGCTGACGCGATCGCCGC consensus G G A G G TAGT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45504 53 1.11e-10 TGCTGGACAC GAATGCTGACGCGTTCACTGC GGATGATCGA 52750 113 1.19e-09 CAACACCAAA CAATGCTGAGGCGATCGGCGA GAATAGAGAA 45802 133 1.45e-09 TGCTGGCAGC CTGTGCGGACGGGTTCGCCGC TCTTGTTGCA 48362 208 3.00e-09 GTGGGAACTG GGATGCTGACGGAGTTGCTGC AAATTTGGAC 48159 37 1.29e-08 GGGTAGCAGT CATTGATGACGCCGCTAGCGC TCGGGAATCG 13897 11 1.73e-08 AAATACCGCG CAGTGATATCGAGATTGGTGC CTTACTTGGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45504 1.1e-10 52_[+1]_427 52750 1.2e-09 112_[+1]_367 45802 1.4e-09 132_[+1]_347 48362 3e-09 207_[+1]_272 48159 1.3e-08 36_[+1]_443 13897 1.7e-08 10_[+1]_469 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 45504 ( 53) GAATGCTGACGCGTTCACTGC 1 52750 ( 113) CAATGCTGAGGCGATCGGCGA 1 45802 ( 133) CTGTGCGGACGGGTTCGCCGC 1 48362 ( 208) GGATGCTGACGGAGTTGCTGC 1 48159 ( 37) CATTGATGACGCCGCTAGCGC 1 13897 ( 11) CAGTGATATCGAGATTGGTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6720 bayes= 10.5758 E= 4.1e+001 -923 151 54 -923 129 -923 -46 -66 87 -923 54 -66 -923 -923 -923 192 -923 -923 212 -923 29 151 -923 -923 -923 -923 -46 166 -71 -923 186 -923 161 -923 -923 -66 -923 183 -46 -923 -923 -923 212 -923 -71 109 54 -923 -71 -49 154 -923 29 -923 54 34 -923 -49 -923 166 -923 109 -923 92 29 -923 154 -923 -923 109 112 -923 -923 109 -923 92 -923 -923 212 -923 -71 183 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 4.1e+001 0.000000 0.666667 0.333333 0.000000 0.666667 0.000000 0.166667 0.166667 0.500000 0.000000 0.333333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.166667 0.000000 0.833333 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.500000 0.333333 0.000000 0.166667 0.166667 0.666667 0.000000 0.333333 0.000000 0.333333 0.333333 0.000000 0.166667 0.000000 0.833333 0.000000 0.500000 0.000000 0.500000 0.333333 0.000000 0.666667 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG]A[AG]TG[CA]TGACG[CG]G[AGT]T[CT][GA][CG][CT]GC -------------------------------------------------------------------------------- Time 1.73 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 7 llr = 96 E-value = 8.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::3:::::1::::11a pos.-specific C :33:17:6::9:9:4: probability G 9:31:3a141:9:94: matrix T 17199::349111::: bits 2.1 * 1.9 * * 1.7 * * 1.5 * * **** * Relative 1.3 * **** ***** * Entropy 1.1 ** **** ***** * (19.8 bits) 0.8 ** **** ***** * 0.6 ** ************* 0.4 ** ************* 0.2 ** ************* 0.0 ---------------- Multilevel GTATTCGCGTCGCGCA consensus CC G TT G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 52750 252 6.03e-09 TTTTCTGAGC GTCTTGGCGTCGCGGA ATGAGACCAC 48159 243 4.42e-08 GACTGTGACA GCATTCGGGTCGCGCA TCGTGAAATG 45802 55 2.18e-07 AATGTTGGTG GTGGTCGCTTCTCGGA ATCACATGAA 48362 59 2.62e-07 TAGTACTTTT GCATTCGTTTTGCGGA AATCTTTCGA 45870 453 3.09e-07 AATCCAACCT GTTTTCGCGTCGTGAA GTAGCCCTTG 28191 455 6.41e-07 TGGATAGCGT TTCTTCGCAGCGCGCA GCGCTTGGGG 14327 81 1.04e-06 TTGTGAGGCA GTGTCGGTTTCGCACA AAATTTAGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 52750 6e-09 251_[+2]_233 48159 4.4e-08 242_[+2]_242 45802 2.2e-07 54_[+2]_430 48362 2.6e-07 58_[+2]_426 45870 3.1e-07 452_[+2]_32 28191 6.4e-07 454_[+2]_30 14327 1e-06 80_[+2]_404 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=7 52750 ( 252) GTCTTGGCGTCGCGGA 1 48159 ( 243) GCATTCGGGTCGCGCA 1 45802 ( 55) GTGGTCGCTTCTCGGA 1 48362 ( 59) GCATTCGTTTTGCGGA 1 45870 ( 453) GTTTTCGCGTCGTGAA 1 28191 ( 455) TTCTTCGCAGCGCGCA 1 14327 ( 81) GTGTCGGTTTCGCACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 10.5266 E= 8.0e+002 -945 -945 190 -88 -945 29 -945 144 6 29 32 -88 -945 -945 -68 170 -945 -71 -945 170 -945 161 32 -945 -945 -945 212 -945 -945 129 -68 12 -93 -945 90 70 -945 -945 -68 170 -945 187 -945 -88 -945 -945 190 -88 -945 187 -945 -88 -93 -945 190 -945 -93 87 90 -945 187 -945 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 8.0e+002 0.000000 0.000000 0.857143 0.142857 0.000000 0.285714 0.000000 0.714286 0.285714 0.285714 0.285714 0.142857 0.000000 0.000000 0.142857 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 0.714286 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.571429 0.142857 0.285714 0.142857 0.000000 0.428571 0.428571 0.000000 0.000000 0.142857 0.857143 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 0.857143 0.142857 0.000000 0.857143 0.000000 0.142857 0.142857 0.000000 0.857143 0.000000 0.142857 0.428571 0.428571 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[TC][ACG]TT[CG]G[CT][GT]TCGCG[CG]A -------------------------------------------------------------------------------- Time 3.47 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 11 llr = 114 E-value = 5.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 5:::21::1:19 pos.-specific C :7:2::129a:: probability G ::2117::::7: matrix T 53877298::21 bits 2.1 * 1.9 * 1.7 ** 1.5 * ** * Relative 1.3 ** **** * Entropy 1.1 ** ******* (15.0 bits) 0.8 ************ 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TCTTTGTTCCGA consensus AT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 45802 9 4.28e-07 GCCCGTTT TCTTTGTCCCGA GATAGCATAC 40662 174 5.48e-07 TGTTGGCCGA ACTCTGTTCCGA TCTGGTACTC 45870 154 1.54e-06 TTGGAAGAGC TCTTTGTTCCGT AAACAAGTTC 44012 55 2.01e-06 CTAATGACAA TCTTTGTTACGA GGATGAATTG 48362 373 4.02e-06 AGACCGCCTT ACTCTTTTCCGA CGAATTTTCC 48159 442 6.42e-06 ACAGCATATT TCTTGGTTCCTA GTATTGTTTT 36848 120 6.42e-06 AATAGGCGTT ATTTTATTCCGA GACGAGCCCT 46904 315 1.38e-05 ACACATCCCG TCGTTGTCCCTA AAGTAGACTC 52750 398 1.91e-05 TCGTGGAAGA ATTGAGTTCCGA AACCAATCGA 14327 99 2.42e-05 TTCGCACAAA ATTTAGTTCCAA CACCGTAGAT 28191 394 2.63e-05 TCATGATCGT TCGTTTCTCCGA GTTTCGATGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45802 4.3e-07 8_[+3]_480 40662 5.5e-07 173_[+3]_315 45870 1.5e-06 153_[+3]_335 44012 2e-06 54_[+3]_434 48362 4e-06 372_[+3]_116 48159 6.4e-06 441_[+3]_47 36848 6.4e-06 119_[+3]_369 46904 1.4e-05 314_[+3]_174 52750 1.9e-05 397_[+3]_91 14327 2.4e-05 98_[+3]_390 28191 2.6e-05 393_[+3]_95 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=11 45802 ( 9) TCTTTGTCCCGA 1 40662 ( 174) ACTCTGTTCCGA 1 45870 ( 154) TCTTTGTTCCGT 1 44012 ( 55) TCTTTGTTACGA 1 48362 ( 373) ACTCTTTTCCGA 1 48159 ( 442) TCTTGGTTCCTA 1 36848 ( 120) ATTTTATTCCGA 1 46904 ( 315) TCGTTGTCCCTA 1 52750 ( 398) ATTGAGTTCCGA 1 14327 ( 99) ATTTAGTTCCAA 1 28191 ( 394) TCGTTTCTCCGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.6349 E= 5.4e+002 73 -1010 -1010 105 -1010 164 -1010 5 -1010 -1010 -34 163 -1010 -36 -133 147 -59 -1010 -133 147 -159 -1010 166 -53 -1010 -136 -1010 179 -1010 -36 -1010 163 -159 196 -1010 -1010 -1010 210 -1010 -1010 -159 -1010 166 -53 173 -1010 -1010 -153 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 5.4e+002 0.454545 0.000000 0.000000 0.545455 0.000000 0.727273 0.000000 0.272727 0.000000 0.000000 0.181818 0.818182 0.000000 0.181818 0.090909 0.727273 0.181818 0.000000 0.090909 0.727273 0.090909 0.000000 0.727273 0.181818 0.000000 0.090909 0.000000 0.909091 0.000000 0.181818 0.000000 0.818182 0.090909 0.909091 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.090909 0.000000 0.727273 0.181818 0.909091 0.000000 0.000000 0.090909 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TA][CT]TTTGTTCCGA -------------------------------------------------------------------------------- Time 5.02 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 28191 3.50e-05 393_[+3(2.63e-05)]_49_\ [+2(6.41e-07)]_30 36848 2.30e-03 119_[+3(6.42e-06)]_218_\ [+1(2.51e-05)]_130 46904 9.81e-02 314_[+3(1.38e-05)]_174 13897 1.01e-04 10_[+1(1.73e-08)]_469 14327 1.77e-04 80_[+2(1.04e-06)]_2_[+3(2.42e-05)]_\ 390 48159 1.81e-10 36_[+1(1.29e-08)]_185_\ [+2(4.42e-08)]_183_[+3(6.42e-06)]_47 48362 1.58e-10 58_[+2(2.62e-07)]_133_\ [+1(3.00e-09)]_144_[+3(4.02e-06)]_116 44012 2.09e-02 54_[+3(2.01e-06)]_434 52750 8.39e-12 112_[+1(1.19e-09)]_118_\ [+2(6.03e-09)]_130_[+3(1.91e-05)]_91 45504 3.03e-06 52_[+1(1.11e-10)]_427 45802 8.26e-12 8_[+3(4.28e-07)]_34_[+2(2.18e-07)]_\ 62_[+1(1.45e-09)]_347 43513 5.56e-01 500 45870 1.38e-06 153_[+3(1.54e-06)]_287_\ [+2(3.09e-07)]_32 40662 2.09e-03 173_[+3(5.48e-07)]_315 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************