******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/261/261.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47057 1.0000 500 14962 1.0000 500 15937 1.0000 500 40679 1.0000 500 50251 1.0000 500 26807 1.0000 500 48400 1.0000 500 34091 1.0000 500 35746 1.0000 500 40659 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/261/261.seqs.fa -oc motifs/261 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.278 C 0.238 G 0.226 T 0.258 Background letter frequencies (from dataset with add-one prior applied): A 0.278 C 0.238 G 0.226 T 0.258 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 5 llr = 96 E-value = 1.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::::8a:a6::8::2:422a pos.-specific C ::a28::2:::6::::2246: probability G aa:62::6:4a::4288242: matrix T :::2:2:2:::4268::2::: bits 2.1 *** * 1.9 *** * * * * 1.7 *** * * * * 1.5 *** * * * * * Relative 1.3 *** * * * * *** * Entropy 1.1 *** *** ********* * (27.6 bits) 0.9 *** *** ********* * 0.6 ***************** ** 0.4 ***************** *** 0.2 ***************** *** 0.0 --------------------- Multilevel GGCGCAAGAAGCATTGGACCA consensus CGT C G TTGGACCGA sequence T T GAG T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 40659 264 3.05e-11 TGTACGTGTT GGCCCAAGAAGTATTGGACCA AAATATGAGT 35746 175 2.77e-10 AATTGGCAAT GGCTCAATAAGCATTGGCGCA ATGGCCAAGG 15937 296 8.51e-10 CGGCCGCCAA GGCGGAACAGGCATTGGAGAA CAGTACAGGC 14962 461 4.06e-09 CAGAACAATT GGCGCAAGAAGTTGTGCTCGA GAAGAAGGAG 50251 209 6.04e-09 CATGTTCACA GGCGCTAGAGGCAGGAGGACA GCTCAGCTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40659 3.1e-11 263_[+1]_216 35746 2.8e-10 174_[+1]_305 15937 8.5e-10 295_[+1]_184 14962 4.1e-09 460_[+1]_19 50251 6e-09 208_[+1]_271 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=5 40659 ( 264) GGCCCAAGAAGTATTGGACCA 1 35746 ( 175) GGCTCAATAAGCATTGGCGCA 1 15937 ( 296) GGCGGAACAGGCATTGGAGAA 1 14962 ( 461) GGCGCAAGAAGTTGTGCTCGA 1 50251 ( 209) GGCGCTAGAGGCAGGAGGACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 10.1572 E= 1.8e+001 -897 -897 214 -897 -897 -897 214 -897 -897 207 -897 -897 -897 -25 141 -37 -897 175 -18 -897 153 -897 -897 -37 185 -897 -897 -897 -897 -25 141 -37 185 -897 -897 -897 111 -897 82 -897 -897 -897 214 -897 -897 133 -897 63 153 -897 -897 -37 -897 -897 82 121 -897 -897 -18 163 -47 -897 182 -897 -897 -25 182 -897 53 -25 -18 -37 -47 75 82 -897 -47 133 -18 -897 185 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 1.8e+001 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.600000 0.200000 0.000000 0.800000 0.200000 0.000000 0.800000 0.000000 0.000000 0.200000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.600000 0.200000 1.000000 0.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.200000 0.800000 0.200000 0.000000 0.800000 0.000000 0.000000 0.200000 0.800000 0.000000 0.400000 0.200000 0.200000 0.200000 0.200000 0.400000 0.400000 0.000000 0.200000 0.600000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GGC[GCT][CG][AT]A[GCT]A[AG]G[CT][AT][TG][TG][GA][GC][ACGT][CGA][CAG]A -------------------------------------------------------------------------------- Time 1.11 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 10 llr = 115 E-value = 2.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 2::::17:11::21:1 pos.-specific C 13:2:5::23184:3: probability G ::a73433:32:::69 matrix T 77:17::77372491: bits 2.1 * 1.9 * 1.7 * * 1.5 * * * Relative 1.3 * * * * Entropy 1.1 ** * ** * * * (16.5 bits) 0.9 ***** *** ** *** 0.6 ********* ** *** 0.4 ********* ****** 0.2 **************** 0.0 ---------------- Multilevel TTGGTCATTCTCCTGG consensus AC CGGGGCGGTT C sequence T A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 47057 81 5.05e-08 AAGACAACCT TTGGTCGTCGTCTTGG TGAGGTCACC 50251 74 1.74e-07 ATCGACCTTT TCGGTGATTCTTTTGG GAGCTTGTGG 48400 283 3.35e-07 CTGCCGTTGC TTGGGAGTTGTCCTGG TATGGTACAC 26807 90 7.17e-07 ATCAACGTGC ATGGTGATTTGCTTCG TGACGGATTA 15937 121 8.71e-07 TTCTCTTGCG TTGCTCAGTTTTTTGG CTTTGGTCTG 35746 326 3.11e-06 ATACTACGTC ACGGGGATTTCCCTGG TCTCCCACCG 34091 67 4.25e-06 TCGAGAATCA TTGCTCATTGTCAACG CGGATGACCT 40679 283 4.59e-06 ATACAGAACT TTGGGCAGACTCCTTG CCAGTTACTT 14962 391 6.61e-06 GTCCTTCGAT TTGGTGGGCAGCCTCG TCGATTATTA 40659 395 3.30e-05 AAGAAACCAG CCGTTCATTCTCATGA TGCAAGCAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47057 5e-08 80_[+2]_404 50251 1.7e-07 73_[+2]_411 48400 3.3e-07 282_[+2]_202 26807 7.2e-07 89_[+2]_395 15937 8.7e-07 120_[+2]_364 35746 3.1e-06 325_[+2]_159 34091 4.2e-06 66_[+2]_418 40679 4.6e-06 282_[+2]_202 14962 6.6e-06 390_[+2]_94 40659 3.3e-05 394_[+2]_90 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=10 47057 ( 81) TTGGTCGTCGTCTTGG 1 50251 ( 74) TCGGTGATTCTTTTGG 1 48400 ( 283) TTGGGAGTTGTCCTGG 1 26807 ( 90) ATGGTGATTTGCTTCG 1 15937 ( 121) TTGCTCAGTTTTTTGG 1 35746 ( 326) ACGGGGATTTCCCTGG 1 34091 ( 67) TTGCTCATTGTCAACG 1 40679 ( 283) TTGGGCAGACTCCTTG 1 14962 ( 391) TTGGTGGGCAGCCTCG 1 40659 ( 395) CCGTTCATTCTCATGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4850 bayes= 8.91886 E= 2.0e+002 -47 -125 -997 144 -997 33 -997 144 -997 -997 214 -997 -997 -25 163 -137 -997 -997 41 144 -147 107 82 -997 133 -997 41 -997 -997 -997 41 144 -147 -25 -997 144 -147 33 41 22 -997 -125 -18 144 -997 175 -997 -37 -47 75 -997 63 -147 -997 -997 180 -997 33 141 -137 -147 -997 199 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 2.0e+002 0.200000 0.100000 0.000000 0.700000 0.000000 0.300000 0.000000 0.700000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.700000 0.100000 0.000000 0.000000 0.300000 0.700000 0.100000 0.500000 0.400000 0.000000 0.700000 0.000000 0.300000 0.000000 0.000000 0.000000 0.300000 0.700000 0.100000 0.200000 0.000000 0.700000 0.100000 0.300000 0.300000 0.300000 0.000000 0.100000 0.200000 0.700000 0.000000 0.800000 0.000000 0.200000 0.200000 0.400000 0.000000 0.400000 0.100000 0.000000 0.000000 0.900000 0.000000 0.300000 0.600000 0.100000 0.100000 0.000000 0.900000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TA][TC]G[GC][TG][CG][AG][TG][TC][CGT][TG][CT][CTA]T[GC]G -------------------------------------------------------------------------------- Time 2.06 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 10 llr = 100 E-value = 1.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1222:::::7:a pos.-specific C 52812:::5:9: probability G 26:::3::33:: matrix T 2::787aa2:1: bits 2.1 1.9 ** * 1.7 ** * 1.5 ** ** Relative 1.3 * * ** ** Entropy 1.1 * **** *** (14.4 bits) 0.9 ****** *** 0.6 *********** 0.4 *********** 0.2 ************ 0.0 ------------ Multilevel CGCTTTTTCACA consensus GAAACG GG sequence TC T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47057 262 4.33e-07 ATTCAGCACA TGCTTTTTCACA TTCTCACTGT 40659 308 7.41e-07 ACAGGAGTTG CGCTCTTTCACA AGCAAATAGC 50251 197 2.76e-06 GTCAAAATGA CGCATGTTCACA GGCGCTAGAG 15937 391 5.03e-06 ATTACTCGAG CCCTTGTTCGCA GTGCTGAAGA 35746 353 5.55e-06 CTCCCACCGC GGCTTTTTTGCA TTTTTCAGTC 14962 144 1.21e-05 GAGCTGCTTA CAATTTTTGACA AAACCTTCTA 34091 418 1.72e-05 TCTAGATCTA CACTTTTTCATA TGATACTACC 40679 444 2.47e-05 ATACGCGAAA GGCACTTTGACA AAAGTGCACG 26807 464 3.97e-05 AATGGCTTGG ACCTTTTTTGCA GCTGTCATAA 48400 200 7.77e-05 CGTACCAGCG TGACTGTTGACA GGAACACAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47057 4.3e-07 261_[+3]_227 40659 7.4e-07 307_[+3]_181 50251 2.8e-06 196_[+3]_292 15937 5e-06 390_[+3]_98 35746 5.5e-06 352_[+3]_136 14962 1.2e-05 143_[+3]_345 34091 1.7e-05 417_[+3]_71 40679 2.5e-05 443_[+3]_45 26807 4e-05 463_[+3]_25 48400 7.8e-05 199_[+3]_289 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=10 47057 ( 262) TGCTTTTTCACA 1 40659 ( 308) CGCTCTTTCACA 1 50251 ( 197) CGCATGTTCACA 1 15937 ( 391) CCCTTGTTCGCA 1 35746 ( 353) GGCTTTTTTGCA 1 14962 ( 144) CAATTTTTGACA 1 34091 ( 418) CACTTTTTCATA 1 40679 ( 444) GGCACTTTGACA 1 26807 ( 464) ACCTTTTTTGCA 1 48400 ( 200) TGACTGTTGACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 8.93074 E= 1.7e+002 -147 107 -18 -37 -47 -25 141 -997 -47 175 -997 -997 -47 -125 -997 144 -997 -25 -997 163 -997 -997 41 144 -997 -997 -997 195 -997 -997 -997 195 -997 107 41 -37 133 -997 41 -997 -997 192 -997 -137 185 -997 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 1.7e+002 0.100000 0.500000 0.200000 0.200000 0.200000 0.200000 0.600000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.100000 0.000000 0.700000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.300000 0.700000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.300000 0.200000 0.700000 0.000000 0.300000 0.000000 0.000000 0.900000 0.000000 0.100000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CGT][GAC][CA][TA][TC][TG]TT[CGT][AG]CA -------------------------------------------------------------------------------- Time 2.92 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47057 9.01e-07 80_[+2(5.05e-08)]_165_\ [+3(4.33e-07)]_227 14962 1.16e-08 143_[+3(1.21e-05)]_235_\ [+2(6.61e-06)]_21_[+1(1.26e-05)]_12_[+1(4.06e-09)]_19 15937 1.85e-10 120_[+2(8.71e-07)]_159_\ [+1(8.51e-10)]_74_[+3(5.03e-06)]_98 40679 1.25e-03 282_[+2(4.59e-06)]_145_\ [+3(2.47e-05)]_45 50251 1.46e-10 73_[+2(1.74e-07)]_107_\ [+3(2.76e-06)]_[+1(6.04e-09)]_271 26807 5.60e-04 89_[+2(7.17e-07)]_358_\ [+3(3.97e-05)]_25 48400 1.83e-04 199_[+3(7.77e-05)]_71_\ [+2(3.35e-07)]_202 34091 6.46e-04 66_[+2(4.25e-06)]_122_\ [+3(9.86e-05)]_201_[+3(1.72e-05)]_71 35746 2.33e-10 174_[+1(2.77e-10)]_130_\ [+2(3.11e-06)]_11_[+3(5.55e-06)]_136 40659 4.08e-11 263_[+1(3.05e-11)]_23_\ [+3(7.41e-07)]_75_[+2(3.30e-05)]_90 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************