******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/69/69.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 54017 1.0000 500 21513 1.0000 500 48510 1.0000 500 49679 1.0000 500 44401 1.0000 500 11174 1.0000 500 11459 1.0000 500 12186 1.0000 500 11990 1.0000 500 35509 1.0000 500 35587 1.0000 500 44400 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/69/69.seqs.fa -oc motifs/69 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.253 C 0.252 G 0.227 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.253 C 0.252 G 0.227 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 12 llr = 118 E-value = 2.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :19a9a:18363 pos.-specific C 731:1::3241: probability G 36::::91:327 matrix T :1::::15::2: bits 2.1 1.9 * * 1.7 * ** 1.5 ***** Relative 1.3 ***** * Entropy 1.1 * ***** * * (14.1 bits) 0.9 * ***** * * 0.6 * ***** * * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CGAAAAGTACAG consensus GC C G A sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 21513 209 5.22e-08 TGACTGACGG CGAAAAGTACAG TCACCCAAAA 11174 445 1.20e-06 ATTGCCGCTT CGAAAAGTAGTG ATACCATGAC 44400 326 1.38e-06 TACTTGTACA GGAAAAGCAAAG TATGTTTTTG 12186 416 4.15e-06 AGTTTTTAAA GGAAAAGCAAAA CCGTATCATC 11459 197 9.35e-06 TTCTGTTCGA CGAAAAGAACGG ACAATCCTCG 44401 75 1.55e-05 GCTTCTGCTG CCAAAAGTAAGA AAGAAGCGGA 49679 164 1.55e-05 GGAAGAGTGG CCAAAAGTCCAA ATTTATAATA 35509 258 1.68e-05 TCGCAACTAT CCAAAAGTCGAA ATCGCCAACG 54017 484 2.85e-05 TTCTGAGCGA GAAAAAGTAGTG GCATC 11990 89 4.85e-05 TTAGGTTGCT GGAAAATGACAG ACATTTTATA 48510 430 4.85e-05 CAACAACCGA CGAACAGCACCG ATACCGTCGT 35587 200 5.81e-05 TTTTCTGAAA CTCAAAGCAGAG ATGATATAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21513 5.2e-08 208_[+1]_280 11174 1.2e-06 444_[+1]_44 44400 1.4e-06 325_[+1]_163 12186 4.2e-06 415_[+1]_73 11459 9.4e-06 196_[+1]_292 44401 1.6e-05 74_[+1]_414 49679 1.6e-05 163_[+1]_325 35509 1.7e-05 257_[+1]_231 54017 2.9e-05 483_[+1]_5 11990 4.9e-05 88_[+1]_400 48510 4.9e-05 429_[+1]_59 35587 5.8e-05 199_[+1]_289 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=12 21513 ( 209) CGAAAAGTACAG 1 11174 ( 445) CGAAAAGTAGTG 1 44400 ( 326) GGAAAAGCAAAG 1 12186 ( 416) GGAAAAGCAAAA 1 11459 ( 197) CGAAAAGAACGG 1 44401 ( 75) CCAAAAGTAAGA 1 49679 ( 164) CCAAAAGTCCAA 1 35509 ( 258) CCAAAAGTCGAA 1 54017 ( 484) GAAAAAGTAGTG 1 11990 ( 89) GGAAAATGACAG 1 48510 ( 430) CGAACAGCACCG 1 35587 ( 200) CTCAAAGCAGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 8.93074 E= 2.0e+001 -1023 140 56 -1023 -160 -1 136 -168 186 -160 -1023 -1023 198 -1023 -1023 -1023 186 -160 -1023 -1023 198 -1023 -1023 -1023 -1023 -1023 202 -168 -160 40 -144 90 172 -60 -1023 -1023 -2 72 56 -1023 120 -160 -44 -68 40 -1023 156 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 2.0e+001 0.000000 0.666667 0.333333 0.000000 0.083333 0.250000 0.583333 0.083333 0.916667 0.083333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.916667 0.083333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.916667 0.083333 0.083333 0.333333 0.083333 0.500000 0.833333 0.166667 0.000000 0.000000 0.250000 0.416667 0.333333 0.000000 0.583333 0.083333 0.166667 0.166667 0.333333 0.000000 0.666667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG][GC]AAAAG[TC]A[CGA]A[GA] -------------------------------------------------------------------------------- Time 1.25 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 7 llr = 85 E-value = 1.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::3::9:::1 pos.-specific C 3:::4a9:a:11 probability G 7a6a3:1::9:: matrix T ::4::::1:197 bits 2.1 * * 1.9 * * * * 1.7 * * * * 1.5 * * ** ** Relative 1.3 ** * ****** Entropy 1.1 **** ****** (17.4 bits) 0.9 **** ******* 0.6 **** ******* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GGGGCCCACGTT consensus C T A sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44401 398 4.42e-08 CAGATTCACC GGGGCCCACGTT TCTATCAGGA 48510 85 3.73e-07 TCCACGTGCT CGGGGCCACGTT CCGAGTTTTC 44400 412 1.05e-06 ATTGGACAAT GGGGGCGACGTT AGAGGTGAAG 54017 447 1.35e-06 CGCTTATTTT GGGGACCACGCT CAGTGTCAGC 12186 468 1.76e-06 GCAGTCCTCG GGTGCCCACTTT ACGCGAGATG 35587 346 4.13e-06 AGCCAACTGA CGTGACCACGTA CTTCGGCGTG 11174 42 5.94e-06 ATTGCTAACC GGTGCCCTCGTC AGGATCTTAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44401 4.4e-08 397_[+2]_91 48510 3.7e-07 84_[+2]_404 44400 1e-06 411_[+2]_77 54017 1.4e-06 446_[+2]_42 12186 1.8e-06 467_[+2]_21 35587 4.1e-06 345_[+2]_143 11174 5.9e-06 41_[+2]_447 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=7 44401 ( 398) GGGGCCCACGTT 1 48510 ( 85) CGGGGCCACGTT 1 44400 ( 412) GGGGGCGACGTT 1 54017 ( 447) GGGGACCACGCT 1 12186 ( 468) GGTGCCCACTTT 1 35587 ( 346) CGTGACCACGTA 1 11174 ( 42) GGTGCCCTCGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 10.3159 E= 1.5e+002 -945 18 166 -945 -945 -945 214 -945 -945 -945 133 68 -945 -945 214 -945 17 76 33 -945 -945 199 -945 -945 -945 176 -66 -945 176 -945 -945 -91 -945 199 -945 -945 -945 -945 192 -91 -945 -82 -945 168 -82 -82 -945 141 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 1.5e+002 0.000000 0.285714 0.714286 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.571429 0.428571 0.000000 0.000000 1.000000 0.000000 0.285714 0.428571 0.285714 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.857143 0.000000 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.142857 0.000000 0.857143 0.142857 0.142857 0.000000 0.714286 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GC]G[GT]G[CAG]CCACGTT -------------------------------------------------------------------------------- Time 2.79 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 5 llr = 66 E-value = 2.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 2:aa2::a:2:8 pos.-specific C :2::2a:::2:: probability G 88::4:a::6a2 matrix T ::::2:::a::: bits 2.1 * * 1.9 ** **** * 1.7 ** **** * 1.5 ** **** * Relative 1.3 **** **** ** Entropy 1.1 **** **** ** (19.0 bits) 0.9 **** **** ** 0.6 **** ******* 0.4 **** ******* 0.2 ************ 0.0 ------------ Multilevel GGAAGCGATGGA consensus AC A A G sequence C C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 12186 446 2.51e-07 TCGTACAGTT GGAAGCGATAGA GCAGTCCTCG 11459 77 3.69e-07 CTACCACAGC GCAAGCGATGGA ATCCCAGGAA 11990 403 6.56e-07 TTGCTGATAC GGAATCGATCGA AGGCCGCTGT 35509 55 7.32e-07 GACACGACGG GGAACCGATGGG GGTGACTGTG 44400 241 9.59e-07 CAATCGATAC AGAAACGATGGA CGATTTTATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12186 2.5e-07 445_[+3]_43 11459 3.7e-07 76_[+3]_412 11990 6.6e-07 402_[+3]_86 35509 7.3e-07 54_[+3]_434 44400 9.6e-07 240_[+3]_248 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=5 12186 ( 446) GGAAGCGATAGA 1 11459 ( 77) GCAAGCGATGGA 1 11990 ( 403) GGAATCGATCGA 1 35509 ( 55) GGAACCGATGGG 1 44400 ( 241) AGAAACGATGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 10.4472 E= 2.8e+002 -34 -897 182 -897 -897 -33 182 -897 198 -897 -897 -897 198 -897 -897 -897 -34 -33 82 -42 -897 198 -897 -897 -897 -897 214 -897 198 -897 -897 -897 -897 -897 -897 190 -34 -33 140 -897 -897 -897 214 -897 166 -897 -18 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 5 E= 2.8e+002 0.200000 0.000000 0.800000 0.000000 0.000000 0.200000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.200000 0.400000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.200000 0.200000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.200000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA][GC]AA[GACT]CGAT[GAC]G[AG] -------------------------------------------------------------------------------- Time 4.07 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54017 2.68e-04 446_[+2(1.35e-06)]_25_\ [+1(2.85e-05)]_5 21513 2.86e-04 208_[+1(5.22e-08)]_280 48510 1.65e-04 84_[+2(3.73e-07)]_333_\ [+1(4.85e-05)]_59 49679 2.22e-02 163_[+1(1.55e-05)]_325 44401 1.67e-05 74_[+1(1.55e-05)]_311_\ [+2(4.42e-08)]_91 11174 1.67e-04 41_[+2(5.94e-06)]_391_\ [+1(1.20e-06)]_44 11459 6.37e-05 76_[+3(3.69e-07)]_108_\ [+1(9.35e-06)]_292 12186 5.79e-08 415_[+1(4.15e-06)]_18_\ [+3(2.51e-07)]_10_[+2(1.76e-06)]_21 11990 4.28e-04 88_[+1(4.85e-05)]_302_\ [+3(6.56e-07)]_86 35509 1.78e-04 54_[+3(7.32e-07)]_191_\ [+1(1.68e-05)]_231 35587 2.84e-03 199_[+1(5.81e-05)]_134_\ [+2(4.13e-06)]_143 44400 4.52e-08 240_[+3(9.59e-07)]_73_\ [+1(1.38e-06)]_74_[+2(1.05e-06)]_77 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************