******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/307/307.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 19106 1.0000 500 20792 1.0000 500 21246 1.0000 500 21867 1.0000 500 23478 1.0000 500 24019 1.0000 500 9607 1.0000 500 bd735 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/307/307.seqs.fa -oc motifs/307 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4000 N= 8 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.243 G 0.227 T 0.259 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.243 G 0.227 T 0.259 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 6 llr = 75 E-value = 4.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :35a72:3::a: pos.-specific C ::3::8::a::8 probability G a7::3:8::a:2 matrix T ::2:::27:::: bits 2.1 * ** 1.9 * * *** 1.7 * * *** 1.5 * * * **** Relative 1.3 * * ** **** Entropy 1.1 ** ********* (18.0 bits) 0.9 ** ********* 0.6 ** ********* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GGAAACGTCGAC consensus AC G A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 20792 79 5.32e-08 TATTCGGTGA GGAAACGTCGAC GGTCTAAATC 24019 486 3.05e-07 GGCTGCACAC GGCAGCGTCGAC ACA 21867 200 7.63e-07 TACCACCACA GGAAACGTCGAG CGCCACGCTT 23478 436 1.15e-06 AGACGACAAC GACAACGACGAC AACGACAACG bd735 240 2.27e-06 GCCACTGGTT GAAAAAGTCGAC TGGGTTCTCC 19106 116 5.26e-06 TGACTGGTTT GGTAGCTACGAC TGAGTTGCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20792 5.3e-08 78_[+1]_410 24019 3e-07 485_[+1]_3 21867 7.6e-07 199_[+1]_289 23478 1.1e-06 435_[+1]_53 bd735 2.3e-06 239_[+1]_249 19106 5.3e-06 115_[+1]_373 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=6 20792 ( 79) GGAAACGTCGAC 1 24019 ( 486) GGCAGCGTCGAC 1 21867 ( 200) GGAAACGTCGAG 1 23478 ( 436) GACAACGACGAC 1 bd735 ( 240) GAAAAAGTCGAC 1 19106 ( 116) GGTAGCTACGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3912 bayes= 9.79456 E= 4.6e+001 -923 -923 214 -923 30 -923 156 -923 88 45 -923 -63 188 -923 -923 -923 129 -923 56 -923 -70 177 -923 -923 -923 -923 188 -63 30 -923 -923 136 -923 204 -923 -923 -923 -923 214 -923 188 -923 -923 -923 -923 177 -44 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 4.6e+001 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.500000 0.333333 0.000000 0.166667 1.000000 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.333333 0.000000 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[GA][AC]A[AG]CG[TA]CGAC -------------------------------------------------------------------------------- Time 0.56 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 8 llr = 98 E-value = 5.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :3:8:1:::44:1:1: pos.-specific C ::4:43::8::8411: probability G 985114a::66:19:8 matrix T 1:1153:a3::34:83 bits 2.1 * 1.9 ** 1.7 ** 1.5 * ** * Relative 1.3 ** *** * * * Entropy 1.1 ** ****** * * (17.7 bits) 0.9 ** * ****** *** 0.6 ***** ****** *** 0.4 ***** ****** *** 0.2 **************** 0.0 ---------------- Multilevel GGGATGGTCGGCCGTG consensus AC CC TAATT T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 23478 275 5.86e-09 TGTTAGGATT GGCACTGTCGGCCGTG GCAGATGATA 21867 125 1.12e-07 GGTTTTGTGA GAGACGGTCAACCGTG AAGTGGACAA 19106 180 1.82e-07 GAGGTTGATG GGGATGGTTGGCTGAG GTTGATGCAT 21246 300 8.89e-07 ATCAGTGGAT TGGATTGTTAGCCGTG GTAGACATTA bd735 106 1.71e-06 TGGAGGACGT GGTATCGTCGGCTGCT CCATCATGTC 9607 296 3.26e-06 GGTTGAAGTA GACACAGTCGGTTGTT GTCTCTGTTG 24019 112 4.51e-06 CGACTGCGGG GGGGGGGTCAATGGTG TCCATTTGAT 20792 194 5.09e-06 TATGTGCATT GGCTTCGTCGACACTG CAGCTTGTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23478 5.9e-09 274_[+2]_210 21867 1.1e-07 124_[+2]_360 19106 1.8e-07 179_[+2]_305 21246 8.9e-07 299_[+2]_185 bd735 1.7e-06 105_[+2]_379 9607 3.3e-06 295_[+2]_189 24019 4.5e-06 111_[+2]_373 20792 5.1e-06 193_[+2]_291 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=8 23478 ( 275) GGCACTGTCGGCCGTG 1 21867 ( 125) GAGACGGTCAACCGTG 1 19106 ( 180) GGGATGGTTGGCTGAG 1 21246 ( 300) TGGATTGTTAGCCGTG 1 bd735 ( 106) GGTATCGTCGGCTGCT 1 9607 ( 296) GACACAGTCGGTTGTT 1 24019 ( 112) GGGGGGGTCAATGGTG 1 20792 ( 194) GGCTTCGTCGACACTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 3880 bayes= 8.91886 E= 5.0e+001 -965 -965 195 -105 -12 -965 173 -965 -965 62 114 -105 146 -965 -86 -105 -965 62 -86 95 -112 4 73 -5 -965 -965 214 -965 -965 -965 -965 195 -965 162 -965 -5 47 -965 146 -965 47 -965 146 -965 -965 162 -965 -5 -112 62 -86 53 -965 -96 195 -965 -112 -96 -965 153 -965 -965 173 -5 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 5.0e+001 0.000000 0.000000 0.875000 0.125000 0.250000 0.000000 0.750000 0.000000 0.000000 0.375000 0.500000 0.125000 0.750000 0.000000 0.125000 0.125000 0.000000 0.375000 0.125000 0.500000 0.125000 0.250000 0.375000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.250000 0.375000 0.000000 0.625000 0.000000 0.375000 0.000000 0.625000 0.000000 0.000000 0.750000 0.000000 0.250000 0.125000 0.375000 0.125000 0.375000 0.000000 0.125000 0.875000 0.000000 0.125000 0.125000 0.000000 0.750000 0.000000 0.000000 0.750000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[GA][GC]A[TC][GCT]GT[CT][GA][GA][CT][CT]GT[GT] -------------------------------------------------------------------------------- Time 1.08 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 8 llr = 85 E-value = 5.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 4a631:31:9:: pos.-specific C 6:48:9118::a probability G ::::::631::: matrix T ::::91:511a: bits 2.1 * 1.9 * ** 1.7 * ** 1.5 * ** ** Relative 1.3 * *** *** Entropy 1.1 ****** **** (15.3 bits) 0.9 ******* **** 0.6 ******* **** 0.4 ******* **** 0.2 ************ 0.0 ------------ Multilevel CAACTCGTCATC consensus A CA AG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 19106 404 6.65e-08 TGTCGTCCGT CAACTCGTCATC GTGTCAATAC 21867 482 1.08e-06 TCCTACACCA AAACTCATCATC CATTCTG 24019 343 2.05e-06 CTATATTAAG CAACACGTCATC GGTCATTTGA 21246 382 2.30e-06 GCCCTTCGCA CACATCGGCATC GGCTTACCTT bd735 178 1.32e-05 GCTGTCACTA CACCTCGACTTC GCTGAATGGA 20792 134 1.72e-05 CTACTAGCTC AAAATCGGTATC GGAGCATTGC 9607 214 1.97e-05 AGCGTCCGCC AAACTTCTCATC CTTCAGAAGA 23478 142 2.45e-05 AAAAGAAGGG CACCTCACGATC CGGATTTGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 19106 6.7e-08 403_[+3]_85 21867 1.1e-06 481_[+3]_7 24019 2e-06 342_[+3]_146 21246 2.3e-06 381_[+3]_107 bd735 1.3e-05 177_[+3]_311 20792 1.7e-05 133_[+3]_355 9607 2e-05 213_[+3]_275 23478 2.4e-05 141_[+3]_347 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=8 19106 ( 404) CAACTCGTCATC 1 21867 ( 482) AAACTCATCATC 1 24019 ( 343) CAACACGTCATC 1 21246 ( 382) CACATCGGCATC 1 bd735 ( 178) CACCTCGACTTC 1 20792 ( 134) AAAATCGGTATC 1 9607 ( 214) AAACTTCTCATC 1 23478 ( 142) CACCTCACGATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3912 bayes= 8.93074 E= 5.8e+002 47 136 -965 -965 188 -965 -965 -965 120 62 -965 -965 -12 162 -965 -965 -112 -965 -965 176 -965 185 -965 -105 -12 -96 146 -965 -112 -96 14 95 -965 162 -86 -105 169 -965 -965 -105 -965 -965 -965 195 -965 204 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 5.8e+002 0.375000 0.625000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.625000 0.375000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.125000 0.000000 0.000000 0.875000 0.000000 0.875000 0.000000 0.125000 0.250000 0.125000 0.625000 0.000000 0.125000 0.125000 0.250000 0.500000 0.000000 0.750000 0.125000 0.125000 0.875000 0.000000 0.000000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA]A[AC][CA]TC[GA][TG]CATC -------------------------------------------------------------------------------- Time 1.61 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 19106 2.62e-09 97_[+2(8.25e-05)]_2_[+1(5.26e-06)]_\ 52_[+2(1.82e-07)]_208_[+3(6.65e-08)]_85 20792 1.34e-07 78_[+1(5.32e-08)]_43_[+3(1.72e-05)]_\ 48_[+2(5.09e-06)]_291 21246 4.01e-05 299_[+2(8.89e-07)]_66_\ [+3(2.30e-06)]_107 21867 3.71e-09 124_[+2(1.12e-07)]_59_\ [+1(7.63e-07)]_270_[+3(1.08e-06)]_7 23478 6.28e-09 141_[+3(2.45e-05)]_121_\ [+2(5.86e-09)]_145_[+1(1.15e-06)]_53 24019 8.51e-08 111_[+2(4.51e-06)]_215_\ [+3(2.05e-06)]_131_[+1(3.05e-07)]_3 9607 7.39e-04 213_[+3(1.97e-05)]_70_\ [+2(3.26e-06)]_189 bd735 1.18e-06 105_[+2(1.71e-06)]_56_\ [+3(1.32e-05)]_50_[+1(2.27e-06)]_249 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************