******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/249/249.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10091 1.0000 500 18047 1.0000 500 260875 1.0000 500 261040 1.0000 500 31412 1.0000 500 35340 1.0000 500 3804 1.0000 500 7794 1.0000 500 9966 1.0000 500 bd168 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/249/249.seqs.fa -oc motifs/249 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.265 C 0.236 G 0.232 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.236 G 0.232 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 18 sites = 10 llr = 126 E-value = 1.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 523:23:21::4:1:::1 pos.-specific C :8186378:3913916a7 probability G 5:2:111:1:1:::11:: matrix T ::42132:87:57:83:2 bits 2.1 * 1.9 * 1.7 * * * 1.5 * * * Relative 1.3 * * * * * * Entropy 1.1 ** * **** *** * (18.2 bits) 0.8 ** * ***** *** ** 0.6 ** * ************ 0.4 ** ** ************ 0.2 ***** ************ 0.0 ------------------ Multilevel ACTCCACCTTCTTCTCCC consensus GAATACTA C AC T T sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ bd168 7 2.05e-10 TAAAAA GCACCTCCTTCTTCTCCC TCGTGTCTTT 35340 128 1.23e-07 GGACCCCTAA ACGCCCCCACCTCCTCCC AACCTCTTCT 261040 432 1.69e-07 CCTCCTCCTC GCTTCTCCTCCATCGCCC GTCTCTCCTC 260875 398 2.75e-07 AGACATTACT ACTTCCCCTTGATCTTCC TCGAAAAAAA 9966 461 5.20e-07 ACAGTAGTGG ACACGATCTTCATCTCCT CTCACGAGCC 7794 70 6.16e-07 CGAAACTCGT ACGCCGCATTCTTCTTCT TGGTGCCAGC 18047 454 1.02e-06 GTGACGAGAA GATCCTCATCCATCTCCA CCAAAACATC 10091 311 3.73e-06 AACTACAAAA GCCCTCCCGTCTCCTGCC TGTCCCCGAT 31412 392 3.97e-06 CAGAGACCTC AATCAATCTTCTTATTCC CTTTGCCCCG 3804 415 4.51e-06 CCCTGTACGG GCACAAGCTTCCCCCCCC CCCGGTTCCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd168 2e-10 6_[+1]_476 35340 1.2e-07 127_[+1]_355 261040 1.7e-07 431_[+1]_51 260875 2.7e-07 397_[+1]_85 9966 5.2e-07 460_[+1]_22 7794 6.2e-07 69_[+1]_413 18047 1e-06 453_[+1]_29 10091 3.7e-06 310_[+1]_172 31412 4e-06 391_[+1]_91 3804 4.5e-06 414_[+1]_68 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=18 seqs=10 bd168 ( 7) GCACCTCCTTCTTCTCCC 1 35340 ( 128) ACGCCCCCACCTCCTCCC 1 261040 ( 432) GCTTCTCCTCCATCGCCC 1 260875 ( 398) ACTTCCCCTTGATCTTCC 1 9966 ( 461) ACACGATCTTCATCTCCT 1 7794 ( 70) ACGCCGCATTCTTCTTCT 1 18047 ( 454) GATCCTCATCCATCTCCA 1 10091 ( 311) GCCCTCCCGTCTCCTGCC 1 31412 ( 392) AATCAATCTTCTTATTCC 1 3804 ( 415) GCACAAGCTTCCCCCCCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 4830 bayes= 8.91289 E= 1.4e+001 92 -997 110 -997 -40 176 -997 -997 18 -124 -22 58 -997 176 -997 -42 -40 134 -121 -141 18 34 -121 17 -997 157 -121 -42 -40 176 -997 -997 -140 -997 -121 158 -997 34 -997 139 -997 193 -121 -997 60 -124 -997 91 -997 34 -997 139 -140 193 -997 -997 -997 -124 -121 158 -997 134 -121 17 -997 208 -997 -997 -140 157 -997 -42 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 10 E= 1.4e+001 0.500000 0.000000 0.500000 0.000000 0.200000 0.800000 0.000000 0.000000 0.300000 0.100000 0.200000 0.400000 0.000000 0.800000 0.000000 0.200000 0.200000 0.600000 0.100000 0.100000 0.300000 0.300000 0.100000 0.300000 0.000000 0.700000 0.100000 0.200000 0.200000 0.800000 0.000000 0.000000 0.100000 0.000000 0.100000 0.800000 0.000000 0.300000 0.000000 0.700000 0.000000 0.900000 0.100000 0.000000 0.400000 0.100000 0.000000 0.500000 0.000000 0.300000 0.000000 0.700000 0.100000 0.900000 0.000000 0.000000 0.000000 0.100000 0.100000 0.800000 0.000000 0.600000 0.100000 0.300000 0.000000 1.000000 0.000000 0.000000 0.100000 0.700000 0.000000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AG][CA][TAG][CT][CA][ACT][CT][CA]T[TC]C[TA][TC]CT[CT]C[CT] -------------------------------------------------------------------------------- Time 1.02 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 8 llr = 98 E-value = 3.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :1:1:5:::::::1 pos.-specific C :1::6:::::::1: probability G 91164:a3:93a61 matrix T 1693:5:8a18:38 bits 2.1 * * 1.9 * * * 1.7 * * * 1.5 * * * ** * Relative 1.3 * * * ** * Entropy 1.1 * * * ****** (17.7 bits) 0.8 * * ********** 0.6 * ************ 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel GTTGCAGTTGTGGT consensus TGT G G T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 3804 223 7.09e-09 CCGGAGGTGT GTTGCAGTTGTGGT AGAGTAAATG bd168 114 6.10e-08 TTCTCTCCCT GTTGGTGGTGTGGT TGAACATGAG 7794 246 4.67e-07 GGGCAAGGAC GTTGCTGGTGTGGA GAGCAGCAAG 31412 302 6.88e-07 TTACGAAGCT GTTTCAGTTGGGTT GGGTGAAGAA 35340 78 2.63e-06 AAGTGTATTA GAGGCTGTTGTGTT CCCTTCCCCC 261040 262 3.09e-06 CCAGTGGCTG GCTGGAGTTGGGGG AGGCCATGGC 9966 256 4.84e-06 AGGAATCGTA GTTTGAGTTTTGCT TTGCAAATAA 18047 355 4.84e-06 TGCGAGAGGA TGTACTGTTGTGGT GTAACGGTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3804 7.1e-09 222_[+2]_264 bd168 6.1e-08 113_[+2]_373 7794 4.7e-07 245_[+2]_241 31412 6.9e-07 301_[+2]_185 35340 2.6e-06 77_[+2]_409 261040 3.1e-06 261_[+2]_225 9966 4.8e-06 255_[+2]_231 18047 4.8e-06 354_[+2]_132 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=8 3804 ( 223) GTTGCAGTTGTGGT 1 bd168 ( 114) GTTGGTGGTGTGGT 1 7794 ( 246) GTTGCTGGTGTGGA 1 31412 ( 302) GTTTCAGTTGGGTT 1 35340 ( 78) GAGGCTGTTGTGTT 1 261040 ( 262) GCTGGAGTTGGGGG 1 9966 ( 256) GTTTGAGTTTTGCT 1 18047 ( 355) TGTACTGTTGTGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 4870 bayes= 9.24733 E= 3.9e+001 -965 -965 191 -109 -108 -92 -89 123 -965 -965 -89 171 -108 -965 143 -9 -965 140 69 -965 92 -965 -965 91 -965 -965 210 -965 -965 -965 11 149 -965 -965 -965 190 -965 -965 191 -109 -965 -965 11 149 -965 -965 210 -965 -965 -92 143 -9 -108 -965 -89 149 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 8 E= 3.9e+001 0.000000 0.000000 0.875000 0.125000 0.125000 0.125000 0.125000 0.625000 0.000000 0.000000 0.125000 0.875000 0.125000 0.000000 0.625000 0.250000 0.000000 0.625000 0.375000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.125000 0.625000 0.250000 0.125000 0.000000 0.125000 0.750000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GTT[GT][CG][AT]G[TG]TG[TG]G[GT]T -------------------------------------------------------------------------------- Time 1.90 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 8 llr = 87 E-value = 1.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::135:1::::: pos.-specific C ::151:3::9:: probability G 3a11:a1aa:4a matrix T 8:614:5::16: bits 2.1 * * ** * 1.9 * * ** * 1.7 * * ** * 1.5 * * *** * Relative 1.3 * * *** * Entropy 1.1 ** * ***** (15.6 bits) 0.8 ** * ***** 0.6 ** ** ***** 0.4 *** ** ***** 0.2 ************ 0.0 ------------ Multilevel TGTCAGTGGCTG consensus G AT C G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 35340 459 2.90e-07 GGTACTGTGT TGTAAGTGGCTG GCTATCATAT 260875 288 1.76e-06 GGGCAATGTG TGGCTGTGGCTG CGACCGAGGA bd168 382 2.01e-06 GGTTTTGCCT GGTCTGCGGCTG CTTATCGTGA 10091 22 2.46e-06 AGTCGATCGA TGACAGTGGCGG AAGCGACGGA 261040 250 4.39e-06 TTGCCGAGGC GGCCAGTGGCTG GCTGGAGTTG 18047 76 8.25e-06 CTTAGGTTTT TGTGAGAGGCGG AAATGAACCC 31412 29 1.46e-05 TCCATCTTGT TGTTCGCGGCGG AGCTGCTTGA 3804 274 2.30e-05 GTCACCTTGG TGTATGGGGTTG ATGTTGATGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35340 2.9e-07 458_[+3]_30 260875 1.8e-06 287_[+3]_201 bd168 2e-06 381_[+3]_107 10091 2.5e-06 21_[+3]_467 261040 4.4e-06 249_[+3]_239 18047 8.3e-06 75_[+3]_413 31412 1.5e-05 28_[+3]_460 3804 2.3e-05 273_[+3]_215 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=8 35340 ( 459) TGTAAGTGGCTG 1 260875 ( 288) TGGCTGTGGCTG 1 bd168 ( 382) GGTCTGCGGCTG 1 10091 ( 22) TGACAGTGGCGG 1 261040 ( 250) GGCCAGTGGCTG 1 18047 ( 76) TGTGAGAGGCGG 1 31412 ( 29) TGTTCGCGGCGG 1 3804 ( 274) TGTATGGGGTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 9.25326 E= 1.8e+002 -965 -965 11 149 -965 -965 210 -965 -108 -92 -89 123 -8 108 -89 -109 92 -92 -965 49 -965 -965 210 -965 -108 8 -89 91 -965 -965 210 -965 -965 -965 210 -965 -965 189 -965 -109 -965 -965 69 123 -965 -965 210 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 1.8e+002 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.125000 0.125000 0.125000 0.625000 0.250000 0.500000 0.125000 0.125000 0.500000 0.125000 0.000000 0.375000 0.000000 0.000000 1.000000 0.000000 0.125000 0.250000 0.125000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.000000 0.375000 0.625000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TG]GT[CA][AT]G[TC]GGC[TG]G -------------------------------------------------------------------------------- Time 2.79 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10091 2.07e-04 21_[+3(2.46e-06)]_277_\ [+1(3.73e-06)]_172 18047 9.54e-07 75_[+3(8.25e-06)]_267_\ [+2(4.84e-06)]_85_[+1(1.02e-06)]_29 260875 7.48e-06 287_[+3(1.76e-06)]_98_\ [+1(2.75e-07)]_85 261040 7.01e-08 249_[+3(4.39e-06)]_[+2(3.09e-06)]_\ 156_[+1(1.69e-07)]_51 31412 9.35e-07 28_[+3(1.46e-05)]_261_\ [+2(6.88e-07)]_76_[+1(3.97e-06)]_91 35340 3.72e-09 77_[+2(2.63e-06)]_36_[+1(1.23e-07)]_\ 313_[+3(2.90e-07)]_30 3804 2.46e-08 222_[+2(7.09e-09)]_37_\ [+3(2.30e-05)]_129_[+1(4.51e-06)]_68 7794 8.48e-06 69_[+1(6.16e-07)]_158_\ [+2(4.67e-07)]_241 9966 6.50e-05 255_[+2(4.84e-06)]_191_\ [+1(5.20e-07)]_22 bd168 1.72e-12 6_[+1(2.05e-10)]_89_[+2(6.10e-08)]_\ 254_[+3(2.01e-06)]_107 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************