******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/279/279.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11623 1.0000 500 11758 1.0000 500 22354 1.0000 500 24129 1.0000 500 25671 1.0000 500 31200 1.0000 500 36960 1.0000 500 7208 1.0000 500 7416 1.0000 500 7687 1.0000 500 bd630 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/279/279.seqs.fa -oc motifs/279 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.284 C 0.232 G 0.219 T 0.265 Background letter frequencies (from dataset with add-one prior applied): A 0.284 C 0.232 G 0.219 T 0.265 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 10 llr = 121 E-value = 2.7e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::::::4::::11:2 pos.-specific C a::9::6:56:344:: probability G :a::82213:651175 matrix T ::a1282524424433 bits 2.2 ** 2.0 *** 1.8 *** 1.5 **** Relative 1.3 ****** * Entropy 1.1 ****** ** * (17.5 bits) 0.9 ****** ** * 0.7 ******* **** ** 0.4 ************ ** 0.2 **************** 0.0 ---------------- Multilevel CGTCGTCTCCGGCCGG consensus TGGAGTTCTTTT sequence T T T A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 11758 466 2.62e-08 TCAATTGGTA CGTCGTCTCTTCCCGG CCTTGAAGAC 7208 299 3.11e-08 TTGTTCTCGC CGTCGTTTCCTGCCGG AATTCCAATG 7687 413 2.83e-07 TCGCCAGCCC CGTCGTCTCCGCGTTG CGAACTGCTT bd630 285 4.09e-07 AATTTCTCGC CGTCGGTACTGGTCGG CGACCTTCTA 24129 6 4.09e-07 GGGCG CGTCGTCTGTGGATGT AACTTAGGTG 7416 304 1.06e-06 AAGCCGTGGT CGTCGTGGGCGGTCTG TCAAGGCCAG 36960 3 2.51e-06 GA CGTCTTCATTGCTTGT GATTAGTTTG 11623 159 3.88e-06 TTGCCAATAG CGTTGTCTCCTTTTGA GGGATTGAAT 25671 419 8.81e-06 TCATCATTGG CGTCGTGATCGTCGTT GGCAACGCTT 31200 203 9.71e-06 AGGAGCTCAA CGTCTGCAGCTGCAGA AGATGACGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11758 2.6e-08 465_[+1]_19 7208 3.1e-08 298_[+1]_186 7687 2.8e-07 412_[+1]_72 bd630 4.1e-07 284_[+1]_200 24129 4.1e-07 5_[+1]_479 7416 1.1e-06 303_[+1]_181 36960 2.5e-06 2_[+1]_482 11623 3.9e-06 158_[+1]_326 25671 8.8e-06 418_[+1]_66 31200 9.7e-06 202_[+1]_282 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=10 11758 ( 466) CGTCGTCTCTTCCCGG 1 7208 ( 299) CGTCGTTTCCTGCCGG 1 7687 ( 413) CGTCGTCTCCGCGTTG 1 bd630 ( 285) CGTCGGTACTGGTCGG 1 24129 ( 6) CGTCGTCTGTGGATGT 1 7416 ( 304) CGTCGTGGGCGGTCTG 1 36960 ( 3) CGTCTTCATTGCTTGT 1 11623 ( 159) CGTTGTCTCCTTTTGA 1 25671 ( 419) CGTCGTGATCGTCGTT 1 31200 ( 203) CGTCTGCAGCTGCAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5335 bayes= 10.0014 E= 2.7e+000 -997 210 -997 -997 -997 -997 219 -997 -997 -997 -997 192 -997 195 -997 -140 -997 -997 187 -40 -997 -997 -13 160 -997 137 -13 -40 49 -997 -113 92 -997 110 45 -40 -997 137 -997 60 -997 -997 145 60 -997 37 119 -40 -150 78 -113 60 -150 78 -113 60 -997 -997 167 18 -50 -997 119 18 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 2.7e+000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.900000 0.000000 0.100000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.800000 0.000000 0.600000 0.200000 0.200000 0.400000 0.000000 0.100000 0.500000 0.000000 0.500000 0.300000 0.200000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 0.600000 0.400000 0.000000 0.300000 0.500000 0.200000 0.100000 0.400000 0.100000 0.400000 0.100000 0.400000 0.100000 0.400000 0.000000 0.000000 0.700000 0.300000 0.200000 0.000000 0.500000 0.300000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CGTC[GT][TG][CGT][TA][CGT][CT][GT][GCT][CT][CT][GT][GTA] -------------------------------------------------------------------------------- Time 1.15 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 106 E-value = 2.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::8::275:2:57:3:82aa pos.-specific C 7a:::8223::a:32:a:7:: probability G ::722232::3:3:85:2::: matrix T 3:3:8:3:2a5:2::2::2:: bits 2.2 * * * 2.0 * * * * 1.8 * * * * ** 1.5 * * * * * * ** Relative 1.3 * *** * * * ** ** Entropy 1.1 ****** * * ** ** ** (25.6 bits) 0.9 ****** * * ** ***** 0.7 ****** * * * ******** 0.4 ****** ************** 0.2 ****** ************** 0.0 --------------------- Multilevel CCGATCGAATTCAAGGCACAA consensus T T T C G GC A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 31200 427 7.18e-11 GAGGAATCGG CCGATCAAATTCGAGACACAA ACGCTTCCAT 7416 408 9.71e-11 CTAGTAGGAC CCGATCTACTTCAAGGCATAA CAGCGGCAGC 24129 461 4.55e-10 ATGGATTCAT CCGATGGATTGCAAGGCACAA CACCAACCTC bd630 58 9.13e-09 AAGTGTCTGG TCTATCTAATGCGCCACACAA ACTGTCTGGC 7208 179 4.13e-08 TATCGACCGT TCTGTCCGATTCTAGTCACAA GACATTGAAC 36960 432 4.31e-08 AAGATGCGAT CCGAGCGCCTACACGGCGAAA CGTCGAGCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31200 7.2e-11 426_[+2]_53 7416 9.7e-11 407_[+2]_72 24129 4.5e-10 460_[+2]_19 bd630 9.1e-09 57_[+2]_422 7208 4.1e-08 178_[+2]_301 36960 4.3e-08 431_[+2]_48 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 31200 ( 427) CCGATCAAATTCGAGACACAA 1 7416 ( 408) CCGATCTACTTCAAGGCATAA 1 24129 ( 461) CCGATGGATTGCAAGGCACAA 1 bd630 ( 58) TCTATCTAATGCGCCACACAA 1 7208 ( 179) TCTGTCCGATTCTAGTCACAA 1 36960 ( 432) CCGAGCGCCTACACGGCGAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 10.2276 E= 2.2e+001 -923 152 -923 33 -923 210 -923 -923 -923 -923 160 33 155 -923 -40 -923 -923 -923 -40 165 -923 184 -40 -923 -77 -48 60 33 123 -48 -40 -923 82 52 -923 -67 -923 -923 -923 192 -77 -923 60 92 -923 210 -923 -923 82 -923 60 -67 123 52 -923 -923 -923 -48 192 -923 23 -923 119 -67 -923 210 -923 -923 155 -923 -40 -923 -77 152 -923 -67 182 -923 -923 -923 182 -923 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 2.2e+001 0.000000 0.666667 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.833333 0.000000 0.166667 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.833333 0.166667 0.000000 0.166667 0.166667 0.333333 0.333333 0.666667 0.166667 0.166667 0.000000 0.500000 0.333333 0.000000 0.166667 0.000000 0.000000 0.000000 1.000000 0.166667 0.000000 0.333333 0.500000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.333333 0.166667 0.666667 0.333333 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.333333 0.000000 0.500000 0.166667 0.000000 1.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.166667 0.666667 0.000000 0.166667 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CT]C[GT]ATC[GT]A[AC]T[TG]C[AG][AC]G[GA]CACAA -------------------------------------------------------------------------------- Time 2.25 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 10 llr = 135 E-value = 3.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::3592:51133:21144:4: pos.-specific C ::4::8113162::::::63: probability G 22351:9135::7:39:23:: matrix T 88:::::33315386:6413a bits 2.2 2.0 * 1.8 * * * 1.5 * * * Relative 1.3 ** *** * * * Entropy 1.1 ** **** ** * * (19.5 bits) 0.9 ** **** ** ** * * 0.7 ** **** * ***** * * 0.4 ******* ************ 0.2 ********************* 0.0 --------------------- Multilevel TTCAACGACGCTGTTGTACAT consensus GGAG A TGTAATAG ATGC sequence G T C G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 7208 62 8.37e-10 TGAGTGAGTG TTAAACGATGCCGTTGTACAT ATGATTACGA 31200 319 4.06e-08 ATTGATTATG TTGGACGTTTCTGAGGTTGAT AAAAATTATC 11623 45 4.06e-08 GCTGTCTCTC GTGAACGACTAAGTTGTTCTT AGATACTCTT 25671 178 6.36e-08 ATTGAACAGT TTCAACGTGGTTTTTGAACTT CACAAGCTCC 7416 46 7.89e-08 TTCATTGGCG TTAGGCGAGGAAGTTGTTGCT GACAATTGCC 36960 27 2.11e-07 GTGATTAGTT TGGAACGACACATTTGTACCT GTCCCACGTG bd630 419 6.68e-07 TGATGAACCA GTAGAAGGTTCTGTTGATGAT CCTACGACAT 22354 297 1.52e-06 TCTTCATGTG TGCGACGACGCCGTGATGTCT GTTGAGTTGA 7687 112 2.19e-06 TGCATATGAG TTCGAAGCGCATTTGGAGCAT AACATATATA 24129 333 2.60e-06 CCAAAATAGT TTCAACCTAGCTGAAGAACTT TTCTCATTCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7208 8.4e-10 61_[+3]_418 31200 4.1e-08 318_[+3]_161 11623 4.1e-08 44_[+3]_435 25671 6.4e-08 177_[+3]_302 7416 7.9e-08 45_[+3]_434 36960 2.1e-07 26_[+3]_453 bd630 6.7e-07 418_[+3]_61 22354 1.5e-06 296_[+3]_183 7687 2.2e-06 111_[+3]_368 24129 2.6e-06 332_[+3]_147 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=10 7208 ( 62) TTAAACGATGCCGTTGTACAT 1 31200 ( 319) TTGGACGTTTCTGAGGTTGAT 1 11623 ( 45) GTGAACGACTAAGTTGTTCTT 1 25671 ( 178) TTCAACGTGGTTTTTGAACTT 1 7416 ( 46) TTAGGCGAGGAAGTTGTTGCT 1 36960 ( 27) TGGAACGACACATTTGTACCT 1 bd630 ( 419) GTAGAAGGTTCTGTTGATGAT 1 22354 ( 297) TGCGACGACGCCGTGATGTCT 1 7687 ( 112) TTCGAAGCGCATTTGGAGCAT 1 24129 ( 333) TTCAACCTAGCTGAAGAACTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 9.29364 E= 3.8e+001 -997 -997 -13 160 -997 -997 -13 160 8 78 45 -997 82 -997 119 -997 166 -997 -113 -997 -50 178 -997 -997 -997 -121 204 -997 82 -121 -113 18 -150 37 45 18 -150 -121 119 18 8 137 -997 -140 8 -22 -997 92 -997 -997 167 18 -50 -997 -997 160 -150 -997 45 118 -150 -997 204 -997 49 -997 -997 118 49 -997 -13 60 -997 137 45 -140 49 37 -997 18 -997 -997 -997 192 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 3.8e+001 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.800000 0.300000 0.400000 0.300000 0.000000 0.500000 0.000000 0.500000 0.000000 0.900000 0.000000 0.100000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.100000 0.900000 0.000000 0.500000 0.100000 0.100000 0.300000 0.100000 0.300000 0.300000 0.300000 0.100000 0.100000 0.500000 0.300000 0.300000 0.600000 0.000000 0.100000 0.300000 0.200000 0.000000 0.500000 0.000000 0.000000 0.700000 0.300000 0.200000 0.000000 0.000000 0.800000 0.100000 0.000000 0.300000 0.600000 0.100000 0.000000 0.900000 0.000000 0.400000 0.000000 0.000000 0.600000 0.400000 0.000000 0.200000 0.400000 0.000000 0.600000 0.300000 0.100000 0.400000 0.300000 0.000000 0.300000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TG][TG][CAG][AG]A[CA]G[AT][CGT][GT][CA][TAC][GT][TA][TG]G[TA][ATG][CG][ACT]T -------------------------------------------------------------------------------- Time 3.36 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11623 2.49e-06 44_[+3(4.06e-08)]_93_[+1(3.88e-06)]_\ 326 11758 7.57e-04 465_[+1(2.62e-08)]_19 22354 1.58e-02 296_[+3(1.52e-06)]_183 24129 2.69e-11 5_[+1(4.09e-07)]_311_[+3(2.60e-06)]_\ 107_[+2(4.55e-10)]_19 25671 1.77e-05 177_[+3(6.36e-08)]_220_\ [+1(8.81e-06)]_66 31200 1.86e-12 202_[+1(9.71e-06)]_100_\ [+3(4.06e-08)]_87_[+2(7.18e-11)]_53 36960 9.78e-10 2_[+1(2.51e-06)]_8_[+3(2.11e-07)]_\ 384_[+2(4.31e-08)]_48 7208 8.52e-14 61_[+3(8.37e-10)]_96_[+2(4.13e-08)]_\ 99_[+1(3.11e-08)]_186 7416 5.75e-13 45_[+3(7.89e-08)]_29_[+2(7.04e-06)]_\ 187_[+1(1.06e-06)]_88_[+2(9.71e-11)]_72 7687 1.60e-05 111_[+3(2.19e-06)]_280_\ [+1(2.83e-07)]_72 bd630 1.25e-10 57_[+2(9.13e-09)]_102_\ [+2(9.13e-09)]_83_[+1(4.09e-07)]_118_[+3(6.68e-07)]_61 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************