******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/76/76.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47891 1.0000 500 48289 1.0000 500 48414 1.0000 500 39777 1.0000 500 15802 1.0000 500 2739 1.0000 500 1389 1.0000 500 44149 1.0000 500 50359 1.0000 500 48560 1.0000 500 47349 1.0000 500 47774 1.0000 500 47751 1.0000 500 50142 1.0000 500 37898 1.0000 500 40362 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/76/76.seqs.fa -oc motifs/76 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.267 C 0.247 G 0.217 T 0.269 Background letter frequencies (from dataset with add-one prior applied): A 0.267 C 0.247 G 0.217 T 0.269 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 4 llr = 78 E-value = 5.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::35::3::::::::::: pos.-specific C ::3:38:35:38:::::38 probability G :a883:a3:a8:a:3:a83 matrix T a::::3:35::3:a8a::: bits 2.2 * * * * * 2.0 ** * * ** ** 1.8 ** * * ** ** 1.5 ** * * ** ** Relative 1.3 **** * ** ** **** Entropy 1.1 **** ** ********** (28.3 bits) 0.9 **** ** *********** 0.7 **** ** *********** 0.4 ******* *********** 0.2 ******* *********** 0.0 ------------------- Multilevel TGGGACGACGGCGTTTGGC consensus CACT CT CT G CG sequence G G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 48560 13 1.64e-10 CAATCACCAC TGGGGCGATGGCGTTTGGG ATGGTCCGAT 47349 21 2.03e-10 TCAGCGGTTT TGCGCCGGCGGCGTTTGGC AGGCCTGGCG 1389 255 1.43e-09 AAATCCCCAA TGGGATGCCGCCGTGTGGC TGTCAAATTT 44149 268 2.43e-09 GGCTGCCTAA TGGAACGTTGGTGTTTGCC TTTTAGCTAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48560 1.6e-10 12_[+1]_469 47349 2e-10 20_[+1]_461 1389 1.4e-09 254_[+1]_227 44149 2.4e-09 267_[+1]_214 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=4 48560 ( 13) TGGGGCGATGGCGTTTGGG 1 47349 ( 21) TGCGCCGGCGGCGTTTGGC 1 1389 ( 255) TGGGATGCCGCCGTGTGGC 1 44149 ( 268) TGGAACGTTGGTGTTTGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 7712 bayes= 10.9121 E= 5.8e+001 -865 -865 -865 189 -865 -865 220 -865 -865 2 179 -865 -10 -865 179 -865 90 2 20 -865 -865 160 -865 -11 -865 -865 220 -865 -10 2 20 -11 -865 102 -865 89 -865 -865 220 -865 -865 2 179 -865 -865 160 -865 -11 -865 -865 220 -865 -865 -865 -865 189 -865 -865 20 148 -865 -865 -865 189 -865 -865 220 -865 -865 2 179 -865 -865 160 20 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 4 E= 5.8e+001 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.250000 0.000000 0.750000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.250000 0.250000 0.250000 0.250000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.250000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TG[GC][GA][ACG][CT]G[ACGT][CT]G[GC][CT]GT[TG]TG[GC][CG] -------------------------------------------------------------------------------- Time 2.08 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 110 E-value = 1.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 2:7:7::2:::a32:::::83 pos.-specific C :227:2:::87:3::::a::7 probability G :8:3278::22:23:27:a:: matrix T 8:2:2228a:2:25a83::2: bits 2.2 * 2.0 * * * ** 1.8 * * * ** 1.5 * * * * * ** Relative 1.3 ** **** * ** *** Entropy 1.1 ** * **** * ******* (26.4 bits) 0.9 ** * ******* ******* 0.7 ************ ******* 0.4 ************ ******** 0.2 ************ ******** 0.0 --------------------- Multilevel TGACAGGTTCCAATTTGCGAC consensus G CG T A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 47751 414 7.04e-11 CCGACGGAGT TGAGAGGTTCCAAGTTTCGAA ACTCTGCGGT 48414 327 3.36e-10 ACTCTAGGGT TGAGATGTTCCACTTGGCGAC GTCATTCAAG 1389 344 1.94e-09 GTTACGCTCT TGCCACGTTCTATTTTGCGAC GAGTGGGCGG 2739 366 2.29e-09 GTATTCGCGA TCACGGGATCCAGTTTGCGAC GAAGTCGGTC 47349 333 7.66e-09 GACTGTCCAT TGACAGTTTGGACGTTGCGTC AGCCCGAGCA 48560 205 1.36e-08 ATGTCTGGGA AGTCTGGTTCCAAATTTCGAA AGATTCCGGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47751 7e-11 413_[+2]_66 48414 3.4e-10 326_[+2]_153 1389 1.9e-09 343_[+2]_136 2739 2.3e-09 365_[+2]_114 47349 7.7e-09 332_[+2]_147 48560 1.4e-08 204_[+2]_275 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 47751 ( 414) TGAGAGGTTCCAAGTTTCGAA 1 48414 ( 327) TGAGATGTTCCACTTGGCGAC 1 1389 ( 344) TGCCACGTTCTATTTTGCGAC 1 2739 ( 366) TCACGGGATCCAGTTTGCGAC 1 47349 ( 333) TGACAGTTTGGACGTTGCGTC 1 48560 ( 205) AGTCTGGTTCCAAATTTCGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 10.7686 E= 1.6e+001 -68 -923 -923 163 -923 -56 194 -923 132 -56 -923 -69 -923 143 62 -923 132 -923 -38 -69 -923 -56 162 -69 -923 -923 194 -69 -68 -923 -923 163 -923 -923 -923 189 -923 175 -38 -923 -923 143 -38 -69 190 -923 -923 -923 32 43 -38 -69 -68 -923 62 89 -923 -923 -923 189 -923 -923 -38 163 -923 -923 162 31 -923 202 -923 -923 -923 -923 220 -923 164 -923 -923 -69 32 143 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 1.6e+001 0.166667 0.000000 0.000000 0.833333 0.000000 0.166667 0.833333 0.000000 0.666667 0.166667 0.000000 0.166667 0.000000 0.666667 0.333333 0.000000 0.666667 0.000000 0.166667 0.166667 0.000000 0.166667 0.666667 0.166667 0.000000 0.000000 0.833333 0.166667 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.666667 0.166667 0.166667 1.000000 0.000000 0.000000 0.000000 0.333333 0.333333 0.166667 0.166667 0.166667 0.000000 0.333333 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.666667 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.333333 0.666667 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TGA[CG]AGGTTCCA[AC][TG]TT[GT]CGA[CA] -------------------------------------------------------------------------------- Time 4.06 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 8 llr = 95 E-value = 1.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 9::3:4::aa51 pos.-specific C :1:::66::::9 probability G 19a36:4a::4: matrix T :::54:::::1: bits 2.2 * * 2.0 * *** 1.8 * *** 1.5 ** *** * Relative 1.3 *** *** * Entropy 1.1 *** ****** * (17.1 bits) 0.9 *** ****** * 0.7 *** ******** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AGGTGCCGAAAC consensus ATAG G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47774 3 4.57e-08 GT AGGTGCCGAAAC GACTTGATCG 50359 205 7.14e-07 ACAGTAAGAA AGGTGAGGAAGC TACATTAGAT 1389 446 9.63e-07 TCTCGGGAGA AGGAGACGAAAC CATCACTGTT 47891 466 1.25e-06 ATCAAGCGCC AGGATCCGAAAC AACATCAACC 47751 344 1.82e-06 AACGCACGCG AGGGTACGAAAC GCCTGCAGTT 39777 9 2.41e-06 CTAGTCTC ACGTGCCGAAGC GCGAGATATG 48414 279 7.12e-06 CGTCGAAGCT GGGGTCGGAAGC ATCGTAAACC 37898 217 8.92e-06 GTGTAAATGG AGGTGCGGAATA GACATCCAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47774 4.6e-08 2_[+3]_486 50359 7.1e-07 204_[+3]_284 1389 9.6e-07 445_[+3]_43 47891 1.3e-06 465_[+3]_23 47751 1.8e-06 343_[+3]_145 39777 2.4e-06 8_[+3]_480 48414 7.1e-06 278_[+3]_210 37898 8.9e-06 216_[+3]_272 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=8 47774 ( 3) AGGTGCCGAAAC 1 50359 ( 205) AGGTGAGGAAGC 1 1389 ( 446) AGGAGACGAAAC 1 47891 ( 466) AGGATCCGAAAC 1 47751 ( 344) AGGGTACGAAAC 1 39777 ( 9) ACGTGCCGAAGC 1 48414 ( 279) GGGGTCGGAAGC 1 37898 ( 217) AGGTGCGGAATA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 10.6698 E= 1.7e+002 171 -965 -79 -965 -965 -98 201 -965 -965 -965 220 -965 -10 -965 20 89 -965 -965 152 48 49 134 -965 -965 -965 134 79 -965 -965 -965 220 -965 190 -965 -965 -965 190 -965 -965 -965 90 -965 79 -110 -109 182 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 1.7e+002 0.875000 0.000000 0.125000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.250000 0.500000 0.000000 0.000000 0.625000 0.375000 0.375000 0.625000 0.000000 0.000000 0.000000 0.625000 0.375000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.375000 0.125000 0.125000 0.875000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AGG[TAG][GT][CA][CG]GAA[AG]C -------------------------------------------------------------------------------- Time 6.15 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47891 1.51e-02 156_[+3(2.48e-05)]_297_\ [+3(1.25e-06)]_23 48289 9.31e-01 500 48414 5.14e-08 278_[+3(7.12e-06)]_36_\ [+2(3.36e-10)]_153 39777 2.83e-02 8_[+3(2.41e-06)]_480 15802 7.07e-01 500 2739 1.69e-05 365_[+2(2.29e-09)]_114 1389 2.03e-13 254_[+1(1.43e-09)]_70_\ [+2(1.94e-09)]_81_[+3(9.63e-07)]_43 44149 9.13e-06 267_[+1(2.43e-09)]_214 50359 5.76e-03 204_[+3(7.14e-07)]_284 48560 5.26e-11 12_[+1(1.64e-10)]_173_\ [+2(1.36e-08)]_275 47349 1.04e-10 20_[+1(2.03e-10)]_293_\ [+2(7.66e-09)]_147 47774 5.72e-04 2_[+3(4.57e-08)]_486 47751 7.97e-09 343_[+3(1.82e-06)]_58_\ [+2(7.04e-11)]_66 50142 3.73e-01 500 37898 2.44e-02 216_[+3(8.92e-06)]_272 40362 8.13e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************