******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/190/190.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42830 1.0000 500 9413 1.0000 500 43024 1.0000 500 47036 1.0000 500 13890 1.0000 500 21897 1.0000 500 47843 1.0000 500 51114 1.0000 500 29821 1.0000 500 39681 1.0000 500 49487 1.0000 500 44364 1.0000 500 44479 1.0000 500 11016 1.0000 500 11305 1.0000 500 45017 1.0000 500 45140 1.0000 500 34898 1.0000 500 34923 1.0000 500 35589 1.0000 500 36180 1.0000 500 50441 1.0000 500 35852 1.0000 500 35863 1.0000 500 34287 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/190/190.seqs.fa -oc motifs/190 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 25 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12500 N= 25 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.267 C 0.238 G 0.222 T 0.274 Background letter frequencies (from dataset with add-one prior applied): A 0.267 C 0.238 G 0.222 T 0.274 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 8 llr = 103 E-value = 2.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :6::5a:1::3a pos.-specific C 1:6:::a1:::: probability G 9:::5:::a:8: matrix T :44a:::8:a:: bits 2.2 * * 2.0 * ** ** * 1.7 * ** ** * 1.5 * * ** ** * Relative 1.3 * * ** **** Entropy 1.1 * ***** **** (18.6 bits) 0.9 ************ 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GACTAACTGTGA consensus TT G A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 34898 390 2.36e-07 CTGGTGTCTG GATTGACTGTGA AAAAAGGCGA 11305 229 2.36e-07 GATCCAAATT GATTGACTGTGA ATGTGAACTT 49487 269 3.02e-07 TGTTCGCGAC GTCTAACTGTGA TTCTAACCGT 36180 477 4.41e-07 ACGTTGGACA GTTTGACTGTGA AGCAGAGTAA 51114 258 6.60e-07 TTATCCAATA GACTAACTGTAA GGGATATGAA 21897 279 6.60e-07 AGAGTCTCCT GACTAACTGTAA AAAAGGGAGA 44479 370 1.36e-06 CCTTCGTTTA GTCTGACCGTGA GCAGTGAAAG 44364 99 4.02e-06 TTTCCACGAA CACTAACAGTGA AGGTCGAGCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34898 2.4e-07 389_[+1]_99 11305 2.4e-07 228_[+1]_260 49487 3e-07 268_[+1]_220 36180 4.4e-07 476_[+1]_12 51114 6.6e-07 257_[+1]_231 21897 6.6e-07 278_[+1]_210 44479 1.4e-06 369_[+1]_119 44364 4e-06 98_[+1]_390 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=8 34898 ( 390) GATTGACTGTGA 1 11305 ( 229) GATTGACTGTGA 1 49487 ( 269) GTCTAACTGTGA 1 36180 ( 477) GTTTGACTGTGA 1 51114 ( 258) GACTAACTGTAA 1 21897 ( 279) GACTAACTGTAA 1 44479 ( 370) GTCTGACCGTGA 1 44364 ( 99) CACTAACAGTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 12225 bayes= 10.5766 E= 2.0e+001 -965 -92 198 -965 123 -965 -965 45 -965 139 -965 45 -965 -965 -965 187 90 -965 117 -965 190 -965 -965 -965 -965 207 -965 -965 -109 -92 -965 145 -965 -965 217 -965 -965 -965 -965 187 -9 -965 176 -965 190 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 2.0e+001 0.000000 0.125000 0.875000 0.000000 0.625000 0.000000 0.000000 0.375000 0.000000 0.625000 0.000000 0.375000 0.000000 0.000000 0.000000 1.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.125000 0.125000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[AT][CT]T[AG]ACTGT[GA]A -------------------------------------------------------------------------------- Time 4.97 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 19 llr = 179 E-value = 4.7e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 87:6:8a9754: pos.-specific C :25:1:::2:2a probability G 21539::1142: matrix T :::1:2:::22: bits 2.2 * 2.0 * * 1.7 * * * 1.5 * ** * Relative 1.3 * **** * Entropy 1.1 * * **** * (13.6 bits) 0.9 *** ***** * 0.7 ********* * 0.4 ********** * 0.2 ********** * 0.0 ------------ Multilevel AAGAGAAAAAAC consensus GCCG GC sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 11016 288 1.24e-06 ATGCTGGTAG AACGGAAAAAAC GATGTTGTTG 36180 181 1.62e-06 TGTGATGTAC AAGGGAAAAAGC TAGTAGAGCC 44364 254 3.95e-06 TTTCTCCAAC GAGAGAAAAAGC ACAACGATCT 9413 395 3.95e-06 TCACGGTGCA AACAGAAACAAC CGCTGGTGTA 50441 42 5.15e-06 GCAATCGCCC ACGAGAAAAATC AACCGTCCAT 39681 56 6.89e-06 AGGAACCAAA AAGGGAAACAAC CTTTGATTGG 34898 331 7.50e-06 TTTGACGCTT AACTGAAAAAAC ACAACTATGC 45140 413 9.91e-06 GCAAGGCTAC AAGGGAAAATTC ACGCCGCTTG 21897 199 9.91e-06 GATGGCGTAG AAGAGAAAGACC AACCCGTCGG 43024 68 1.32e-05 CGACACAATT AGCGGAAAAGAC AAAATTGGAG 49487 300 1.56e-05 TAACAAAGTT AACAGTAAAGTC GCCTTCCCAG 35863 431 1.79e-05 TACACAATTG AAGACAAAAGCC TCCAGTAACG 35852 332 2.38e-05 GAAAATTGCT AACAGTAAATAC TTTTGAATGG 47843 22 2.52e-05 GCTTGCCATT GAGAGAAGAGAC ATACGAGCTG 34923 422 3.56e-05 TCATTTTTCC AGCAGAAAATCC AGAATCCACC 34287 255 7.41e-05 TGCACATAGT AAGGGTAAGGGC TGTTGATTGA 11305 377 1.09e-04 CGAATCAGAC GCGACAAAAGGC ACGATCGTTG 44479 181 1.09e-04 ATAGCCACCG GCCAGAAGAATC TTACCCGGCG 47036 170 1.09e-04 TATTCAGAGA ACCTGAAACGCC TAGCCAGACG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11016 1.2e-06 287_[+2]_201 36180 1.6e-06 180_[+2]_308 44364 3.9e-06 253_[+2]_235 9413 3.9e-06 394_[+2]_94 50441 5.2e-06 41_[+2]_447 39681 6.9e-06 55_[+2]_433 34898 7.5e-06 330_[+2]_158 45140 9.9e-06 412_[+2]_76 21897 9.9e-06 198_[+2]_290 43024 1.3e-05 67_[+2]_421 49487 1.6e-05 299_[+2]_189 35863 1.8e-05 430_[+2]_58 35852 2.4e-05 331_[+2]_157 47843 2.5e-05 21_[+2]_467 34923 3.6e-05 421_[+2]_67 34287 7.4e-05 254_[+2]_234 11305 0.00011 376_[+2]_112 44479 0.00011 180_[+2]_308 47036 0.00011 169_[+2]_319 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=19 11016 ( 288) AACGGAAAAAAC 1 36180 ( 181) AAGGGAAAAAGC 1 44364 ( 254) GAGAGAAAAAGC 1 9413 ( 395) AACAGAAACAAC 1 50441 ( 42) ACGAGAAAAATC 1 39681 ( 56) AAGGGAAACAAC 1 34898 ( 331) AACTGAAAAAAC 1 45140 ( 413) AAGGGAAAATTC 1 21897 ( 199) AAGAGAAAGACC 1 43024 ( 68) AGCGGAAAAGAC 1 49487 ( 300) AACAGTAAAGTC 1 35863 ( 431) AAGACAAAAGCC 1 35852 ( 332) AACAGTAAATAC 1 47843 ( 22) GAGAGAAGAGAC 1 34923 ( 422) AGCAGAAAATCC 1 34287 ( 255) AAGGGTAAGGGC 1 11305 ( 377) GCGACAAAAGGC 1 44479 ( 181) GCCAGAAGAATC 1 47036 ( 170) ACCTGAAACGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 12225 bayes= 10.2258 E= 4.7e+000 156 -1089 -7 -1089 136 -17 -107 -1089 -1089 100 125 -1089 112 -1089 51 -138 -1089 -117 201 -1089 166 -1089 -1089 -80 191 -1089 -1089 -1089 174 -1089 -107 -1089 146 -59 -107 -1089 83 -1089 73 -80 46 -17 -7 -38 -1089 207 -1089 -1089 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 19 E= 4.7e+000 0.789474 0.000000 0.210526 0.000000 0.684211 0.210526 0.105263 0.000000 0.000000 0.473684 0.526316 0.000000 0.578947 0.000000 0.315789 0.105263 0.000000 0.105263 0.894737 0.000000 0.842105 0.000000 0.000000 0.157895 1.000000 0.000000 0.000000 0.000000 0.894737 0.000000 0.105263 0.000000 0.736842 0.157895 0.105263 0.000000 0.473684 0.000000 0.368421 0.157895 0.368421 0.210526 0.210526 0.210526 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AG][AC][GC][AG]GAAAA[AG][ACGT]C -------------------------------------------------------------------------------- Time 10.00 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 6 llr = 107 E-value = 2.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::::5:a::::78a8:2: pos.-specific C :7:222:2:282::2::2:: probability G :2a275:8:225a3:::::8 matrix T a2:7235::7:3::::2882 bits 2.2 * * 2.0 * * * * * 1.7 * * * * * 1.5 * * ** * * * * Relative 1.3 * * ** * * ****** Entropy 1.1 * * ** * ******** (25.7 bits) 0.9 *** * *** * ******** 0.7 ******************** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel TCGTGGAGATCGGAAAATTG consensus TT T G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 11305 151 7.22e-11 AAGTCAATGC TCGCGGAGATCTGAAAATTG CAGCAGCGTT 21897 400 1.58e-09 TGTCTACATT TGGTCGAGAGCGGAAAATTG AGATCCCAAT 35852 310 2.99e-09 AATCGCTATT TCGGTCAGATCTGAAAATTG CTAACAGTAA 29821 186 4.18e-09 CGTCAGGTCC TTGTGGTGACCGGGAAATAG AATTTCCCGT 42830 357 1.43e-08 GCGTTGCCGT TCGTGTTCATCGGGAAACTT CAGTAATCCC 36180 295 1.73e-08 CCTACGCCTT TCGTGTTGATGCGACATTTG GCATACGTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11305 7.2e-11 150_[+3]_330 21897 1.6e-09 399_[+3]_81 35852 3e-09 309_[+3]_171 29821 4.2e-09 185_[+3]_295 42830 1.4e-08 356_[+3]_124 36180 1.7e-08 294_[+3]_186 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=6 11305 ( 151) TCGCGGAGATCTGAAAATTG 1 21897 ( 400) TGGTCGAGAGCGGAAAATTG 1 35852 ( 310) TCGGTCAGATCTGAAAATTG 1 29821 ( 186) TTGTGGTGACCGGGAAATAG 1 42830 ( 357) TCGTGTTCATCGGGAAACTT 1 36180 ( 295) TCGTGTTGATGCGACATTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 12025 bayes= 11.4157 E= 2.9e+002 -923 -923 -923 187 -923 149 -41 -72 -923 -923 217 -923 -923 -51 -41 128 -923 -51 159 -72 -923 -51 117 28 90 -923 -923 87 -923 -51 191 -923 190 -923 -923 -923 -923 -51 -41 128 -923 181 -41 -923 -923 -51 117 28 -923 -923 217 -923 132 -923 59 -923 164 -51 -923 -923 190 -923 -923 -923 164 -923 -923 -72 -923 -51 -923 160 -68 -923 -923 160 -923 -923 191 -72 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 2.9e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.166667 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.166667 0.666667 0.000000 0.166667 0.666667 0.166667 0.000000 0.166667 0.500000 0.333333 0.500000 0.000000 0.000000 0.500000 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.166667 0.666667 0.000000 0.833333 0.166667 0.000000 0.000000 0.166667 0.500000 0.333333 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.166667 0.000000 0.833333 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.833333 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TCGTG[GT][AT]GATC[GT]G[AG]AAATTG -------------------------------------------------------------------------------- Time 15.15 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42830 1.69e-04 356_[+3(1.43e-08)]_124 9413 2.40e-02 394_[+2(3.95e-06)]_94 43024 1.92e-02 67_[+2(1.32e-05)]_421 47036 1.29e-01 500 13890 4.66e-01 500 21897 4.81e-10 198_[+2(9.91e-06)]_68_\ [+1(6.60e-07)]_109_[+3(1.58e-09)]_81 47843 3.54e-02 21_[+2(2.52e-05)]_467 51114 1.47e-03 257_[+1(6.60e-07)]_231 29821 2.54e-05 185_[+3(4.18e-09)]_295 39681 3.67e-02 55_[+2(6.89e-06)]_433 49487 3.17e-05 268_[+1(3.02e-07)]_[+1(9.48e-05)]_7_\ [+2(1.56e-05)]_59_[+2(6.63e-05)]_118 44364 2.92e-04 98_[+1(4.02e-06)]_143_\ [+2(3.95e-06)]_235 44479 1.02e-04 369_[+1(1.36e-06)]_59_\ [+3(5.57e-05)]_40 11016 4.50e-03 287_[+2(1.24e-06)]_201 11305 9.49e-11 150_[+3(7.22e-11)]_58_\ [+1(2.36e-07)]_260 45017 5.68e-01 500 45140 4.29e-02 412_[+2(9.91e-06)]_76 34898 1.94e-05 330_[+2(7.50e-06)]_47_\ [+1(2.36e-07)]_99 34923 7.76e-02 421_[+2(3.56e-05)]_67 35589 2.62e-01 500 36180 5.69e-10 180_[+2(1.62e-06)]_102_\ [+3(1.73e-08)]_162_[+1(4.41e-07)]_12 50441 2.83e-02 41_[+2(5.15e-06)]_447 35852 7.94e-07 309_[+3(2.99e-09)]_2_[+2(2.38e-05)]_\ 157 35863 4.74e-02 430_[+2(1.79e-05)]_58 34287 1.70e-02 254_[+2(7.41e-05)]_86_\ [+1(9.48e-05)]_136 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************