******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/218/218.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 46957 1.0000 500 47351 1.0000 500 37957 1.0000 500 47754 1.0000 500 51092 1.0000 500 14397 1.0000 500 48750 1.0000 500 49009 1.0000 500 54983 1.0000 500 49238 1.0000 500 16036 1.0000 500 16120 1.0000 500 42307 1.0000 500 49642 1.0000 500 52619 1.0000 500 51232 1.0000 500 23629 1.0000 500 50026 1.0000 500 41327 1.0000 500 50387 1.0000 500 33217 1.0000 500 44046 1.0000 500 12902 1.0000 500 55010 1.0000 500 50304 1.0000 500 55004 1.0000 500 48868 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/218/218.seqs.fa -oc motifs/218 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 27 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 13500 N= 27 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.264 C 0.251 G 0.227 T 0.258 Background letter frequencies (from dataset with add-one prior applied): A 0.264 C 0.251 G 0.227 T 0.258 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 11 llr = 145 E-value = 8.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a:1:1:93:682:951 pos.-specific C :8:1:1:73:2:7::: probability G :25:97::73:53::6 matrix T ::59:21::1:3:153 bits 2.1 1.9 * 1.7 * * 1.5 * ** * * Relative 1.3 ** ** * * * ** Entropy 1.1 ** ****** * ** (19.1 bits) 0.9 ** ****** * **** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel ACGTGGACGAAGCAAG consensus T ACG TG TT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 51232 269 8.31e-09 ACTGTATGCC ACTTGGACGAAACATG CCGACCACCT 55010 14 4.99e-08 TCAAATGCCA ACTTGGAAGGAGGAAG TTGTTGCATC 16036 455 5.98e-08 GAATGATGAC AGGTGGACGACGCAAG GCTTCGAGCC 37957 462 1.47e-07 CTATAAGTAC ACGTGTACGAATCATT GGTGGGCAGC 23629 243 1.61e-07 GGTGTGTGCT ACTTGGAAGAAACATT CTATTGGCAA 55004 364 2.30e-07 TACTGTTGCC ACGTAGACGGAGGAAG TTCCCCGTTC 14397 467 2.49e-07 CTAGTGAAAC ACTTGGACCTATCAAG GAGGGATACG 16120 96 1.08e-06 ACCGACGGAG ACGCGTACGGAGGATG CTCCCACTGT 49238 466 1.72e-06 CCCTCCCATT AGTTGGAACAAGCTTG TCCTCTGGCA 51092 231 1.95e-06 TCTAAGACTG ACGTGGTCCAATCAAA TTTTCGCTAT 50304 105 2.08e-06 TCAAGTTATT ACATGCACGACGCAAT TTAATATGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 51232 8.3e-09 268_[+1]_216 55010 5e-08 13_[+1]_471 16036 6e-08 454_[+1]_30 37957 1.5e-07 461_[+1]_23 23629 1.6e-07 242_[+1]_242 55004 2.3e-07 363_[+1]_121 14397 2.5e-07 466_[+1]_18 16120 1.1e-06 95_[+1]_389 49238 1.7e-06 465_[+1]_19 51092 2e-06 230_[+1]_254 50304 2.1e-06 104_[+1]_380 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=11 51232 ( 269) ACTTGGACGAAACATG 1 55010 ( 14) ACTTGGAAGGAGGAAG 1 16036 ( 455) AGGTGGACGACGCAAG 1 37957 ( 462) ACGTGTACGAATCATT 1 23629 ( 243) ACTTGGAAGAAACATT 1 55004 ( 364) ACGTAGACGGAGGAAG 1 14397 ( 467) ACTTGGACCTATCAAG 1 16120 ( 96) ACGCGTACGGAGGATG 1 49238 ( 466) AGTTGGAACAAGCTTG 1 51092 ( 231) ACGTGGTCCAATCAAA 1 50304 ( 105) ACATGCACGACGCAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 13095 bayes= 9.76818 E= 8.0e+000 192 -1010 -1010 -1010 -1010 170 -32 -1010 -154 -1010 100 82 -1010 -146 -1010 182 -154 -1010 200 -1010 -1010 -146 168 -50 178 -1010 -1010 -150 5 153 -1010 -1010 -1010 12 168 -1010 127 -1010 27 -150 163 -47 -1010 -1010 -54 -1010 126 8 -1010 153 27 -1010 178 -1010 -1010 -150 105 -1010 -1010 82 -154 -1010 149 8 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 8.0e+000 1.000000 0.000000 0.000000 0.000000 0.000000 0.818182 0.181818 0.000000 0.090909 0.000000 0.454545 0.454545 0.000000 0.090909 0.000000 0.909091 0.090909 0.000000 0.909091 0.000000 0.000000 0.090909 0.727273 0.181818 0.909091 0.000000 0.000000 0.090909 0.272727 0.727273 0.000000 0.000000 0.000000 0.272727 0.727273 0.000000 0.636364 0.000000 0.272727 0.090909 0.818182 0.181818 0.000000 0.000000 0.181818 0.000000 0.545455 0.272727 0.000000 0.727273 0.272727 0.000000 0.909091 0.000000 0.000000 0.090909 0.545455 0.000000 0.000000 0.454545 0.090909 0.000000 0.636364 0.272727 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- AC[GT]TGGA[CA][GC][AG]A[GT][CG]A[AT][GT] -------------------------------------------------------------------------------- Time 6.39 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 10 llr = 134 E-value = 1.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :3837751:9::a::: pos.-specific C :::12:221::::61: probability G 921611179::8:29: matrix T 151::22::1a2:2:a bits 2.1 1.9 * * * 1.7 * * * * ** 1.5 * *** * ** Relative 1.3 * ***** ** Entropy 1.1 * * ***** ** (19.3 bits) 0.9 * **** ****** ** 0.6 * **** ********* 0.4 ****** ********* 0.2 **************** 0.0 ---------------- Multilevel GTAGAAAGGATGACGT consensus A ACTCC T G sequence G T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 12902 237 1.38e-09 GAGATACGAA GTAGAAAGGATGAGGT CAGAGCTTTC 14397 409 3.68e-08 GCTATGAGAC GTAAATTGGATGACGT CTTGAAGGAG 51232 145 8.64e-08 GTCCGGGAGA GGAAAAAAGATGACGT CGCAACGTTC 23629 380 1.16e-07 CTCAGCACCT GTTGAGAGGATGACGT CTCCTATACG 41327 283 1.85e-07 ATGGACAAAC GGAGCACGGATTACGT TGTTCTCCGG 47351 456 5.51e-07 AGCAATCCAT GTAGGTACGATGAGGT GAATTTGCTA 44046 94 8.00e-07 GACCGATCGG GAACCATCGATGACGT CAACAACCGG 48868 113 1.22e-06 GTTGTTGGTT TTGAAAAGGATGATGT TGCATGTGGT 52619 285 1.30e-06 ACTACGGAAT GAAGAACGGATTATCT GCTACTTTTT 49238 91 1.39e-06 TGCAGAGAAT GAAGAAGGCTTGACGT TGACCGTGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12902 1.4e-09 236_[+2]_248 14397 3.7e-08 408_[+2]_76 51232 8.6e-08 144_[+2]_340 23629 1.2e-07 379_[+2]_105 41327 1.9e-07 282_[+2]_202 47351 5.5e-07 455_[+2]_29 44046 8e-07 93_[+2]_391 48868 1.2e-06 112_[+2]_372 52619 1.3e-06 284_[+2]_200 49238 1.4e-06 90_[+2]_394 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=10 12902 ( 237) GTAGAAAGGATGAGGT 1 14397 ( 409) GTAAATTGGATGACGT 1 51232 ( 145) GGAAAAAAGATGACGT 1 23629 ( 380) GTTGAGAGGATGACGT 1 41327 ( 283) GGAGCACGGATTACGT 1 47351 ( 456) GTAGGTACGATGAGGT 1 44046 ( 94) GAACCATCGATGACGT 1 48868 ( 113) TTGAAAAGGATGATGT 1 52619 ( 285) GAAGAACGGATTATCT 1 49238 ( 91) GAAGAAGGCTTGACGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 13095 bayes= 10.6054 E= 1.5e+001 -997 -997 199 -136 18 -997 -18 95 160 -997 -118 -136 18 -133 140 -997 141 -33 -118 -997 141 -997 -118 -37 92 -33 -118 -37 -140 -33 162 -997 -997 -133 199 -997 177 -997 -997 -136 -997 -997 -997 195 -997 -997 182 -37 192 -997 -997 -997 -997 126 -18 -37 -997 -133 199 -997 -997 -997 -997 195 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 1.5e+001 0.000000 0.000000 0.900000 0.100000 0.300000 0.000000 0.200000 0.500000 0.800000 0.000000 0.100000 0.100000 0.300000 0.100000 0.600000 0.000000 0.700000 0.200000 0.100000 0.000000 0.700000 0.000000 0.100000 0.200000 0.500000 0.200000 0.100000 0.200000 0.100000 0.200000 0.700000 0.000000 0.000000 0.100000 0.900000 0.000000 0.900000 0.000000 0.000000 0.100000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 1.000000 0.000000 0.000000 0.000000 0.000000 0.600000 0.200000 0.200000 0.000000 0.100000 0.900000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[TAG]A[GA][AC][AT][ACT][GC]GAT[GT]A[CGT]GT -------------------------------------------------------------------------------- Time 12.31 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 4 llr = 83 E-value = 3.6e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 33a:58:::3:::5:a:::8 pos.-specific C :::8:3::::a::5:::::: probability G 85:33:a3a::a3:a::5a3 matrix T :3::3::8:8::8:::a5:: bits 2.1 * * * * * 1.9 * * * ** *** * 1.7 * * * ** *** * 1.5 * * * ** *** * Relative 1.3 * ** *** *** *** * Entropy 1.1 * ** ******** ****** (29.8 bits) 0.9 * ** *************** 0.6 **** *************** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GGACAAGTGTCGTAGATGGA consensus AA GGC G A GC T G sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 16120 164 2.01e-11 CCCAAGTAAG GGACACGTGTCGTCGATGGA TTGAAACGCC 33217 101 2.56e-10 TTTAGCTAGT GGACGAGGGTCGGCGATTGA TACGCACGCG 52619 169 3.91e-10 ATTCCCAGCA GTAGAAGTGTCGTAGATTGG CTTGCCTGAA 47754 134 9.18e-10 TGTAAGAAAC AAACTAGTGACGTAGATGGA CATGTCCGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 16120 2e-11 163_[+3]_317 33217 2.6e-10 100_[+3]_380 52619 3.9e-10 168_[+3]_312 47754 9.2e-10 133_[+3]_347 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=4 16120 ( 164) GGACACGTGTCGTCGATGGA 1 33217 ( 101) GGACGAGGGTCGGCGATTGA 1 52619 ( 169) GTAGAAGTGTCGTAGATTGG 1 47754 ( 134) AAACTAGTGACGTAGATGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 12987 bayes= 11.6643 E= 3.6e+003 -8 -865 172 -865 -8 -865 114 -4 192 -865 -865 -865 -865 158 14 -865 92 -865 14 -4 150 -1 -865 -865 -865 -865 214 -865 -865 -865 14 154 -865 -865 214 -865 -8 -865 -865 154 -865 199 -865 -865 -865 -865 214 -865 -865 -865 14 154 92 99 -865 -865 -865 -865 214 -865 192 -865 -865 -865 -865 -865 -865 195 -865 -865 114 95 -865 -865 214 -865 150 -865 14 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 3.6e+003 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.500000 0.250000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.500000 0.000000 0.250000 0.250000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.500000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.250000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA][GAT]A[CG][AGT][AC]G[TG]G[TA]CG[TG][AC]GAT[GT]G[AG] -------------------------------------------------------------------------------- Time 18.30 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46957 1.30e-01 500 47351 4.26e-03 455_[+2(5.51e-07)]_29 37957 1.78e-03 461_[+1(1.47e-07)]_23 47754 1.96e-05 133_[+3(9.18e-10)]_347 51092 6.96e-03 230_[+1(1.95e-06)]_254 14397 1.86e-08 282_[+2(4.85e-05)]_29_\ [+3(6.04e-05)]_61_[+2(3.68e-08)]_42_[+1(2.49e-07)]_18 48750 1.63e-01 138_[+3(4.79e-05)]_342 49009 7.67e-01 500 54983 5.54e-01 500 49238 1.84e-05 90_[+2(1.39e-06)]_359_\ [+1(1.72e-06)]_19 16036 6.25e-04 454_[+1(5.98e-08)]_30 16120 1.24e-09 95_[+1(1.08e-06)]_52_[+3(2.01e-11)]_\ 317 42307 8.51e-01 500 49642 1.92e-01 500 52619 2.30e-09 168_[+3(3.91e-10)]_96_\ [+2(1.30e-06)]_200 51232 2.71e-08 144_[+2(8.64e-08)]_108_\ [+1(8.31e-09)]_216 23629 4.25e-07 242_[+1(1.61e-07)]_121_\ [+2(1.16e-07)]_105 50026 4.80e-01 500 41327 6.40e-05 241_[+3(8.07e-05)]_21_\ [+2(1.85e-07)]_202 50387 4.18e-01 500 33217 1.24e-06 100_[+3(2.56e-10)]_380 44046 3.91e-03 93_[+2(8.00e-07)]_391 12902 2.48e-05 236_[+2(1.38e-09)]_248 55010 1.75e-04 13_[+1(4.99e-08)]_471 50304 1.23e-02 104_[+1(2.08e-06)]_380 55004 4.84e-04 363_[+1(2.30e-07)]_121 48868 5.04e-03 112_[+2(1.22e-06)]_372 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************