******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/48/48.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 43165 1.0000 500 9612 1.0000 500 46356 1.0000 500 14401 1.0000 500 14967 1.0000 500 48969 1.0000 500 49575 1.0000 500 33450 1.0000 500 16955 1.0000 500 18927 1.0000 500 11634 1.0000 500 45513 1.0000 500 54478 1.0000 500 43307 1.0000 500 46514 1.0000 500 41173 1.0000 500 48925 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/48/48.seqs.fa -oc motifs/48 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.254 C 0.252 G 0.229 T 0.265 Background letter frequencies (from dataset with add-one prior applied): A 0.254 C 0.252 G 0.229 T 0.265 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 12 llr = 159 E-value = 5.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::31421:632688::a48 pos.-specific C 1::481:314433::a4:5: probability G 2282113:9:3611::4:11 matrix T 88321457::::112:2::1 bits 2.1 1.9 * * 1.7 * * * 1.5 * * * Relative 1.3 ** * *** * * Entropy 1.1 ** ** *** * * (19.1 bits) 0.8 *** * *** *** * * 0.6 *** * **** * ******* 0.4 *** * ************** 0.2 ******************** 0.0 -------------------- Multilevel TTGCCATTGACGAAACCACA consensus TA TGC CACC G A sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 46356 400 2.44e-09 TCCAGTAGGT TTTCCATTGACGAAACTACA ACCGTACACA 48925 4 3.33e-09 CGC TTGCCTTTGCAGTAACCACA TTCGGACCAA 43165 390 6.42e-08 CTTCCCATGG GTGCCAGTCCCGAAACGAAA CTCTTTGATG 45513 385 1.44e-07 TGTGCTGACA CTGACTGTGAACCAACCAAA CCCGATACTG 43307 172 1.58e-07 GATGCCAATT TTGCTTGTGAAGAAACGAAT ATTATTATTT 54478 332 3.57e-07 TCGAATTAGT TTGACTGTGAGCAATCGACG CATCATCCAA 41173 368 4.23e-07 CGTGAATGAA TTTACTTCGACGATACTACA TCACAATCGA 33450 132 4.23e-07 GTGCTTTAAA TTGGGATAGAGCAAACGACA TTTCCAGAAA 14967 187 1.09e-06 GATGGACACT TGGGCAACGCCGCATCCAAA TTCGTGGCCG 48969 196 1.79e-06 CACATTTTCT TGTTAATTGCGAAAACGACA AGAAAAAGGT 49575 130 1.92e-06 CGAGAACTGG GTGCCGTTGCAACGACCAAA ATTGGGACTG 16955 267 2.05e-06 CCGCTTACAC TTGTCCACGACGGAACCAGA AACTGCTGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46356 2.4e-09 399_[+1]_81 48925 3.3e-09 3_[+1]_477 43165 6.4e-08 389_[+1]_91 45513 1.4e-07 384_[+1]_96 43307 1.6e-07 171_[+1]_309 54478 3.6e-07 331_[+1]_149 41173 4.2e-07 367_[+1]_113 33450 4.2e-07 131_[+1]_349 14967 1.1e-06 186_[+1]_294 48969 1.8e-06 195_[+1]_285 49575 1.9e-06 129_[+1]_351 16955 2e-06 266_[+1]_214 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=12 46356 ( 400) TTTCCATTGACGAAACTACA 1 48925 ( 4) TTGCCTTTGCAGTAACCACA 1 43165 ( 390) GTGCCAGTCCCGAAACGAAA 1 45513 ( 385) CTGACTGTGAACCAACCAAA 1 43307 ( 172) TTGCTTGTGAAGAAACGAAT 1 54478 ( 332) TTGACTGTGAGCAATCGACG 1 41173 ( 368) TTTACTTCGACGATACTACA 1 33450 ( 132) TTGGGATAGAGCAAACGACA 1 14967 ( 187) TGGGCAACGCCGCATCCAAA 1 48969 ( 196) TGTTAATTGCGAAAACGACA 1 49575 ( 130) GTGCCGTTGCAACGACCAAA 1 16955 ( 267) TTGTCCACGACGGAACCAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 8177 bayes= 9.8583 E= 5.6e+000 -1023 -160 -46 150 -1023 -1023 -46 165 -1023 -1023 171 -8 -2 72 -46 -67 -160 157 -146 -166 71 -160 -146 65 -61 -1023 54 92 -160 -1 -1023 133 -1023 -160 200 -1023 120 72 -1023 -1023 39 72 12 -1023 -61 -1 135 -1023 120 -1 -146 -166 171 -1023 -146 -166 171 -1023 -1023 -67 -1023 199 -1023 -1023 -1023 72 86 -67 198 -1023 -1023 -1023 71 99 -146 -1023 171 -1023 -146 -166 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 12 E= 5.6e+000 0.000000 0.083333 0.166667 0.750000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.750000 0.250000 0.250000 0.416667 0.166667 0.166667 0.083333 0.750000 0.083333 0.083333 0.416667 0.083333 0.083333 0.416667 0.166667 0.000000 0.333333 0.500000 0.083333 0.250000 0.000000 0.666667 0.000000 0.083333 0.916667 0.000000 0.583333 0.416667 0.000000 0.000000 0.333333 0.416667 0.250000 0.000000 0.166667 0.250000 0.583333 0.000000 0.583333 0.250000 0.083333 0.083333 0.833333 0.000000 0.083333 0.083333 0.833333 0.000000 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.416667 0.416667 0.166667 1.000000 0.000000 0.000000 0.000000 0.416667 0.500000 0.083333 0.000000 0.833333 0.000000 0.083333 0.083333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TT[GT][CA]C[AT][TG][TC]G[AC][CAG][GC][AC]AAC[CG]A[CA]A -------------------------------------------------------------------------------- Time 2.92 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 17 llr = 153 E-value = 2.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 8211:32::::2 pos.-specific C 2::257:::39: probability G 1618::2132:8 matrix T :18:5:69751: bits 2.1 1.9 1.7 * * 1.5 * ** Relative 1.3 * ** Entropy 1.1 * ** * ** ** (12.9 bits) 0.8 ****** ** ** 0.6 ********* ** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AGTGTCTTTTCG consensus A CAA GC sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48969 120 6.13e-07 TCCCGCGTGT AGTGTATTTTCG GTGTCTGTTG 16955 150 1.10e-06 AGCCGTTGTG AATGCCTTTTCG GTCGCGCCGG 48925 488 2.54e-06 GTCCGTTTGG AGTCCCTTTTCG G 33450 45 2.83e-06 GCGCCTACCA AGTGCCGTTGCG AGATCTGGAA 18927 137 9.02e-06 CCGCGCAAAA AGTCTCGTTTCG CGCGTGAGGA 45513 258 1.01e-05 GTTTTTCGGA AATGCATTTGCG TAACATAGCG 43307 459 1.63e-05 TGTTTCGACC AGAGCCTTGCCG ATACCGGTTA 43165 170 1.63e-05 TAGAAGATGC ATTGTCATTCCG ACGATTCGCG 46514 337 1.82e-05 TCCTTTTACA AGTGCATTTGCA CAATGCTCCA 9612 166 2.43e-05 TCCTAGCCAG AGAGTCATGTCG AGACATGGTT 46356 152 3.63e-05 CCTCAGTGTC AGTGTCAGTCCG ACGTCCGACC 54478 60 5.59e-05 CTTATAAGTC AGTCTCTTTTTG AAACGCGGCA 14401 168 7.88e-05 ATTGATGATT CGTATCTTTGCG GCGTTGGTGG 14967 166 1.12e-04 GAGCAGTTTT ATGGTATTTTCG ATGGACACTT 49575 34 1.20e-04 GGACTTATCC GATGCCATGTCG GAAGAATCCC 41173 219 1.27e-04 TCATTCGCTT CATGTCTTGCCA AATGAAATTC 11634 68 2.28e-04 CGCGTTCGGG CGTGCAGTGCCA CGCCACCGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48969 6.1e-07 119_[+2]_369 16955 1.1e-06 149_[+2]_339 48925 2.5e-06 487_[+2]_1 33450 2.8e-06 44_[+2]_444 18927 9e-06 136_[+2]_352 45513 1e-05 257_[+2]_231 43307 1.6e-05 458_[+2]_30 43165 1.6e-05 169_[+2]_319 46514 1.8e-05 336_[+2]_152 9612 2.4e-05 165_[+2]_323 46356 3.6e-05 151_[+2]_337 54478 5.6e-05 59_[+2]_429 14401 7.9e-05 167_[+2]_321 14967 0.00011 165_[+2]_323 49575 0.00012 33_[+2]_455 41173 0.00013 218_[+2]_270 11634 0.00023 67_[+2]_421 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=17 48969 ( 120) AGTGTATTTTCG 1 16955 ( 150) AATGCCTTTTCG 1 48925 ( 488) AGTCCCTTTTCG 1 33450 ( 45) AGTGCCGTTGCG 1 18927 ( 137) AGTCTCGTTTCG 1 45513 ( 258) AATGCATTTGCG 1 43307 ( 459) AGAGCCTTGCCG 1 43165 ( 170) ATTGTCATTCCG 1 46514 ( 337) AGTGCATTTGCA 1 9612 ( 166) AGAGTCATGTCG 1 46356 ( 152) AGTGTCAGTCCG 1 54478 ( 60) AGTCTCTTTTTG 1 14401 ( 168) CGTATCTTTGCG 1 14967 ( 166) ATGGTATTTTCG 1 49575 ( 34) GATGCCATGTCG 1 41173 ( 219) CATGTCTTGCCA 1 11634 ( 68) CGTGCAGTGCCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.00042 E= 2.6e+002 159 -51 -196 -1073 -11 -1073 150 -117 -111 -1073 -196 164 -211 -51 174 -1073 -1073 90 -1073 100 21 148 -1073 -1073 -11 -1073 -38 115 -1073 -1073 -196 183 -1073 -1073 36 141 -1073 22 4 83 -1073 190 -1073 -217 -52 -1073 184 -1073 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 17 E= 2.6e+002 0.764706 0.176471 0.058824 0.000000 0.235294 0.000000 0.647059 0.117647 0.117647 0.000000 0.058824 0.823529 0.058824 0.176471 0.764706 0.000000 0.000000 0.470588 0.000000 0.529412 0.294118 0.705882 0.000000 0.000000 0.235294 0.000000 0.176471 0.588235 0.000000 0.000000 0.058824 0.941176 0.000000 0.000000 0.294118 0.705882 0.000000 0.294118 0.235294 0.470588 0.000000 0.941176 0.000000 0.058824 0.176471 0.000000 0.823529 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[GA]TG[TC][CA][TA]T[TG][TCG]CG -------------------------------------------------------------------------------- Time 5.91 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 6 llr = 87 E-value = 1.6e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :27:::::::23:8: pos.-specific C :2::a5::7:8::28 probability G a333:53:3a:7a:: matrix T :3:7::7a::::::2 bits 2.1 * * * 1.9 * * * * * 1.7 * * * * * 1.5 * * * * * Relative 1.3 * * * ** *** Entropy 1.1 * ************* (20.9 bits) 0.8 * ************* 0.6 * ************* 0.4 * ************* 0.2 * ************* 0.0 --------------- Multilevel GGATCCTTCGCGGAC consensus TGG GG G A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 11634 383 2.80e-08 TTGCCGAAAA GGATCCGTGGCGGAC CGTTCGTACA 46356 368 4.78e-08 TCACGTTGTT GTATCCGTGGCGGAC TCTTTGATCC 45513 449 6.12e-08 TTGTCTTGAC GTGTCGTTCGCAGAC TCCTCATTCT 33450 481 1.31e-07 CTCTTCTACT GGGTCGTTCGAGGAC TGACA 14967 306 3.14e-07 AATATTGACA GAAGCCTTCGCGGAT GTCCATCGAA 16955 129 4.82e-07 GGTGTTGCTA GCAGCGTTCGCAGCC GTTGTGAATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11634 2.8e-08 382_[+3]_103 46356 4.8e-08 367_[+3]_118 45513 6.1e-08 448_[+3]_37 33450 1.3e-07 480_[+3]_5 14967 3.1e-07 305_[+3]_180 16955 4.8e-07 128_[+3]_357 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=6 11634 ( 383) GGATCCGTGGCGGAC 1 46356 ( 368) GTATCCGTGGCGGAC 1 45513 ( 449) GTGTCGTTCGCAGAC 1 33450 ( 481) GGGTCGTTCGAGGAC 1 14967 ( 306) GAAGCCTTCGCGGAT 1 16955 ( 129) GCAGCGTTCGCAGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8262 bayes= 11.5264 E= 1.6e+003 -923 -923 212 -923 -61 -60 54 33 139 -923 54 -923 -923 -923 54 133 -923 199 -923 -923 -923 99 112 -923 -923 -923 54 133 -923 -923 -923 192 -923 140 54 -923 -923 -923 212 -923 -61 172 -923 -923 39 -923 154 -923 -923 -923 212 -923 171 -60 -923 -923 -923 172 -923 -67 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 6 E= 1.6e+003 0.000000 0.000000 1.000000 0.000000 0.166667 0.166667 0.333333 0.333333 0.666667 0.000000 0.333333 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.833333 0.000000 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[GT][AG][TG]C[CG][TG]T[CG]GC[GA]GAC -------------------------------------------------------------------------------- Time 9.11 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43165 1.80e-05 169_[+2(1.63e-05)]_208_\ [+1(6.42e-08)]_91 9612 2.01e-02 165_[+2(2.43e-05)]_323 46356 2.08e-10 151_[+2(3.63e-05)]_42_\ [+1(9.20e-05)]_108_[+3(3.09e-05)]_19_[+3(4.78e-08)]_17_[+1(2.44e-09)]_81 14401 6.43e-02 167_[+2(7.88e-05)]_321 14967 8.80e-07 186_[+1(1.09e-06)]_99_\ [+3(3.14e-07)]_180 48969 8.47e-06 119_[+2(6.13e-07)]_64_\ [+1(1.79e-06)]_285 49575 2.83e-03 129_[+1(1.92e-06)]_351 33450 5.95e-09 44_[+2(2.83e-06)]_75_[+1(4.23e-07)]_\ 329_[+3(1.31e-07)]_5 16955 3.54e-08 128_[+3(4.82e-07)]_6_[+2(1.10e-06)]_\ 105_[+1(2.05e-06)]_214 18927 1.86e-02 136_[+2(9.02e-06)]_352 11634 1.18e-04 382_[+3(2.80e-08)]_103 45513 3.52e-09 257_[+2(1.01e-05)]_115_\ [+1(1.44e-07)]_44_[+3(6.12e-08)]_37 54478 1.72e-04 59_[+2(5.59e-05)]_260_\ [+1(3.57e-07)]_149 43307 6.10e-05 171_[+1(1.58e-07)]_267_\ [+2(1.63e-05)]_30 46514 6.48e-02 336_[+2(1.82e-05)]_152 41173 8.41e-04 367_[+1(4.23e-07)]_113 48925 3.93e-07 3_[+1(3.33e-09)]_464_[+2(2.54e-06)]_\ 1 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************