******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/138/138.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 36955 1.0000 500 15121 1.0000 500 41790 1.0000 500 16944 1.0000 500 17043 1.0000 500 54265 1.0000 500 11391 1.0000 500 11593 1.0000 500 19761 1.0000 500 19821 1.0000 500 35396 1.0000 500 27240 1.0000 500 20135 1.0000 500 27762 1.0000 500 32532 1.0000 500 34275 1.0000 500 47338 1.0000 500 48489 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/138/138.seqs.fa -oc motifs/138 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.269 C 0.252 G 0.237 T 0.242 Background letter frequencies (from dataset with add-one prior applied): A 0.269 C 0.252 G 0.237 T 0.242 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 15 llr = 182 E-value = 2.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :49397229363:19:83:87 pos.-specific C 73131251:5117:16212:1 probability G 13:3::161:3329:3:661: matrix T 2::::111:3:31:11::211 bits 2.1 1.9 1.7 1.5 * * * Relative 1.2 * * * ** * Entropy 1.0 * * * * ** * * (17.5 bits) 0.8 * * ** * * *** ** ** 0.6 * * ** * * ********* 0.4 ****** **** ********* 0.2 *********** ********* 0.0 --------------------- Multilevel CAAAAACGACAGCGACAGGAA consensus TG C CAA AGAG GCAC sequence C G T T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 11391 375 1.83e-11 TCAGGCACGG CAAGAACGATATCGACAGGAA CTGCGTGCAC 35396 354 9.15e-10 GTCACCACAA CAACAACGACAACGACAACAA CAACGGCAAC 41790 479 4.16e-08 AGACCAAAGC CGACAATGAAGTGGAGAGGAA T 36955 129 2.92e-07 AACAAGGTAC CAACCACCACGGCGACAGGTA TCGAAACGTA 48489 295 6.79e-07 CTTATGAGGG CGAGAAACAACGTGACAGGAA GATGGCCACC 54265 31 7.42e-07 TTTAACCAGA CAAAAAAGAAAACGAGCGTAT ACGTGCCTCC 19761 351 1.05e-06 CTGGGTGTCG GGAAAAAGACAGCGATACGAA TGAAAGGCGC 16944 25 1.73e-06 GCCCGGCAAG CCACACCGGTACCGACCGGAC CTTCCCCCAC 15121 158 2.19e-06 CAATGGCTCC CAAAACCGGCAACAATAGCAA CAAAGCAAAA 32532 305 2.56e-06 GGACCGAAGA CAAAACCAACAGTGACAAGGT GTCGCGCGTT 20135 2 2.77e-06 A CCAAAACAATACCGTGAACAA GAAATTCTTG 34275 227 4.01e-06 CACAGCGCCT TCCGAACGATGGGAAGAGGAA TGTTACATTC 17043 284 5.32e-06 CGGCCAAACT TGAGAAGAAAATGGCCAAGAA ACAGCAAGTC 11593 410 7.47e-06 CCAGTGCCAC CGAGAATTACGACGACAATGC ACGACTCCCG 27240 128 1.17e-05 GTCATGTATT TCACCTGGACGTCGACCGTAA TGGCGCTCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11391 1.8e-11 374_[+1]_105 35396 9.1e-10 353_[+1]_126 41790 4.2e-08 478_[+1]_1 36955 2.9e-07 128_[+1]_351 48489 6.8e-07 294_[+1]_185 54265 7.4e-07 30_[+1]_449 19761 1.1e-06 350_[+1]_129 16944 1.7e-06 24_[+1]_455 15121 2.2e-06 157_[+1]_322 32532 2.6e-06 304_[+1]_175 20135 2.8e-06 1_[+1]_478 34275 4e-06 226_[+1]_253 17043 5.3e-06 283_[+1]_196 11593 7.5e-06 409_[+1]_70 27240 1.2e-05 127_[+1]_352 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=15 11391 ( 375) CAAGAACGATATCGACAGGAA 1 35396 ( 354) CAACAACGACAACGACAACAA 1 41790 ( 479) CGACAATGAAGTGGAGAGGAA 1 36955 ( 129) CAACCACCACGGCGACAGGTA 1 48489 ( 295) CGAGAAACAACGTGACAGGAA 1 54265 ( 31) CAAAAAAGAAAACGAGCGTAT 1 19761 ( 351) GGAAAAAGACAGCGATACGAA 1 16944 ( 25) CCACACCGGTACCGACCGGAC 1 15121 ( 158) CAAAACCGGCAACAATAGCAA 1 32532 ( 305) CAAAACCAACAGTGACAAGGT 1 20135 ( 2) CCAAAACAATACCGTGAACAA 1 34275 ( 227) TCCGAACGATGGGAAGAGGAA 1 17043 ( 284) TGAGAAGAAAATGGCCAAGAA 1 11593 ( 410) CGAGAATTACGACGACAATGC 1 27240 ( 128) TCACCTGGACGTCGACCGTAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 9.09232 E= 2.4e+001 -1055 154 -183 -28 57 8 49 -1055 180 -192 -1055 -1055 31 40 49 -1055 169 -92 -1055 -1055 145 -34 -1055 -186 -43 108 -83 -86 -43 -92 134 -186 169 -1055 -83 -1055 -1 89 -1055 14 116 -192 49 -1055 -1 -92 49 14 -1055 140 -24 -86 -101 -1055 187 -1055 169 -192 -1055 -186 -1055 125 17 -86 157 -34 -1055 -1055 31 -192 134 -1055 -1055 -34 134 -28 157 -1055 -83 -186 145 -92 -1055 -86 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 15 E= 2.4e+001 0.000000 0.733333 0.066667 0.200000 0.400000 0.266667 0.333333 0.000000 0.933333 0.066667 0.000000 0.000000 0.333333 0.333333 0.333333 0.000000 0.866667 0.133333 0.000000 0.000000 0.733333 0.200000 0.000000 0.066667 0.200000 0.533333 0.133333 0.133333 0.200000 0.133333 0.600000 0.066667 0.866667 0.000000 0.133333 0.000000 0.266667 0.466667 0.000000 0.266667 0.600000 0.066667 0.333333 0.000000 0.266667 0.133333 0.333333 0.266667 0.000000 0.666667 0.200000 0.133333 0.133333 0.000000 0.866667 0.000000 0.866667 0.066667 0.000000 0.066667 0.000000 0.600000 0.266667 0.133333 0.800000 0.200000 0.000000 0.000000 0.333333 0.066667 0.600000 0.000000 0.000000 0.200000 0.600000 0.200000 0.800000 0.000000 0.133333 0.066667 0.733333 0.133333 0.000000 0.133333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CT][AGC]A[ACG]A[AC][CA][GA]A[CAT][AG][GAT][CG]GA[CG][AC][GA][GCT]AA -------------------------------------------------------------------------------- Time 3.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 6 llr = 89 E-value = 1.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::::::22:::: pos.-specific C :22:::::525::3: probability G :52:aa::2228::a matrix T a37a::aa3522a7: bits 2.1 * ***** * * 1.9 * ***** * * 1.7 * ***** * * 1.5 * ***** ** * Relative 1.2 * ***** ** * Entropy 1.0 * ***** **** (21.4 bits) 0.8 * ****** **** 0.6 ********* **** 0.4 ********* **** 0.2 *************** 0.0 --------------- Multilevel TGTTGGTTCTCGTTG consensus T T C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 34275 1 2.41e-08 . TGTTGGTTGTCGTCG TCCTTACAAT 35396 306 2.41e-08 GGTGCCGTGT TTGTGGTTCTCGTTG TTTTCTCGTT 27762 100 5.05e-08 AAGATACGCT TGTTGGTTCCAGTTG CTGTTTCATT 20135 194 1.80e-07 CAACTGTTCA TTTTGGTTTTGTTTG GCCATACAAA 19821 354 1.80e-07 GCAAATCATC TCTTGGTTCATGTTG ACTTGTGAAT 41790 417 2.40e-07 GACGACTCGC TGCTGGTTTGCGTCG GCCGTCGAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34275 2.4e-08 [+2]_485 35396 2.4e-08 305_[+2]_180 27762 5.1e-08 99_[+2]_386 20135 1.8e-07 193_[+2]_292 19821 1.8e-07 353_[+2]_132 41790 2.4e-07 416_[+2]_69 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=6 34275 ( 1) TGTTGGTTGTCGTCG 1 35396 ( 306) TTGTGGTTCTCGTTG 1 27762 ( 100) TGTTGGTTCCAGTTG 1 20135 ( 194) TTTTGGTTTTGTTTG 1 19821 ( 354) TCTTGGTTCATGTTG 1 41790 ( 417) TGCTGGTTTGCGTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8748 bayes= 10.9565 E= 1.6e+001 -923 -923 -923 204 -923 -60 108 46 -923 -60 -51 146 -923 -923 -923 204 -923 -923 208 -923 -923 -923 208 -923 -923 -923 -923 204 -923 -923 -923 204 -923 99 -51 46 -69 -60 -51 104 -69 99 -51 -54 -923 -923 181 -54 -923 -923 -923 204 -923 40 -923 146 -923 -923 208 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 6 E= 1.6e+001 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.500000 0.333333 0.000000 0.166667 0.166667 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.166667 0.333333 0.166667 0.166667 0.166667 0.500000 0.166667 0.500000 0.166667 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[GT]TTGGTT[CT]TCGT[TC]G -------------------------------------------------------------------------------- Time 6.91 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 5 llr = 95 E-value = 2.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::::2::4::2248::::: pos.-specific C :6::2:6:6:222::::4:: probability G a::a8:28:6:::62::4:: matrix T :4a::822:4866::aa2aa bits 2.1 * ** ** ** 1.9 * ** ** ** 1.7 * ** ** ** 1.5 * ** ** ** Relative 1.2 * **** * * *** ** Entropy 1.0 ****** **** **** ** (27.5 bits) 0.8 ****** **** **** ** 0.6 ***************** ** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GCTGGTCGCGTTTGATTCTT consensus T CAGTATCAAAG G sequence T CC T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 20135 126 3.69e-11 TCATTGTGGC GTTGGTCGAGTTTGATTTTT TCGGTGGAAA 54265 320 2.35e-10 AGTCTGCCGA GCTGGTGGATTTTAATTCTT CCGCACTTAA 27762 41 1.84e-09 ATCGCGTGTT GCTGGTCTCGCTTGGTTGTT CCAAAGGCAT 36955 281 3.78e-09 AATGATGTGG GCTGCTCGCTTCAAATTCTT GGCAATTGGA 11391 272 5.07e-09 AGAAACAGTG GTTGGATGCGTACGATTGTT ACACTCGGGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20135 3.7e-11 125_[+3]_355 54265 2.4e-10 319_[+3]_161 27762 1.8e-09 40_[+3]_440 36955 3.8e-09 280_[+3]_200 11391 5.1e-09 271_[+3]_209 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=5 20135 ( 126) GTTGGTCGAGTTTGATTTTT 1 54265 ( 320) GCTGGTGGATTTTAATTCTT 1 27762 ( 41) GCTGGTCTCGCTTGGTTGTT 1 36955 ( 281) GCTGCTCGCTTCAAATTCTT 1 11391 ( 272) GTTGGATGCGTACGATTGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 8658 bayes= 11.0087 E= 2.4e+002 -897 -897 208 -897 -897 125 -897 72 -897 -897 -897 204 -897 -897 208 -897 -897 -33 175 -897 -43 -897 -897 172 -897 125 -24 -28 -897 -897 175 -28 57 125 -897 -897 -897 -897 134 72 -897 -33 -897 172 -43 -33 -897 131 -43 -33 -897 131 57 -897 134 -897 157 -897 -24 -897 -897 -897 -897 204 -897 -897 -897 204 -897 66 76 -28 -897 -897 -897 204 -897 -897 -897 204 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 5 E= 2.4e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.600000 0.200000 0.200000 0.000000 0.000000 0.800000 0.200000 0.400000 0.600000 0.000000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.200000 0.000000 0.800000 0.200000 0.200000 0.000000 0.600000 0.200000 0.200000 0.000000 0.600000 0.400000 0.000000 0.600000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.400000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[CT]TG[GC][TA][CGT][GT][CA][GT][TC][TAC][TAC][GA][AG]TT[CGT]TT -------------------------------------------------------------------------------- Time 10.18 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36955 5.61e-08 128_[+1(2.92e-07)]_131_\ [+3(3.78e-09)]_200 15121 3.26e-03 157_[+1(2.19e-06)]_322 41790 2.36e-07 416_[+2(2.40e-07)]_47_\ [+1(4.16e-08)]_1 16944 1.69e-02 24_[+1(1.73e-06)]_455 17043 7.33e-03 283_[+1(5.32e-06)]_196 54265 7.79e-09 30_[+1(7.42e-07)]_268_\ [+3(2.35e-10)]_161 11391 5.80e-12 162_[+1(9.79e-05)]_88_\ [+3(5.07e-09)]_83_[+1(1.83e-11)]_105 11593 1.87e-02 409_[+1(7.47e-06)]_70 19761 3.76e-03 350_[+1(1.05e-06)]_129 19821 8.62e-04 353_[+2(1.80e-07)]_132 35396 3.00e-10 305_[+2(2.41e-08)]_33_\ [+1(9.15e-10)]_5_[+1(6.67e-05)]_100 27240 3.31e-02 127_[+1(1.17e-05)]_352 20135 1.25e-12 1_[+1(2.77e-06)]_59_[+3(2.45e-08)]_\ 24_[+3(3.69e-11)]_48_[+2(1.80e-07)]_292 27762 5.88e-09 40_[+3(1.84e-09)]_39_[+2(5.05e-08)]_\ 196_[+2(1.32e-05)]_175 32532 7.65e-03 304_[+1(2.56e-06)]_175 34275 3.42e-06 [+2(2.41e-08)]_211_[+1(4.01e-06)]_\ 253 47338 8.50e-01 500 48489 3.73e-03 294_[+1(6.79e-07)]_185 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************