******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/391/391.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 17140 1.0000 500 20590 1.0000 500 21477 1.0000 500 21683 1.0000 500 21795 1.0000 500 22984 1.0000 500 23416 1.0000 500 24818 1.0000 500 25713 1.0000 500 261820 1.0000 500 262099 1.0000 500 263142 1.0000 500 263246 1.0000 500 263313 1.0000 500 268329 1.0000 500 31085 1.0000 500 8365 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/391/391.seqs.fa -oc motifs/391 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.236 G 0.232 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.272 C 0.236 G 0.232 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 17 llr = 155 E-value = 1.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :78:96213:96 pos.-specific C 93:a:244:9:2 probability G 1:2::2121:11 matrix T ::::1:3361:: bits 2.1 * 1.9 * 1.7 * * 1.5 * ** ** Relative 1.3 * *** ** Entropy 1.1 ***** ** (13.2 bits) 0.8 ***** ** 0.6 ****** **** 0.4 ****** **** 0.2 ************ 0.0 ------------ Multilevel CAACAACCTCAA consensus C TTA C sequence AG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 24818 173 3.23e-07 CACCAACATA CAACAATCTCAA CTCACGCGCA 17140 63 1.06e-06 GCAGTCAGTT CCACAACTTCAA CAGTGTAGCG 263313 436 4.92e-06 GTTGACGCTG CCACAAAGTCAA AGCCGACAAC 25713 488 6.51e-06 AACAAACGAA CAACAACAACAA A 20590 155 9.43e-06 TTACTTATTG CAACAATTGCAA TGCTGTTGGC 21683 326 1.13e-05 GCGAAATTTC CAACACTCTCAC GATTTGGTTG 31085 253 1.28e-05 TCTTGGTTTC CAACAAATGCAA ATAGCGATGC 263246 170 1.44e-05 TCATGAATAA CAACAAAGTCAG GAAGAAAGTG 21477 280 1.44e-05 TCGTTCACGG CAACAGTTTCAC ACATCGTTCG 261820 487 2.57e-05 CTTCCACAGC CAGCAGCCACAA AT 268329 156 4.94e-05 AGTCAACCAT CAACAGCAACAC ATAATCGTCT 22984 14 5.33e-05 CATGGACACT CAACAAAGTTAA TAAATTGGAG 8365 282 7.52e-05 CATCGAACCA GCACAATGTCAC CATGCAATTG 263142 369 8.52e-05 TGATCCATCT CCACACCCTCGA TCCTTTCGCG 262099 132 9.74e-05 ACAGTACTTT CAGCACCCACAG TACAATCAGA 23416 355 9.74e-05 GTTAGTTGTC GAACAAGCACAA GATCAGAGAC 21795 408 1.24e-04 AGGCTACTTC CCGCTACTTCAA CTGGTAGTTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24818 3.2e-07 172_[+1]_316 17140 1.1e-06 62_[+1]_426 263313 4.9e-06 435_[+1]_53 25713 6.5e-06 487_[+1]_1 20590 9.4e-06 154_[+1]_334 21683 1.1e-05 325_[+1]_163 31085 1.3e-05 252_[+1]_236 263246 1.4e-05 169_[+1]_319 21477 1.4e-05 279_[+1]_209 261820 2.6e-05 486_[+1]_2 268329 4.9e-05 155_[+1]_333 22984 5.3e-05 13_[+1]_475 8365 7.5e-05 281_[+1]_207 263142 8.5e-05 368_[+1]_120 262099 9.7e-05 131_[+1]_357 23416 9.7e-05 354_[+1]_134 21795 0.00012 407_[+1]_81 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=17 24818 ( 173) CAACAATCTCAA 1 17140 ( 63) CCACAACTTCAA 1 263313 ( 436) CCACAAAGTCAA 1 25713 ( 488) CAACAACAACAA 1 20590 ( 155) CAACAATTGCAA 1 21683 ( 326) CAACACTCTCAC 1 31085 ( 253) CAACAAATGCAA 1 263246 ( 170) CAACAAAGTCAG 1 21477 ( 280) CAACAGTTTCAC 1 261820 ( 487) CAGCAGCCACAA 1 268329 ( 156) CAACAGCAACAC 1 22984 ( 14) CAACAAAGTTAA 1 8365 ( 282) GCACAATGTCAC 1 263142 ( 369) CCACACCCTCGA 1 262099 ( 132) CAGCACCCACAG 1 23416 ( 355) GAACAAGCACAA 1 21795 ( 408) CCGCTACTTCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.00042 E= 1.9e+000 -1073 190 -98 -1073 138 32 -1073 -1073 160 -1073 -40 -1073 -1073 209 -1073 -1073 179 -1073 -1073 -214 125 -42 -40 -1073 -21 81 -198 18 -121 58 2 18 11 -1073 -98 118 -1073 200 -1073 -214 179 -1073 -198 -1073 125 0 -98 -1073 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 17 E= 1.9e+000 0.000000 0.882353 0.117647 0.000000 0.705882 0.294118 0.000000 0.000000 0.823529 0.000000 0.176471 0.000000 0.000000 1.000000 0.000000 0.000000 0.941176 0.000000 0.000000 0.058824 0.647059 0.176471 0.176471 0.000000 0.235294 0.411765 0.058824 0.294118 0.117647 0.352941 0.235294 0.294118 0.294118 0.000000 0.117647 0.588235 0.000000 0.941176 0.000000 0.058824 0.941176 0.000000 0.058824 0.000000 0.647059 0.235294 0.117647 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[AC]ACAA[CTA][CTG][TA]CA[AC] -------------------------------------------------------------------------------- Time 2.74 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 9 llr = 126 E-value = 3.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::6:13:27:2:1:6:81: pos.-specific C ::3::3::112:::1::4: probability G 971a31a1171198:a228 matrix T 13::62:71249:23::22 bits 2.1 * * * 1.9 * * * 1.7 * * * * 1.5 * * * ** * Relative 1.3 * * * *** ** * Entropy 1.1 ** * * *** ** * (20.2 bits) 0.8 ** * * * *** ** * 0.6 ***** ** * ****** * 0.4 ***** **** ****** * 0.2 ***** ************* 0.0 ------------------- Multilevel GGAGTAGTAGTTGGAGACG consensus TC GC A TA TT GGT sequence T C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 31085 111 6.17e-11 TTGAATCGTT GGAGTAGTAGATGGAGACG TCTGCACTGG 261820 101 2.16e-08 GATGGAGATA GGGGTAGGAGCTGGAGACG ATGGCAGCTG 263142 449 3.36e-08 CGACTTGGCT GTAGTCGAGGTTGGAGAGG ATATCGACTG 263313 414 5.63e-08 ACACGTCTCG GTCGTCGTCGTTGTTGACG CTGCCACAAA 21683 52 3.01e-07 AATAAACAGC GGCGGTGTATTTGTCGATG CTGTGAATGA 25713 63 4.70e-07 GCGGGGAGGT GTCGATGTTGCTGGAGGCG TAGGGGCAAA 23416 138 4.70e-07 ACGTGGTAAT GGAGGAGAATATGGTGGAG GGCACGATCG 263246 353 5.80e-07 CAGAACACTT TGAGGCGTACTTGGTGAGT CAAATTTCCC 21795 248 2.24e-06 GCAGTATATT GGAGTGGTAGGGAGAGATT CAGAGGAGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31085 6.2e-11 110_[+2]_371 261820 2.2e-08 100_[+2]_381 263142 3.4e-08 448_[+2]_33 263313 5.6e-08 413_[+2]_68 21683 3e-07 51_[+2]_430 25713 4.7e-07 62_[+2]_419 23416 4.7e-07 137_[+2]_344 263246 5.8e-07 352_[+2]_129 21795 2.2e-06 247_[+2]_234 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=9 31085 ( 111) GGAGTAGTAGATGGAGACG 1 261820 ( 101) GGGGTAGGAGCTGGAGACG 1 263142 ( 449) GTAGTCGAGGTTGGAGAGG 1 263313 ( 414) GTCGTCGTCGTTGTTGACG 1 21683 ( 52) GGCGGTGTATTTGTCGATG 1 25713 ( 63) GTCGATGTTGCTGGAGGCG 1 23416 ( 138) GGAGGAGAATATGGTGGAG 1 263246 ( 353) TGAGGCGTACTTGGTGAGT 1 21795 ( 248) GGAGTGGTAGGGAGAGATT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 8194 bayes= 9.96328 E= 3.4e+001 -982 -982 193 -122 -982 -982 152 36 103 50 -106 -982 -982 -982 210 -982 -129 -982 52 109 29 50 -106 -23 -982 -982 210 -982 -29 -982 -106 136 129 -108 -106 -122 -982 -108 152 -23 -29 -8 -106 77 -982 -982 -106 177 -129 -982 193 -982 -982 -982 174 -23 103 -108 -982 36 -982 -982 210 -982 151 -982 -7 -982 -129 92 -7 -23 -982 -982 174 -23 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 9 E= 3.4e+001 0.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.666667 0.333333 0.555556 0.333333 0.111111 0.000000 0.000000 0.000000 1.000000 0.000000 0.111111 0.000000 0.333333 0.555556 0.333333 0.333333 0.111111 0.222222 0.000000 0.000000 1.000000 0.000000 0.222222 0.000000 0.111111 0.666667 0.666667 0.111111 0.111111 0.111111 0.000000 0.111111 0.666667 0.222222 0.222222 0.222222 0.111111 0.444444 0.000000 0.000000 0.111111 0.888889 0.111111 0.000000 0.888889 0.000000 0.000000 0.000000 0.777778 0.222222 0.555556 0.111111 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.777778 0.000000 0.222222 0.000000 0.111111 0.444444 0.222222 0.222222 0.000000 0.000000 0.777778 0.222222 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[GT][AC]G[TG][ACT]G[TA]A[GT][TAC]TG[GT][AT]G[AG][CGT][GT] -------------------------------------------------------------------------------- Time 5.42 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 17 llr = 165 E-value = 3.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 243::1:42::232:: pos.-specific C 13111:4112:2511: probability G 5422:96:32:113:a matrix T 1:589::556a5259: bits 2.1 * 1.9 * * 1.7 * * * 1.5 ** * ** Relative 1.3 ** * ** Entropy 1.1 **** * ** (14.0 bits) 0.8 **** * ** 0.6 ***** ** ** 0.4 ** ***** ** ** 0.2 **************** 0.0 ---------------- Multilevel GATTTGGTTTTTCTTG consensus AGA CAGG AG sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 21795 451 1.22e-09 GTGTTGAACG GATTTGGATTTTCTTG TGTCTTTGTC 262099 364 3.08e-07 TCCGCAGATA GCTTTGCTTTTCATTG TAGTATTATT 20590 206 4.29e-07 ACTAAAGCTA GATTTGGTCTTTCGTG TTTGTGACAA 17140 226 1.41e-06 ACTAGAGTAT ACTTTGCCGTTTCTTG AAGCATTTGC 8365 60 3.84e-06 CTAGTTCTTG TATTTGCTGTTTTGTG GTTTGCTTGT 261820 334 5.99e-06 TGGGGGGGTG AGTTTGGATTTACTCG TAAGGTCCGA 21683 80 5.99e-06 GCTGTGAATG AGGTTGCTTTTGATTG ATATCGAAAA 25713 28 7.39e-06 AGGAAGATGC GGTGTGGTATTTTATG GCACTGAAAG 22984 130 1.89e-05 GCTTCGAGCG GCATTGCTGGTCCCTG TTGTTCTGCA 263313 372 2.24e-05 TTTGTCACGC ACAGTGGCTCTTCTTG CGACGCTTCA 24818 27 2.24e-05 ACTAATAATA GATTTACATGTAATTG ATCATATAAA 23416 337 3.59e-05 ATGCAAATAT GAGGTGGAGTTAGTTG TCGAACAAGC 263246 149 5.14e-05 TCTCGTTCTA CGCTTGGAAGTTCATG AATAACAACA 263142 472 5.14e-05 GAGAGGATAT CGACTGCTTCTTCGTG GGTTGTCGTC 31085 24 5.88e-05 ACGACGAACT GCGTTAGATGTGAGTG ACATGTAGGC 268329 217 8.07e-05 CCTACGTCTT TGATTGGTGTTCAACG AAGTAAGCTC 21477 165 8.07e-05 GTGTTGTGCT GAATCGGAACTTTGTG ACGAACCCAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21795 1.2e-09 450_[+3]_34 262099 3.1e-07 363_[+3]_121 20590 4.3e-07 205_[+3]_279 17140 1.4e-06 225_[+3]_259 8365 3.8e-06 59_[+3]_425 261820 6e-06 333_[+3]_151 21683 6e-06 79_[+3]_405 25713 7.4e-06 27_[+3]_457 22984 1.9e-05 129_[+3]_355 263313 2.2e-05 371_[+3]_113 24818 2.2e-05 26_[+3]_458 23416 3.6e-05 336_[+3]_148 263246 5.1e-05 148_[+3]_336 263142 5.1e-05 471_[+3]_13 31085 5.9e-05 23_[+3]_461 268329 8.1e-05 216_[+3]_268 21477 8.1e-05 164_[+3]_320 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=17 21795 ( 451) GATTTGGATTTTCTTG 1 262099 ( 364) GCTTTGCTTTTCATTG 1 20590 ( 206) GATTTGGTCTTTCGTG 1 17140 ( 226) ACTTTGCCGTTTCTTG 1 8365 ( 60) TATTTGCTGTTTTGTG 1 261820 ( 334) AGTTTGGATTTACTCG 1 21683 ( 80) AGGTTGCTTTTGATTG 1 25713 ( 28) GGTGTGGTATTTTATG 1 22984 ( 130) GCATTGCTGGTCCCTG 1 263313 ( 372) ACAGTGGCTCTTCTTG 1 24818 ( 27) GATTTACATGTAATTG 1 23416 ( 337) GAGGTGGAGTTAGTTG 1 263246 ( 149) CGCTTGGAAGTTCATG 1 263142 ( 472) CGACTGCTTCTTCGTG 1 31085 ( 24) GCGTTAGATGTGAGTG 1 268329 ( 217) TGATTGGTGTTCAACG 1 21477 ( 165) GAATCGGAACTTTGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8245 bayes= 8.98854 E= 3.2e+001 -21 -100 119 -114 38 32 60 -1073 11 -200 -40 86 -1073 -200 -40 156 -1073 -200 -1073 186 -121 -1073 192 -1073 -1073 81 134 -1073 60 -100 -1073 86 -62 -200 34 86 -1073 -42 2 118 -1073 -1073 -1073 194 -62 -42 -98 103 11 100 -198 -56 -62 -200 34 86 -1073 -100 -1073 176 -1073 -1073 210 -1073 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 17 E= 3.2e+001 0.235294 0.117647 0.529412 0.117647 0.352941 0.294118 0.352941 0.000000 0.294118 0.058824 0.176471 0.470588 0.000000 0.058824 0.176471 0.764706 0.000000 0.058824 0.000000 0.941176 0.117647 0.000000 0.882353 0.000000 0.000000 0.411765 0.588235 0.000000 0.411765 0.117647 0.000000 0.470588 0.176471 0.058824 0.294118 0.470588 0.000000 0.176471 0.235294 0.588235 0.000000 0.000000 0.000000 1.000000 0.176471 0.176471 0.117647 0.529412 0.294118 0.470588 0.058824 0.176471 0.176471 0.058824 0.294118 0.470588 0.000000 0.117647 0.000000 0.882353 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA][AGC][TA]TTG[GC][TA][TG][TG]TT[CA][TG]TG -------------------------------------------------------------------------------- Time 8.02 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17140 4.45e-05 62_[+1(1.06e-06)]_151_\ [+3(1.41e-06)]_259 20590 6.29e-05 111_[+3(1.59e-05)]_27_\ [+1(9.43e-06)]_39_[+3(4.29e-07)]_279 21477 7.09e-03 164_[+3(8.07e-05)]_99_\ [+1(1.44e-05)]_209 21683 5.06e-07 51_[+2(3.01e-07)]_9_[+3(5.99e-06)]_\ 230_[+1(1.13e-05)]_163 21795 1.18e-08 136_[+2(4.89e-05)]_92_\ [+2(2.24e-06)]_184_[+3(1.22e-09)]_34 22984 8.28e-03 13_[+1(5.33e-05)]_104_\ [+3(1.89e-05)]_355 23416 2.49e-05 137_[+2(4.70e-07)]_180_\ [+3(3.59e-05)]_2_[+1(9.74e-05)]_134 24818 7.64e-05 26_[+3(2.24e-05)]_130_\ [+1(3.23e-07)]_316 25713 5.57e-07 27_[+3(7.39e-06)]_19_[+2(4.70e-07)]_\ 406_[+1(6.51e-06)]_1 261820 9.72e-08 100_[+2(2.16e-08)]_187_\ [+2(6.39e-06)]_8_[+3(5.99e-06)]_137_[+1(2.57e-05)]_2 262099 2.48e-04 131_[+1(9.74e-05)]_220_\ [+3(3.08e-07)]_121 263142 2.92e-06 368_[+1(8.52e-05)]_68_\ [+2(3.36e-08)]_4_[+3(5.14e-05)]_13 263246 7.70e-06 148_[+3(5.14e-05)]_5_[+1(1.44e-05)]_\ 171_[+2(5.80e-07)]_129 263313 1.72e-07 371_[+3(2.24e-05)]_26_\ [+2(5.63e-08)]_3_[+1(4.92e-06)]_53 268329 2.48e-02 155_[+1(4.94e-05)]_49_\ [+3(8.07e-05)]_268 31085 1.90e-09 23_[+3(5.88e-05)]_71_[+2(6.17e-11)]_\ 123_[+1(1.28e-05)]_28_[+2(3.88e-06)]_189 8365 2.24e-03 59_[+3(3.84e-06)]_206_\ [+1(7.52e-05)]_207 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************