******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/126/126.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31900 1.0000 500 46818 1.0000 500 46843 1.0000 500 47276 1.0000 500 47367 1.0000 500 7718 1.0000 500 49372 1.0000 500 40467 1.0000 500 44241 1.0000 500 44273 1.0000 500 44453 1.0000 500 45032 1.0000 500 34732 1.0000 500 35564 1.0000 500 50959 1.0000 500 33888 1.0000 500 48248 1.0000 500 42856 1.0000 500 48785 1.0000 500 48851 1.0000 500 40172 1.0000 500 47310 1.0000 500 44015 1.0000 500 47938 1.0000 500 45975 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/126/126.seqs.fa -oc motifs/126 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 25 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12500 N= 25 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.269 C 0.230 G 0.216 T 0.285 Background letter frequencies (from dataset with add-one prior applied): A 0.269 C 0.230 G 0.216 T 0.285 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 18 sites = 17 llr = 210 E-value = 1.5e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 83::5141:29252:292 pos.-specific C 22:516:2:6:81:1611 probability G :2114::5:1:1:8:::7 matrix T :394:262a11:4:91:: bits 2.2 2.0 1.8 * 1.5 * * ** Relative 1.3 * * * ** * Entropy 1.1 * * * ** ** ** (17.8 bits) 0.9 * * * * * ** ** ** 0.7 * ***** **** ***** 0.4 * **************** 0.2 * **************** 0.0 ------------------ Multilevel AATCACTGTCACAGTCAG consensus CT TGTAT T A A sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 35564 65 7.29e-10 ACACAACTTC ATTCACAGTCACTGTCAG TTTGCCTACC 40172 191 2.90e-09 TTAGCGCTAA CTTCACAGTCACAGTCAG CAATTTGTTT 44273 361 4.17e-09 GCAATACACC ATTCACTGTCACAATCAG GACAGCCTGG 47938 281 5.56e-09 TCACCATCCG ACTCACACTCACAGTCAG TCAACGCTCA 48248 330 4.63e-07 TCTACATGAT AGTTGCTGTTACAGTTAG TTCATTCCAC 31900 18 5.59e-07 AAAACATCCC AATCGTTCTGACAGTCAA AATCCAATCA 40467 95 6.73e-07 AACGCTTGGA AATTGTAGTCACCGTCCG CCTTAGGGAG 47367 308 6.73e-07 GATCCGCCAA AAGCACTGTCAGCGTCAG CCGCCCATCT 33888 454 8.06e-07 GTAGTTTCCG CTTCGCTGTCTCTGTCAA TTAGCCGTCT 50959 320 8.06e-07 ACCAGCGCCA AAGTGCATTCACTGTCAA TCGAACCAGA 48785 149 1.47e-06 AATACCGAAC AGTTAATTTAACAGTAAG ATCGATTCAA 45975 319 2.56e-06 GATAGGTAAG ATTTCTTGTCACTGTCAC ATCACATATG 44241 94 2.97e-06 AGGTGCAGTC AATGGCTTTGACTGTAAA TGCAGTTATG 44015 24 3.97e-06 TCTAAATCAG AGTCGTTTTCAAAATCCG ACTCTAAATT 47310 32 8.21e-06 CTGTACCATA ACTTACTATAAAAGTTAG CAGTAGTAAC 34732 81 9.29e-06 TCATAGGTCA CGTCAATCTAACAATAAG CGCTGGCAAG 46843 106 1.32e-05 TCTTCGTAAA CCTGACAGTCAATGCAAG AAATAGTAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35564 7.3e-10 64_[+1]_418 40172 2.9e-09 190_[+1]_292 44273 4.2e-09 360_[+1]_122 47938 5.6e-09 280_[+1]_202 48248 4.6e-07 329_[+1]_153 31900 5.6e-07 17_[+1]_465 40467 6.7e-07 94_[+1]_388 47367 6.7e-07 307_[+1]_175 33888 8.1e-07 453_[+1]_29 50959 8.1e-07 319_[+1]_163 48785 1.5e-06 148_[+1]_334 45975 2.6e-06 318_[+1]_164 44241 3e-06 93_[+1]_389 44015 4e-06 23_[+1]_459 47310 8.2e-06 31_[+1]_451 34732 9.3e-06 80_[+1]_402 46843 1.3e-05 105_[+1]_377 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=18 seqs=17 35564 ( 65) ATTCACAGTCACTGTCAG 1 40172 ( 191) CTTCACAGTCACAGTCAG 1 44273 ( 361) ATTCACTGTCACAATCAG 1 47938 ( 281) ACTCACACTCACAGTCAG 1 48248 ( 330) AGTTGCTGTTACAGTTAG 1 31900 ( 18) AATCGTTCTGACAGTCAA 1 40467 ( 95) AATTGTAGTCACCGTCCG 1 47367 ( 308) AAGCACTGTCAGCGTCAG 1 33888 ( 454) CTTCGCTGTCTCTGTCAA 1 50959 ( 320) AAGTGCATTCACTGTCAA 1 48785 ( 149) AGTTAATTTAACAGTAAG 1 45975 ( 319) ATTTCTTGTCACTGTCAC 1 44241 ( 94) AATGGCTTTGACTGTAAA 1 44015 ( 24) AGTCGTTTTCAAAATCCG 1 47310 ( 32) ACTTACTATAAAAGTTAG 1 34732 ( 81) CGTCAATCTAACAATAAG 1 46843 ( 106) CCTGACAGTCAATGCAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 12075 bayes= 10.8365 E= 1.5e-006 150 3 -1073 -1073 13 -38 12 5 -1073 -1073 -87 163 -1073 120 -87 31 97 -196 93 -1073 -119 149 -1073 -28 39 -1073 -1073 118 -219 -38 129 -28 -1073 -1073 -1073 181 -61 149 -87 -227 180 -1073 -1073 -227 -61 173 -187 -1073 97 -97 -1073 31 -61 -1073 193 -1073 -1073 -196 -1073 172 -19 149 -1073 -127 171 -97 -1073 -1073 -19 -196 171 -1073 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 17 E= 1.5e-006 0.764706 0.235294 0.000000 0.000000 0.294118 0.176471 0.235294 0.294118 0.000000 0.000000 0.117647 0.882353 0.000000 0.529412 0.117647 0.352941 0.529412 0.058824 0.411765 0.000000 0.117647 0.647059 0.000000 0.235294 0.352941 0.000000 0.000000 0.647059 0.058824 0.176471 0.529412 0.235294 0.000000 0.000000 0.000000 1.000000 0.176471 0.647059 0.117647 0.058824 0.941176 0.000000 0.000000 0.058824 0.176471 0.764706 0.058824 0.000000 0.529412 0.117647 0.000000 0.352941 0.176471 0.000000 0.823529 0.000000 0.000000 0.058824 0.000000 0.941176 0.235294 0.647059 0.000000 0.117647 0.882353 0.117647 0.000000 0.000000 0.235294 0.058824 0.705882 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AC][ATG]T[CT][AG][CT][TA][GT]TCAC[AT]GT[CA]A[GA] -------------------------------------------------------------------------------- Time 6.51 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 13 llr = 139 E-value = 2.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :47888:74a:1 pos.-specific C ::1:1::21:28 probability G a122:2a25:81 matrix T :51:1::::::: bits 2.2 * * 2.0 * * * 1.8 * * * 1.5 * * ** Relative 1.3 * * *** Entropy 1.1 * **** *** (15.5 bits) 0.9 * ********* 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GTAAAAGAGAGC consensus A G G A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47310 463 5.41e-08 ACCTTCGCTT GTAAAAGAGAGC ATCTCTTGAG 49372 17 1.05e-07 AAGTCGTGGC GAAAAAGAGAGC AATGAGTACA 34732 38 7.28e-07 CTCTTTTCTT GTAAAGGAAAGC ACTGACAATA 47367 182 9.78e-07 TACTTCGAAT GTAAAAGGAAGC TATGTTTACA 45032 326 1.57e-06 CGGAACGAAG GTGAAGGAGAGC TGCTTGTGAA 44273 428 3.72e-06 CTGTGCCTTT GAGAAAGCGAGC TGGTACTCAC 48785 192 6.83e-06 AGGAGTCCTA GTAGTAGAGAGC GTATAATGTA 42856 399 6.83e-06 TTTTGATATT GTAGAAGAGAGA ACATTAATTG 40467 289 8.00e-06 CGGCACTCTC GAAGAAGAAACC CCGTCGTGGA 46818 410 9.45e-06 ACTGCGCTAT GATAAAGGGAGC CTCTGCTCCA 7718 389 1.83e-05 GCATTGTATG GAAACAGACAGC TGTACGCGCC 47276 54 2.80e-05 GATTCGCAGC GGCAAGGAAAGC TTGTCCTTTA 44241 70 4.40e-05 TCAATTGACA GTAAAAGCAACG AGAGGTGCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47310 5.4e-08 462_[+2]_26 49372 1.1e-07 16_[+2]_472 34732 7.3e-07 37_[+2]_451 47367 9.8e-07 181_[+2]_307 45032 1.6e-06 325_[+2]_163 44273 3.7e-06 427_[+2]_61 48785 6.8e-06 191_[+2]_297 42856 6.8e-06 398_[+2]_90 40467 8e-06 288_[+2]_200 46818 9.5e-06 409_[+2]_79 7718 1.8e-05 388_[+2]_100 47276 2.8e-05 53_[+2]_435 44241 4.4e-05 69_[+2]_419 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=13 47310 ( 463) GTAAAAGAGAGC 1 49372 ( 17) GAAAAAGAGAGC 1 34732 ( 38) GTAAAGGAAAGC 1 47367 ( 182) GTAAAAGGAAGC 1 45032 ( 326) GTGAAGGAGAGC 1 44273 ( 428) GAGAAAGCGAGC 1 48785 ( 192) GTAGTAGAGAGC 1 42856 ( 399) GTAGAAGAGAGA 1 40467 ( 289) GAAGAAGAAACC 1 46818 ( 410) GATAAAGGGAGC 1 7718 ( 389) GAAACAGACAGC 1 47276 ( 54) GGCAAGGAAAGC 1 44241 ( 70) GTAAAAGCAACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 12225 bayes= 10.4066 E= 2.0e+001 -1035 -1035 221 -1035 51 -1035 -149 92 136 -158 -49 -189 151 -1035 10 -1035 165 -158 -1035 -189 151 -1035 10 -1035 -1035 -1035 221 -1035 136 -58 -49 -1035 51 -158 132 -1035 189 -1035 -1035 -1035 -1035 -58 197 -1035 -181 188 -149 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 2.0e+001 0.000000 0.000000 1.000000 0.000000 0.384615 0.000000 0.076923 0.538462 0.692308 0.076923 0.153846 0.076923 0.769231 0.000000 0.230769 0.000000 0.846154 0.076923 0.000000 0.076923 0.769231 0.000000 0.230769 0.000000 0.000000 0.000000 1.000000 0.000000 0.692308 0.153846 0.153846 0.000000 0.384615 0.076923 0.538462 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.153846 0.846154 0.000000 0.076923 0.846154 0.076923 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[TA]A[AG]A[AG]GA[GA]AGC -------------------------------------------------------------------------------- Time 13.23 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 6 llr = 80 E-value = 9.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::8a::8:772: pos.-specific C :a2:5a:8338: probability G a:::5:22:::a matrix T :::::::::::: bits 2.2 ** * * 2.0 ** * * * 1.8 ** * * * 1.5 ** * * * * Relative 1.3 **** *** ** Entropy 1.1 ************ (19.3 bits) 0.9 ************ 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GCAACCACAACG consensus G CC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 7718 174 8.20e-08 GCAAAATAAA GCAACCACAACG ATGCGCGTGT 35564 116 2.22e-07 TTGGATTTTG GCAACCACCACG ACTACAACAA 44453 40 3.48e-07 AGGACGCTCT GCAAGCGCAACG TCGAGGCGGC 44015 107 4.95e-07 ACCGTAACGG GCAACCAGAACG CCAGCTCGGT 49372 196 1.03e-06 ATAATTAATT GCAAGCACACAG GCTAGTACAT 45975 22 1.17e-06 GAATAGACAA GCCAGCACCCCG CAAATCCAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7718 8.2e-08 173_[+3]_315 35564 2.2e-07 115_[+3]_373 44453 3.5e-07 39_[+3]_449 44015 5e-07 106_[+3]_382 49372 1e-06 195_[+3]_293 45975 1.2e-06 21_[+3]_467 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=6 7718 ( 174) GCAACCACAACG 1 35564 ( 116) GCAACCACCACG 1 44453 ( 40) GCAAGCGCAACG 1 44015 ( 107) GCAACCAGAACG 1 49372 ( 196) GCAAGCACACAG 1 45975 ( 22) GCCAGCACCCCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 12225 bayes= 10.6507 E= 9.0e+002 -923 -923 221 -923 -923 212 -923 -923 163 -46 -923 -923 189 -923 -923 -923 -923 112 121 -923 -923 212 -923 -923 163 -923 -37 -923 -923 186 -37 -923 131 53 -923 -923 131 53 -923 -923 -69 186 -923 -923 -923 -923 221 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 9.0e+002 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.666667 0.333333 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GCAA[CG]CAC[AC][AC]CG -------------------------------------------------------------------------------- Time 19.79 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31900 3.61e-03 17_[+1(5.59e-07)]_465 46818 5.76e-03 240_[+3(5.91e-05)]_157_\ [+2(9.45e-06)]_79 46843 6.46e-02 105_[+1(1.32e-05)]_377 47276 3.00e-02 53_[+2(2.80e-05)]_435 47367 1.36e-05 181_[+2(9.78e-07)]_114_\ [+1(6.73e-07)]_175 7718 3.65e-05 173_[+3(8.20e-08)]_203_\ [+2(1.83e-05)]_100 49372 3.75e-06 16_[+2(1.05e-07)]_167_\ [+3(1.03e-06)]_293 40467 7.77e-05 94_[+1(6.73e-07)]_176_\ [+2(8.00e-06)]_200 44241 2.65e-04 69_[+2(4.40e-05)]_12_[+1(2.97e-06)]_\ 389 44273 1.34e-07 256_[+1(9.12e-05)]_86_\ [+1(4.17e-09)]_49_[+2(3.72e-06)]_61 44453 3.97e-03 39_[+3(3.48e-07)]_449 45032 2.63e-03 325_[+2(1.57e-06)]_163 34732 3.76e-05 37_[+2(7.28e-07)]_31_[+1(9.29e-06)]_\ 402 35564 5.80e-09 64_[+1(7.29e-10)]_33_[+3(2.22e-07)]_\ 373 50959 3.22e-03 319_[+1(8.06e-07)]_163 33888 9.40e-03 453_[+1(8.06e-07)]_29 48248 7.36e-04 54_[+2(9.68e-05)]_263_\ [+1(4.63e-07)]_153 42856 3.75e-02 398_[+2(6.83e-06)]_40_\ [+2(3.36e-05)]_38 48785 1.81e-04 148_[+1(1.47e-06)]_25_\ [+2(6.83e-06)]_297 48851 3.16e-01 500 40172 7.54e-06 190_[+1(2.90e-09)]_292 47310 8.22e-06 31_[+1(8.21e-06)]_413_\ [+2(5.41e-08)]_26 44015 2.97e-05 23_[+1(3.97e-06)]_65_[+3(4.95e-07)]_\ 382 47938 2.11e-05 280_[+1(5.56e-09)]_202 45975 1.77e-05 21_[+3(1.17e-06)]_285_\ [+1(2.56e-06)]_164 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************