******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/162/162.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1446 1.0000 500 20065 1.0000 500 21909 1.0000 500 23464 1.0000 500 24624 1.0000 500 260982 1.0000 500 2623 1.0000 500 263193 1.0000 500 268737 1.0000 500 268822 1.0000 500 2974 1.0000 500 4973 1.0000 500 5538 1.0000 500 6112 1.0000 500 7547 1.0000 500 993 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/162/162.seqs.fa -oc motifs/162 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.255 C 0.224 G 0.254 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.255 C 0.224 G 0.254 T 0.266 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 10 llr = 149 E-value = 3.2e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :212:14113:112:2::4:: pos.-specific C 1741984992693:63:a3:6 probability G 81::::1::3::18323:1:: matrix T 1:57111::24:5:137:2a4 bits 2.2 * 1.9 * * 1.7 * ** * * * 1.5 * ** * * * Relative 1.3 ** ** * * * * Entropy 1.1 * ** ** ** * ** ** (21.5 bits) 0.9 ** *** ** ** ** ** ** 0.6 ****** ** ** ** ** ** 0.4 ****** ** ** ** ** ** 0.2 ********* ***** ***** 0.0 --------------------- Multilevel GCTTCCACCACCTGCCTCATC consensus ACA C GT CAGTG C T sequence C A T T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 24624 439 1.27e-10 TTCGCGCGCG GCTTCCCCCCCCTGGCTCATT AAACCCCACA 268822 202 5.78e-09 CCCCCGCGAA GGCTCCACCGTCCGGCTCATC GGTCTCAAAA 7547 219 1.10e-08 ACCACGTGAA CCATCCACCACCTGCTTCCTT CTGCTGCGCT 4973 405 3.85e-08 GCTCCCTCCG GCCTCCACCGCCGAGTTCGTC GGCTACAATA 23464 15 4.59e-08 CTGGCAATAA GATTCCACCATCCGTAGCATC CAACGTATCC 1446 384 6.47e-08 TCTGCTGCAT GCCACCGACACCTGCTTCCTT CCTCGCCGTC 2974 424 7.64e-08 AGCATACTAT GCTTTCCCCCTCCACGTCTTC ACTGTCCTGC 263193 455 2.39e-07 TTTGCTGTTT GCTCCTCCCTCCAGCCGCTTC CCTCCTGGCT 21909 467 4.66e-07 CGGACGAGCA GATACACCAGTCTGCATCATC GACAGTACAA 993 307 5.62e-07 CAACCCCACA TCCTCCTCCTCATGCGGCCTT TGGCGAGTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24624 1.3e-10 438_[+1]_41 268822 5.8e-09 201_[+1]_278 7547 1.1e-08 218_[+1]_261 4973 3.9e-08 404_[+1]_75 23464 4.6e-08 14_[+1]_465 1446 6.5e-08 383_[+1]_96 2974 7.6e-08 423_[+1]_56 263193 2.4e-07 454_[+1]_25 21909 4.7e-07 466_[+1]_13 993 5.6e-07 306_[+1]_173 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=10 24624 ( 439) GCTTCCCCCCCCTGGCTCATT 1 268822 ( 202) GGCTCCACCGTCCGGCTCATC 1 7547 ( 219) CCATCCACCACCTGCTTCCTT 1 4973 ( 405) GCCTCCACCGCCGAGTTCGTC 1 23464 ( 15) GATTCCACCATCCGTAGCATC 1 1446 ( 384) GCCACCGACACCTGCTTCCTT 1 2974 ( 424) GCTTTCCCCCTCCACGTCTTC 1 263193 ( 455) GCTCCTCCCTCCAGCCGCTTC 1 21909 ( 467) GATACACCAGTCTGCATCATC 1 993 ( 307) TCCTCCTCCTCATGCGGCCTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 9.83492 E= 3.2e-002 -997 -116 165 -141 -35 164 -134 -997 -135 83 -997 91 -35 -116 -997 139 -997 200 -997 -141 -135 183 -997 -141 65 83 -134 -141 -135 200 -997 -997 -135 200 -997 -997 23 -17 24 -41 -997 142 -997 59 -135 200 -997 -997 -135 42 -134 91 -35 -997 165 -997 -997 142 24 -141 -35 42 -35 17 -997 -997 24 139 -997 216 -997 -997 65 42 -134 -41 -997 -997 -997 191 -997 142 -997 59 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 3.2e-002 0.000000 0.100000 0.800000 0.100000 0.200000 0.700000 0.100000 0.000000 0.100000 0.400000 0.000000 0.500000 0.200000 0.100000 0.000000 0.700000 0.000000 0.900000 0.000000 0.100000 0.100000 0.800000 0.000000 0.100000 0.400000 0.400000 0.100000 0.100000 0.100000 0.900000 0.000000 0.000000 0.100000 0.900000 0.000000 0.000000 0.300000 0.200000 0.300000 0.200000 0.000000 0.600000 0.000000 0.400000 0.100000 0.900000 0.000000 0.000000 0.100000 0.300000 0.100000 0.500000 0.200000 0.000000 0.800000 0.000000 0.000000 0.600000 0.300000 0.100000 0.200000 0.300000 0.200000 0.300000 0.000000 0.000000 0.300000 0.700000 0.000000 1.000000 0.000000 0.000000 0.400000 0.300000 0.100000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.000000 0.400000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[CA][TC][TA]CC[AC]CC[AGCT][CT]C[TC][GA][CG][CTAG][TG]C[ACT]T[CT] -------------------------------------------------------------------------------- Time 2.65 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 9 llr = 140 E-value = 3.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::243:216::::1:31:6:: pos.-specific C :6:32:6::29133:6123:: probability G ::722a::4::7::a1:7::6 matrix T a41:2:29:81276::811a4 bits 2.2 1.9 * * * * 1.7 * * * * * 1.5 * * * * * * Relative 1.3 * * * ** * * Entropy 1.1 ** * **** * * * (22.5 bits) 0.9 ** * ****** * ** ** 0.6 *** **************** 0.4 **** **************** 0.2 **** **************** 0.0 --------------------- Multilevel TCGAAGCTATCGTTGCTGATG consensus TACC A GC TCC A CC T sequence GG T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 23464 158 2.39e-10 ATGTACAACG TCGAAGATGTCGCTGCTGATG ATCGTGTGGC 4973 289 1.38e-09 CACTACACCA TCGGCGCTGTCGCTGATGCTG GAAACAATCG 268737 73 3.06e-09 TGATGGGCTG TTGCCGTTATCGTCGCTCATG GCTGTGGAGA 260982 9 3.91e-09 TTGGCAAG TTGAGGCTGTCGTTGACGATG ACGAAGCCGA 7547 272 1.61e-08 GCCACTCTCT TCGCAGCTATCCTTGCAGCTT GCACCAGGCG 268822 341 8.62e-08 CCGAAGCAGA TTGAGGTTATCGCCGGTCCTT CGAGGAGGGC 20065 88 2.37e-07 GATATAGACC TCTCTGAAATCTTCGCTGATT ACTTCAGTGA 263193 434 2.94e-07 CTTTTAGGAC TCAAAGCTACTTTTGCTGTTT GCTCCTCCCT 2974 242 3.60e-07 CTGCTGTTGT TTAGTGCTGCCGTAGATTATG GGATTGGTAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23464 2.4e-10 157_[+2]_322 4973 1.4e-09 288_[+2]_191 268737 3.1e-09 72_[+2]_407 260982 3.9e-09 8_[+2]_471 7547 1.6e-08 271_[+2]_208 268822 8.6e-08 340_[+2]_139 20065 2.4e-07 87_[+2]_392 263193 2.9e-07 433_[+2]_46 2974 3.6e-07 241_[+2]_238 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=9 23464 ( 158) TCGAAGATGTCGCTGCTGATG 1 4973 ( 289) TCGGCGCTGTCGCTGATGCTG 1 268737 ( 73) TTGCCGTTATCGTCGCTCATG 1 260982 ( 9) TTGAGGCTGTCGTTGACGATG 1 7547 ( 272) TCGCAGCTATCCTTGCAGCTT 1 268822 ( 341) TTGAGGTTATCGCCGGTCCTT 1 20065 ( 88) TCTCTGAAATCTTCGCTGATT 1 263193 ( 434) TCAAAGCTACTTTTGCTGTTT 1 2974 ( 242) TTAGTGCTGCCGTAGATTATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 9.86973 E= 3.2e+000 -982 -982 -982 191 -982 131 -982 74 -20 -982 139 -126 80 57 -19 -982 39 -1 -19 -26 -982 -982 197 -982 -20 131 -982 -26 -120 -982 -982 174 112 -982 81 -982 -982 -1 -982 154 -982 199 -982 -126 -982 -101 139 -26 -982 57 -982 132 -120 57 -982 106 -982 -982 197 -982 39 131 -119 -982 -120 -101 -982 154 -982 -1 139 -126 112 57 -982 -126 -982 -982 -982 191 -982 -982 113 74 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 3.2e+000 0.000000 0.000000 0.000000 1.000000 0.000000 0.555556 0.000000 0.444444 0.222222 0.000000 0.666667 0.111111 0.444444 0.333333 0.222222 0.000000 0.333333 0.222222 0.222222 0.222222 0.000000 0.000000 1.000000 0.000000 0.222222 0.555556 0.000000 0.222222 0.111111 0.000000 0.000000 0.888889 0.555556 0.000000 0.444444 0.000000 0.000000 0.222222 0.000000 0.777778 0.000000 0.888889 0.000000 0.111111 0.000000 0.111111 0.666667 0.222222 0.000000 0.333333 0.000000 0.666667 0.111111 0.333333 0.000000 0.555556 0.000000 0.000000 1.000000 0.000000 0.333333 0.555556 0.111111 0.000000 0.111111 0.111111 0.000000 0.777778 0.000000 0.222222 0.666667 0.111111 0.555556 0.333333 0.000000 0.111111 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.555556 0.444444 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[CT][GA][ACG][ACGT]G[CAT]T[AG][TC]C[GT][TC][TC]G[CA]T[GC][AC]T[GT] -------------------------------------------------------------------------------- Time 5.18 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 7 llr = 98 E-value = 5.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::4::1:1::::::: pos.-specific C a339::a4:a:3a:: probability G :33:44:3::a::39 matrix T :4:164:1a::7:71 bits 2.2 * * * * 1.9 * * *** * 1.7 * * *** * 1.5 * * * *** * Relative 1.3 * * * *** * * Entropy 1.1 * * * ******* (20.2 bits) 0.9 * ** * ******* 0.6 * ** * ******* 0.4 ******* ******* 0.2 *************** 0.0 --------------- Multilevel CTACTGCCTCGTCTG consensus CC GT G C G sequence GG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 4973 362 1.44e-09 TATTCTCTCA CTACTTCCTCGTCTG CTCTCACTGG 7547 72 8.29e-08 CTGTTACCGA CGGCGGCGTCGTCTG CTTTTGCAAA 268737 417 8.29e-08 AACGCTTCGT CGCCTGCCTCGCCTG AGAACATGCT 993 478 2.90e-07 ATAGATTGAT CCACTTCTTCGTCGG TTAACGGA 268822 234 4.30e-07 GTCTCAAAAA CCCCGACCTCGTCGG CCAAGAGGGT 20065 358 4.60e-07 TCCAATGTAT CTGTTTCGTCGTCTG GTGTAGGGGC 21909 58 9.97e-07 CTATTTGTAG CTACGGCATCGCCTT CAATGGGTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 4973 1.4e-09 361_[+3]_124 7547 8.3e-08 71_[+3]_414 268737 8.3e-08 416_[+3]_69 993 2.9e-07 477_[+3]_8 268822 4.3e-07 233_[+3]_252 20065 4.6e-07 357_[+3]_128 21909 1e-06 57_[+3]_428 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=7 4973 ( 362) CTACTTCCTCGTCTG 1 7547 ( 72) CGGCGGCGTCGTCTG 1 268737 ( 417) CGCCTGCCTCGCCTG 1 993 ( 478) CCACTTCTTCGTCGG 1 268822 ( 234) CCCCGACCTCGTCGG 1 20065 ( 358) CTGTTTCGTCGTCTG 1 21909 ( 58) CTACGGCATCGCCTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 7776 bayes= 9.95989 E= 5.8e+001 -945 216 -945 -945 -945 35 17 68 75 35 17 -945 -945 193 -945 -90 -945 -945 75 110 -83 -945 75 68 -945 216 -945 -945 -83 93 17 -90 -945 -945 -945 191 -945 216 -945 -945 -945 -945 197 -945 -945 35 -945 142 -945 216 -945 -945 -945 -945 17 142 -945 -945 175 -90 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 7 E= 5.8e+001 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.285714 0.428571 0.428571 0.285714 0.285714 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 0.428571 0.571429 0.142857 0.000000 0.428571 0.428571 0.000000 1.000000 0.000000 0.000000 0.142857 0.428571 0.285714 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.285714 0.000000 0.714286 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 0.857143 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[TCG][ACG]C[TG][GT]C[CG]TCG[TC]C[TG]G -------------------------------------------------------------------------------- Time 7.67 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1446 5.38e-05 383_[+1(6.47e-08)]_96 20065 1.34e-06 87_[+2(2.37e-07)]_249_\ [+3(4.60e-07)]_128 21909 9.35e-06 57_[+3(9.97e-07)]_394_\ [+1(4.66e-07)]_13 23464 2.17e-10 14_[+1(4.59e-08)]_122_\ [+2(2.39e-10)]_322 24624 3.45e-06 438_[+1(1.27e-10)]_41 260982 3.09e-05 8_[+2(3.91e-09)]_471 2623 4.36e-01 500 263193 2.44e-06 433_[+2(2.94e-07)]_[+1(2.39e-07)]_\ 25 268737 2.72e-09 72_[+2(3.06e-09)]_323_\ [+3(8.29e-08)]_69 268822 1.26e-11 201_[+1(5.78e-09)]_11_\ [+3(4.30e-07)]_92_[+2(8.62e-08)]_139 2974 9.95e-08 241_[+2(3.60e-07)]_161_\ [+1(7.64e-08)]_56 4973 6.94e-15 288_[+2(1.38e-09)]_52_\ [+3(1.44e-09)]_28_[+1(3.85e-08)]_75 5538 4.26e-01 500 6112 4.42e-01 500 7547 1.01e-12 71_[+3(8.29e-08)]_132_\ [+1(1.10e-08)]_32_[+2(1.61e-08)]_208 993 1.76e-06 306_[+1(5.62e-07)]_150_\ [+3(2.90e-07)]_8 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************