******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/288/288.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 16746 1.0000 500 17811 1.0000 500 20602 1.0000 500 2170 1.0000 500 22389 1.0000 500 2247 1.0000 500 23142 1.0000 500 270199 1.0000 500 27208 1.0000 500 30478 1.0000 500 31150 1.0000 500 33775 1.0000 500 38025 1.0000 500 5175 1.0000 500 5323 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/288/288.seqs.fa -oc motifs/288 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.251 C 0.232 G 0.249 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.251 C 0.232 G 0.249 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 9 llr = 137 E-value = 3.2e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 33112:47:989179:6:3: pos.-specific C 7179:a22312:8119341a probability G :6::1:::7::1:2:1:22: matrix T ::2:7:31::::1:::133: bits 2.1 * * 1.9 * * 1.7 * * * * 1.5 * * * * ** * Relative 1.3 * * *** ** * Entropy 1.1 * * * ***** ** * (22.0 bits) 0.8 * ** * ********* * 0.6 ****** ********** * 0.4 ****************** * 0.2 ****************** * 0.0 -------------------- Multilevel CGCCTCAAGAAACAACACAC consensus AAT A TCC C G CTT sequence C GG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 270199 348 1.15e-11 GCCACGTCGT CGCCTCAAGAAACAACAGAC TGAATTGAAG 17811 312 1.15e-11 GCCACGTCGT CGCCTCAAGAAACAACAGAC TGAATTGAAG 2247 308 1.78e-08 GATTTTTCAT CAACTCTAGAAGCAACACTC AAGCAAGAGA 5175 464 5.71e-08 ACTGAGCCAC AATCTCTCGACACAACCTTC TGCAATTCAC 30478 46 7.88e-08 CCTCTTGCAA CGCCTCACGCAATGACACAC TTGCTAGTTG 31150 470 2.04e-07 GGACGCCGCC CACCACCACAAACGCCCCCC ACCACATAAC 16746 400 2.18e-07 ACACGAAAGC CCCCGCCACAAACAAGCCGC CCCCCGCCGC 22389 339 4.08e-07 AGTTGTTTGC AGCAACATGAAACAACTTGC AAAAGGATAC 2170 476 4.33e-07 CCCGAGAAGA AGTCTCTACACAACACATTC CCATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 270199 1.2e-11 347_[+1]_133 17811 1.2e-11 311_[+1]_169 2247 1.8e-08 307_[+1]_173 5175 5.7e-08 463_[+1]_17 30478 7.9e-08 45_[+1]_435 31150 2e-07 469_[+1]_11 16746 2.2e-07 399_[+1]_81 22389 4.1e-07 338_[+1]_142 2170 4.3e-07 475_[+1]_5 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=9 270199 ( 348) CGCCTCAAGAAACAACAGAC 1 17811 ( 312) CGCCTCAAGAAACAACAGAC 1 2247 ( 308) CAACTCTAGAAGCAACACTC 1 5175 ( 464) AATCTCTCGACACAACCTTC 1 30478 ( 46) CGCCTCACGCAATGACACAC 1 31150 ( 470) CACCACCACAAACGCCCCCC 1 16746 ( 400) CCCCGCCACAAACAAGCCGC 1 22389 ( 339) AGCAACATGAAACAACTTGC 1 2170 ( 476) AGTCTCTACACAACACATTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7215 bayes= 10.4939 E= 3.2e-002 41 152 -982 -982 41 -106 116 -982 -117 152 -982 -27 -117 194 -982 -982 -18 -982 -116 131 -982 211 -982 -982 82 -6 -982 31 141 -6 -982 -127 -982 52 142 -982 182 -106 -982 -982 163 -6 -982 -982 182 -982 -116 -982 -117 175 -982 -127 141 -106 -16 -982 182 -106 -982 -982 -982 194 -116 -982 114 52 -982 -127 -982 94 -16 31 41 -106 -16 31 -982 211 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 9 E= 3.2e-002 0.333333 0.666667 0.000000 0.000000 0.333333 0.111111 0.555556 0.000000 0.111111 0.666667 0.000000 0.222222 0.111111 0.888889 0.000000 0.000000 0.222222 0.000000 0.111111 0.666667 0.000000 1.000000 0.000000 0.000000 0.444444 0.222222 0.000000 0.333333 0.666667 0.222222 0.000000 0.111111 0.000000 0.333333 0.666667 0.000000 0.888889 0.111111 0.000000 0.000000 0.777778 0.222222 0.000000 0.000000 0.888889 0.000000 0.111111 0.000000 0.111111 0.777778 0.000000 0.111111 0.666667 0.111111 0.222222 0.000000 0.888889 0.111111 0.000000 0.000000 0.000000 0.888889 0.111111 0.000000 0.555556 0.333333 0.000000 0.111111 0.000000 0.444444 0.222222 0.333333 0.333333 0.111111 0.222222 0.333333 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CA][GA][CT]C[TA]C[ATC][AC][GC]A[AC]AC[AG]AC[AC][CTG][ATG]C -------------------------------------------------------------------------------- Time 1.89 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 7 llr = 128 E-value = 6.7e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :46:1a36::9a:::36::17 pos.-specific C :3:1:::::1:::37::11:3 probability G a1:99:71a91:a7173196: matrix T :14::::3::::::1:17:3: bits 2.1 * * * 1.9 * * * ** 1.7 * * * ** 1.5 * *** ***** * Relative 1.3 * *** ****** * * Entropy 1.1 * ***** ****** * * * (26.4 bits) 0.8 * ***** ******** ** * 0.6 * ******************* 0.4 * ******************* 0.2 ********************* 0.0 --------------------- Multilevel GAAGGAGAGGAAGGCGATGGA consensus CT AT C AG TC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 270199 125 4.30e-13 CGGCACCGAG GCAGGAGAGGAAGGCGATGGA GGAGATCTTC 17811 89 4.30e-13 CGGCACCGAG GCAGGAGAGGAAGGCGATGGA GGAGATCTTC 5175 225 9.12e-09 ATAATGCGTT GATGGAGGGGAAGCGATTGGA ACGTGCGTCT 30478 176 9.12e-09 ACGTCTCTGC GTTGGAAAGGAAGCCAGTGTC TGCTGCCATG 23142 346 9.12e-09 GAGTGCTGGT GGTCGAGAGCAAGGTGATGGA AGTGTCCTCT 38025 287 1.41e-08 CAGCTGGCGC GAAGAAATGGAAGGCGGCGAA TGTGACGCAT 33775 205 1.66e-08 TGAGTTGGAT GAAGGAGTGGGAGGCGAGCTC GCCTGTGGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 270199 4.3e-13 124_[+2]_355 17811 4.3e-13 88_[+2]_391 5175 9.1e-09 224_[+2]_255 30478 9.1e-09 175_[+2]_304 23142 9.1e-09 345_[+2]_134 38025 1.4e-08 286_[+2]_193 33775 1.7e-08 204_[+2]_275 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=7 270199 ( 125) GCAGGAGAGGAAGGCGATGGA 1 17811 ( 89) GCAGGAGAGGAAGGCGATGGA 1 5175 ( 225) GATGGAGGGGAAGCGATTGGA 1 30478 ( 176) GTTGGAAAGGAAGCCAGTGTC 1 23142 ( 346) GGTCGAGAGCAAGGTGATGGA 1 38025 ( 287) GAAGAAATGGAAGGCGGCGAA 1 33775 ( 205) GAAGGAGTGGGAGGCGAGCTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 9.84874 E= 6.7e-003 -945 -945 200 -945 77 30 -80 -91 118 -945 -945 68 -945 -70 178 -945 -81 -945 178 -945 199 -945 -945 -945 19 -945 152 -945 118 -945 -80 9 -945 -945 200 -945 -945 -70 178 -945 177 -945 -80 -945 199 -945 -945 -945 -945 -945 200 -945 -945 30 152 -945 -945 162 -80 -91 19 -945 152 -945 118 -945 20 -91 -945 -70 -80 141 -945 -70 178 -945 -81 -945 120 9 151 30 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 6.7e-003 0.000000 0.000000 1.000000 0.000000 0.428571 0.285714 0.142857 0.142857 0.571429 0.000000 0.000000 0.428571 0.000000 0.142857 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.000000 0.714286 0.000000 0.571429 0.000000 0.142857 0.285714 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.857143 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 0.714286 0.142857 0.142857 0.285714 0.000000 0.714286 0.000000 0.571429 0.000000 0.285714 0.142857 0.000000 0.142857 0.142857 0.714286 0.000000 0.142857 0.857143 0.000000 0.142857 0.000000 0.571429 0.285714 0.714286 0.285714 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[AC][AT]GGA[GA][AT]GGAAG[GC]C[GA][AG]TG[GT][AC] -------------------------------------------------------------------------------- Time 3.65 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 15 llr = 147 E-value = 4.7e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::24:1::3:1: pos.-specific C ::::::51:2:: probability G :a8:95:6632a matrix T a::61453157: bits 2.1 * * 1.9 ** * 1.7 ** * * 1.5 ** * * Relative 1.3 *** * * Entropy 1.1 ***** * * (14.1 bits) 0.8 ***** ** ** 0.6 ********* ** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGGTGGTGGTTG consensus AA TCTAGG sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 5323 80 1.53e-07 CATACGACGA TGGTGGCGGTTG AGCCTCACTA 270199 192 3.18e-07 AGTTGGTGGT TGGTGTTGGTTG GTGGTTGACA 17811 156 3.18e-07 AGTTGGTGGT TGGTGTTGGTTG GTGGTTGACA 38025 155 9.22e-07 GGTCGTCGGC TGGTGGCGGGTG GCGACCACTT 33775 185 1.99e-06 GTCTTGGGCG TGGAGGCGGGTG AGTTGGATGA 31150 166 6.66e-06 ACTTGAGCGA TGGAGGTTATTG TCCAGTGGAT 2170 273 6.66e-06 AAGTTCCAAG TGGAGGTGAGTG TGCACCAAAG 23142 256 9.49e-06 TGATTGTAGA TGGTGGTGATGG CACTGCTTTT 22389 326 1.01e-05 AACTATGCCG TGGAGTTGTTTG CAGCAACATG 20602 305 1.25e-05 TGCCTGAGTT TGGTGTTCGTTG TTCACTGCTT 2247 61 2.19e-05 GTGTTGTGGA TGGTGTCTATGG TAGATGTGTG 30478 268 5.85e-05 GTGTTGCCGA TGATGACTGCTG TTGATGTTCG 27208 71 6.51e-05 GCACGACGAA TGGAGACTGCGG AGGCGATGTC 5175 91 1.21e-04 AAACCAACGA TGAAGGCTGCAG CAGCTCCACC 16746 216 1.64e-04 TGCCGAGTTG TGATTTTGTGTG CGGCGGTGGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5323 1.5e-07 79_[+3]_409 270199 3.2e-07 191_[+3]_297 17811 3.2e-07 155_[+3]_333 38025 9.2e-07 154_[+3]_334 33775 2e-06 184_[+3]_304 31150 6.7e-06 165_[+3]_323 2170 6.7e-06 272_[+3]_216 23142 9.5e-06 255_[+3]_233 22389 1e-05 325_[+3]_163 20602 1.2e-05 304_[+3]_184 2247 2.2e-05 60_[+3]_428 30478 5.8e-05 267_[+3]_221 27208 6.5e-05 70_[+3]_418 5175 0.00012 90_[+3]_398 16746 0.00016 215_[+3]_273 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=15 5323 ( 80) TGGTGGCGGTTG 1 270199 ( 192) TGGTGTTGGTTG 1 17811 ( 156) TGGTGTTGGTTG 1 38025 ( 155) TGGTGGCGGGTG 1 33775 ( 185) TGGAGGCGGGTG 1 31150 ( 166) TGGAGGTTATTG 1 2170 ( 273) TGGAGGTGAGTG 1 23142 ( 256) TGGTGGTGATGG 1 22389 ( 326) TGGAGTTGTTTG 1 20602 ( 305) TGGTGTTCGTTG 1 2247 ( 61) TGGTGTCTATGG 1 30478 ( 268) TGATGACTGCTG 1 27208 ( 71) TGGAGACTGCGG 1 5175 ( 91) TGAAGGCTGCAG 1 16746 ( 216) TGATTTTGTGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 8.93074 E= 4.7e-002 -1055 -1055 -1055 190 -1055 -1055 200 -1055 -33 -1055 168 -1055 67 -1055 -1055 116 -1055 -1055 191 -200 -91 -1055 91 58 -1055 101 -1055 99 -1055 -179 127 31 9 -1055 127 -101 -1055 -21 10 99 -191 -1055 -32 145 -1055 -1055 200 -1055 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 15 E= 4.7e-002 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.400000 0.000000 0.000000 0.600000 0.000000 0.000000 0.933333 0.066667 0.133333 0.000000 0.466667 0.400000 0.000000 0.466667 0.000000 0.533333 0.000000 0.066667 0.600000 0.333333 0.266667 0.000000 0.600000 0.133333 0.000000 0.200000 0.266667 0.533333 0.066667 0.000000 0.200000 0.733333 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TG[GA][TA]G[GT][TC][GT][GA][TGC][TG]G -------------------------------------------------------------------------------- Time 5.50 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 16746 5.40e-04 399_[+1(2.18e-07)]_81 17811 2.32e-19 88_[+2(4.30e-13)]_46_[+3(3.18e-07)]_\ 144_[+1(1.15e-11)]_169 20602 6.52e-02 304_[+3(1.25e-05)]_184 2170 5.50e-05 272_[+3(6.66e-06)]_116_\ [+1(4.85e-05)]_55_[+1(4.33e-07)]_5 22389 5.03e-05 325_[+3(1.01e-05)]_1_[+1(4.08e-07)]_\ 142 2247 2.36e-06 60_[+3(2.19e-05)]_235_\ [+1(1.78e-08)]_173 23142 2.89e-06 255_[+3(9.49e-06)]_78_\ [+2(9.12e-09)]_134 270199 2.32e-19 11_[+3(4.37e-05)]_101_\ [+2(4.30e-13)]_46_[+3(3.18e-07)]_144_[+1(1.15e-11)]_133 27208 5.58e-02 70_[+3(6.51e-05)]_418 30478 1.72e-09 45_[+1(7.88e-08)]_110_\ [+2(9.12e-09)]_71_[+3(5.85e-05)]_221 31150 1.53e-05 165_[+3(6.66e-06)]_292_\ [+1(2.04e-07)]_11 33775 1.05e-07 89_[+3(9.34e-05)]_83_[+3(1.99e-06)]_\ 8_[+2(1.66e-08)]_275 38025 1.93e-07 154_[+3(9.22e-07)]_120_\ [+2(1.41e-08)]_193 5175 2.47e-09 224_[+2(9.12e-09)]_218_\ [+1(5.71e-08)]_17 5323 1.31e-03 79_[+3(1.53e-07)]_409 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************