******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/227/227.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 12972 1.0000 500 52058 1.0000 500 13093 1.0000 500 28237 1.0000 500 292 1.0000 500 15422 1.0000 500 49528 1.0000 500 51604 1.0000 500 8293 1.0000 500 54310 1.0000 500 11356 1.0000 500 432 1.0000 500 11409 1.0000 500 49300 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/227/227.seqs.fa -oc motifs/227 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.270 C 0.235 G 0.247 T 0.248 Background letter frequencies (from dataset with add-one prior applied): A 0.270 C 0.235 G 0.247 T 0.248 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 14 llr = 165 E-value = 2.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 5771186743163:112a4:8 pos.-specific C 4:121122:314:1622:2:: probability G 13:48111418::1:66:29: matrix T ::22::::331:7741::112 bits 2.1 1.9 * 1.7 * 1.5 * * Relative 1.3 * ** Entropy 1.0 * ** ** * ** (17.0 bits) 0.8 ** ** * ***** * ** 0.6 *** **** ***** ** ** 0.4 *** ***** ******** ** 0.2 ********* ******** ** 0.0 --------------------- Multilevel AAAGGAAAAAGATTCGGAAGA consensus CGTC CCGC CA TCA C T sequence T TT C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 52058 273 2.80e-09 ACGAAAGGTG AAAGGAGAGTGATTCCGAAGA ACAGACTTTC 8293 101 4.12e-08 AAGAAAGAGA AAAGGAACTCGCTCTGGACGA CGAGGATGAT 49300 36 2.20e-07 GATATGTACA AGTGGAAAGCGCTTCGCATGT GCGAACCGAT 12972 130 2.20e-07 GCCGGTACCA CAACGGAAGTGCATTGGACGA GACACGTCGT 51604 296 5.60e-07 ATTCGCAGTT CATCGCCAATGATTTGGATGA GCTGCAGGCA 54310 424 6.18e-07 TTAACCCCAC CGTTGAAGAAGATTCCGAAGA AAGCATGGCA 11409 133 9.04e-07 GAAACAATCG CAATGAAATCGAATTGCAATT CCCAGAATCC 28237 2 1.19e-06 A CGAAGAACGAGCTGCGAAGGA AAGAAAAATG 292 335 1.42e-06 CGCCCTTTAA AACGGCAAAAGATTCTCAGGA CCCGCTCTTT 432 368 1.69e-06 CGTTGAAAGA AAAGGACAAGCCTTCGAACGT CTTCACCAAA 13093 356 4.46e-06 ACTTGATTTT GAACGAAATTGAATACAAAGA ACCTGTAAAA 49528 107 8.53e-06 GTCGAAAAAA AAAGCAAATATATTTTGAGTA ACCCTAGCAT 11356 78 1.04e-05 AGGGTCTCTT CGAAAACAAGGCACCGGAAGA GCAACAGCGT 15422 470 4.17e-05 ATGATGGTCA AAATCAGCGCAATGCAGAAGA ATCCTCCCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 52058 2.8e-09 272_[+1]_207 8293 4.1e-08 100_[+1]_379 49300 2.2e-07 35_[+1]_444 12972 2.2e-07 129_[+1]_350 51604 5.6e-07 295_[+1]_184 54310 6.2e-07 423_[+1]_56 11409 9e-07 132_[+1]_347 28237 1.2e-06 1_[+1]_478 292 1.4e-06 334_[+1]_145 432 1.7e-06 367_[+1]_112 13093 4.5e-06 355_[+1]_124 49528 8.5e-06 106_[+1]_373 11356 1e-05 77_[+1]_402 15422 4.2e-05 469_[+1]_10 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=14 52058 ( 273) AAAGGAGAGTGATTCCGAAGA 1 8293 ( 101) AAAGGAACTCGCTCTGGACGA 1 49300 ( 36) AGTGGAAAGCGCTTCGCATGT 1 12972 ( 130) CAACGGAAGTGCATTGGACGA 1 51604 ( 296) CATCGCCAATGATTTGGATGA 1 54310 ( 424) CGTTGAAGAAGATTCCGAAGA 1 11409 ( 133) CAATGAAATCGAATTGCAATT 1 28237 ( 2) CGAAGAACGAGCTGCGAAGGA 1 292 ( 335) AACGGCAAAAGATTCTCAGGA 1 432 ( 368) AAAGGACAAGCCTTCGAACGT 1 13093 ( 356) GAACGAAATTGAATACAAAGA 1 49528 ( 107) AAAGCAAATATATTTTGAGTA 1 11356 ( 78) CGAAAACAAGGCACCGGAAGA 1 15422 ( 470) AAATCAGCGCAATGCAGAAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6720 bayes= 8.90388 E= 2.0e+002 89 87 -179 -1045 140 -1045 21 -1045 140 -171 -1045 -21 -92 -13 79 -21 -191 -72 167 -1045 154 -72 -179 -1045 125 -13 -79 -1045 140 -13 -179 -1045 40 -1045 53 20 8 28 -79 20 -191 -171 167 -179 108 87 -1045 -1045 8 -1045 -1045 152 -1045 -72 -79 152 -191 128 -1045 53 -191 -13 121 -80 -33 -13 121 -1045 189 -1045 -1045 -1045 67 -13 -21 -80 -1045 -1045 179 -80 154 -1045 -1045 -21 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 14 E= 2.0e+002 0.500000 0.428571 0.071429 0.000000 0.714286 0.000000 0.285714 0.000000 0.714286 0.071429 0.000000 0.214286 0.142857 0.214286 0.428571 0.214286 0.071429 0.142857 0.785714 0.000000 0.785714 0.142857 0.071429 0.000000 0.642857 0.214286 0.142857 0.000000 0.714286 0.214286 0.071429 0.000000 0.357143 0.000000 0.357143 0.285714 0.285714 0.285714 0.142857 0.285714 0.071429 0.071429 0.785714 0.071429 0.571429 0.428571 0.000000 0.000000 0.285714 0.000000 0.000000 0.714286 0.000000 0.142857 0.142857 0.714286 0.071429 0.571429 0.000000 0.357143 0.071429 0.214286 0.571429 0.142857 0.214286 0.214286 0.571429 0.000000 1.000000 0.000000 0.000000 0.000000 0.428571 0.214286 0.214286 0.142857 0.000000 0.000000 0.857143 0.142857 0.785714 0.000000 0.000000 0.214286 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AC][AG][AT][GCT]GA[AC][AC][AGT][ACT]G[AC][TA]T[CT][GC][GAC]A[ACG]G[AT] -------------------------------------------------------------------------------- Time 1.94 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 14 llr = 127 E-value = 1.0e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 62::7:::4349 pos.-specific C ::412:aa:361 probability G 13391:::14:1 matrix T 353::a::41:: bits 2.1 *** 1.9 *** 1.7 *** 1.5 * *** Relative 1.3 * *** * Entropy 1.0 * *** ** (13.1 bits) 0.8 * ***** ** 0.6 * ***** ** 0.4 ********* ** 0.2 ************ 0.0 ------------ Multilevel ATCGATCCAGCA consensus TGG C TAA sequence AT C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 54310 340 4.73e-07 CGAATGCACG ATGGATCCTGCA AAACGCCAGA 28237 457 1.12e-06 ACATTGGGTA AGCGATCCTCCA GTCTCAACGG 13093 126 3.81e-06 TGGGGACTAC TTTGATCCAGCA TATATCGAGA 432 15 4.60e-06 AGCCAGCATC AACGATCCAACA TTCAGTCTTC 11409 410 1.13e-05 CGGCCGCTGA ATTGCTCCAACA AAGGATTATG 8293 457 2.16e-05 CCGCAAATCG AGGGATCCGACA ACCCTTATCG 12972 371 2.55e-05 GTACTTCCAA AACCATCCTGCA ACACGTTAAC 15422 2 3.60e-05 G GTCGATCCTCAA AACCGAGATT 51604 350 3.92e-05 ACGCACAGCG TGCGCTCCAGAA GGCCCACCAG 49528 292 5.87e-05 TTTATTGAGG ATGGGTCCTCAA TATAGGACAG 49300 394 6.25e-05 TATATTGCAC TGTCATCCTCCA AACACAGTTT 292 282 8.28e-05 CGAATATTTT TTTGATCCAACC TTGCTCATCA 11356 334 9.38e-05 TAGATTTGAG AACGATCCAGAG AGTTTCATAT 52058 383 1.61e-04 GTGCTCACGT ATGGCTCCGTAA GGGAGATTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54310 4.7e-07 339_[+2]_149 28237 1.1e-06 456_[+2]_32 13093 3.8e-06 125_[+2]_363 432 4.6e-06 14_[+2]_474 11409 1.1e-05 409_[+2]_79 8293 2.2e-05 456_[+2]_32 12972 2.5e-05 370_[+2]_118 15422 3.6e-05 1_[+2]_487 51604 3.9e-05 349_[+2]_139 49528 5.9e-05 291_[+2]_197 49300 6.3e-05 393_[+2]_95 292 8.3e-05 281_[+2]_207 11356 9.4e-05 333_[+2]_155 52058 0.00016 382_[+2]_106 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=14 54310 ( 340) ATGGATCCTGCA 1 28237 ( 457) AGCGATCCTCCA 1 13093 ( 126) TTTGATCCAGCA 1 432 ( 15) AACGATCCAACA 1 11409 ( 410) ATTGCTCCAACA 1 8293 ( 457) AGGGATCCGACA 1 12972 ( 371) AACCATCCTGCA 1 15422 ( 2) GTCGATCCTCAA 1 51604 ( 350) TGCGCTCCAGAA 1 49528 ( 292) ATGGGTCCTCAA 1 49300 ( 394) TGTCATCCTCCA 1 292 ( 282) TTTGATCCAACC 1 11356 ( 334) AACGATCCAGAG 1 52058 ( 383) ATGGCTCCGTAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.53747 E= 1.0e+003 125 -1045 -179 20 -33 -1045 21 101 -1045 87 21 20 -1045 -72 179 -1045 140 -13 -179 -1045 -1045 -1045 -1045 201 -1045 209 -1045 -1045 -1045 209 -1045 -1045 67 -1045 -79 79 8 28 53 -179 40 145 -1045 -1045 167 -171 -179 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 1.0e+003 0.642857 0.000000 0.071429 0.285714 0.214286 0.000000 0.285714 0.500000 0.000000 0.428571 0.285714 0.285714 0.000000 0.142857 0.857143 0.000000 0.714286 0.214286 0.071429 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.428571 0.000000 0.142857 0.428571 0.285714 0.285714 0.357143 0.071429 0.357143 0.642857 0.000000 0.000000 0.857143 0.071429 0.071429 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AT][TGA][CGT]G[AC]TCC[AT][GAC][CA]A -------------------------------------------------------------------------------- Time 3.76 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 5 llr = 76 E-value = 2.1e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::a::::6:24:82: pos.-specific C 88::828a:a:48:88 probability G 2:a::42:4:8::::: matrix T :2::24:::::222:2 bits 2.1 * * * 1.9 ** * * 1.7 ** * * 1.5 ** * * Relative 1.3 ***** ** ** **** Entropy 1.0 ***** ***** **** (21.9 bits) 0.8 ***** ***** **** 0.6 ***** ***** **** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CCGACGCCACGACACC consensus GT TTG G ACTTAT sequence C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 8293 148 2.47e-09 GACGCTGACG CCGACGCCGCGTCACC TAAAGACAAT 15422 124 2.01e-08 GAAATCTGCT CCGACCCCACGACACT GGCCGATATT 12972 250 3.96e-08 AGACCCTGCA CCGATGCCACACCACC ACTGGATCAT 49300 85 1.63e-07 AGGAAAGTTG CTGACTCCACGATTCC CATCTGTAAC 292 151 2.44e-07 TGTGGCGGTG GCGACTGCGCGCCAAC GGTCACCCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8293 2.5e-09 147_[+3]_337 15422 2e-08 123_[+3]_361 12972 4e-08 249_[+3]_235 49300 1.6e-07 84_[+3]_400 292 2.4e-07 150_[+3]_334 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=5 8293 ( 148) CCGACGCCGCGTCACC 1 15422 ( 124) CCGACCCCACGACACT 1 12972 ( 250) CCGATGCCACACCACC 1 49300 ( 85) CTGACTCCACGATTCC 1 292 ( 151) GCGACTGCGCGCCAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 10.6579 E= 2.1e+003 -897 177 -31 -897 -897 177 -897 -31 -897 -897 201 -897 189 -897 -897 -897 -897 177 -897 -31 -897 -23 69 69 -897 177 -31 -897 -897 209 -897 -897 115 -897 69 -897 -897 209 -897 -897 -43 -897 169 -897 57 77 -897 -31 -897 177 -897 -31 157 -897 -897 -31 -43 177 -897 -897 -897 177 -897 -31 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 2.1e+003 0.000000 0.800000 0.200000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.200000 0.400000 0.400000 0.000000 0.800000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.400000 0.400000 0.000000 0.200000 0.000000 0.800000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.200000 0.800000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG][CT]GA[CT][GTC][CG]C[AG]C[GA][ACT][CT][AT][CA][CT] -------------------------------------------------------------------------------- Time 5.56 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12972 8.11e-09 129_[+1(2.20e-07)]_99_\ [+3(3.96e-08)]_105_[+2(2.55e-05)]_118 52058 1.24e-05 272_[+1(2.80e-09)]_207 13093 2.44e-04 125_[+2(3.81e-06)]_218_\ [+1(4.46e-06)]_124 28237 3.99e-05 1_[+1(1.19e-06)]_434_[+2(1.12e-06)]_\ 32 292 6.78e-07 150_[+3(2.44e-07)]_115_\ [+2(8.28e-05)]_41_[+1(1.42e-06)]_145 15422 7.11e-07 1_[+2(3.60e-05)]_110_[+3(2.01e-08)]_\ 330_[+1(4.17e-05)]_10 49528 4.12e-03 106_[+1(8.53e-06)]_164_\ [+2(5.87e-05)]_197 51604 3.42e-04 295_[+1(5.60e-07)]_33_\ [+2(3.92e-05)]_139 8293 1.12e-10 100_[+1(4.12e-08)]_26_\ [+3(2.47e-09)]_293_[+2(2.16e-05)]_32 54310 5.04e-06 339_[+2(4.73e-07)]_72_\ [+1(6.18e-07)]_56 11356 7.43e-03 77_[+1(1.04e-05)]_235_\ [+2(9.38e-05)]_155 432 2.71e-05 14_[+2(4.60e-06)]_341_\ [+1(1.69e-06)]_112 11409 6.33e-05 132_[+1(9.04e-07)]_256_\ [+2(1.13e-05)]_79 49300 6.72e-08 35_[+1(2.20e-07)]_28_[+3(1.63e-07)]_\ 293_[+2(6.25e-05)]_95 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************