******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/393/393.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10488 1.0000 500 10985 1.0000 500 11853 1.0000 500 20912 1.0000 500 21071 1.0000 500 21590 1.0000 500 23778 1.0000 500 23832 1.0000 500 24418 1.0000 500 25573 1.0000 500 3630 1.0000 500 5419 1.0000 500 7056 1.0000 500 7596 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/393/393.seqs.fa -oc motifs/393 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.259 C 0.224 G 0.245 T 0.273 Background letter frequencies (from dataset with add-one prior applied): A 0.259 C 0.224 G 0.245 T 0.273 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 10 llr = 127 E-value = 2.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 3:5:33::4::::::: pos.-specific C 714a2378:9756a:1 probability G :31::2:1:11:::1: matrix T :6::52316:254:99 bits 2.2 * * 1.9 * * 1.7 * * * 1.5 * * *** Relative 1.3 * * ** * *** Entropy 1.1 * * ** * ***** (18.3 bits) 0.9 * * ********** 0.6 **** ********** 0.4 ***** ********** 0.2 ***** ********** 0.0 ---------------- Multilevel CTACTACCTCCCCCTT consensus AGC ACT A TTT sequence CG T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 3630 469 1.20e-09 TGCGGTACTA CTACTACCTCCTCCTT GAGTGGATTG 7596 62 1.29e-08 ATTGTGCTTA CTACTGCCACCTCCTT CACCACAGTC 11853 272 5.35e-08 GCATTACTTT CTCCCACCACCCTCTT ACCGTCGTCA 23778 103 6.08e-08 CCAGCGTCCA CTCCACTCACCCCCTT AGGTCCCAAA 21071 43 6.09e-07 AAGTACCCAC CGACTCTCTCTCTCTT CGAGCCACTT 25573 336 1.29e-06 TAACTATTCG CGACTCTCTCGTTCTT GTCATAATCT 5419 282 3.37e-06 TACCCCAACG CGACTTCTTCCTCCTC GCCAGCATCC 10985 466 3.58e-06 AAACGACGCC ACCCAACCTGCTCCTT CCCACGTTTC 24418 337 4.78e-06 ATCCACTCCA ATCCAGCGTCTCTCTT TGTCGCTGTC 23832 242 5.05e-06 GCTCGTTGCG ATGCCTCCACCCCCGT CGTATCGTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3630 1.2e-09 468_[+1]_16 7596 1.3e-08 61_[+1]_423 11853 5.3e-08 271_[+1]_213 23778 6.1e-08 102_[+1]_382 21071 6.1e-07 42_[+1]_442 25573 1.3e-06 335_[+1]_149 5419 3.4e-06 281_[+1]_203 10985 3.6e-06 465_[+1]_19 24418 4.8e-06 336_[+1]_148 23832 5e-06 241_[+1]_243 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=10 3630 ( 469) CTACTACCTCCTCCTT 1 7596 ( 62) CTACTGCCACCTCCTT 1 11853 ( 272) CTCCCACCACCCTCTT 1 23778 ( 103) CTCCACTCACCCCCTT 1 21071 ( 43) CGACTCTCTCTCTCTT 1 25573 ( 336) CGACTCTCTCGTTCTT 1 5419 ( 282) CGACTTCTTCCTCCTC 1 10985 ( 466) ACCCAACCTGCTCCTT 1 24418 ( 337) ATCCAGCGTCTCTCTT 1 23832 ( 242) ATGCCTCCACCCCCGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 10.3496 E= 2.0e+000 21 164 -997 -997 -997 -116 29 114 95 84 -129 -997 -997 216 -997 -997 21 -16 -997 87 21 42 -29 -45 -997 164 -997 14 -997 184 -129 -144 63 -997 -997 114 -997 201 -129 -997 -997 164 -129 -45 -997 116 -997 87 -997 142 -997 55 -997 216 -997 -997 -997 -997 -129 172 -997 -116 -997 172 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 2.0e+000 0.300000 0.700000 0.000000 0.000000 0.000000 0.100000 0.300000 0.600000 0.500000 0.400000 0.100000 0.000000 0.000000 1.000000 0.000000 0.000000 0.300000 0.200000 0.000000 0.500000 0.300000 0.300000 0.200000 0.200000 0.000000 0.700000 0.000000 0.300000 0.000000 0.800000 0.100000 0.100000 0.400000 0.000000 0.000000 0.600000 0.000000 0.900000 0.100000 0.000000 0.000000 0.700000 0.100000 0.200000 0.000000 0.500000 0.000000 0.500000 0.000000 0.600000 0.000000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.100000 0.900000 0.000000 0.100000 0.000000 0.900000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CA][TG][AC]C[TAC][ACGT][CT]C[TA]C[CT][CT][CT]CTT -------------------------------------------------------------------------------- Time 1.91 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 14 llr = 132 E-value = 1.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 22:16::a268: pos.-specific C 118::4::::1: probability G 64:91:8:8319 matrix T :42:362::1:1 bits 2.2 1.9 * 1.7 * * * 1.5 * * * Relative 1.3 ** *** * Entropy 1.1 ** **** ** (13.6 bits) 0.9 ** ******* 0.6 * ********** 0.4 * ********** 0.2 ************ 0.0 ------------ Multilevel GGCGATGAGAAG consensus ATT TCT AG sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 23832 64 1.75e-07 AGGGAGAGGC GGCGACGAGAAG TTAGAGCAGT 5419 46 5.25e-07 TTCGCTCATG GTCGATGAGGAG GCACGGAGGG 20912 317 5.25e-07 TGCTACGACC GACGACGAGAAG AGCAACTTGA 11853 2 2.09e-06 G GTTGATGAGAAG GGAGGCAATG 21071 253 5.87e-06 CCTTGTGGAT AGCGATGAAAAG CCAAAGGATG 24418 84 6.50e-06 GACGGTTGTC GTTGTTGAGAAG GAACAGGGTC 10488 298 1.56e-05 ACTTATGGAT AGCGTTGAAAAG AAAAAGTAGG 21590 93 2.21e-05 TGCAGTGGGC GTTGTCGAGGAG TCTTTTAAGG 10985 321 2.21e-05 AAAGTGCCGT GCCGGTGAGAAG GAGGCCGGAG 7056 414 4.21e-05 AGACATCATC CACGATGAGGGG AGGTAATGCT 23778 354 6.70e-05 GGAGTCGACA GTCGACTAGGCG ACTTTTTCTG 7596 354 9.85e-05 GTTGACTCTT CGCAATTAGAAG CTTGATTTTT 3630 295 1.04e-04 TGCGAGGGTT AGCGACTAGAAT GAAGTCGCTC 25573 74 2.05e-04 TATTTTCGTT GACGTTGAATGG TTGGCGCTAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23832 1.7e-07 63_[+2]_425 5419 5.3e-07 45_[+2]_443 20912 5.3e-07 316_[+2]_172 11853 2.1e-06 1_[+2]_487 21071 5.9e-06 252_[+2]_236 24418 6.5e-06 83_[+2]_405 10488 1.6e-05 297_[+2]_191 21590 2.2e-05 92_[+2]_396 10985 2.2e-05 320_[+2]_168 7056 4.2e-05 413_[+2]_75 23778 6.7e-05 353_[+2]_135 7596 9.8e-05 353_[+2]_135 3630 0.0001 294_[+2]_194 25573 0.0002 73_[+2]_415 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=14 23832 ( 64) GGCGACGAGAAG 1 5419 ( 46) GTCGATGAGGAG 1 20912 ( 317) GACGACGAGAAG 1 11853 ( 2) GTTGATGAGAAG 1 21071 ( 253) AGCGATGAAAAG 1 24418 ( 84) GTTGTTGAGAAG 1 10488 ( 298) AGCGTTGAAAAG 1 21590 ( 93) GTTGTCGAGGAG 1 10985 ( 321) GCCGGTGAGAAG 1 7056 ( 414) CACGATGAGGGG 1 23778 ( 354) GTCGACTAGGCG 1 7596 ( 354) CGCAATTAGAAG 1 3630 ( 295) AGCGACTAGAAT 1 25573 ( 74) GACGTTGAATGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 8.93074 E= 1.6e+001 -27 -65 139 -1045 -27 -164 54 39 -1045 181 -1045 -35 -185 -1045 192 -1045 131 -1045 -177 7 -1045 67 -1045 124 -1045 -1045 168 -35 195 -1045 -1045 -1045 -27 -1045 168 -1045 131 -1045 22 -193 160 -164 -78 -1045 -1045 -1045 192 -193 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 1.6e+001 0.214286 0.142857 0.642857 0.000000 0.214286 0.071429 0.357143 0.357143 0.000000 0.785714 0.000000 0.214286 0.071429 0.000000 0.928571 0.000000 0.642857 0.000000 0.071429 0.285714 0.000000 0.357143 0.000000 0.642857 0.000000 0.000000 0.785714 0.214286 1.000000 0.000000 0.000000 0.000000 0.214286 0.000000 0.785714 0.000000 0.642857 0.000000 0.285714 0.071429 0.785714 0.071429 0.142857 0.000000 0.000000 0.000000 0.928571 0.071429 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GA][GTA][CT]G[AT][TC][GT]A[GA][AG]AG -------------------------------------------------------------------------------- Time 3.70 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 10 llr = 109 E-value = 1.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::3::2:32::: pos.-specific C 9::a1:a:1923 probability G :9::9::3:15: matrix T 117::8:47:37 bits 2.2 * * 1.9 * * 1.7 * * * * 1.5 ** ** * * Relative 1.3 ** ** * * Entropy 1.1 ******* * * (15.7 bits) 0.9 ******* ** * 0.6 ******* ** * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CGTCGTCTTCGT consensus A A AA TC sequence G C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 21590 450 1.07e-07 CGCTCCCTCT CGTCGTCGTCGT CATAGCCTGC 11853 290 5.14e-07 ACCCTCTTAC CGTCGTCATCGC GCAATCCTAC 20912 2 7.67e-07 T CGTCGTCATCCT CTTTGAGGAG 24418 458 2.63e-06 AAAGATACAC CGACGTCGACGT CTGACAACTG 7056 488 4.64e-06 TTTATCTTAT CGACGTCTACTT C 23832 101 8.16e-06 TGGACGGCAA TGACGTCTTCGT GCGCGGGGTC 10488 344 8.16e-06 GTTGTATTGA CTTCGTCGTCTT GTGCTGAAGT 10985 191 1.07e-05 TCTGGAACTG CGTCGTCTTGTC GGAGATTGGT 5419 227 1.38e-05 TACGAACTAC CGTCGACTCCCT CTCGTCGGAC 7596 431 1.83e-05 CCGAACACTT CGTCCACATCGC ACCGACGACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21590 1.1e-07 449_[+3]_39 11853 5.1e-07 289_[+3]_199 20912 7.7e-07 1_[+3]_487 24418 2.6e-06 457_[+3]_31 7056 4.6e-06 487_[+3]_1 23832 8.2e-06 100_[+3]_388 10488 8.2e-06 343_[+3]_145 10985 1.1e-05 190_[+3]_298 5419 1.4e-05 226_[+3]_262 7596 1.8e-05 430_[+3]_58 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=10 21590 ( 450) CGTCGTCGTCGT 1 11853 ( 290) CGTCGTCATCGC 1 20912 ( 2) CGTCGTCATCCT 1 24418 ( 458) CGACGTCGACGT 1 7056 ( 488) CGACGTCTACTT 1 23832 ( 101) TGACGTCTTCGT 1 10488 ( 344) CTTCGTCGTCTT 1 10985 ( 191) CGTCGTCTTGTC 1 5419 ( 227) CGTCGACTCCCT 1 7596 ( 431) CGTCCACATCGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 10.3614 E= 1.6e+001 -997 201 -997 -144 -997 -997 188 -144 21 -997 -997 136 -997 216 -997 -997 -997 -116 188 -997 -37 -997 -997 155 -997 216 -997 -997 21 -997 29 55 -37 -116 -997 136 -997 201 -129 -997 -997 -16 103 14 -997 42 -997 136 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 1.6e+001 0.000000 0.900000 0.000000 0.100000 0.000000 0.000000 0.900000 0.100000 0.300000 0.000000 0.000000 0.700000 0.000000 1.000000 0.000000 0.000000 0.000000 0.100000 0.900000 0.000000 0.200000 0.000000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.300000 0.000000 0.300000 0.400000 0.200000 0.100000 0.000000 0.700000 0.000000 0.900000 0.100000 0.000000 0.000000 0.200000 0.500000 0.300000 0.000000 0.300000 0.000000 0.700000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CG[TA]CG[TA]C[TAG][TA]C[GTC][TC] -------------------------------------------------------------------------------- Time 5.39 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10488 1.52e-03 297_[+2(1.56e-05)]_34_\ [+3(8.16e-06)]_145 10985 1.43e-05 190_[+3(1.07e-05)]_118_\ [+2(2.21e-05)]_133_[+1(3.58e-06)]_19 11853 2.39e-09 1_[+2(2.09e-06)]_258_[+1(5.35e-08)]_\ 2_[+3(5.14e-07)]_199 20912 1.16e-05 1_[+3(7.67e-07)]_303_[+2(5.25e-07)]_\ 172 21071 5.17e-05 42_[+1(6.09e-07)]_194_\ [+2(5.87e-06)]_236 21590 2.88e-05 92_[+2(2.21e-05)]_345_\ [+3(1.07e-07)]_39 23778 3.84e-05 102_[+1(6.08e-08)]_235_\ [+2(6.70e-05)]_135 23832 2.00e-07 63_[+2(1.75e-07)]_25_[+3(8.16e-06)]_\ 129_[+1(5.05e-06)]_243 24418 1.80e-06 83_[+2(6.50e-06)]_241_\ [+1(4.78e-06)]_105_[+3(2.63e-06)]_31 25573 2.58e-04 47_[+3(9.43e-05)]_276_\ [+1(1.29e-06)]_149 3630 1.73e-06 468_[+1(1.20e-09)]_16 5419 6.07e-07 45_[+2(5.25e-07)]_169_\ [+3(1.38e-05)]_43_[+1(3.37e-06)]_203 7056 2.21e-03 413_[+2(4.21e-05)]_62_\ [+3(4.64e-06)]_1 7596 5.68e-07 61_[+1(1.29e-08)]_276_\ [+2(9.85e-05)]_65_[+3(1.83e-05)]_58 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************