******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/356/356.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 24493 1.0000 500 47919 1.0000 500 14778 1.0000 500 44327 1.0000 500 54257 1.0000 500 51720 1.0000 500 20188 1.0000 500 45778 1.0000 500 33318 1.0000 500 46676 1.0000 500 47145 1.0000 500 41149 1.0000 500 47278 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/356/356.seqs.fa -oc motifs/356 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.258 C 0.251 G 0.228 T 0.264 Background letter frequencies (from dataset with add-one prior applied): A 0.258 C 0.251 G 0.228 T 0.264 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 6 llr = 102 E-value = 5.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :53527aa:75:55:::7a pos.-specific C 83:::2::::583:::::: probability G 2:7272::83::2::aa3: matrix T :2:32:::2::2:5a:::: bits 2.1 ** 1.9 ** *** * 1.7 ** *** * 1.5 *** *** * Relative 1.3 * *** * *** * Entropy 1.1 * * ****** ***** (24.6 bits) 0.9 * * * ****** ****** 0.6 * * *************** 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel CAGAGAAAGAACAATGGAA consensus CAT GC CT G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 24493 182 2.71e-11 GAACGAAACC CAGAGAAAGAACCATGGAA TAGGGCTTCT 47278 50 1.34e-09 TCCGGTCTGC CAGAAAAAGAACAATGGGA CCGTGAATCG 20188 181 1.98e-09 GCAGAGCCGA CCAGGAAAGACCATTGGAA GACAGGAAAT 45778 420 2.31e-08 ACGAAACCAA CTGTGAAAGACTGTTGGAA CATCGTGGAA 33318 460 2.47e-08 TTTCGGTATG CCGATCAAGGCCATTGGGA CAAAATACCT 44327 383 9.94e-08 TATTTGGTAT GAATGGAATGACCATGGAA CGAAAAATCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24493 2.7e-11 181_[+1]_300 47278 1.3e-09 49_[+1]_432 20188 2e-09 180_[+1]_301 45778 2.3e-08 419_[+1]_62 33318 2.5e-08 459_[+1]_22 44327 9.9e-08 382_[+1]_99 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=6 24493 ( 182) CAGAGAAAGAACCATGGAA 1 47278 ( 50) CAGAAAAAGAACAATGGGA 1 20188 ( 181) CCAGGAAAGACCATTGGAA 1 45778 ( 420) CTGTGAAAGACTGTTGGAA 1 33318 ( 460) CCGATCAAGGCCATTGGGA 1 44327 ( 383) GAATGGAATGACCATGGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 6266 bayes= 10.4748 E= 5.4e+001 -923 173 -45 -923 95 41 -923 -66 37 -923 155 -923 95 -923 -45 34 -63 -923 155 -66 137 -59 -45 -923 195 -923 -923 -923 195 -923 -923 -923 -923 -923 187 -66 137 -923 55 -923 95 100 -923 -923 -923 173 -923 -66 95 41 -45 -923 95 -923 -923 92 -923 -923 -923 192 -923 -923 213 -923 -923 -923 213 -923 137 -923 55 -923 195 -923 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 6 E= 5.4e+001 0.000000 0.833333 0.166667 0.000000 0.500000 0.333333 0.000000 0.166667 0.333333 0.000000 0.666667 0.000000 0.500000 0.000000 0.166667 0.333333 0.166667 0.000000 0.666667 0.166667 0.666667 0.166667 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.666667 0.000000 0.333333 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.500000 0.333333 0.166667 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[AC][GA][AT]GAAAG[AG][AC]C[AC][AT]TGG[AG]A -------------------------------------------------------------------------------- Time 1.51 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 13 llr = 135 E-value = 7.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :82::::3:1a3::5 pos.-specific C 222152:551::141 probability G 7:14:82258:7:4: matrix T 2:555:8:11::925 bits 2.1 1.9 * 1.7 * 1.5 * * * Relative 1.3 * ** *** Entropy 1.1 * *** *** (14.9 bits) 0.9 ** *** **** 0.6 ** ********** * 0.4 ** ************ 0.2 *************** 0.0 --------------- Multilevel GATTCGTCCGAGTCA consensus AGT GAG A GT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 51720 441 2.85e-08 CATGTCGCTA GATTCGTCGGAGTTA GTCGATCAGG 24493 222 7.26e-07 GATTACTTAC TATTCGTACGAGTCT ACCTCGTTTG 46676 390 8.21e-07 TTTGGAAGCA GAAGTGTAGGAGTTT CGTATTGTTA 54257 74 9.31e-07 TACCACGCTG GCTTTGTACGAGTGA GCACGAAACA 33318 222 1.05e-06 AACATTTGGA GACGTGTCGGAATGA AAAAATCTCA 47278 379 2.52e-06 CTGCTTTTTC GATGCCGCCGAGTGA CGCCCGTATC 47919 165 4.55e-06 TCTGCCGGGG GATTTCGCCGAGTTT TCATCGGCTA 45778 189 5.43e-06 TAACCAACTA GAATCGTACTAGTCT AGAATAGCAT 44327 142 2.26e-05 TACGTCGACG TCGGCGTCGGAGTCA TATCATTTCT 14778 329 2.94e-05 GCAAAACATT GACTTGGCGCAATGT GAACATGACT 20188 116 3.35e-05 TCTGTCTACA CATCCGTCCAAGTGA TCTCGTTCAG 47145 332 4.31e-05 CCTACGCATG GATTCGTGGGAACCC GAGACGAACG 41149 126 6.06e-05 AATGAGATGG CAAGTGTGTGAATCT CATAAACTCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 51720 2.8e-08 440_[+2]_45 24493 7.3e-07 221_[+2]_264 46676 8.2e-07 389_[+2]_96 54257 9.3e-07 73_[+2]_412 33318 1.1e-06 221_[+2]_264 47278 2.5e-06 378_[+2]_107 47919 4.6e-06 164_[+2]_321 45778 5.4e-06 188_[+2]_297 44327 2.3e-05 141_[+2]_344 14778 2.9e-05 328_[+2]_157 20188 3.4e-05 115_[+2]_370 47145 4.3e-05 331_[+2]_154 41149 6.1e-05 125_[+2]_360 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=13 51720 ( 441) GATTCGTCGGAGTTA 1 24493 ( 222) TATTCGTACGAGTCT 1 46676 ( 390) GAAGTGTAGGAGTTT 1 54257 ( 74) GCTTTGTACGAGTGA 1 33318 ( 222) GACGTGTCGGAATGA 1 47278 ( 379) GATGCCGCCGAGTGA 1 47919 ( 165) GATTTCGCCGAGTTT 1 45778 ( 189) GAATCGTACTAGTCT 1 44327 ( 142) TCGGCGTCGGAGTCA 1 14778 ( 329) GACTTGGCGCAATGT 1 20188 ( 116) CATCCGTCCAAGTGA 1 47145 ( 332) GATTCGTGGGAACCC 1 41149 ( 126) CAAGTGTGTGAATCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6318 bayes= 8.92184 E= 7.3e+002 -1035 -70 160 -78 171 -70 -1035 -1035 -16 -70 -156 103 -1035 -170 76 103 -1035 110 -1035 81 -1035 -70 189 -1035 -1035 -1035 2 154 25 110 -57 -1035 -1035 88 102 -178 -174 -170 176 -178 195 -1035 -1035 -1035 25 -1035 160 -1035 -1035 -170 -1035 181 -1035 62 76 -19 84 -170 -1035 81 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 13 E= 7.3e+002 0.000000 0.153846 0.692308 0.153846 0.846154 0.153846 0.000000 0.000000 0.230769 0.153846 0.076923 0.538462 0.000000 0.076923 0.384615 0.538462 0.000000 0.538462 0.000000 0.461538 0.000000 0.153846 0.846154 0.000000 0.000000 0.000000 0.230769 0.769231 0.307692 0.538462 0.153846 0.000000 0.000000 0.461538 0.461538 0.076923 0.076923 0.076923 0.769231 0.076923 1.000000 0.000000 0.000000 0.000000 0.307692 0.000000 0.692308 0.000000 0.000000 0.076923 0.000000 0.923077 0.000000 0.384615 0.384615 0.230769 0.461538 0.076923 0.000000 0.461538 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GA[TA][TG][CT]G[TG][CA][CG]GA[GA]T[CGT][AT] -------------------------------------------------------------------------------- Time 2.82 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 13 llr = 120 E-value = 4.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 2:5:::22:::: pos.-specific C 58:a:822:1:4 probability G 322:912:3:1: matrix T 1:3:11577996 bits 2.1 1.9 * 1.7 ** 1.5 * ** ** Relative 1.3 * *** ** Entropy 1.1 * *** **** (13.3 bits) 0.9 * *** **** 0.6 ***** ***** 0.4 ***** ***** 0.2 ************ 0.0 ------------ Multilevel CCACGCTTTTTT consensus G T G C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46676 100 5.17e-07 GGAGATGAGA GCTCGCTTTTTT TCAAATCTCT 47278 162 3.92e-06 GACCCGGGAT CGACGCTTTTTC GTTGGACGAA 47919 261 8.02e-06 AGTAGTAGTA GCGCGCGTTTTT CAAAGAGTCT 24493 263 1.00e-05 TATGTACGTA CCTCGCTAGTTT TCGATCGAGG 51720 414 1.19e-05 TGAACCATCG CCTCGGTTTTTT TCGAGCATGT 47145 422 1.35e-05 CCCACTCGTC CCACGCGCTTTC CTCCTTTCAC 20188 261 1.58e-05 AATCACGTCT GGACGCCTTTTT TTCGTTCCAT 44327 75 2.25e-05 CAGTGCCCTA CCACGCTTGCTC GTTTACTACT 33318 441 2.67e-05 ATGTTGTGTC CCACGTCTTTTT CGGTATGCCG 41149 83 3.99e-05 CTAATCGAGA ACGCGCTCTTTC ATGAACCAAC 54257 302 4.36e-05 TTCATTTACG GCACTCATTTTT TCGCGCGACA 45778 113 7.53e-05 ATAGCTCGTT ACACGCAAGTTC ATAGATGTGC 14778 454 1.15e-04 ACACCGTCTG TCTCGCTTGTGT GATCCTTCCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46676 5.2e-07 99_[+3]_389 47278 3.9e-06 161_[+3]_327 47919 8e-06 260_[+3]_228 24493 1e-05 262_[+3]_226 51720 1.2e-05 413_[+3]_75 47145 1.3e-05 421_[+3]_67 20188 1.6e-05 260_[+3]_228 44327 2.2e-05 74_[+3]_414 33318 2.7e-05 440_[+3]_48 41149 4e-05 82_[+3]_406 54257 4.4e-05 301_[+3]_187 45778 7.5e-05 112_[+3]_376 14778 0.00012 453_[+3]_35 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=13 46676 ( 100) GCTCGCTTTTTT 1 47278 ( 162) CGACGCTTTTTC 1 47919 ( 261) GCGCGCGTTTTT 1 24493 ( 263) CCTCGCTAGTTT 1 51720 ( 414) CCTCGGTTTTTT 1 47145 ( 422) CCACGCGCTTTC 1 20188 ( 261) GGACGCCTTTTT 1 44327 ( 75) CCACGCTTGCTC 1 33318 ( 441) CCACGTCTTTTT 1 41149 ( 83) ACGCGCTCTTTC 1 54257 ( 302) GCACTCATTTTT 1 45778 ( 113) ACACGCAAGTTC 1 14778 ( 454) TCTCGCTTGTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 8.93074 E= 4.9e+002 -74 88 43 -178 -1035 175 -57 -1035 106 -1035 -57 22 -1035 200 -1035 -1035 -1035 -1035 202 -178 -1035 175 -156 -178 -74 -70 -57 103 -74 -70 -1035 139 -1035 -1035 43 139 -1035 -170 -1035 181 -1035 -1035 -156 181 -1035 62 -1035 122 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 4.9e+002 0.153846 0.461538 0.307692 0.076923 0.000000 0.846154 0.153846 0.000000 0.538462 0.000000 0.153846 0.307692 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.923077 0.076923 0.000000 0.846154 0.076923 0.076923 0.153846 0.153846 0.153846 0.538462 0.153846 0.153846 0.000000 0.692308 0.000000 0.000000 0.307692 0.692308 0.000000 0.076923 0.000000 0.923077 0.000000 0.000000 0.076923 0.923077 0.000000 0.384615 0.000000 0.615385 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG]C[AT]CGCTT[TG]TT[TC] -------------------------------------------------------------------------------- Time 4.15 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24493 1.18e-11 181_[+1(2.71e-11)]_21_\ [+2(7.26e-07)]_26_[+3(1.00e-05)]_226 47919 1.16e-04 164_[+2(4.55e-06)]_81_\ [+3(8.02e-06)]_228 14778 1.54e-02 328_[+2(2.94e-05)]_157 44327 1.15e-06 74_[+3(2.25e-05)]_55_[+2(2.26e-05)]_\ 226_[+1(9.94e-08)]_99 54257 7.21e-04 73_[+2(9.31e-07)]_213_\ [+3(4.36e-05)]_187 51720 5.98e-07 230_[+1(7.30e-05)]_164_\ [+3(1.19e-05)]_15_[+2(2.85e-08)]_45 20188 3.38e-08 115_[+2(3.35e-05)]_50_\ [+1(1.98e-09)]_61_[+3(1.58e-05)]_228 45778 2.49e-07 112_[+3(7.53e-05)]_64_\ [+2(5.43e-06)]_216_[+1(2.31e-08)]_62 33318 2.33e-08 221_[+2(1.05e-06)]_204_\ [+3(2.67e-05)]_7_[+1(2.47e-08)]_22 46676 1.08e-05 99_[+3(5.17e-07)]_278_\ [+2(8.21e-07)]_96 47145 4.76e-03 331_[+2(4.31e-05)]_75_\ [+3(1.35e-05)]_67 41149 1.17e-02 82_[+3(3.99e-05)]_31_[+2(6.06e-05)]_\ 360 47278 6.06e-10 49_[+1(1.34e-09)]_93_[+3(3.92e-06)]_\ 205_[+2(2.52e-06)]_107 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************