******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/277/277.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10723 1.0000 500 11118 1.0000 500 1543 1.0000 500 1843 1.0000 500 21087 1.0000 500 21263 1.0000 500 22033 1.0000 500 24385 1.0000 500 263212 1.0000 500 263213 1.0000 500 2642 1.0000 500 35133 1.0000 500 bd779 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/277/277.seqs.fa -oc motifs/277 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.247 C 0.244 G 0.241 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.247 C 0.244 G 0.241 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 7 llr = 126 E-value = 3.6e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :73::1:4:::a::6:46:4 pos.-specific C 3::::311::::::3::31: probability G 71:9a:7::aa:7a:96196 matrix T :171:614a:::3:11:::: bits 2.1 * *** * 1.8 * **** * 1.6 * **** * 1.4 ** **** * * * Relative 1.2 * ** ****** * * Entropy 1.0 * *** ****** ** ** (26.1 bits) 0.8 ***** * ****** ** ** 0.6 ******************** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GATGGTGATGGAGGAGGAGG consensus C A C T T C AC A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 10723 289 1.05e-11 GGTCGTCGAG GATGGTGTTGGAGGCGGAGG CCAGTGATAC bd779 400 1.44e-10 GGTTTGTGGC GATGGCGCTGGAGGAGGAGA GACGATCGAA 24385 53 8.47e-10 GAAGCTTCAC CATTGTGTTGGAGGAGGAGA CGTTTGAGAA 21087 217 3.41e-09 CACCGGCGGC GAAGGTCATGGAGGAGAGGA AGGTGTTGTG 11118 222 6.31e-09 TTTGCGCCGA GATGGCGTTGGATGTGGACG GAGAGGAGGC 1843 344 1.75e-08 CTGGCAGACG GTTGGTGATGGATGCTACGG TGGTTAGTGA 263213 260 3.32e-08 AGGAGATGTA CGAGGATATGGAGGAGACGG ACGATGATGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10723 1.1e-11 288_[+1]_192 bd779 1.4e-10 399_[+1]_81 24385 8.5e-10 52_[+1]_428 21087 3.4e-09 216_[+1]_264 11118 6.3e-09 221_[+1]_259 1843 1.7e-08 343_[+1]_137 263213 3.3e-08 259_[+1]_221 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=7 10723 ( 289) GATGGTGTTGGAGGCGGAGG 1 bd779 ( 400) GATGGCGCTGGAGGAGGAGA 1 24385 ( 53) CATTGTGTTGGAGGAGGAGA 1 21087 ( 217) GAAGGTCATGGAGGAGAGGA 1 11118 ( 222) GATGGCGTTGGATGTGGACG 1 1843 ( 344) GTTGGTGATGGATGCTACGG 1 263213 ( 260) CGAGGATATGGAGGAGACGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6253 bayes= 10.4076 E= 3.6e-004 -945 23 156 -945 153 -945 -75 -91 21 -945 -945 141 -945 -945 183 -91 -945 -945 205 -945 -79 23 -945 109 -945 -77 156 -91 80 -77 -945 67 -945 -945 -945 190 -945 -945 205 -945 -945 -945 205 -945 202 -945 -945 -945 -945 -945 156 9 -945 -945 205 -945 121 23 -945 -91 -945 -945 183 -91 80 -945 124 -945 121 23 -75 -945 -945 -77 183 -945 80 -945 124 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 3.6e-004 0.000000 0.285714 0.714286 0.000000 0.714286 0.000000 0.142857 0.142857 0.285714 0.000000 0.000000 0.714286 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 1.000000 0.000000 0.142857 0.285714 0.000000 0.571429 0.000000 0.142857 0.714286 0.142857 0.428571 0.142857 0.000000 0.428571 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.714286 0.285714 0.000000 0.000000 1.000000 0.000000 0.571429 0.285714 0.000000 0.142857 0.000000 0.000000 0.857143 0.142857 0.428571 0.000000 0.571429 0.000000 0.571429 0.285714 0.142857 0.000000 0.000000 0.142857 0.857143 0.000000 0.428571 0.000000 0.571429 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC]A[TA]GG[TC]G[AT]TGGA[GT]G[AC]G[GA][AC]G[GA] -------------------------------------------------------------------------------- Time 1.42 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 9 llr = 137 E-value = 6.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :971861:338649a4:67:1 pos.-specific C 4:36:21:73:31::1:2329 probability G 6::22:8a:32:41:37::1: matrix T :1:1:2:::::1:::132:7: bits 2.1 * * 1.8 * * 1.6 * * 1.4 * * ** * Relative 1.2 * * * * ** * Entropy 1.0 *** * *** * ** * * * (21.9 bits) 0.8 *** * *** * ** * *** 0.6 *** ***** ***** ***** 0.4 *************** ***** 0.2 ********************* 0.0 --------------------- Multilevel GAACAAGGCAAAAAAAGAATC consensus C CGGC ACGCG GTCCC sequence T G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 263212 88 2.93e-12 GTGTTCCTGT GAACAAGGCAAAGAAGGAATC GATGCGTGAT 263213 365 4.21e-12 GTGCTCCTGT CAACAAGGCCAAGAAGGAATC ATTTCGGGAT 35133 351 4.13e-08 CACAGCAAAC GAACAAAGCAACGAAAGCATA TCACGCAGGA bd779 302 8.07e-08 CATTTATATC CACAACGGACACAAAATACTC AATTCAAATT 21087 398 1.47e-07 ATTAAATAGA CAATAACGAAAAAAAGTAACC TGGTGTATTT 2642 159 1.69e-07 CATCATCAGC GACGGCGGCGGCGAAAGCCTC TGCTTCTGCC 21263 170 2.69e-07 TGGAGAAGCG CTACGTGGCCAAAAAATTACC CTTAAGGACT 1843 260 3.87e-07 CCGGATGAAC GACCATGGCGATCAATGTCTC ACAATTTCAT 11118 264 3.87e-07 AGGGCGAATC GAAGAAGGAGGAAGACGAAGC AAGCCGGTAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 263212 2.9e-12 87_[+2]_392 263213 4.2e-12 364_[+2]_115 35133 4.1e-08 350_[+2]_129 bd779 8.1e-08 301_[+2]_178 21087 1.5e-07 397_[+2]_82 2642 1.7e-07 158_[+2]_321 21263 2.7e-07 169_[+2]_310 1843 3.9e-07 259_[+2]_220 11118 3.9e-07 263_[+2]_216 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=9 263212 ( 88) GAACAAGGCAAAGAAGGAATC 1 263213 ( 365) CAACAAGGCCAAGAAGGAATC 1 35133 ( 351) GAACAAAGCAACGAAAGCATA 1 bd779 ( 302) CACAACGGACACAAAATACTC 1 21087 ( 398) CAATAACGAAAAAAAGTAACC 1 2642 ( 159) GACGGCGGCGGCGAAAGCCTC 1 21263 ( 170) CTACGTGGCCAAAAAATTACC 1 1843 ( 260) GACCATGGCGATCAATGTCTC 1 11118 ( 264) GAAGAAGGAGGAAGACGAAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6240 bayes= 10.2842 E= 6.6e+000 -982 87 120 -982 185 -982 -982 -127 143 45 -982 -982 -115 119 -12 -127 166 -982 -12 -982 117 -13 -982 -27 -115 -113 169 -982 -982 -982 205 -982 43 145 -982 -982 43 45 47 -982 166 -982 -12 -982 117 45 -982 -127 85 -113 88 -982 185 -982 -112 -982 202 -982 -982 -982 85 -113 47 -127 -982 -982 147 31 117 -13 -982 -27 143 45 -982 -982 -982 -13 -112 131 -115 187 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 6.6e+000 0.000000 0.444444 0.555556 0.000000 0.888889 0.000000 0.000000 0.111111 0.666667 0.333333 0.000000 0.000000 0.111111 0.555556 0.222222 0.111111 0.777778 0.000000 0.222222 0.000000 0.555556 0.222222 0.000000 0.222222 0.111111 0.111111 0.777778 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.333333 0.333333 0.000000 0.777778 0.000000 0.222222 0.000000 0.555556 0.333333 0.000000 0.111111 0.444444 0.111111 0.444444 0.000000 0.888889 0.000000 0.111111 0.000000 1.000000 0.000000 0.000000 0.000000 0.444444 0.111111 0.333333 0.111111 0.000000 0.000000 0.666667 0.333333 0.555556 0.222222 0.000000 0.222222 0.666667 0.333333 0.000000 0.000000 0.000000 0.222222 0.111111 0.666667 0.111111 0.888889 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GC]A[AC][CG][AG][ACT]GG[CA][ACG][AG][AC][AG]AA[AG][GT][ACT][AC][TC]C -------------------------------------------------------------------------------- Time 2.95 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 8 llr = 125 E-value = 7.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::533:1:::4::13:31:4 pos.-specific C aa46689:8a48961563a6 probability G ::1:1::8:::1:::::1:: matrix T :::1:3:33:31136515:: bits 2.1 ** * * 1.8 ** * * 1.6 ** * * 1.4 ** * * * * Relative 1.2 ** ***** * * Entropy 1.0 ** ***** ** * ** (22.5 bits) 0.8 ** ****** ** * ** 0.6 ********** ****** ** 0.4 ***************** ** 0.2 ******************** 0.0 -------------------- Multilevel CCACCCCGCCACCCTCCTCC consensus CAAT TT C TATAC A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 1843 232 5.66e-12 CGTGCATCTC CCCCCCCGCCCCCCTTCTCC GGATGAACGA 22033 140 1.02e-08 TCCGAATCAT CCAACCAGCCTCCCACCTCC GGCTACATGA 263212 224 3.21e-08 ACCACTCTAC CCCTACCTCCACCTTTCTCC TCCCCCAACA 1543 395 5.31e-08 ACATCGCAGG CCACCTCGCCAGCCCTCCCA CTGTCCTCCG 2642 137 6.23e-08 CGAGAGGCGC CCACACCGCCATCATCATCA GCGACGGCGG 263213 478 6.23e-08 TCCTCTAACT CCACCTCGTCCCTCTCACCC CCA 21263 435 1.05e-07 GATATGAGAG CCGCGCCTCCCCCCACCACC TCTCATTTCT 21087 473 2.28e-07 CCTGTCTGTA CCCACCCGTCTCCTTTTGCA AATCGCCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1843 5.7e-12 231_[+3]_249 22033 1e-08 139_[+3]_341 263212 3.2e-08 223_[+3]_257 1543 5.3e-08 394_[+3]_86 2642 6.2e-08 136_[+3]_344 263213 6.2e-08 477_[+3]_3 21263 1e-07 434_[+3]_46 21087 2.3e-07 472_[+3]_8 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=8 1843 ( 232) CCCCCCCGCCCCCCTTCTCC 1 22033 ( 140) CCAACCAGCCTCCCACCTCC 1 263212 ( 224) CCCTACCTCCACCTTTCTCC 1 1543 ( 395) CCACCTCGCCAGCCCTCCCA 1 2642 ( 137) CCACACCGCCATCATCATCA 1 263213 ( 478) CCACCTCGTCCCTCTCACCC 1 21263 ( 435) CCGCGCCTCCCCCCACCACC 1 21087 ( 473) CCCACCCGTCTCCTTTTGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6253 bayes= 9.60849 E= 7.6e+000 -965 204 -965 -965 -965 204 -965 -965 102 62 -95 -965 2 136 -965 -110 2 136 -95 -965 -965 162 -965 -10 -98 184 -965 -965 -965 -965 164 -10 -965 162 -965 -10 -965 204 -965 -965 60 62 -965 -10 -965 162 -95 -110 -965 184 -965 -110 -98 136 -965 -10 2 -96 -965 122 -965 104 -965 90 2 136 -965 -110 -98 4 -95 90 -965 204 -965 -965 60 136 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 8 E= 7.6e+000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.375000 0.125000 0.000000 0.250000 0.625000 0.000000 0.125000 0.250000 0.625000 0.125000 0.000000 0.000000 0.750000 0.000000 0.250000 0.125000 0.875000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.375000 0.375000 0.000000 0.250000 0.000000 0.750000 0.125000 0.125000 0.000000 0.875000 0.000000 0.125000 0.125000 0.625000 0.000000 0.250000 0.250000 0.125000 0.000000 0.625000 0.000000 0.500000 0.000000 0.500000 0.250000 0.625000 0.000000 0.125000 0.125000 0.250000 0.125000 0.500000 0.000000 1.000000 0.000000 0.000000 0.375000 0.625000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CC[AC][CA][CA][CT]C[GT][CT]C[ACT]CC[CT][TA][CT][CA][TC]C[CA] -------------------------------------------------------------------------------- Time 4.33 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10723 2.32e-08 195_[+1(5.20e-05)]_73_\ [+1(1.05e-11)]_85_[+1(4.15e-05)]_87 11118 2.49e-08 221_[+1(6.31e-09)]_22_\ [+2(3.87e-07)]_216 1543 1.56e-03 394_[+3(5.31e-08)]_86 1843 3.57e-15 231_[+3(5.66e-12)]_8_[+2(3.87e-07)]_\ 63_[+1(1.75e-08)]_2_[+1(1.73e-05)]_115 21087 6.94e-12 216_[+1(3.41e-09)]_161_\ [+2(1.47e-07)]_54_[+3(2.28e-07)]_8 21263 7.63e-07 169_[+2(2.69e-07)]_244_\ [+3(1.05e-07)]_46 22033 3.28e-04 139_[+3(1.02e-08)]_341 24385 2.14e-05 52_[+1(8.47e-10)]_428 263212 9.27e-13 87_[+2(2.93e-12)]_115_\ [+3(3.21e-08)]_257 263213 8.72e-16 112_[+1(5.39e-05)]_127_\ [+1(3.32e-08)]_85_[+2(4.21e-12)]_92_[+3(6.23e-08)]_3 2642 8.18e-08 136_[+3(6.23e-08)]_2_[+2(1.69e-07)]_\ 321 35133 7.22e-04 350_[+2(4.13e-08)]_129 bd779 2.79e-10 301_[+2(8.07e-08)]_77_\ [+1(1.44e-10)]_81 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************