******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/358/358.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10194 1.0000 500 10917 1.0000 500 11636 1.0000 500 11921 1.0000 500 12014 1.0000 500 12078 1.0000 500 12206 1.0000 500 264602 1.0000 500 268368 1.0000 500 269871 1.0000 500 3404 1.0000 500 34097 1.0000 500 5296 1.0000 500 6360 1.0000 500 7199 1.0000 500 7844 1.0000 500 8807 1.0000 500 8918 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/358/358.seqs.fa -oc motifs/358 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.265 C 0.234 G 0.232 T 0.269 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.234 G 0.232 T 0.269 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 18 llr = 183 E-value = 1.8e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 27315:8:9824:232 pos.-specific C 12333:2::1::2::6 probability G 71441a:a11743742 matrix T 1::22:::::12612: bits 2.1 * * 1.9 * * 1.7 * * 1.5 * ** Relative 1.3 **** Entropy 1.1 ***** (14.6 bits) 0.8 ** ****** * 0.6 ** ****** ** * 0.4 *** *********** 0.2 **************** 0.0 ---------------- Multilevel GAGGAGAGAAGATGGC consensus ACC GG AA sequence CT T TG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 10917 357 9.44e-08 AATGGTAGTA GAGGAGAGAAGAGGAA GGGGGGTATC 7844 15 1.31e-07 ATCTCATTCT GAATAGAGAAGGTGGA TTTGTCCTTG 5296 406 9.27e-07 ACATGCAGCT GCACAGAGAAGATGAG AAGAGAACTC 7199 212 1.34e-06 ACTCTTCCCG GCCTCGAGAAGATGTC TGCAAGTGAG 3404 377 1.69e-06 CATCACCGTA GAAGCGAGGAGATGTC GGCTGTACGG 11921 270 1.69e-06 CCTGGGCCGA GAGGCGAGACAGTGGC GACCGAGACG 12078 174 2.39e-06 TATCAACGAG GACGAGAGAAGTCAAC ACTCGTGACT 8918 264 3.29e-06 GACAGGCTGT AGCCAGAGAAGGGGGC GTTGTTGTGG 12206 234 6.04e-06 CTTGTTTCGC GAGTCGAGGAGGTTGC TCTAGTGTTA 6360 326 7.34e-06 GGCAACAGCG GCGTTGAGAAGTTGAA CTCTTTTTGC 8807 54 9.64e-06 TGTTGTCAAG GAAGCGAGAGAGTGGG TGGTTGGTGT 11636 325 1.06e-05 TTCAGGGGAC AAGCTGAGAAGAGAAC CCACTTTGAC 10194 283 1.63e-05 TCTATTTGAA AACAAGAGACGGTGGC TTGTTGGGAC 269871 280 2.24e-05 GTGCTATTCA GAGCTGAGAATGGAGA GTAGGAATTC 268368 284 2.24e-05 TCCTTTGCTT GACGAGCGAAATGGTG GAATTTGTAA 264602 260 2.83e-05 TTATCGTTAT TAGCAGAGAGGACGTC CTGCTCACCA 34097 453 9.26e-05 TCGCACAACG CAACAGCGAAGACTAC AAAGTAAAAA 12014 249 1.28e-04 GGAGATGGAG GGGGGGCGAATTTGGG GATGTGACAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10917 9.4e-08 356_[+1]_128 7844 1.3e-07 14_[+1]_470 5296 9.3e-07 405_[+1]_79 7199 1.3e-06 211_[+1]_273 3404 1.7e-06 376_[+1]_108 11921 1.7e-06 269_[+1]_215 12078 2.4e-06 173_[+1]_311 8918 3.3e-06 263_[+1]_221 12206 6e-06 233_[+1]_251 6360 7.3e-06 325_[+1]_159 8807 9.6e-06 53_[+1]_431 11636 1.1e-05 324_[+1]_160 10194 1.6e-05 282_[+1]_202 269871 2.2e-05 279_[+1]_205 268368 2.2e-05 283_[+1]_201 264602 2.8e-05 259_[+1]_225 34097 9.3e-05 452_[+1]_32 12014 0.00013 248_[+1]_236 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=18 10917 ( 357) GAGGAGAGAAGAGGAA 1 7844 ( 15) GAATAGAGAAGGTGGA 1 5296 ( 406) GCACAGAGAAGATGAG 1 7199 ( 212) GCCTCGAGAAGATGTC 1 3404 ( 377) GAAGCGAGGAGATGTC 1 11921 ( 270) GAGGCGAGACAGTGGC 1 12078 ( 174) GACGAGAGAAGTCAAC 1 8918 ( 264) AGCCAGAGAAGGGGGC 1 12206 ( 234) GAGTCGAGGAGGTTGC 1 6360 ( 326) GCGTTGAGAAGTTGAA 1 8807 ( 54) GAAGCGAGAGAGTGGG 1 11636 ( 325) AAGCTGAGAAGAGAAC 1 10194 ( 283) AACAAGAGACGGTGGC 1 269871 ( 280) GAGCTGAGAATGGAGA 1 268368 ( 284) GACGAGCGAAATGGTG 1 264602 ( 260) TAGCAGAGAGGACGTC 1 34097 ( 453) CAACAGCGAAGACTAC 1 12014 ( 249) GGGGGGCGAATTTGGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8730 bayes= 9.05343 E= 1.8e-001 -67 -207 164 -227 144 -49 -106 -1081 7 25 94 -1081 -225 51 75 -27 91 25 -206 -69 -1081 -1081 211 -1081 165 -49 -1081 -1081 -1081 -1081 211 -1081 174 -1081 -106 -1081 155 -107 -106 -1081 -67 -1081 164 -127 55 -1081 75 -27 -1081 -49 26 105 -67 -1081 164 -127 33 -1081 94 -27 -26 125 -6 -1081 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 18 E= 1.8e-001 0.166667 0.055556 0.722222 0.055556 0.722222 0.166667 0.111111 0.000000 0.277778 0.277778 0.444444 0.000000 0.055556 0.333333 0.388889 0.222222 0.500000 0.277778 0.055556 0.166667 0.000000 0.000000 1.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.888889 0.000000 0.111111 0.000000 0.777778 0.111111 0.111111 0.000000 0.166667 0.000000 0.722222 0.111111 0.388889 0.000000 0.388889 0.222222 0.000000 0.166667 0.277778 0.555556 0.166667 0.000000 0.722222 0.111111 0.333333 0.000000 0.444444 0.222222 0.222222 0.555556 0.222222 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GA[GAC][GCT][AC]GAGAAG[AGT][TG]G[GAT][CAG] -------------------------------------------------------------------------------- Time 2.86 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 13 llr = 166 E-value = 3.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 1:2:55613985:612::1:9 pos.-specific C 5:15:::4:122:1:2:128: probability G :22:23127:::8:5241621 matrix T 58552234:::32345681:: bits 2.1 1.9 1.7 1.5 * * ** Relative 1.3 * ** * ** Entropy 1.1 * * *** * ** ** (18.4 bits) 0.8 * * *** * ** ** 0.6 ** * ** *** *** ***** 0.4 ** **** ******* ***** 0.2 ********************* 0.0 --------------------- Multilevel CTTCAAACGAAAGAGTTTGCA consensus T GTGGTTA CT TTAG C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 7199 318 1.59e-08 ATCTTTTGCC CTTTGAAGGAAAGCGTTTGCA TCGAAAGTCG 5296 8 4.89e-08 GCAGTTT CTGCTGAAGAAAGAGAGTGCA TGGCCCACGT 264602 227 1.07e-07 GATGTAAGTT CTTTAAATGACAGAAATTCCA CGTTATCGTT 3404 311 1.98e-07 CAGATTCAGT CTATAAATGAAAGATGGTGCG GTCGGTTGAG 11636 253 1.98e-07 CTCTTTCTAA TTTTTTTCGAAATATTTTGCA GTGGACAATA 34097 36 6.00e-07 GCCATTGCAA TTGCAGTCAACTGAGCGTCCA TCATCCGCTG 269871 389 6.53e-07 CCGCTTCTAA TGACAATCAAACGTGTGTGCA TTTCACGATG 10917 429 7.73e-07 GACATCGTGA TTTCAAGCAAAAGAGTTTAGA ACATTTTAGT 8918 56 1.07e-06 ATTTCGCTGT TTGCAAATACATGTTGGTGCA CCACCCGAGT 7844 455 1.07e-06 CCAGAGACTG TTTCGGAGGAAAGTGTTCTCA CAACTTTAAA 12078 327 1.57e-06 GTAATCACCT CTTTGTACGAACGTTTTGCCA GCTTAGACTG 268368 25 1.69e-06 AAAGGCTGTT AGTCTGATGAATGAGATTGGA CTTTATCCGA 11921 2 1.82e-06 T CTCTAATTGACTTATCTTGCA AGCTTAATAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7199 1.6e-08 317_[+2]_162 5296 4.9e-08 7_[+2]_472 264602 1.1e-07 226_[+2]_253 3404 2e-07 310_[+2]_169 11636 2e-07 252_[+2]_227 34097 6e-07 35_[+2]_444 269871 6.5e-07 388_[+2]_91 10917 7.7e-07 428_[+2]_51 8918 1.1e-06 55_[+2]_424 7844 1.1e-06 454_[+2]_25 12078 1.6e-06 326_[+2]_153 268368 1.7e-06 24_[+2]_455 11921 1.8e-06 1_[+2]_478 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=13 7199 ( 318) CTTTGAAGGAAAGCGTTTGCA 1 5296 ( 8) CTGCTGAAGAAAGAGAGTGCA 1 264602 ( 227) CTTTAAATGACAGAAATTCCA 1 3404 ( 311) CTATAAATGAAAGATGGTGCG 1 11636 ( 253) TTTTTTTCGAAATATTTTGCA 1 34097 ( 36) TTGCAGTCAACTGAGCGTCCA 1 269871 ( 389) TGACAATCAAACGTGTGTGCA 1 10917 ( 429) TTTCAAGCAAAAGAGTTTAGA 1 8918 ( 56) TTGCAAATACATGTTGGTGCA 1 7844 ( 455) TTTCGGAGGAAAGTGTTCTCA 1 12078 ( 327) CTTTGTACGAACGTTTTGCCA 1 268368 ( 25) AGTCTGATGAATGAGATTGGA 1 11921 ( 2) CTCTAATTGACTTATCTTGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 9.90539 E= 3.9e+002 -178 98 -1035 78 -1035 -1035 -59 165 -78 -160 -1 100 -1035 120 -1035 78 102 -1035 -1 -22 102 -1035 41 -80 121 -1035 -159 19 -178 72 -59 52 21 -1035 158 -1035 180 -160 -1035 -1035 154 -2 -1035 -1035 102 -61 -1035 19 -1035 -1035 187 -80 121 -160 -1035 19 -178 -1035 122 52 -20 -61 -59 78 -1035 -1035 73 119 -1035 -160 -159 165 -178 -2 141 -180 -1035 185 -59 -1035 180 -1035 -159 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 13 E= 3.9e+002 0.076923 0.461538 0.000000 0.461538 0.000000 0.000000 0.153846 0.846154 0.153846 0.076923 0.230769 0.538462 0.000000 0.538462 0.000000 0.461538 0.538462 0.000000 0.230769 0.230769 0.538462 0.000000 0.307692 0.153846 0.615385 0.000000 0.076923 0.307692 0.076923 0.384615 0.153846 0.384615 0.307692 0.000000 0.692308 0.000000 0.923077 0.076923 0.000000 0.000000 0.769231 0.230769 0.000000 0.000000 0.538462 0.153846 0.000000 0.307692 0.000000 0.000000 0.846154 0.153846 0.615385 0.076923 0.000000 0.307692 0.076923 0.000000 0.538462 0.384615 0.230769 0.153846 0.153846 0.461538 0.000000 0.000000 0.384615 0.615385 0.000000 0.076923 0.076923 0.846154 0.076923 0.230769 0.615385 0.076923 0.000000 0.846154 0.153846 0.000000 0.923077 0.000000 0.076923 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CT]T[TG][CT][AGT][AG][AT][CT][GA]A[AC][AT]G[AT][GT][TA][TG]T[GC]CA -------------------------------------------------------------------------------- Time 5.40 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 7 llr = 87 E-value = 1.7e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::a:16:39:4: pos.-specific C 1::a::1::9:a probability G :a::9477:1:: matrix T 9:::::1:1:6: bits 2.1 * * * 1.9 *** * 1.7 *** * 1.5 **** * * Relative 1.3 ***** *** * Entropy 1.1 ****** *** * (17.9 bits) 0.8 ************ 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGACGAGGACTC consensus G A A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 12206 34 5.00e-08 TAGCAGCATT TGACGAGGACTC ACCAGAGCCC 11636 362 3.49e-07 GCACAAACGT TGACGAGAACAC CACCTTTCAG 11921 242 3.99e-07 TAGTTGGACA TGACGGGAACAC TCGCAACCTG 6360 417 4.93e-07 TCGGATTGCA TGACGACGACTC CTTGTTGAGC 12014 215 1.02e-06 CAGCAGACGA TGACGGGGTCTC CCAAGTAAGC 3404 472 3.99e-06 CCTCCAAAGA CGACGAGGAGAC GATCACCATA 268368 385 5.04e-06 AAGTAGGAAG TGACAGTGACTC TTCAAACCTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12206 5e-08 33_[+3]_455 11636 3.5e-07 361_[+3]_127 11921 4e-07 241_[+3]_247 6360 4.9e-07 416_[+3]_72 12014 1e-06 214_[+3]_274 3404 4e-06 471_[+3]_17 268368 5e-06 384_[+3]_104 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=7 12206 ( 34) TGACGAGGACTC 1 11636 ( 362) TGACGAGAACAC 1 11921 ( 242) TGACGGGAACAC 1 6360 ( 417) TGACGACGACTC 1 12014 ( 215) TGACGGGGTCTC 1 3404 ( 472) CGACGAGGAGAC 1 268368 ( 385) TGACAGTGACTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 10.1389 E= 1.7e+003 -945 -71 -945 167 -945 -945 211 -945 191 -945 -945 -945 -945 209 -945 -945 -89 -945 189 -945 111 -945 89 -945 -945 -71 162 -91 11 -945 162 -945 169 -945 -945 -91 -945 187 -70 -945 69 -945 -945 109 -945 209 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 1.7e+003 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.571429 0.000000 0.428571 0.000000 0.000000 0.142857 0.714286 0.142857 0.285714 0.000000 0.714286 0.000000 0.857143 0.000000 0.000000 0.142857 0.000000 0.857143 0.142857 0.000000 0.428571 0.000000 0.000000 0.571429 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TGACG[AG]G[GA]AC[TA]C -------------------------------------------------------------------------------- Time 8.09 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10194 2.79e-02 282_[+1(1.63e-05)]_202 10917 2.06e-06 356_[+1(9.44e-08)]_56_\ [+2(7.73e-07)]_51 11636 2.44e-08 252_[+2(1.98e-07)]_51_\ [+1(1.06e-05)]_21_[+3(3.49e-07)]_127 11921 3.92e-08 1_[+2(1.82e-06)]_219_[+3(3.99e-07)]_\ 16_[+1(1.69e-06)]_215 12014 1.87e-03 214_[+3(1.02e-06)]_274 12078 7.46e-05 173_[+1(2.39e-06)]_137_\ [+2(1.57e-06)]_153 12206 9.04e-06 33_[+3(5.00e-08)]_188_\ [+1(6.04e-06)]_52_[+1(6.07e-05)]_183 264602 3.60e-05 226_[+2(1.07e-07)]_12_\ [+1(2.83e-05)]_98_[+2(6.67e-05)]_106 268368 3.76e-06 24_[+2(1.69e-06)]_238_\ [+1(2.24e-05)]_85_[+3(5.04e-06)]_104 269871 1.81e-04 279_[+1(2.24e-05)]_93_\ [+2(6.53e-07)]_91 3404 4.23e-08 310_[+2(1.98e-07)]_45_\ [+1(1.69e-06)]_79_[+3(3.99e-06)]_17 34097 3.69e-04 35_[+2(6.00e-07)]_396_\ [+1(9.26e-05)]_32 5296 4.98e-07 7_[+2(4.89e-08)]_377_[+1(9.27e-07)]_\ 79 6360 2.45e-05 24_[+3(5.51e-05)]_289_\ [+1(7.34e-06)]_75_[+3(4.93e-07)]_72 7199 6.83e-08 211_[+1(1.34e-06)]_90_\ [+2(1.59e-08)]_162 7844 3.25e-06 14_[+1(1.31e-07)]_424_\ [+2(1.07e-06)]_25 8807 6.13e-02 53_[+1(9.64e-06)]_431 8918 6.43e-05 55_[+2(1.07e-06)]_187_\ [+1(3.29e-06)]_37_[+2(1.07e-06)]_163 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************