******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/435/435.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 20710 1.0000 500 2284 1.0000 500 23115 1.0000 500 23719 1.0000 500 23816 1.0000 500 25050 1.0000 500 25540 1.0000 500 32525 1.0000 500 3822 1.0000 500 4985 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/435/435.seqs.fa -oc motifs/435 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.257 C 0.250 G 0.240 T 0.253 Background letter frequencies (from dataset with add-one prior applied): A 0.257 C 0.250 G 0.240 T 0.253 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 8 llr = 97 E-value = 1.7e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :3a::81:9:38 pos.-specific C a1:9::9::6:: probability G :6:1a3:a:383 matrix T ::::::::11:: bits 2.1 * * * * 1.9 * * * * 1.6 * * * * 1.4 * *** *** Relative 1.2 * ******* ** Entropy 1.0 * ******* ** (17.4 bits) 0.8 ********* ** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CGACGACGACGA consensus A G GAG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 23816 288 5.65e-08 TCGATCGCAA CGACGACGACGA CAACAGTACA 2284 84 5.65e-08 CTGCGGTGGA CGACGACGACGA TGGTGCAGTG 25050 476 1.07e-06 ACTACAACAA CAACGACGACAA CAACTACTAC 23719 108 1.07e-06 TTGTGAGTTG CGACGGCGACGG ACGTTTTTGT 25540 460 1.51e-06 ACACTTCCCC CGACGACGATGG CCGTTGGGTA 4985 311 2.51e-06 ACCGGAGAGG CAACGACGAGAA GCCTCGGTGT 23115 459 3.54e-06 CAACATCAAA CCAGGACGACGA ACAAAACAAA 3822 453 1.41e-05 CAGCCGTAAT CGACGGAGTGGA TAGATAATCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23816 5.7e-08 287_[+1]_201 2284 5.7e-08 83_[+1]_405 25050 1.1e-06 475_[+1]_13 23719 1.1e-06 107_[+1]_381 25540 1.5e-06 459_[+1]_29 4985 2.5e-06 310_[+1]_178 23115 3.5e-06 458_[+1]_30 3822 1.4e-05 452_[+1]_36 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=8 23816 ( 288) CGACGACGACGA 1 2284 ( 84) CGACGACGACGA 1 25050 ( 476) CAACGACGACAA 1 23719 ( 108) CGACGGCGACGG 1 25540 ( 460) CGACGACGATGG 1 4985 ( 311) CAACGACGAGAA 1 23115 ( 459) CCAGGACGACGA 1 3822 ( 453) CGACGGAGTGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 10.5766 E= 1.7e-001 -965 200 -965 -965 -4 -100 138 -965 196 -965 -965 -965 -965 181 -94 -965 -965 -965 206 -965 154 -965 6 -965 -104 181 -965 -965 -965 -965 206 -965 177 -965 -965 -101 -965 132 6 -101 -4 -965 164 -965 154 -965 6 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 1.7e-001 0.000000 1.000000 0.000000 0.000000 0.250000 0.125000 0.625000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.125000 0.875000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.875000 0.000000 0.000000 0.125000 0.000000 0.625000 0.250000 0.125000 0.250000 0.000000 0.750000 0.000000 0.750000 0.000000 0.250000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[GA]ACG[AG]CGA[CG][GA][AG] -------------------------------------------------------------------------------- Time 0.97 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 9 llr = 128 E-value = 2.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :21::212312:6:122:1:: pos.-specific C :23:11:11:::2::1::::: probability G 2433641643:a1a948:1a7 matrix T 81273281168:1::2:a8:3 bits 2.1 * * * * 1.9 * * * * 1.6 * ** * * 1.4 * ** * * Relative 1.2 * ** ** ** * Entropy 1.0 * * * ** ** ***** (20.6 bits) 0.8 * * * ** ** ***** 0.6 * ** * *** ** ***** 0.4 * ** ** ****** ***** 0.2 ********************* 0.0 --------------------- Multilevel TGCTGGTGGTTGAGGGGTTGG consensus GAGGTA AAGA C AA T sequence CT T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 25540 189 2.95e-12 GGCTTGGATG TGGTGGTGGTTGAGGTGTTGG ATACGATGCA 20710 377 4.01e-08 TCAAGTGGGG GACGGTGGATTGAGGGGTTGG CAGACGTGTA 23115 177 4.44e-08 ATATATTATT GACTGATAGGTGAGGTGTTGT CCGCCATCCT 25050 212 6.56e-08 CTTTGGTGTT TGTGTTTGGTAGTGGAGTTGG TGTGAGTGTG 32525 186 1.24e-07 ATGGGTGTCT TGGTGGAAATTGAGGGATGGG AGATGCGGTA 23816 30 2.41e-07 AGGTGTCTAT TGTTTGTGGGAGCGAGATTGG ACGCGACGGG 23719 148 2.81e-07 ACGTGCAACG TCCTCCTCATTGAGGAGTTGT TGTTGTTCGT 2284 360 8.47e-07 GGGGACAGCC TTGGTGTGTGTGGGGGGTAGT TTCGAGAAAA 4985 219 1.76e-06 GGGCTAGTGC TCATGATTCATGCGGCGTTGG ACTTCGTACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25540 3e-12 188_[+2]_291 20710 4e-08 376_[+2]_103 23115 4.4e-08 176_[+2]_303 25050 6.6e-08 211_[+2]_268 32525 1.2e-07 185_[+2]_294 23816 2.4e-07 29_[+2]_450 23719 2.8e-07 147_[+2]_332 2284 8.5e-07 359_[+2]_120 4985 1.8e-06 218_[+2]_261 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=9 25540 ( 189) TGGTGGTGGTTGAGGTGTTGG 1 20710 ( 377) GACGGTGGATTGAGGGGTTGG 1 23115 ( 177) GACTGATAGGTGAGGTGTTGT 1 25050 ( 212) TGTGTTTGGTAGTGGAGTTGG 1 32525 ( 186) TGGTGGAAATTGAGGGATGGG 1 23816 ( 30) TGTTTGTGGGAGCGAGATTGG 1 23719 ( 148) TCCTCCTCATTGAGGAGTTGT 1 2284 ( 360) TTGGTGTGTGTGGGGGGTAGT 1 4985 ( 219) TCATGATTCATGCGGCGTTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 9.19073 E= 2.4e-001 -982 -982 -11 162 -21 -17 89 -118 -121 41 47 -19 -982 -982 47 140 -982 -117 121 40 -21 -117 89 -19 -121 -982 -111 162 -21 -117 121 -118 37 -117 89 -118 -121 -982 47 114 -21 -982 -982 162 -982 -982 206 -982 111 -17 -111 -118 -982 -982 206 -982 -121 -982 189 -982 -21 -117 89 -19 -21 -982 170 -982 -982 -982 -982 198 -121 -982 -111 162 -982 -982 206 -982 -982 -982 147 40 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 2.4e-001 0.000000 0.000000 0.222222 0.777778 0.222222 0.222222 0.444444 0.111111 0.111111 0.333333 0.333333 0.222222 0.000000 0.000000 0.333333 0.666667 0.000000 0.111111 0.555556 0.333333 0.222222 0.111111 0.444444 0.222222 0.111111 0.000000 0.111111 0.777778 0.222222 0.111111 0.555556 0.111111 0.333333 0.111111 0.444444 0.111111 0.111111 0.000000 0.333333 0.555556 0.222222 0.000000 0.000000 0.777778 0.000000 0.000000 1.000000 0.000000 0.555556 0.222222 0.111111 0.111111 0.000000 0.000000 1.000000 0.000000 0.111111 0.000000 0.888889 0.000000 0.222222 0.111111 0.444444 0.222222 0.222222 0.000000 0.777778 0.000000 0.000000 0.000000 0.000000 1.000000 0.111111 0.000000 0.111111 0.777778 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TG][GAC][CGT][TG][GT][GAT]T[GA][GA][TG][TA]G[AC]GG[GAT][GA]TTG[GT] -------------------------------------------------------------------------------- Time 1.95 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 10 llr = 106 E-value = 1.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :6::5::56::1 pos.-specific C a:19:3a2:a:: probability G :3115::2::a5 matrix T :18::7:14::4 bits 2.1 * * ** 1.9 * * ** 1.6 * * ** 1.4 * * * ** Relative 1.2 * * * ** Entropy 1.0 * ***** *** (15.4 bits) 0.8 * ***** *** 0.6 ******* **** 0.4 ******* **** 0.2 ************ 0.0 ------------ Multilevel CATCATCAACGG consensus G GC CT T sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 3822 221 1.83e-07 TTCTCGCTTT CATCGTCAACGT CCATCATTGC 25540 396 9.57e-07 CGATCGGAGG CATCACCAACGG TAAAACCGAG 2284 112 1.61e-06 AGTGTGTTGC CATCGCCATCGG CGGCGCTGTC 32525 126 1.91e-06 CTTGATACAC CATCATCCACGT TGGAGACATG 23816 437 4.23e-06 CGGCGTCAAG CATCATCTACGT GAAACACCGT 25050 57 5.40e-06 TGACTTGCAC CGTCGTCGTCGT TGTTGTCGTT 23719 430 5.40e-06 GTATGTCTTG CATGGTCAACGG CAGGATAATC 4985 415 1.17e-05 CTGCTTCACC CTTCGTCGTCGG CACGTCTCTA 23115 199 2.58e-05 AGGTGTTGTC CGCCATCCTCGG TACGACACGA 20710 229 4.74e-05 CCGAGCATTC CGGCACCAACGA TTGTGTTTGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3822 1.8e-07 220_[+3]_268 25540 9.6e-07 395_[+3]_93 2284 1.6e-06 111_[+3]_377 32525 1.9e-06 125_[+3]_363 23816 4.2e-06 436_[+3]_52 25050 5.4e-06 56_[+3]_432 23719 5.4e-06 429_[+3]_59 4985 1.2e-05 414_[+3]_74 23115 2.6e-05 198_[+3]_290 20710 4.7e-05 228_[+3]_260 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=10 3822 ( 221) CATCGTCAACGT 1 25540 ( 396) CATCACCAACGG 1 2284 ( 112) CATCGCCATCGG 1 32525 ( 126) CATCATCCACGT 1 23816 ( 437) CATCATCTACGT 1 25050 ( 57) CGTCGTCGTCGT 1 23719 ( 430) CATGGTCAACGG 1 4985 ( 415) CTTCGTCGTCGG 1 23115 ( 199) CGCCATCCTCGG 1 20710 ( 229) CGGCACCAACGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 9.18275 E= 1.5e+000 -997 200 -997 -997 122 -997 32 -134 -997 -132 -126 166 -997 185 -126 -997 96 -997 106 -997 -997 26 -997 147 -997 200 -997 -997 96 -32 -26 -134 122 -997 -997 66 -997 200 -997 -997 -997 -997 206 -997 -136 -997 106 66 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 1.5e+000 0.000000 1.000000 0.000000 0.000000 0.600000 0.000000 0.300000 0.100000 0.000000 0.100000 0.100000 0.800000 0.000000 0.900000 0.100000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.300000 0.000000 0.700000 0.000000 1.000000 0.000000 0.000000 0.500000 0.200000 0.200000 0.100000 0.600000 0.000000 0.000000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.100000 0.000000 0.500000 0.400000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[AG]TC[AG][TC]C[ACG][AT]CG[GT] -------------------------------------------------------------------------------- Time 2.80 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20710 3.82e-05 228_[+3(4.74e-05)]_136_\ [+2(4.01e-08)]_103 2284 3.10e-09 83_[+1(5.65e-08)]_16_[+3(1.61e-06)]_\ 236_[+2(8.47e-07)]_120 23115 1.17e-07 176_[+2(4.44e-08)]_1_[+3(2.58e-05)]_\ 248_[+1(3.54e-06)]_30 23719 5.10e-08 32_[+1(8.54e-05)]_63_[+1(1.07e-06)]_\ 28_[+2(2.81e-07)]_261_[+3(5.40e-06)]_59 23816 2.36e-09 29_[+2(2.41e-07)]_237_\ [+1(5.65e-08)]_137_[+3(4.23e-06)]_52 25050 1.34e-08 56_[+3(5.40e-06)]_143_\ [+2(6.56e-08)]_149_[+1(5.70e-05)]_82_[+1(1.07e-06)]_13 25540 3.23e-13 188_[+2(2.95e-12)]_186_\ [+3(9.57e-07)]_52_[+1(1.51e-06)]_29 32525 7.19e-06 125_[+3(1.91e-06)]_48_\ [+2(1.24e-07)]_294 3822 6.69e-05 220_[+3(1.83e-07)]_220_\ [+1(1.41e-05)]_36 4985 1.18e-06 218_[+2(1.76e-06)]_48_\ [+3(2.88e-05)]_11_[+1(2.51e-06)]_92_[+3(1.17e-05)]_74 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************