******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/283/283.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42807 1.0000 500 48554 1.0000 500 43363 1.0000 500 43364 1.0000 500 43685 1.0000 500 49084 1.0000 500 49511 1.0000 500 16581 1.0000 500 44556 1.0000 500 48555 1.0000 500 50479 1.0000 500 50522 1.0000 500 45862 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/283/283.seqs.fa -oc motifs/283 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.260 C 0.237 G 0.222 T 0.280 Background letter frequencies (from dataset with add-one prior applied): A 0.260 C 0.237 G 0.222 T 0.280 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 13 llr = 139 E-value = 6.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :112:141a2211:2a pos.-specific C 2525a11::4234:1: probability G :21::858:1222a7: matrix T 8373::11:3543::: bits 2.2 * * 2.0 * * * * 1.7 * * * * 1.5 * * * * Relative 1.3 * ** ** * * Entropy 1.1 * ** ** * * (15.4 bits) 0.9 * ** ** *** 0.7 * * ** ** *** 0.4 * ******* *** 0.2 **************** 0.0 ---------------- Multilevel TCTCCGGGACTTCGGA consensus T T A TCCT A sequence A A GG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 48555 81 1.04e-08 ATTTTGATTT TCTCCGGGATCTCGGA AGGAAATAAA 49511 131 2.39e-07 ACCGTAGCGT TCTACGAGAATGTGGA ACTCCTCCGA 49084 211 4.61e-07 ATTGCCCTCT TGTTCGGGACGCTGGA GTGTGTAAAA 48554 76 8.45e-07 CTTTTATTTC CTTCCGAGATCCCGGA GAAAATCAAA 45862 444 1.20e-06 TCTGTCCGTC TCCTCGAGAATGGGGA GACAGACGAC 44556 268 1.66e-06 GCCGTTGACT CCTCCGGGACCCGGAA TCCTAAGCAT 42807 283 4.42e-06 TCGATGTCAT TATACGGGAGTTGGGA GCTGTGCAAG 50522 209 5.27e-06 TAAGTGATTT TCACCCGGACTGCGGA CATACGTTCC 16581 16 1.28e-05 TTTCGCTTGA TTTCCGGTACATTGAA GCCGATGTCA 43685 224 1.60e-05 GAGGCACTGA TTGTCGTGACACCGGA GTTCCCACCG 43364 365 2.26e-05 TGCACGACGT TCTTCACGATGTTGGA TTTGATGTCC 43363 337 2.58e-05 CTTGCCAGAC TTTCCGAGATTAAGCA AGGAGGTACA 50479 219 6.18e-05 CAGTCTTTCG TGCACGAAAATTCGAA CCCTTCTTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48555 1e-08 80_[+1]_404 49511 2.4e-07 130_[+1]_354 49084 4.6e-07 210_[+1]_274 48554 8.4e-07 75_[+1]_409 45862 1.2e-06 443_[+1]_41 44556 1.7e-06 267_[+1]_217 42807 4.4e-06 282_[+1]_202 50522 5.3e-06 208_[+1]_276 16581 1.3e-05 15_[+1]_469 43685 1.6e-05 223_[+1]_261 43364 2.3e-05 364_[+1]_120 43363 2.6e-05 336_[+1]_148 50479 6.2e-05 218_[+1]_266 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=13 48555 ( 81) TCTCCGGGATCTCGGA 1 49511 ( 131) TCTACGAGAATGTGGA 1 49084 ( 211) TGTTCGGGACGCTGGA 1 48554 ( 76) CTTCCGAGATCCCGGA 1 45862 ( 444) TCCTCGAGAATGGGGA 1 44556 ( 268) CCTCCGGGACCCGGAA 1 42807 ( 283) TATACGGGAGTTGGGA 1 50522 ( 209) TCACCCGGACTGCGGA 1 16581 ( 16) TTTCCGGTACATTGAA 1 43685 ( 224) TTGTCGTGACACCGGA 1 43364 ( 365) TCTTCACGATGTTGGA 1 43363 ( 337) TTTCCGAGATTAAGCA 1 50479 ( 219) TGCACGAAAATTCGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 8.91886 E= 6.6e-001 -1035 -63 -1035 159 -175 96 -53 14 -175 -63 -153 130 -17 96 -1035 14 -1035 207 -1035 -1035 -175 -162 193 -1035 56 -162 105 -186 -175 -1035 193 -186 194 -1035 -1035 -1035 -17 70 -153 14 -76 -4 -53 72 -175 37 5 46 -175 70 5 14 -1035 -1035 217 -1035 -17 -162 164 -1035 194 -1035 -1035 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 13 E= 6.6e-001 0.000000 0.153846 0.000000 0.846154 0.076923 0.461538 0.153846 0.307692 0.076923 0.153846 0.076923 0.692308 0.230769 0.461538 0.000000 0.307692 0.000000 1.000000 0.000000 0.000000 0.076923 0.076923 0.846154 0.000000 0.384615 0.076923 0.461538 0.076923 0.076923 0.000000 0.846154 0.076923 1.000000 0.000000 0.000000 0.000000 0.230769 0.384615 0.076923 0.307692 0.153846 0.230769 0.153846 0.461538 0.076923 0.307692 0.230769 0.384615 0.076923 0.384615 0.230769 0.307692 0.000000 0.000000 1.000000 0.000000 0.230769 0.076923 0.692308 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[CT]T[CTA]CG[GA]GA[CTA][TC][TCG][CTG]G[GA]A -------------------------------------------------------------------------------- Time 1.48 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 13 llr = 142 E-value = 1.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :5a:3375a55853a: pos.-specific C 62:822::::3142:5 probability G 42:15533:52213:5 matrix T :2:11::2:::::2:: bits 2.2 2.0 * * * 1.7 * * * 1.5 * * * Relative 1.3 ** * * Entropy 1.1 * ** * ** * ** (15.8 bits) 0.9 * ** * ** * ** 0.7 * ** ******** ** 0.4 * ** ******** ** 0.2 ************* ** 0.0 ---------------- Multilevel CAACGGAAAAAAAAAG consensus G AAGG GC CG C sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 48554 130 5.87e-08 CCTGCCGAAA GAACAGAAAGAACAAG CCTCTTTAAC 16581 40 1.50e-07 AAGCCGATGT CAACGGAGAGGACGAG TATGCTCCAG 48555 259 2.10e-07 GTTTACGTTC CCACGGAGAAAAACAG GTCCCAGTTT 50522 80 4.49e-07 AATCTTTTGA GAACGGAAAGAAGGAC GCAAACTTGA 49511 58 1.41e-06 GTTTCAACCG CAACTGGAAGAACAAG GTGTCACGAT 44556 102 2.93e-06 AGAAGCCAAG GCACAAGAAAAAAAAG TCCGGGATTG 49084 127 3.88e-06 ACTGCAAATG CTACGAAGAACAATAG GACATTTGAC 43364 311 8.77e-06 TTCCCACTGG CAACCAAGAACGCGAC GCTCTACGCC 43685 153 1.01e-05 TCGCTCACGT CGACCAATAAAAACAC GACCGATGTG 42807 455 1.25e-05 AGAGACTTGT CAAGGCGAAACAAAAC GCCACACGAC 45862 204 1.42e-05 GGAAAGGGTG GTACAGAAAACGATAC GAACCCATGT 50479 68 1.61e-05 CAGAGAGAAC CGATAGATAGAAAGAG TTGTTTGTGC 43363 369 2.95e-05 TACACTTACG GAACGCGAAGGCCCAC ACCTCATCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48554 5.9e-08 129_[+2]_355 16581 1.5e-07 39_[+2]_445 48555 2.1e-07 258_[+2]_226 50522 4.5e-07 79_[+2]_405 49511 1.4e-06 57_[+2]_427 44556 2.9e-06 101_[+2]_383 49084 3.9e-06 126_[+2]_358 43364 8.8e-06 310_[+2]_174 43685 1e-05 152_[+2]_332 42807 1.2e-05 454_[+2]_30 45862 1.4e-05 203_[+2]_281 50479 1.6e-05 67_[+2]_417 43363 3e-05 368_[+2]_116 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=13 48554 ( 130) GAACAGAAAGAACAAG 1 16581 ( 40) CAACGGAGAGGACGAG 1 48555 ( 259) CCACGGAGAAAAACAG 1 50522 ( 80) GAACGGAAAGAAGGAC 1 49511 ( 58) CAACTGGAAGAACAAG 1 44556 ( 102) GCACAAGAAAAAAAAG 1 49084 ( 127) CTACGAAGAACAATAG 1 43364 ( 311) CAACCAAGAACGCGAC 1 43685 ( 153) CGACCAATAAAAACAC 1 42807 ( 455) CAAGGCGAAACAAAAC 1 45862 ( 204) GTACAGAAAACGATAC 1 50479 ( 68) CGATAGATAGAAAGAG 1 43363 ( 369) GAACGCGAAGGCCCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 9.45029 E= 1.6e+000 -1035 137 79 -1035 105 -63 -53 -86 194 -1035 -1035 -1035 -1035 183 -153 -186 24 -63 105 -186 24 -63 127 -1035 141 -1035 47 -1035 105 -1035 47 -86 194 -1035 -1035 -1035 105 -1035 105 -1035 105 37 -53 -1035 156 -162 -53 -1035 105 70 -153 -1035 24 -4 47 -86 194 -1035 -1035 -1035 -1035 96 127 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 13 E= 1.6e+000 0.000000 0.615385 0.384615 0.000000 0.538462 0.153846 0.153846 0.153846 1.000000 0.000000 0.000000 0.000000 0.000000 0.846154 0.076923 0.076923 0.307692 0.153846 0.461538 0.076923 0.307692 0.153846 0.538462 0.000000 0.692308 0.000000 0.307692 0.000000 0.538462 0.000000 0.307692 0.153846 1.000000 0.000000 0.000000 0.000000 0.538462 0.000000 0.461538 0.000000 0.538462 0.307692 0.153846 0.000000 0.769231 0.076923 0.153846 0.000000 0.538462 0.384615 0.076923 0.000000 0.307692 0.230769 0.307692 0.153846 1.000000 0.000000 0.000000 0.000000 0.000000 0.461538 0.538462 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG]AAC[GA][GA][AG][AG]A[AG][AC]A[AC][AGC]A[GC] -------------------------------------------------------------------------------- Time 3.10 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 18 sites = 5 llr = 84 E-value = 4.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::82a::882::822:2: pos.-specific C 88:6:a:2288a2628:8 probability G 22:2::a::::::22:82 matrix T ::2:::::::2:::42:: bits 2.2 ** * 2.0 *** * 1.7 *** * 1.5 *** * Relative 1.3 *** ********* *** Entropy 1.1 *** ********* *** (24.1 bits) 0.9 *** ********* *** 0.7 ************** *** 0.4 ************** *** 0.2 ************** *** 0.0 ------------------ Multilevel CCACACGAACCCACTCGC consensus GGTA CCAT CAATAG sequence G GC G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 43685 265 7.00e-11 TGCCACCTTT CCAAACGAACCCACTCGC CTCCCTTCTT 50522 42 1.07e-08 AGCCTTTTTT CGACACGAAACCAGCCGC CGGTCATTTG 45862 23 1.47e-08 GGCCAAGACG GCACACGAACCCCCGCAC AACCAGAAGT 16581 140 3.60e-08 CAAAAAGGCA CCTGACGCACCCAAACGC TTCAAGTCCG 42807 472 3.60e-08 AAACAAAACG CCACACGACCTCACTTGG GGCTGGAAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43685 7e-11 264_[+3]_218 50522 1.1e-08 41_[+3]_441 45862 1.5e-08 22_[+3]_460 16581 3.6e-08 139_[+3]_343 42807 3.6e-08 471_[+3]_11 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=18 seqs=5 43685 ( 265) CCAAACGAACCCACTCGC 1 50522 ( 42) CGACACGAAACCAGCCGC 1 45862 ( 23) GCACACGAACCCCCGCAC 1 16581 ( 140) CCTGACGCACCCAAACGC 1 42807 ( 472) CCACACGACCTCACTTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 6279 bayes= 10.545 E= 4.5e+002 -897 175 -15 -897 -897 175 -15 -897 162 -897 -897 -48 -38 134 -15 -897 194 -897 -897 -897 -897 207 -897 -897 -897 -897 217 -897 162 -25 -897 -897 162 -25 -897 -897 -38 175 -897 -897 -897 175 -897 -48 -897 207 -897 -897 162 -25 -897 -897 -38 134 -15 -897 -38 -25 -15 51 -897 175 -897 -48 -38 -897 184 -897 -897 175 -15 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 5 E= 4.5e+002 0.000000 0.800000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.800000 0.000000 0.000000 0.200000 0.200000 0.600000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.600000 0.200000 0.000000 0.200000 0.200000 0.200000 0.400000 0.000000 0.800000 0.000000 0.200000 0.200000 0.000000 0.800000 0.000000 0.000000 0.800000 0.200000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG][CG][AT][CAG]ACG[AC][AC][CA][CT]C[AC][CAG][TACG][CT][GA][CG] -------------------------------------------------------------------------------- Time 4.47 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42807 6.05e-08 282_[+1(4.42e-06)]_156_\ [+2(1.25e-05)]_1_[+3(3.60e-08)]_11 48554 2.47e-07 75_[+1(8.45e-07)]_38_[+2(5.87e-08)]_\ 355 43363 5.56e-03 336_[+1(2.58e-05)]_16_\ [+2(2.95e-05)]_116 43364 7.01e-04 310_[+2(8.77e-06)]_38_\ [+1(2.26e-05)]_120 43685 5.16e-10 152_[+2(1.01e-05)]_55_\ [+1(1.60e-05)]_25_[+3(7.00e-11)]_218 49084 3.06e-05 126_[+2(3.88e-06)]_68_\ [+1(4.61e-07)]_274 49511 2.54e-06 57_[+2(1.41e-06)]_57_[+1(2.39e-07)]_\ 354 16581 2.77e-09 15_[+1(1.28e-05)]_8_[+2(1.50e-07)]_\ 84_[+3(3.60e-08)]_343 44556 7.40e-06 101_[+2(2.93e-06)]_150_\ [+1(1.66e-06)]_11_[+3(8.50e-05)]_188 48555 2.89e-08 80_[+1(1.04e-08)]_162_\ [+2(2.10e-07)]_226 50479 7.82e-03 67_[+2(1.61e-05)]_135_\ [+1(6.18e-05)]_266 50522 1.09e-09 41_[+3(1.07e-08)]_20_[+2(4.49e-07)]_\ 113_[+1(5.27e-06)]_276 45862 9.09e-09 22_[+3(1.47e-08)]_163_\ [+2(1.42e-05)]_224_[+1(1.20e-06)]_41 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************