******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/29/29.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47039 1.0000 500 47316 1.0000 500 43576 1.0000 500 30615 1.0000 500 41064 1.0000 500 50384 1.0000 500 44223 1.0000 500 50629 1.0000 500 45773 1.0000 500 27659 1.0000 500 17372 1.0000 500 44386 1.0000 500 50029 1.0000 500 37949 1.0000 500 43287 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/29/29.seqs.fa -oc motifs/29 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.263 C 0.243 G 0.218 T 0.276 Background letter frequencies (from dataset with add-one prior applied): A 0.263 C 0.243 G 0.218 T 0.276 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 7 llr = 104 E-value = 3.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::1:::::::::6: pos.-specific C 1:::6a:917::16:1 probability G :a141::111:a34:9 matrix T 9:961:a:71a:6:4: bits 2.2 * * 2.0 * * * 1.8 * ** ** 1.5 * *** ** * Relative 1.3 *** *** ** * Entropy 1.1 **** *** ** * * (21.5 bits) 0.9 **** ******* *** 0.7 **** *********** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TGTTCCTCTCTGTCAG consensus G GGT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 43287 9 1.71e-09 AGCTGGAT TGTTCCTCTCTGGCAG TCGAAATTCC 17372 396 1.70e-08 CGAGTGAATT TGTTCCTCTCTGTCAC ACACTCAGTT 44223 400 2.86e-08 GGCGTGTTTG TGTGCCTCTGTGTGTG TGTGTTTTTG 30615 162 1.13e-07 GATTTTTTTC TGGGTCTCTCTGTCTG ATTAGAAAAG 50384 73 1.63e-07 CATGGTCGTT TGTTGCTCCCTGGGTG TTACTGTTAG 37949 5 2.42e-07 GTGG CGTTACTCGCTGTCAG ATCAGGATCC 45773 93 3.92e-07 TCAACAACCC TGTGCCTGTTTGCGAG GTCGCCACCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43287 1.7e-09 8_[+1]_476 17372 1.7e-08 395_[+1]_89 44223 2.9e-08 399_[+1]_85 30615 1.1e-07 161_[+1]_323 50384 1.6e-07 72_[+1]_412 37949 2.4e-07 4_[+1]_480 45773 3.9e-07 92_[+1]_392 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=7 43287 ( 9) TGTTCCTCTCTGGCAG 1 17372 ( 396) TGTTCCTCTCTGTCAC 1 44223 ( 400) TGTGCCTCTGTGTGTG 1 30615 ( 162) TGGGTCTCTCTGTCTG 1 50384 ( 73) TGTTGCTCCCTGGGTG 1 37949 ( 5) CGTTACTCGCTGTCAG 1 45773 ( 93) TGTGCCTGTTTGCGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7275 bayes= 9.86371 E= 3.1e+000 -945 -76 -945 163 -945 -945 220 -945 -945 -945 -61 163 -945 -945 97 105 -88 123 -61 -95 -945 204 -945 -945 -945 -945 -945 186 -945 182 -61 -945 -945 -76 -61 137 -945 156 -61 -95 -945 -945 -945 186 -945 -945 220 -945 -945 -76 39 105 -945 123 97 -945 112 -945 -945 63 -945 -76 197 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 3.1e+000 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.428571 0.571429 0.142857 0.571429 0.142857 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.142857 0.142857 0.714286 0.000000 0.714286 0.142857 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.285714 0.571429 0.000000 0.571429 0.428571 0.000000 0.571429 0.000000 0.000000 0.428571 0.000000 0.142857 0.857143 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TGT[TG]CCTCTCTG[TG][CG][AT]G -------------------------------------------------------------------------------- Time 2.61 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 12 llr = 123 E-value = 5.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3a:::63421:8 pos.-specific C ::5:128:::2: probability G 3:::92:18:83 matrix T 3:5a:1:5:9:: bits 2.2 2.0 * 1.8 * ** 1.5 * ** *** Relative 1.3 * ** *** Entropy 1.1 * ** * **** (14.7 bits) 0.9 **** * **** 0.7 **** ****** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AACTGACTGTGA consensus G T AA G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47039 16 5.10e-08 GTCAACCACT GACTGACTGTGA ACTCGCAGCA 50384 306 3.98e-07 AGCGATGGGA AACTGACAGTGA TGCATCCATT 47316 337 3.98e-07 TAGGGCGAAA TACTGACTGTGA ACAACCCCAG 37949 386 2.73e-06 TGTAACTGGA AATTGGCTGTGA GAGATGTTGA 50029 320 3.16e-06 TTTGGCAGTT AATTGCCTGTGA CTTCCTCCAA 44386 21 1.10e-05 CGGGAGTTTG GACTGACAATGG TGTCGCGTTG 43576 100 1.10e-05 GGGAAGAGGT TATTGACTGAGA TGTAAAGGAA 50629 469 2.07e-05 GAAGAGCAGA AACTGTAAGTGA GGACTTCGTC 17372 69 3.26e-05 ATGGAATTGC TATTGCCGGTGG AGTTTCTCGG 41064 408 3.26e-05 GAGCCTTACG GACTGGAAGTCA TCTCTCTTTT 45773 306 3.91e-05 GAGTACAGGA TATTGAAAATGG ATCGATGTGG 30615 88 4.11e-05 CGAACCAAAA GATTCACTGTCA ATTGCTTACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47039 5.1e-08 15_[+2]_473 50384 4e-07 305_[+2]_183 47316 4e-07 336_[+2]_152 37949 2.7e-06 385_[+2]_103 50029 3.2e-06 319_[+2]_169 44386 1.1e-05 20_[+2]_468 43576 1.1e-05 99_[+2]_389 50629 2.1e-05 468_[+2]_20 17372 3.3e-05 68_[+2]_420 41064 3.3e-05 407_[+2]_81 45773 3.9e-05 305_[+2]_183 30615 4.1e-05 87_[+2]_401 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=12 47039 ( 16) GACTGACTGTGA 1 50384 ( 306) AACTGACAGTGA 1 47316 ( 337) TACTGACTGTGA 1 37949 ( 386) AATTGGCTGTGA 1 50029 ( 320) AATTGCCTGTGA 1 44386 ( 21) GACTGACAATGG 1 43576 ( 100) TATTGACTGAGA 1 50629 ( 469) AACTGTAAGTGA 1 17372 ( 69) TATTGCCGGTGG 1 41064 ( 408) GACTGGAAGTCA 1 45773 ( 306) TATTGAAAATGG 1 30615 ( 88) GATTCACTGTCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 9.70135 E= 5.8e+001 34 -1023 61 27 192 -1023 -1023 -1023 -1023 104 -1023 86 -1023 -1023 -1023 186 -1023 -154 207 -1023 115 -54 -39 -172 -8 163 -1023 -1023 66 -1023 -139 86 -66 -1023 193 -1023 -166 -1023 -1023 173 -1023 -54 193 -1023 151 -1023 20 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 5.8e+001 0.333333 0.000000 0.333333 0.333333 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.083333 0.916667 0.000000 0.583333 0.166667 0.166667 0.083333 0.250000 0.750000 0.000000 0.000000 0.416667 0.000000 0.083333 0.500000 0.166667 0.000000 0.833333 0.000000 0.083333 0.000000 0.000000 0.916667 0.000000 0.166667 0.833333 0.000000 0.750000 0.000000 0.250000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AGT]A[CT]TGA[CA][TA]GTG[AG] -------------------------------------------------------------------------------- Time 5.05 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 10 llr = 143 E-value = 9.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::2::62:56734256:4:48 pos.-specific C 5:6183:64:24:22:92:62 probability G 4228:1811::3663:14a:: matrix T 18:12::3:41::::4::::: bits 2.2 * 2.0 * 1.8 * 1.5 * * Relative 1.3 ** * * * * Entropy 1.1 * ** * * * *** (20.7 bits) 0.9 * ** * ** * ** *** 0.7 *********** ** ** *** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CTCGCAGCAAACGGAACAGCA consensus GGA TCATCTCAAAGT G AC sequence G G CC C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45773 124 1.52e-10 CACCATGGTC GTCGCCGCCAACAGGACGGCA TAAACCACGA 50629 393 3.77e-09 GACCAAGTCA CTCGCAGTCAAAACAACAGCA ATAGAACGCT 47039 29 6.48e-09 TGACTGTGAA CTCGCAGCAACAGGCACGGCC CGCAGAATCT 30615 61 1.37e-08 GCTTGCTCTC GGGGCAGCAAACAGAACCGAA CCAAAAGATT 50029 40 1.55e-07 GGGTAATATT CTAGCAATCTACGAGTCGGCA TCCTAACGGT 43287 318 3.70e-07 AAAATCGTAA GTGGCAAGCTACGGGTCAGCC GGAAACCCTG 27659 42 3.98e-07 GCGATACCAG CGCCTAGCAAAAGCATCAGCA CGTCCCGGGT 41064 177 4.27e-07 ATTTTGTTGT GTCGTGGCAAAGAGAAGCGAA CCCCAGCGAA 17372 25 4.92e-07 TGAAACAGAA TTCGCCGCGTTGGGATCGGAA TCTGAATTCT 50384 389 2.09e-06 TCTTCTCGTA CTATCCGTATCGGACACAGAA CGTACACACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45773 1.5e-10 123_[+3]_356 50629 3.8e-09 392_[+3]_87 47039 6.5e-09 28_[+3]_451 30615 1.4e-08 60_[+3]_419 50029 1.6e-07 39_[+3]_440 43287 3.7e-07 317_[+3]_162 27659 4e-07 41_[+3]_438 41064 4.3e-07 176_[+3]_303 17372 4.9e-07 24_[+3]_455 50384 2.1e-06 388_[+3]_91 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=10 45773 ( 124) GTCGCCGCCAACAGGACGGCA 1 50629 ( 393) CTCGCAGTCAAAACAACAGCA 1 47039 ( 29) CTCGCAGCAACAGGCACGGCC 1 30615 ( 61) GGGGCAGCAAACAGAACCGAA 1 50029 ( 40) CTAGCAATCTACGAGTCGGCA 1 43287 ( 318) GTGGCAAGCTACGGGTCAGCC 1 27659 ( 42) CGCCTAGCAAAAGCATCAGCA 1 41064 ( 177) GTCGTGGCAAAGAGAAGCGAA 1 17372 ( 25) TTCGCCGCGTTGGGATCGGAA 1 50384 ( 389) CTATCCGTATCGGACACAGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 10.4342 E= 9.5e+001 -997 104 87 -146 -997 -997 -12 154 -40 131 -12 -997 -997 -128 187 -146 -997 172 -997 -46 119 31 -112 -997 -40 -997 187 -997 -997 131 -112 12 92 72 -112 -997 119 -997 -997 54 141 -28 -997 -146 19 72 46 -997 60 -997 146 -997 -40 -28 146 -997 92 -28 46 -997 119 -997 -997 54 -997 189 -112 -997 60 -28 87 -997 -997 -997 220 -997 60 131 -997 -997 160 -28 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 9.5e+001 0.000000 0.500000 0.400000 0.100000 0.000000 0.000000 0.200000 0.800000 0.200000 0.600000 0.200000 0.000000 0.000000 0.100000 0.800000 0.100000 0.000000 0.800000 0.000000 0.200000 0.600000 0.300000 0.100000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.600000 0.100000 0.300000 0.500000 0.400000 0.100000 0.000000 0.600000 0.000000 0.000000 0.400000 0.700000 0.200000 0.000000 0.100000 0.300000 0.400000 0.300000 0.000000 0.400000 0.000000 0.600000 0.000000 0.200000 0.200000 0.600000 0.000000 0.500000 0.200000 0.300000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 0.900000 0.100000 0.000000 0.400000 0.200000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG][TG][CAG]G[CT][AC][GA][CT][AC][AT][AC][CAG][GA][GAC][AGC][AT]C[AGC]G[CA][AC] -------------------------------------------------------------------------------- Time 7.09 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47039 1.58e-08 15_[+2(5.10e-08)]_1_[+3(6.48e-09)]_\ 451 47316 1.67e-03 336_[+2(3.98e-07)]_152 43576 2.68e-02 99_[+2(1.10e-05)]_389 30615 2.57e-09 60_[+3(1.37e-08)]_6_[+2(4.11e-05)]_\ 62_[+1(1.13e-07)]_323 41064 1.67e-04 176_[+3(4.27e-07)]_210_\ [+2(3.26e-05)]_81 50384 5.19e-09 72_[+1(1.63e-07)]_105_\ [+2(9.45e-05)]_100_[+2(3.98e-07)]_71_[+3(2.09e-06)]_91 44223 4.11e-04 399_[+1(2.86e-08)]_85 50629 2.97e-06 392_[+3(3.77e-09)]_55_\ [+2(2.07e-05)]_20 45773 1.18e-10 92_[+1(3.92e-07)]_15_[+3(1.52e-10)]_\ 161_[+2(3.91e-05)]_183 27659 3.34e-03 41_[+3(3.98e-07)]_438 17372 9.78e-09 24_[+3(4.92e-07)]_23_[+2(3.26e-05)]_\ 315_[+1(1.70e-08)]_89 44386 5.96e-03 20_[+2(1.10e-05)]_468 50029 4.64e-06 39_[+3(1.55e-07)]_259_\ [+2(3.16e-06)]_169 37949 1.58e-05 4_[+1(2.42e-07)]_140_[+2(9.12e-05)]_\ 93_[+2(7.30e-05)]_108_[+2(2.73e-06)]_103 43287 1.79e-08 8_[+1(1.71e-09)]_293_[+3(3.70e-07)]_\ 162 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************