******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/31/31.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 43008 1.0000 500 46568 1.0000 500 46635 1.0000 500 47110 1.0000 500 47180 1.0000 500 21982 1.0000 500 22274 1.0000 500 48628 1.0000 500 48788 1.0000 500 18008 1.0000 500 43434 1.0000 500 50293 1.0000 500 43803 1.0000 500 44272 1.0000 500 34392 1.0000 500 35090 1.0000 500 12459 1.0000 500 20574 1.0000 500 45653 1.0000 500 49386 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/31/31.seqs.fa -oc motifs/31 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10000 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.260 C 0.230 G 0.239 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.260 C 0.230 G 0.239 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 12 llr = 162 E-value = 2.0e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::a:61::8365:388 pos.-specific C 1::9:::1:7::51:1 probability G :a:127192115372: matrix T 9:::339:::3:3::1 bits 2.1 * 1.9 ** 1.7 *** * 1.5 **** ** Relative 1.3 **** *** * Entropy 1.1 **** *** * ** (19.4 bits) 0.8 **** ***** * *** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TGACAGTGACAACGAA consensus TT ATGGA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 47110 77 1.64e-09 TATCTATGTC TGACAGTGACAGTGAA TGATGACCTG 50293 179 8.40e-09 TGATAGCCAG TGACAGTGAATGCGAA TGAGAACTAA 12459 34 1.02e-08 ATATGTGATG TGACAGTGACAAGAAA TCTTACCTAT 43008 11 7.18e-08 TGTCGTGTTC TGACAGTGGCAACGGA GTGGGACTGA 46635 247 9.86e-08 GTAATCAAGG TGACGGTGACTACGGA TACGAGTGAA 34392 225 1.56e-07 TCTAACGTCA TGACATTGACAGGCAA ATTAAATTTG 43434 231 3.24e-07 TAGGATTTGC TGACTTTGACTGTAAA CAAAGACCCC 18008 255 3.52e-07 TTCTCCTAAT TGACAGTGAAAACAAT TCCATTTTCC 49386 421 6.08e-07 TGCGAAAACC CGACTTTGACAGTGAA GCAGCTAATG 43803 404 8.76e-07 GATAGAAGAC TGACGATGACAACGAC GACGAATCCC 20574 92 4.58e-06 GGGTTAGATC TGACAGTCGGGACGAA TTTCATGAGA 46568 213 4.80e-06 TTGTCCGATT TGAGTGGGAATGGGAA ATGTACAGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47110 1.6e-09 76_[+1]_408 50293 8.4e-09 178_[+1]_306 12459 1e-08 33_[+1]_451 43008 7.2e-08 10_[+1]_474 46635 9.9e-08 246_[+1]_238 34392 1.6e-07 224_[+1]_260 43434 3.2e-07 230_[+1]_254 18008 3.5e-07 254_[+1]_230 49386 6.1e-07 420_[+1]_64 43803 8.8e-07 403_[+1]_81 20574 4.6e-06 91_[+1]_393 46568 4.8e-06 212_[+1]_272 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=12 47110 ( 77) TGACAGTGACAGTGAA 1 50293 ( 179) TGACAGTGAATGCGAA 1 12459 ( 34) TGACAGTGACAAGAAA 1 43008 ( 11) TGACAGTGGCAACGGA 1 46635 ( 247) TGACGGTGACTACGGA 1 34392 ( 225) TGACATTGACAGGCAA 1 43434 ( 231) TGACTTTGACTGTAAA 1 18008 ( 255) TGACAGTGAAAACAAT 1 49386 ( 421) CGACTTTGACAGTGAA 1 43803 ( 404) TGACGATGACAACGAC 1 20574 ( 92) TGACAGTCGGGACGAA 1 46568 ( 213) TGAGTGGGAATGGGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 10.105 E= 2.0e-006 -1023 -146 -1023 176 -1023 -1023 206 -1023 194 -1023 -1023 -1023 -1023 200 -152 -1023 116 -1023 -52 -12 -164 -1023 148 -12 -1023 -1023 -152 176 -1023 -146 194 -1023 168 -1023 -52 -1023 -6 154 -152 -1023 116 -1023 -152 30 94 -1023 106 -1023 -1023 112 6 -12 -6 -146 148 -1023 168 -1023 -52 -1023 168 -146 -1023 -170 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 12 E= 2.0e-006 0.000000 0.083333 0.000000 0.916667 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.916667 0.083333 0.000000 0.583333 0.000000 0.166667 0.250000 0.083333 0.000000 0.666667 0.250000 0.000000 0.000000 0.083333 0.916667 0.000000 0.083333 0.916667 0.000000 0.833333 0.000000 0.166667 0.000000 0.250000 0.666667 0.083333 0.000000 0.583333 0.000000 0.083333 0.333333 0.500000 0.000000 0.500000 0.000000 0.000000 0.500000 0.250000 0.250000 0.250000 0.083333 0.666667 0.000000 0.833333 0.000000 0.166667 0.000000 0.833333 0.083333 0.000000 0.083333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TGAC[AT][GT]TGA[CA][AT][AG][CGT][GA]AA -------------------------------------------------------------------------------- Time 3.71 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 13 llr = 132 E-value = 2.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 2:::7:4:1:95 pos.-specific C 15:a:a1:7a:2 probability G 222:3::9::12 matrix T 538:::512::1 bits 2.1 * * * 1.9 * * * 1.7 * * * * 1.5 * * * ** Relative 1.3 ** * * ** Entropy 1.1 **** * ** (14.6 bits) 0.8 **** **** 0.6 ********** 0.4 *********** 0.2 ************ 0.0 ------------ Multilevel TCTCACTGCCAA consensus AT G A T G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47110 45 5.57e-07 CGAATCGCGC TCTCACAGCCAG TCTCTTAATC 46635 311 1.77e-06 ACCATATTCC TGTCACAGCCAA ACGGATTCGG 43434 303 2.48e-06 AGGTGGGTAG ACTCACTGTCAA ACGGCAGCGT 45653 110 2.99e-06 AAGCCTGGCT TCTCACTGCCAT CGTTGATATC 20574 386 7.36e-06 AGCTCGTGTT GCGCGCTGCCAA GCGCGAACAA 48628 247 7.36e-06 GTTAAAACAG CCTCACTGCCAC GGCAACTCCG 48788 82 8.80e-06 GCAACAGTTA TTTCGCAGCCAC TTTCAGAATA 49386 373 9.63e-06 TTTGTAGGTA TCTCACTGACAG TAACTCTGAC 34392 3 1.36e-05 AC ATTCGCTGTCAA CACTTGTTGA 47180 82 1.72e-05 ACTTCAAGAC GTTCGCAGTCAA TTGTATGCAC 50293 460 2.31e-05 CAGAGGGGTA GTTCACTTCCAA CGTTTTTTCT 18008 351 2.46e-05 ACGGTGGGGA TCGCACCGCCAG AACGGGACTG 21982 227 3.81e-05 AACCCACATC AGTCACAGCCGA CGGCCGACAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47110 5.6e-07 44_[+2]_444 46635 1.8e-06 310_[+2]_178 43434 2.5e-06 302_[+2]_186 45653 3e-06 109_[+2]_379 20574 7.4e-06 385_[+2]_103 48628 7.4e-06 246_[+2]_242 48788 8.8e-06 81_[+2]_407 49386 9.6e-06 372_[+2]_116 34392 1.4e-05 2_[+2]_486 47180 1.7e-05 81_[+2]_407 50293 2.3e-05 459_[+2]_29 18008 2.5e-05 350_[+2]_138 21982 3.8e-05 226_[+2]_262 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=13 47110 ( 45) TCTCACAGCCAG 1 46635 ( 311) TGTCACAGCCAA 1 43434 ( 303) ACTCACTGTCAA 1 45653 ( 110) TCTCACTGCCAT 1 20574 ( 386) GCGCGCTGCCAA 1 48628 ( 247) CCTCACTGCCAC 1 48788 ( 82) TTTCGCAGCCAC 1 49386 ( 373) TCTCACTGACAG 1 34392 ( 3) ATTCGCTGTCAA 1 47180 ( 82) GTTCGCAGTCAA 1 50293 ( 460) GTTCACTTCCAA 1 18008 ( 351) TCGCACCGCCAG 1 21982 ( 227) AGTCACAGCCGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9780 bayes= 10.0844 E= 2.7e+002 -17 -157 -5 77 -1035 123 -64 18 -1035 -1035 -64 164 -1035 212 -1035 -1035 141 -1035 36 -1035 -1035 212 -1035 -1035 56 -157 -1035 99 -1035 -1035 195 -181 -176 159 -1035 -23 -1035 212 -1035 -1035 183 -1035 -163 -1035 105 -58 -5 -181 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 2.7e+002 0.230769 0.076923 0.230769 0.461538 0.000000 0.538462 0.153846 0.307692 0.000000 0.000000 0.153846 0.846154 0.000000 1.000000 0.000000 0.000000 0.692308 0.000000 0.307692 0.000000 0.000000 1.000000 0.000000 0.000000 0.384615 0.076923 0.000000 0.538462 0.000000 0.000000 0.923077 0.076923 0.076923 0.692308 0.000000 0.230769 0.000000 1.000000 0.000000 0.000000 0.923077 0.000000 0.076923 0.000000 0.538462 0.153846 0.230769 0.076923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TAG][CT]TC[AG]C[TA]G[CT]CA[AG] -------------------------------------------------------------------------------- Time 7.04 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 5 llr = 81 E-value = 5.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::6:2::28:::6:a2 pos.-specific C 82:::4a::aa822:8 probability G :8::8::8:::2:8:: matrix T 2:4a:6::2:::2::: bits 2.1 * ** 1.9 * * ** * 1.7 * * ** * 1.5 * * *** * Relative 1.3 ** ** ****** *** Entropy 1.1 ** ********* *** (23.4 bits) 0.8 ************ *** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CGATGTCGACCCAGAC consensus TCT AC AT GCC A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 21982 170 9.24e-10 GCAAAGAGCT CGATGTCGACCCTGAC TTGGCAAGGC 47180 58 1.86e-08 TTAGCCGAGG CGATGCCGTCCGAGAC TTCAAGACGT 50293 484 2.58e-08 TTTTTTCTTT CCTTGTCGACCCACAC G 22274 1 3.26e-08 . TGTTGTCAACCCAGAC CACAATGTAC 43434 473 5.93e-08 AATGGCGCAT CGATACCGACCCCGAA CTGGCAGCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21982 9.2e-10 169_[+3]_315 47180 1.9e-08 57_[+3]_427 50293 2.6e-08 483_[+3]_1 22274 3.3e-08 [+3]_484 43434 5.9e-08 472_[+3]_12 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=5 21982 ( 170) CGATGTCGACCCTGAC 1 47180 ( 58) CGATGCCGTCCGAGAC 1 50293 ( 484) CCTTGTCGACCCACAC 1 22274 ( 1) TGTTGTCAACCCAGAC 1 43434 ( 473) CGATACCGACCCCGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 11.1728 E= 5.4e+002 -897 180 -897 -44 -897 -20 174 -897 120 -897 -897 56 -897 -897 -897 188 -38 -897 174 -897 -897 80 -897 114 -897 212 -897 -897 -38 -897 174 -897 162 -897 -897 -44 -897 212 -897 -897 -897 212 -897 -897 -897 180 -26 -897 120 -20 -897 -44 -897 -20 174 -897 194 -897 -897 -897 -38 180 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 5.4e+002 0.000000 0.800000 0.000000 0.200000 0.000000 0.200000 0.800000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 0.000000 0.000000 1.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.600000 0.200000 0.000000 0.200000 0.000000 0.200000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CT][GC][AT]T[GA][TC]C[GA][AT]CC[CG][ACT][GC]A[CA] -------------------------------------------------------------------------------- Time 10.33 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43008 1.13e-03 10_[+1(7.18e-08)]_474 46568 2.12e-02 212_[+1(4.80e-06)]_272 46635 2.65e-06 225_[+1(9.47e-05)]_5_[+1(9.86e-08)]_\ 48_[+2(1.77e-06)]_178 47110 2.98e-08 44_[+2(5.57e-07)]_20_[+1(1.64e-09)]_\ 408 47180 1.48e-06 57_[+3(1.86e-08)]_8_[+2(1.72e-05)]_\ 407 21982 7.45e-07 169_[+3(9.24e-10)]_41_\ [+2(3.81e-05)]_262 22274 3.39e-04 [+3(3.26e-08)]_484 48628 1.42e-02 246_[+2(7.36e-06)]_242 48788 4.57e-02 81_[+2(8.80e-06)]_407 18008 1.03e-04 254_[+1(3.52e-07)]_80_\ [+2(2.46e-05)]_60_[+1(7.26e-05)]_62 43434 1.99e-09 230_[+1(3.24e-07)]_56_\ [+2(2.48e-06)]_158_[+3(5.93e-08)]_12 50293 2.44e-10 178_[+1(8.40e-09)]_265_\ [+2(2.31e-05)]_12_[+3(2.58e-08)]_1 43803 1.23e-03 403_[+1(8.76e-07)]_81 44272 4.47e-01 500 34392 1.97e-05 2_[+2(1.36e-05)]_210_[+1(1.56e-07)]_\ 260 35090 6.59e-01 500 12459 9.40e-05 33_[+1(1.02e-08)]_52_[+1(1.40e-06)]_\ 383 20574 3.53e-05 91_[+1(4.58e-06)]_278_\ [+2(7.36e-06)]_103 45653 3.86e-02 109_[+2(2.99e-06)]_379 49386 9.08e-05 372_[+2(9.63e-06)]_36_\ [+1(6.08e-07)]_64 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************