******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/180/180.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47550 1.0000 500 22418 1.0000 500 15820 1.0000 500 1494 1.0000 500 33435 1.0000 500 11197 1.0000 500 8310 1.0000 500 44899 1.0000 500 12155 1.0000 500 35670 1.0000 500 38769 1.0000 500 50556 1.0000 500 43069 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/180/180.seqs.fa -oc motifs/180 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.259 G 0.216 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.259 G 0.216 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 103 E-value = 4.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 31:61:::1:3: pos.-specific C :7a21:::9a2: probability G :::::a4:::4a matrix T 72:28:6a:::: bits 2.2 * * 2.0 * * * * * 1.8 * * * * * 1.5 * * *** * Relative 1.3 * * *** * Entropy 1.1 * * ***** * (16.4 bits) 0.9 * * ****** * 0.7 *** ****** * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TCCATGTTCCGG consensus AT C G A sequence T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 12155 352 5.66e-08 ACCAACAAGG TCCATGTTCCGG TTCCATCCGT 8310 393 5.67e-07 CTTGCTGGAA TCCATGGTCCCG GCGTTTTCCT 33435 215 1.14e-06 TTCTTTCCTG TCCCTGGTCCAG CTGAACGTGA 11197 439 1.59e-06 GGAGCGGGCA ACCTTGGTCCGG ACACGGTCGC 47550 192 4.99e-06 AATCTTTTTC TACTTGTTCCGG GTTTGGTTAA 38769 368 5.40e-06 ACATCGGTCC ACCATGGTACGG TTGCAACCTT 35670 282 5.40e-06 CTTACAGTTA TTCCTGTTCCCG TTCACCAGCT 44899 153 6.88e-06 CCTTCATGCG ACCAAGTTCCAG TCAACACCCG 50556 170 9.16e-06 AGCGAATCCG TTCACGTTCCAG TATTCCTGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12155 5.7e-08 351_[+1]_137 8310 5.7e-07 392_[+1]_96 33435 1.1e-06 214_[+1]_274 11197 1.6e-06 438_[+1]_50 47550 5e-06 191_[+1]_297 38769 5.4e-06 367_[+1]_121 35670 5.4e-06 281_[+1]_207 44899 6.9e-06 152_[+1]_336 50556 9.2e-06 169_[+1]_319 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 12155 ( 352) TCCATGTTCCGG 1 8310 ( 393) TCCATGGTCCCG 1 33435 ( 215) TCCCTGGTCCAG 1 11197 ( 439) ACCTTGGTCCGG 1 47550 ( 192) TACTTGTTCCGG 1 38769 ( 368) ACCATGGTACGG 1 35670 ( 282) TTCCTGTTCCCG 1 44899 ( 153) ACCAAGTTCCAG 1 50556 ( 170) TTCACGTTCCAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 9.59664 E= 4.5e+001 35 -982 -982 134 -123 136 -982 -24 -982 195 -982 -982 109 -22 -982 -24 -123 -122 -982 156 -982 -982 221 -982 -982 -982 104 108 -982 -982 -982 192 -123 178 -982 -982 -982 195 -982 -982 35 -22 104 -982 -982 -982 221 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 4.5e+001 0.333333 0.000000 0.000000 0.666667 0.111111 0.666667 0.000000 0.222222 0.000000 1.000000 0.000000 0.000000 0.555556 0.222222 0.000000 0.222222 0.111111 0.111111 0.000000 0.777778 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.444444 0.555556 0.000000 0.000000 0.000000 1.000000 0.111111 0.888889 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.222222 0.444444 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TA][CT]C[ACT]TG[TG]TCC[GAC]G -------------------------------------------------------------------------------- Time 1.35 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 6 llr = 88 E-value = 2.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :32:38:::7:8:::: pos.-specific C a52::2:8::7:8a:a probability G ::375:3:a3:22:a: matrix T :2332:72::3::::: bits 2.2 * * 2.0 * * *** 1.8 * * *** 1.5 * * *** Relative 1.3 * * ** ***** Entropy 1.1 * * *********** (21.1 bits) 0.9 * * *********** 0.7 * ************* 0.4 ** ************* 0.2 ** ************* 0.0 ---------------- Multilevel CCGGGATCGACACCGC consensus ATTA G GT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 11197 469 1.43e-08 GCCTCCTTGC CCTGGATCGACGCCGC TCGTGTCGAG 22418 346 3.63e-08 GAAGTGTCTC CACTGATCGACACCGC CAGCCATTCA 33435 131 6.15e-08 CATCTAGAAT CTAGAATCGACACCGC TCCAATTAAT 35670 439 9.91e-08 GGGACCTAAT CCTGGATTGGTACCGC ATGTGATACC 15820 442 2.97e-07 AATTGTCTTC CAGGAAGCGGTAGCGC AGTCAACGAT 1494 405 3.38e-07 CCTTACGAGA CCGTTCGCGACACCGC CATTACCCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11197 1.4e-08 468_[+2]_16 22418 3.6e-08 345_[+2]_139 33435 6.1e-08 130_[+2]_354 35670 9.9e-08 438_[+2]_46 15820 3e-07 441_[+2]_43 1494 3.4e-07 404_[+2]_80 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=6 11197 ( 469) CCTGGATCGACGCCGC 1 22418 ( 346) CACTGATCGACACCGC 1 33435 ( 131) CTAGAATCGACACCGC 1 35670 ( 439) CCTGGATTGGTACCGC 1 15820 ( 442) CAGGAAGCGGTAGCGC 1 1494 ( 405) CCGTTCGCGACACCGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 10.4838 E= 2.6e+002 -923 195 -923 -923 35 95 -923 -66 -65 -64 62 34 -923 -923 162 34 35 -923 121 -66 167 -64 -923 -923 -923 -923 62 134 -923 168 -923 -66 -923 -923 221 -923 135 -923 62 -923 -923 136 -923 34 167 -923 -37 -923 -923 168 -37 -923 -923 195 -923 -923 -923 -923 221 -923 -923 195 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 2.6e+002 0.000000 1.000000 0.000000 0.000000 0.333333 0.500000 0.000000 0.166667 0.166667 0.166667 0.333333 0.333333 0.000000 0.000000 0.666667 0.333333 0.333333 0.000000 0.500000 0.166667 0.833333 0.166667 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.666667 0.000000 0.333333 0.833333 0.000000 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[CA][GT][GT][GA]A[TG]CG[AG][CT]ACCGC -------------------------------------------------------------------------------- Time 2.79 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 4 llr = 65 E-value = 1.9e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :83:3:aa:38::a: pos.-specific C a:5::a::a:3a::a probability G :::a8::::3::8:: matrix T :33::::::5::3:: bits 2.2 * 2.0 * * **** * ** 1.8 * * **** * ** 1.5 * * **** * ** Relative 1.3 * ****** **** Entropy 1.1 ** ****** ***** (23.6 bits) 0.9 ** ****** ***** 0.7 ** ****** ***** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CACGGCAACTACGAC consensus TA A AC T sequence T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 15820 472 4.69e-09 AACGATGCAG CAAGGCAACTACGAC CCAAAGCTCT 8310 257 8.23e-09 CCTTGAGATA CACGGCAACTCCGAC GCATCCCACG 1494 254 4.18e-08 GTATCATCAT CATGACAACGACGAC GACGCGCTGG 33435 331 7.47e-08 TACACCGCGG CTCGGCAACAACTAC TCACCTTTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 15820 4.7e-09 471_[+3]_14 8310 8.2e-09 256_[+3]_229 1494 4.2e-08 253_[+3]_232 33435 7.5e-08 330_[+3]_155 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=4 15820 ( 472) CAAGGCAACTACGAC 1 8310 ( 257) CACGGCAACTCCGAC 1 1494 ( 254) CATGACAACGACGAC 1 33435 ( 331) CTCGGCAACAACTAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6318 bayes= 10.6243 E= 1.9e+003 -865 195 -865 -865 152 -865 -865 -8 -6 95 -865 -8 -865 -865 221 -865 -6 -865 179 -865 -865 195 -865 -865 193 -865 -865 -865 193 -865 -865 -865 -865 195 -865 -865 -6 -865 21 92 152 -5 -865 -865 -865 195 -865 -865 -865 -865 179 -8 193 -865 -865 -865 -865 195 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 4 E= 1.9e+003 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.250000 0.500000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.250000 0.500000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[AT][CAT]G[GA]CAAC[TAG][AC]C[GT]AC -------------------------------------------------------------------------------- Time 4.37 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47550 5.59e-02 191_[+1(4.99e-06)]_297 22418 4.34e-04 345_[+2(3.63e-08)]_139 15820 8.26e-08 441_[+2(2.97e-07)]_14_\ [+3(4.69e-09)]_14 1494 5.40e-07 253_[+3(4.18e-08)]_136_\ [+2(3.38e-07)]_80 33435 2.56e-10 130_[+2(6.15e-08)]_68_\ [+1(1.14e-06)]_104_[+3(7.47e-08)]_155 11197 2.18e-07 438_[+1(1.59e-06)]_18_\ [+2(1.43e-08)]_16 8310 7.20e-08 256_[+3(8.23e-09)]_121_\ [+1(5.67e-07)]_96 44899 3.73e-02 152_[+1(6.88e-06)]_336 12155 1.76e-04 351_[+1(5.66e-08)]_137 35670 8.25e-06 281_[+1(5.40e-06)]_145_\ [+2(9.91e-08)]_46 38769 2.11e-02 367_[+1(5.40e-06)]_121 50556 3.43e-02 169_[+1(9.16e-06)]_319 43069 9.79e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************