******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/266/266.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1083 1.0000 500 1134 1.0000 500 11854 1.0000 500 264816 1.0000 500 31242 1.0000 500 3149 1.0000 500 38194 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/266/266.seqs.fa -oc motifs/266 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.241 C 0.242 G 0.236 T 0.281 Background letter frequencies (from dataset with add-one prior applied): A 0.241 C 0.242 G 0.236 T 0.281 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 6 llr = 78 E-value = 4.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :7::a::823:8 pos.-specific C ::7::::::3:: probability G a:3a:aa28382 matrix T :3::::::::2: bits 2.1 * **** 1.9 * **** 1.7 * **** 1.5 * ****** ** Relative 1.3 * ****** ** Entropy 1.0 ********* ** (18.7 bits) 0.8 ********* ** 0.6 ********* ** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GACGAGGAGAGA consensus TG C sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 38194 276 1.01e-07 CCGACACCGA GACGAGGAGAGA AAGATTCACG 31242 307 2.00e-07 TGAAGTAGCA GAGGAGGAGCGA GGGAGCGAGG 264816 11 2.00e-07 GGAGGGGGAG GAGGAGGAGAGA GGCTGTTTGA 3149 142 1.38e-06 CTGGGTGAAG GTCGAGGAGCGG CTCTATCCTC 1083 291 1.38e-06 TGTGGCGTGT GTCGAGGGGGGA CACTTTAGAT 11854 482 2.90e-06 TGCCATTCCC GACGAGGAAGTA ATGGATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38194 1e-07 275_[+1]_213 31242 2e-07 306_[+1]_182 264816 2e-07 10_[+1]_478 3149 1.4e-06 141_[+1]_347 1083 1.4e-06 290_[+1]_198 11854 2.9e-06 481_[+1]_7 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=6 38194 ( 276) GACGAGGAGAGA 1 31242 ( 307) GAGGAGGAGCGA 1 264816 ( 11) GAGGAGGAGAGA 1 3149 ( 142) GTCGAGGAGCGG 1 1083 ( 291) GTCGAGGGGGGA 1 11854 ( 482) GACGAGGAAGTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3423 bayes= 9.60169 E= 4.4e-001 -923 -923 208 -923 146 -923 -923 25 -923 146 50 -923 -923 -923 208 -923 205 -923 -923 -923 -923 -923 208 -923 -923 -923 208 -923 179 -923 -50 -923 -53 -923 182 -923 46 46 50 -923 -923 -923 182 -75 179 -923 -50 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 4.4e-001 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.333333 0.333333 0.333333 0.000000 0.000000 0.000000 0.833333 0.166667 0.833333 0.000000 0.166667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[AT][CG]GAGGAG[ACG]GA -------------------------------------------------------------------------------- Time 0.53 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 107 E-value = 1.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::382:22::22a::3:22:7 pos.-specific C 2a52:78:a887:38777383 probability G :::::::2::::::2::25:: matrix T 8:2:83:7:2:2:7::3::2: bits 2.1 * * * 1.9 * * * 1.7 * * * 1.5 * * * *** * * * Relative 1.3 ** ** * *** * * * Entropy 1.0 ** **** *** ***** ** (25.7 bits) 0.8 ** **** ********** ** 0.6 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TCCATCCTCCCCATCCCCGCA consensus A T C AT C C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 1083 423 7.49e-11 CCATCACCCG TCCATCCACCCCATCACCGCC ACAACTTCGT 38194 236 4.08e-10 TGTCGGAAAG TCCATTCGCCCCACCCCCACA AACCACAGAC 3149 155 1.96e-09 GAGGAGCGGC TCTATCCTCCCCATCCTCCTC TCCTCACCAC 264816 382 2.11e-09 CTCCCAACAA TCCAACCTCCCCACGCCAGCA CTGGTTGTGG 11854 426 1.40e-08 TCCGATGCGG CCAATTCTCCAAATCCTCCCA TGGCATCCCT 1134 248 6.03e-08 CGCATAACTC TCACTCATCTCTATCACGGCA CACTGTAGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1083 7.5e-11 422_[+2]_57 38194 4.1e-10 235_[+2]_244 3149 2e-09 154_[+2]_325 264816 2.1e-09 381_[+2]_98 11854 1.4e-08 425_[+2]_54 1134 6e-08 247_[+2]_232 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 1083 ( 423) TCCATCCACCCCATCACCGCC 1 38194 ( 236) TCCATTCGCCCCACCCCCACA 1 3149 ( 155) TCTATCCTCCCCATCCTCCTC 1 264816 ( 382) TCCAACCTCCCCACGCCAGCA 1 11854 ( 426) CCAATTCTCCAAATCCTCCCA 1 1134 ( 248) TCACTCATCTCTATCACGGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3360 bayes= 9.57485 E= 1.7e+001 -923 -54 -923 157 -923 205 -923 -923 46 105 -923 -75 179 -54 -923 -923 -53 -923 -923 157 -923 146 -923 25 -53 178 -923 -923 -53 -923 -50 124 -923 205 -923 -923 -923 178 -923 -75 -53 178 -923 -923 -53 146 -923 -75 205 -923 -923 -923 -923 46 -923 124 -923 178 -50 -923 46 146 -923 -923 -923 146 -923 25 -53 146 -50 -923 -53 46 108 -923 -923 178 -923 -75 146 46 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 1.7e+001 0.000000 0.166667 0.000000 0.833333 0.000000 1.000000 0.000000 0.000000 0.333333 0.500000 0.000000 0.166667 0.833333 0.166667 0.000000 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.666667 0.000000 0.333333 0.166667 0.833333 0.000000 0.000000 0.166667 0.000000 0.166667 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.166667 0.833333 0.000000 0.000000 0.166667 0.666667 0.000000 0.166667 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.833333 0.166667 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.166667 0.666667 0.166667 0.000000 0.166667 0.333333 0.500000 0.000000 0.000000 0.833333 0.000000 0.166667 0.666667 0.333333 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TC[CA]AT[CT]CTCCCCA[TC]C[CA][CT]C[GC]C[AC] -------------------------------------------------------------------------------- Time 0.95 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 6 llr = 97 E-value = 3.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 22::72:2::8:223:8:23 pos.-specific C ::::2::3:3:82::::5:7 probability G 8:a8:5a23722:8382:8: matrix T :8:223:37:::7:32:5:: bits 2.1 * * 1.9 * * 1.7 * * 1.5 * ** * ** * ** * Relative 1.3 **** * *** * ** * Entropy 1.0 **** * **** * ***** (23.3 bits) 0.8 ***** * **** * ***** 0.6 ******* ****** ***** 0.4 ******* ************ 0.2 ******* ************ 0.0 -------------------- Multilevel GTGGAGGCTGACTGAGACGC consensus T TGC G T A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 11854 369 2.24e-10 TGTATTACCA GTGGAGGCTGACTGGTACGC GCTGCTCCTC 31242 39 1.05e-08 GTAAGATGCC GTGGAGGAGGAGCGTGATGC TTATTGCTCT 1134 15 1.47e-08 AGAAGGTTGG GTGGATGTTGACTAGGACAA TGGCAGCAAC 38194 387 3.36e-08 GATGAAGGAT GTGGAAGGTCACAGAGGTGC ACCTACCTCT 3149 57 3.85e-08 TGGAGAGTAA GAGGTTGCGGGCTGAGACGC TGATACTTCC 1083 124 9.82e-08 TGTGTTCATG ATGTCGGTTCACTGTGATGA ATTCTCATAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11854 2.2e-10 368_[+3]_112 31242 1.1e-08 38_[+3]_442 1134 1.5e-08 14_[+3]_466 38194 3.4e-08 386_[+3]_94 3149 3.8e-08 56_[+3]_424 1083 9.8e-08 123_[+3]_357 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=6 11854 ( 369) GTGGAGGCTGACTGGTACGC 1 31242 ( 39) GTGGAGGAGGAGCGTGATGC 1 1134 ( 15) GTGGATGTTGACTAGGACAA 1 38194 ( 387) GTGGAAGGTCACAGAGGTGC 1 3149 ( 57) GAGGTTGCGGGCTGAGACGC 1 1083 ( 124) ATGTCGGTTCACTGTGATGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 3367 bayes= 9.57786 E= 3.0e+001 -53 -923 182 -923 -53 -923 -923 157 -923 -923 208 -923 -923 -923 182 -75 146 -54 -923 -75 -53 -923 108 25 -923 -923 208 -923 -53 46 -50 25 -923 -923 50 124 -923 46 150 -923 179 -923 -50 -923 -923 178 -50 -923 -53 -54 -923 124 -53 -923 182 -923 46 -923 50 25 -923 -923 182 -75 179 -923 -50 -923 -923 105 -923 83 -53 -923 182 -923 46 146 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 3.0e+001 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.666667 0.166667 0.000000 0.166667 0.166667 0.000000 0.500000 0.333333 0.000000 0.000000 1.000000 0.000000 0.166667 0.333333 0.166667 0.333333 0.000000 0.000000 0.333333 0.666667 0.000000 0.333333 0.666667 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.166667 0.166667 0.000000 0.666667 0.166667 0.000000 0.833333 0.000000 0.333333 0.000000 0.333333 0.333333 0.000000 0.000000 0.833333 0.166667 0.833333 0.000000 0.166667 0.000000 0.000000 0.500000 0.000000 0.500000 0.166667 0.000000 0.833333 0.000000 0.333333 0.666667 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GTGGA[GT]G[CT][TG][GC]ACTG[AGT]GA[CT]G[CA] -------------------------------------------------------------------------------- Time 1.38 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1083 7.20e-13 123_[+3(9.82e-08)]_147_\ [+1(1.38e-06)]_120_[+2(7.49e-11)]_57 1134 4.87e-08 14_[+3(1.47e-08)]_213_\ [+2(6.03e-08)]_232 11854 6.50e-13 111_[+2(1.46e-05)]_236_\ [+3(2.24e-10)]_37_[+2(1.40e-08)]_35_[+1(2.90e-06)]_7 264816 1.43e-09 10_[+1(2.00e-07)]_359_\ [+2(2.11e-09)]_98 31242 2.67e-08 38_[+3(1.05e-08)]_248_\ [+1(2.00e-07)]_182 3149 6.44e-12 56_[+3(3.85e-08)]_65_[+1(1.38e-06)]_\ 1_[+2(1.96e-09)]_145_[+3(3.04e-06)]_160 38194 1.10e-13 235_[+2(4.08e-10)]_19_\ [+1(1.01e-07)]_99_[+3(3.36e-08)]_94 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************