******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/442/442.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9649 1.0000 500 13590 1.0000 500 39281 1.0000 500 49637 1.0000 500 50370 1.0000 500 44066 1.0000 500 35058 1.0000 500 45734 1.0000 500 35508 1.0000 500 45954 1.0000 500 12589 1.0000 500 39092 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/442/442.seqs.fa -oc motifs/442 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.257 C 0.248 G 0.230 T 0.265 Background letter frequencies (from dataset with add-one prior applied): A 0.257 C 0.248 G 0.230 T 0.265 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 18 sites = 5 llr = 90 E-value = 9.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 4:826::a:2:a:6:6:8 pos.-specific C ::2::82:a:::8::::: probability G 6a:82:8:::a:2:a:6: matrix T ::::22:::8:::4:442 bits 2.1 * * * 1.9 * ** ** * 1.7 * ** ** * 1.5 * ** ** * Relative 1.3 *** ******** * * Entropy 1.1 **** ************* (26.0 bits) 0.8 **** ************* 0.6 ****************** 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel GGAGACGACTGACAGAGA consensus A CAGTC A GT TTT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 35508 213 4.26e-11 CGATTCGTGG GGAGACGACTGACAGTGA AAAATCGTTC 35058 451 1.47e-09 TGGCGTCGTT GGAGGCGACTGACAGAGT TCTCTCCACC 39281 242 5.99e-09 CTCACCTTTG AGCGACCACTGACAGTGA GTGCAGGACG 44066 59 7.19e-09 CCTTGCTCCG GGAAATGACTGACTGATA GACTTTGAAG 45734 379 1.91e-08 TCGATTCCAC AGAGTCGACAGAGTGATA ACCAAAAGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35508 4.3e-11 212_[+1]_270 35058 1.5e-09 450_[+1]_32 39281 6e-09 241_[+1]_241 44066 7.2e-09 58_[+1]_424 45734 1.9e-08 378_[+1]_104 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=18 seqs=5 35508 ( 213) GGAGACGACTGACAGTGA 1 35058 ( 451) GGAGGCGACTGACAGAGT 1 39281 ( 242) AGCGACCACTGACAGTGA 1 44066 ( 59) GGAAATGACTGACTGATA 1 45734 ( 379) AGAGTCGACAGAGTGATA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 5796 bayes= 10.4294 E= 9.2e+000 64 -897 138 -897 -897 -897 212 -897 164 -31 -897 -897 -36 -897 180 -897 122 -897 -20 -41 -897 169 -897 -41 -897 -31 180 -897 196 -897 -897 -897 -897 201 -897 -897 -36 -897 -897 159 -897 -897 212 -897 196 -897 -897 -897 -897 169 -20 -897 122 -897 -897 59 -897 -897 212 -897 122 -897 -897 59 -897 -897 138 59 164 -897 -897 -41 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 5 E= 9.2e+000 0.400000 0.000000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.600000 0.000000 0.200000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.200000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 0.000000 1.000000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 0.000000 0.600000 0.400000 0.800000 0.000000 0.000000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA]G[AC][GA][AGT][CT][GC]AC[TA]GA[CG][AT]G[AT][GT][AT] -------------------------------------------------------------------------------- Time 1.25 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 11 llr = 143 E-value = 2.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::215:121:1::5::4:11 pos.-specific C 3::335:582:3a63:52836 probability G 1844::81:::1:3::4211: matrix T 62626124:7a5:12a23153 bits 2.1 1.9 * * * 1.7 * * * 1.5 * * * * * Relative 1.3 * * * * * * Entropy 1.1 ** * * * * * * (18.8 bits) 0.8 ** * *** ** * * 0.6 *** *** *** ***** * * 0.4 *** *** ********* *** 0.2 *** ************* *** 0.0 --------------------- Multilevel TGTGTAGCCTTTCCATCACTC consensus C GCCC T C GC GT CT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45734 471 1.91e-09 ACGGGGAGTT TGTGCCGCCTTCCCCTGACCC TCGTTCACC 39281 474 1.12e-08 TCTCACTTTA CGTTTCGCCTTCCGATCCCTC AGCGAA 35058 189 4.90e-08 GAAAAAAACT TGGATCGCCTTTCCATCGTCC GCTCGCTAGT 45954 253 5.49e-08 ATGTTGCTAT TGGCAAGACTTTCCATCTCTC GCTTACCCTG 12589 459 1.28e-07 CTTGACGGTC CGTCTCGTCTTCCTCTGCCTC GGCATATATC 9649 41 3.32e-07 CCAAATTCAT TGTATATTATTTCGATCACTT AAACCATCAG 39092 66 7.10e-07 CAACACGCTG TGTGTCGCAATTCCTTCTCTA GTGTCATCCG 44066 36 1.51e-06 AGAAACTGCT TGTCTTTTCTTGCCCTTGCTC CGGGAAATGA 35508 167 2.12e-06 AAGGCTATTT GTGGTAGCCCTTCGATGTCGC AAATCCCGCG 49637 455 3.53e-06 TGTATGTAAA TGTTCAGTCTTTCCTTTAGAT AAAAGATCGA 13590 434 6.90e-06 TCGAAGGTAT CTGGCAGGCCTACCATGACCT CGCTGTAAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45734 1.9e-09 470_[+2]_9 39281 1.1e-08 473_[+2]_6 35058 4.9e-08 188_[+2]_291 45954 5.5e-08 252_[+2]_227 12589 1.3e-07 458_[+2]_21 9649 3.3e-07 40_[+2]_439 39092 7.1e-07 65_[+2]_414 44066 1.5e-06 35_[+2]_444 35508 2.1e-06 166_[+2]_313 49637 3.5e-06 454_[+2]_25 13590 6.9e-06 433_[+2]_46 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=11 45734 ( 471) TGTGCCGCCTTCCCCTGACCC 1 39281 ( 474) CGTTTCGCCTTCCGATCCCTC 1 35058 ( 189) TGGATCGCCTTTCCATCGTCC 1 45954 ( 253) TGGCAAGACTTTCCATCTCTC 1 12589 ( 459) CGTCTCGTCTTCCTCTGCCTC 1 9649 ( 41) TGTATATTATTTCGATCACTT 1 39092 ( 66) TGTGTCGCAATTCCTTCTCTA 1 44066 ( 36) TGTCTTTTCTTGCCCTTGCTC 1 35508 ( 167) GTGGTAGCCCTTCGATGTCGC 1 49637 ( 455) TGTTCAGTCTTTCCTTTAGAT 1 13590 ( 434) CTGGCAGGCCTACCATGACCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5760 bayes= 9.38536 E= 2.6e+001 -1010 14 -134 126 -1010 -1010 183 -54 -1010 -1010 66 126 -50 14 66 -54 -150 14 -1010 126 82 87 -1010 -154 -1010 -1010 183 -54 -150 87 -134 45 -50 172 -1010 -1010 -150 -45 -1010 145 -1010 -1010 -1010 191 -150 14 -134 104 -1010 201 -1010 -1010 -1010 136 25 -154 109 14 -1010 -54 -1010 -1010 -1010 191 -1010 87 66 -54 50 -45 -34 4 -1010 172 -134 -154 -150 14 -134 104 -150 136 -1010 4 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 11 E= 2.6e+001 0.000000 0.272727 0.090909 0.636364 0.000000 0.000000 0.818182 0.181818 0.000000 0.000000 0.363636 0.636364 0.181818 0.272727 0.363636 0.181818 0.090909 0.272727 0.000000 0.636364 0.454545 0.454545 0.000000 0.090909 0.000000 0.000000 0.818182 0.181818 0.090909 0.454545 0.090909 0.363636 0.181818 0.818182 0.000000 0.000000 0.090909 0.181818 0.000000 0.727273 0.000000 0.000000 0.000000 1.000000 0.090909 0.272727 0.090909 0.545455 0.000000 1.000000 0.000000 0.000000 0.000000 0.636364 0.272727 0.090909 0.545455 0.272727 0.000000 0.181818 0.000000 0.000000 0.000000 1.000000 0.000000 0.454545 0.363636 0.181818 0.363636 0.181818 0.181818 0.272727 0.000000 0.818182 0.090909 0.090909 0.090909 0.272727 0.090909 0.545455 0.090909 0.636364 0.000000 0.272727 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TC]G[TG][GC][TC][AC]G[CT]CTT[TC]C[CG][AC]T[CG][AT]C[TC][CT] -------------------------------------------------------------------------------- Time 2.52 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 8 llr = 91 E-value = 3.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::813::38:1 pos.-specific C a:8:93:3831: probability G :::1::a:::9: matrix T :a31:5:8:::9 bits 2.1 * 1.9 ** * 1.7 ** * 1.5 ** * * * Relative 1.3 *** * * * ** Entropy 1.1 *** * ****** (16.5 bits) 0.8 ***** ****** 0.6 ***** ****** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CTCACTGTCAGT consensus T A CAC sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 39092 175 6.52e-08 GAAACACACA CTCACTGTCAGT GAACCGACTT 39281 392 6.52e-08 GTAAATTGTC CTCACTGTCAGT TTCCTCGAAC 49637 45 2.29e-06 AGGATGCATA CTCAAAGTCAGT GAGCGTAAAA 50370 467 3.92e-06 CAGTCTAAAC CTCACTGCCACT ACTATCCACT 45734 69 4.68e-06 TCGATTGGTT CTCGCCGTCCGT CATTCCGAAT 9649 362 7.13e-06 AATGAAACCG CTCTCTGTCAGA CAACATACCA 45954 481 9.05e-06 CCTTTGCGTG CTTACAGTACGT TGACCAAC 35508 100 9.05e-06 AAGTGATTTC CTTACCGCAAGT TGTAGTATTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39092 6.5e-08 174_[+3]_314 39281 6.5e-08 391_[+3]_97 49637 2.3e-06 44_[+3]_444 50370 3.9e-06 466_[+3]_22 45734 4.7e-06 68_[+3]_420 9649 7.1e-06 361_[+3]_127 45954 9.1e-06 480_[+3]_8 35508 9.1e-06 99_[+3]_389 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=8 39092 ( 175) CTCACTGTCAGT 1 39281 ( 392) CTCACTGTCAGT 1 49637 ( 45) CTCAAAGTCAGT 1 50370 ( 467) CTCACTGCCACT 1 45734 ( 69) CTCGCCGTCCGT 1 9649 ( 362) CTCTCTGTCAGA 1 45954 ( 481) CTTACAGTACGT 1 35508 ( 100) CTTACCGCAAGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 9.51668 E= 3.1e+002 -965 201 -965 -965 -965 -965 -965 191 -965 160 -965 -9 154 -965 -88 -108 -104 182 -965 -965 -4 1 -965 91 -965 -965 212 -965 -965 1 -965 150 -4 160 -965 -965 154 1 -965 -965 -965 -99 193 -965 -104 -965 -965 172 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 3.1e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.250000 0.750000 0.000000 0.125000 0.125000 0.125000 0.875000 0.000000 0.000000 0.250000 0.250000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.250000 0.750000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.125000 0.000000 0.000000 0.875000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CT[CT]AC[TAC]G[TC][CA][AC]GT -------------------------------------------------------------------------------- Time 3.88 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9649 5.35e-05 40_[+2(3.32e-07)]_300_\ [+3(7.13e-06)]_127 13590 1.15e-02 433_[+2(6.90e-06)]_46 39281 3.26e-13 241_[+1(5.99e-09)]_132_\ [+3(6.52e-08)]_70_[+2(1.12e-08)]_6 49637 1.09e-04 44_[+3(2.29e-06)]_398_\ [+2(3.53e-06)]_25 50370 2.04e-02 466_[+3(3.92e-06)]_22 44066 4.12e-07 35_[+2(1.51e-06)]_2_[+1(7.19e-09)]_\ 338_[+2(4.57e-05)]_65 35058 3.78e-09 188_[+2(4.90e-08)]_241_\ [+1(1.47e-09)]_32 45734 1.03e-11 68_[+3(4.68e-06)]_298_\ [+1(1.91e-08)]_74_[+2(1.91e-09)]_9 35508 4.46e-11 99_[+3(9.05e-06)]_55_[+2(2.12e-06)]_\ 25_[+1(4.26e-11)]_270 45954 1.19e-06 252_[+2(5.49e-08)]_207_\ [+3(9.05e-06)]_8 12589 6.14e-04 458_[+2(1.28e-07)]_21 39092 8.50e-08 65_[+2(7.10e-07)]_14_[+3(5.97e-05)]_\ 62_[+3(6.52e-08)]_135_[+1(6.30e-05)]_161 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************