******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/119/119.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11664 1.0000 500 11695 1.0000 500 21210 1.0000 500 23217 1.0000 500 23860 1.0000 500 23942 1.0000 500 24540 1.0000 500 25655 1.0000 500 25686 1.0000 500 25687 1.0000 500 25688 1.0000 500 25693 1.0000 500 25852 1.0000 500 264762 1.0000 500 268262 1.0000 500 270065 1.0000 500 6567 1.0000 500 8885 1.0000 500 bd1760 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/119/119.seqs.fa -oc motifs/119 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.251 C 0.234 G 0.248 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.251 C 0.234 G 0.248 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 11 llr = 151 E-value = 1.7e-005 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1932984391a6159: pos.-specific C 6:58:254:9:493:a probability G 2:::1:231::::21: matrix T 112::::1:::::::: bits 2.1 * * 1.9 * * 1.7 ** * * 1.5 * ** *** * ** Relative 1.3 * *** *** * ** Entropy 1.0 * *** ***** ** (19.8 bits) 0.8 * *** ***** ** 0.6 ******* ******** 0.4 ******* ******** 0.2 **************** 0.0 ---------------- Multilevel CACCAACCACAACAAC consensus A AA C C sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 8885 481 3.08e-09 TCTCTCTCTA CAACAACAACAACAAC AACA 11664 481 3.08e-09 TCTCTCTCTA CAACAACAACAACAAC AACA 25687 447 1.62e-08 ACGCATGCCA CACCAAGCACACCCAC AGACGGCCCC 25686 450 1.62e-08 TCCGGTGCCA CACCAAGCACACCCAC AGACGCCCAA 25688 457 4.64e-08 TAACGTCGTG CACCACACACACCCAC GCCTCCCGTC 23942 482 1.68e-07 AGAACAGCGC AACAAACGACAACAAC GCA 21210 418 7.78e-07 AACAGAGCCG CATCGCAGACAACAAC GTCGTACACA 23217 387 1.01e-06 TACCTTCACG TTCCAACTACAACAAC ATGAGAATGC 25655 74 1.30e-06 AGAGCCGAGT GATCAACGACAACGGC ATCACCGATC 25693 388 1.38e-06 TGAGCGCGCA GAACAAACAAACCGAC GACGGTGACG 264762 55 1.94e-06 TTGAATGACA CACAAAAAGCAAAAAC TGGAGGCCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8885 3.1e-09 480_[+1]_4 11664 3.1e-09 480_[+1]_4 25687 1.6e-08 446_[+1]_38 25686 1.6e-08 449_[+1]_35 25688 4.6e-08 456_[+1]_28 23942 1.7e-07 481_[+1]_3 21210 7.8e-07 417_[+1]_67 23217 1e-06 386_[+1]_98 25655 1.3e-06 73_[+1]_411 25693 1.4e-06 387_[+1]_97 264762 1.9e-06 54_[+1]_430 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=11 8885 ( 481) CAACAACAACAACAAC 1 11664 ( 481) CAACAACAACAACAAC 1 25687 ( 447) CACCAAGCACACCCAC 1 25686 ( 450) CACCAAGCACACCCAC 1 25688 ( 457) CACCACACACACCCAC 1 23942 ( 482) AACAAACGACAACAAC 1 21210 ( 418) CATCGCAGACAACAAC 1 23217 ( 387) TTCCAACTACAACAAC 1 25655 ( 74) GATCAACGACAACGGC 1 25693 ( 388) GAACAAACAAACCGAC 1 264762 ( 55) CACAAAAAGCAAAAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9215 bayes= 10.736 E= 1.7e-005 -146 144 -45 -155 185 -1010 -1010 -155 12 122 -1010 -55 -47 180 -1010 -1010 185 -1010 -144 -1010 170 -36 -1010 -1010 53 96 -45 -1010 12 64 14 -155 185 -1010 -144 -1010 -146 196 -1010 -1010 199 -1010 -1010 -1010 134 64 -1010 -1010 -146 196 -1010 -1010 112 22 -45 -1010 185 -1010 -144 -1010 -1010 209 -1010 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 1.7e-005 0.090909 0.636364 0.181818 0.090909 0.909091 0.000000 0.000000 0.090909 0.272727 0.545455 0.000000 0.181818 0.181818 0.818182 0.000000 0.000000 0.909091 0.000000 0.090909 0.000000 0.818182 0.181818 0.000000 0.000000 0.363636 0.454545 0.181818 0.000000 0.272727 0.363636 0.272727 0.090909 0.909091 0.000000 0.090909 0.000000 0.090909 0.909091 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.636364 0.363636 0.000000 0.000000 0.090909 0.909091 0.000000 0.000000 0.545455 0.272727 0.181818 0.000000 0.909091 0.000000 0.090909 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CA[CA]CAA[CA][CAG]ACA[AC]C[AC]AC -------------------------------------------------------------------------------- Time 4.12 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 5 llr = 114 E-value = 1.5e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 8::a::a2:a22:4:2::::: pos.-specific C 2:::::::::::::2:4:2:: probability G :aa:aa:8a:48a688:882a matrix T ::::::::::4:::::62:8: bits 2.1 ****** ** * * 1.9 ****** ** * * 1.7 ****** ** * * 1.5 ****** ** * * Relative 1.3 ********** ** ** **** Entropy 1.0 ********** ********** (32.8 bits) 0.8 ********** ********** 0.6 ********** ********** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel AGGAGGAGGAGGGGGGTGGTG consensus C A TA ACACTCG sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 8885 323 9.10e-13 GTTTTGGAGG AGGAGGAGGATGGGGGCGGTG TGGGTGCCAT 11664 323 9.10e-13 GTTTTGGAGG AGGAGGAGGATGGGGGCGGTG TGGGTGCCAT 25852 103 2.15e-11 TGAAGAACCG AGGAGGAGGAAGGAGGTGCTG ACGACGAGAG 25693 254 1.01e-10 GTCGGAGCAG CGGAGGAGGAGGGGCATGGTG GGGTTGAGCA 25655 237 3.32e-10 GGATAAGGAG AGGAGGAAGAGAGAGGTTGGG ATGTTCATGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8885 9.1e-13 322_[+2]_157 11664 9.1e-13 322_[+2]_157 25852 2.2e-11 102_[+2]_377 25693 1e-10 253_[+2]_226 25655 3.3e-10 236_[+2]_243 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=5 8885 ( 323) AGGAGGAGGATGGGGGCGGTG 1 11664 ( 323) AGGAGGAGGATGGGGGCGGTG 1 25852 ( 103) AGGAGGAGGAAGGAGGTGCTG 1 25693 ( 254) CGGAGGAGGAGGGGCATGGTG 1 25655 ( 237) AGGAGGAAGAGAGAGGTTGGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9120 bayes= 11.0838 E= 1.5e-003 167 -23 -897 -897 -897 -897 201 -897 -897 -897 201 -897 199 -897 -897 -897 -897 -897 201 -897 -897 -897 201 -897 199 -897 -897 -897 -33 -897 169 -897 -897 -897 201 -897 199 -897 -897 -897 -33 -897 69 58 -33 -897 169 -897 -897 -897 201 -897 67 -897 127 -897 -897 -23 169 -897 -33 -897 169 -897 -897 77 -897 117 -897 -897 169 -41 -897 -23 169 -897 -897 -897 -31 158 -897 -897 201 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 1.5e-003 0.800000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.400000 0.400000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.200000 0.800000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.800000 0.200000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AC]GGAGGA[GA]GA[GTA][GA]G[GA][GC][GA][TC][GT][GC][TG]G -------------------------------------------------------------------------------- Time 8.07 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 10 llr = 155 E-value = 5.8e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 26::32::::1:::4::115 pos.-specific C 84652841a3925a:81855 probability G ::23::1::::24::::1:: matrix T ::225:59:7:61:629:4: bits 2.1 * * 1.9 * * 1.7 * * * 1.5 ** * * * Relative 1.3 * * ** * * *** Entropy 1.0 ** * **** ***** * (22.3 bits) 0.8 ** * **** ***** * 0.6 **** *************** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel CACCTCTTCTCTCCTCTCCA consensus ACGGAAC C CG AT TC sequence TTC G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 268262 150 6.54e-13 TCACTTTTTA CACCTCTTCTCTCCTCTCCC CAAGAAGTTC 24540 10 3.84e-09 TGCAACCGA CCGCCCCTCCCTCCTCTCCA GCGGTGGCAT 6567 292 1.65e-08 GCCTACATGG CACGTATTCTCTTCACTCCA GCAAGCCGAG 8885 451 4.56e-08 TACAAAGGGG ACCGTCCTCCCCGCACTCTC TCTCTCTCTA 21210 218 4.56e-08 AGCAAAGCAA CACCCCTCCTCTGCTTTCTA CTTTGCAATA 11664 451 4.56e-08 TACAAAGGGG ACCGTCCTCCCCGCACTCTC TCTCTCTCTA bd1760 314 1.34e-07 TTAATGAGGT CATCAATTCTCTCCTTTCAC ATGTAATTCT 23860 265 1.34e-07 CGCTACACAA CATCACCTCTCTCCACCGCA CAAACACCTT 23217 467 1.34e-07 CCGAACCAAT CACTTCGTCTAGGCTCTCCA ATCATACCAT 25686 197 1.65e-07 CAGCGCCACA CCGTACTTCTCGCCTCTATC ACTTCTGCTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 268262 6.5e-13 149_[+3]_331 24540 3.8e-09 9_[+3]_471 6567 1.6e-08 291_[+3]_189 8885 4.6e-08 450_[+3]_30 21210 4.6e-08 217_[+3]_263 11664 4.6e-08 450_[+3]_30 bd1760 1.3e-07 313_[+3]_167 23860 1.3e-07 264_[+3]_216 23217 1.3e-07 466_[+3]_14 25686 1.6e-07 196_[+3]_284 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=10 268262 ( 150) CACCTCTTCTCTCCTCTCCC 1 24540 ( 10) CCGCCCCTCCCTCCTCTCCA 1 6567 ( 292) CACGTATTCTCTTCACTCCA 1 8885 ( 451) ACCGTCCTCCCCGCACTCTC 1 21210 ( 218) CACCCCTCCTCTGCTTTCTA 1 11664 ( 451) ACCGTCCTCCCCGCACTCTC 1 bd1760 ( 314) CATCAATTCTCTCCTTTCAC 1 23860 ( 265) CATCACCTCTCTCCACCGCA 1 23217 ( 467) CACTTCGTCTAGGCTCTCCA 1 25686 ( 197) CCGTACTTCTCGCCTCTATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 9139 bayes= 10.0861 E= 5.8e-003 -33 177 -997 -997 125 77 -997 -997 -997 136 -31 -42 -997 109 28 -42 25 -23 -997 91 -33 177 -997 -997 -997 77 -131 91 -997 -122 -997 175 -997 209 -997 -997 -997 36 -997 139 -133 194 -997 -997 -997 -23 -31 117 -997 109 69 -141 -997 209 -997 -997 67 -997 -997 117 -997 177 -997 -42 -997 -122 -997 175 -133 177 -131 -997 -133 109 -997 58 99 109 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 10 E= 5.8e-003 0.200000 0.800000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 0.600000 0.200000 0.200000 0.000000 0.500000 0.300000 0.200000 0.300000 0.200000 0.000000 0.500000 0.200000 0.800000 0.000000 0.000000 0.000000 0.400000 0.100000 0.500000 0.000000 0.100000 0.000000 0.900000 0.000000 1.000000 0.000000 0.000000 0.000000 0.300000 0.000000 0.700000 0.100000 0.900000 0.000000 0.000000 0.000000 0.200000 0.200000 0.600000 0.000000 0.500000 0.400000 0.100000 0.000000 1.000000 0.000000 0.000000 0.400000 0.000000 0.000000 0.600000 0.000000 0.800000 0.000000 0.200000 0.000000 0.100000 0.000000 0.900000 0.100000 0.800000 0.100000 0.000000 0.100000 0.500000 0.000000 0.400000 0.500000 0.500000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA][AC][CGT][CGT][TAC][CA][TC]TC[TC]C[TCG][CG]C[TA][CT]TC[CT][AC] -------------------------------------------------------------------------------- Time 12.15 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11664 1.56e-17 322_[+2(9.10e-13)]_107_\ [+3(4.56e-08)]_10_[+1(3.08e-09)]_4 11695 4.50e-01 500 21210 2.35e-07 217_[+3(4.56e-08)]_180_\ [+1(7.78e-07)]_67 23217 1.71e-06 386_[+1(1.01e-06)]_64_\ [+3(1.34e-07)]_14 23860 1.61e-03 264_[+3(1.34e-07)]_216 23942 4.81e-04 446_[+1(7.37e-06)]_19_\ [+1(1.68e-07)]_3 24540 1.54e-04 9_[+3(3.84e-09)]_471 25655 1.75e-08 73_[+1(1.30e-06)]_147_\ [+2(3.32e-10)]_243 25686 3.30e-08 196_[+3(1.65e-07)]_233_\ [+1(1.62e-08)]_35 25687 1.02e-04 446_[+1(1.62e-08)]_38 25688 2.88e-04 456_[+1(4.64e-08)]_28 25693 9.90e-09 187_[+2(4.95e-07)]_45_\ [+2(1.01e-10)]_113_[+1(1.38e-06)]_97 25852 6.53e-07 102_[+2(2.15e-11)]_377 264762 1.94e-03 54_[+1(1.94e-06)]_430 268262 6.61e-10 149_[+3(6.54e-13)]_248_\ [+3(7.63e-05)]_63 270065 3.63e-01 500 6567 5.76e-04 291_[+3(1.65e-08)]_189 8885 1.56e-17 322_[+2(9.10e-13)]_107_\ [+3(4.56e-08)]_10_[+1(3.08e-09)]_4 bd1760 2.56e-04 313_[+3(1.34e-07)]_48_\ [+3(4.51e-05)]_61_[+3(8.80e-05)]_18 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************