******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/311/311.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 2081 1.0000 500 20831 1.0000 500 22312 1.0000 500 22887 1.0000 500 24112 1.0000 500 24479 1.0000 500 24718 1.0000 500 25786 1.0000 500 25915 1.0000 500 268286 1.0000 500 2797 1.0000 500 37790 1.0000 500 38651 1.0000 500 6616 1.0000 500 7540 1.0000 500 8820 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/311/311.seqs.fa -oc motifs/311 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.269 C 0.227 G 0.233 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.269 C 0.227 G 0.233 T 0.270 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 109 E-value = 4.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :2:::2::2:5:23::23::: pos.-specific C 873272a5882387:a5:88: probability G ::3:32:3::37::a::7::a matrix T 2238:5:2:2::::::3:22: bits 2.1 * ** * 1.9 * ** * 1.7 * ** * 1.5 * * ** * ** *** Relative 1.3 * ** * ** ** ** *** Entropy 1.1 * ** * ** ***** **** (26.2 bits) 0.9 ** ** * ** ***** **** 0.6 ** ** *************** 0.4 ***** *************** 0.2 ********************* 0.0 --------------------- Multilevel CCCTCTCCCCAGCCGCCGCCG consensus G G G GC A TA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 25786 388 1.02e-11 GGGCGAGCGT CCCTCCCGCCGGCCGCCGCCG CCGCCCGTCC 8820 272 2.82e-10 GGGGTAATAA CCGTGTCCCCACCAGCAGCCG CGAACACCGG 22312 471 1.83e-09 ACACGCCAGG CTTTCTCTCCACCCGCTACCG ACTGACACA 37790 69 3.81e-09 GGCACAACGA CCGCCACCACAGCCGCTACCG GCGGGGTGGT 24718 39 1.54e-08 GTGCAGTGTC TCCTCGCCCTCGCCGCCGTCG TGATCGTGGT 268286 344 1.82e-08 ATGCCAGAGA CATTGTCGCCGGAAGCCGCTG AAAGGGCTAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25786 1e-11 387_[+1]_92 8820 2.8e-10 271_[+1]_208 22312 1.8e-09 470_[+1]_9 37790 3.8e-09 68_[+1]_411 24718 1.5e-08 38_[+1]_441 268286 1.8e-08 343_[+1]_136 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 25786 ( 388) CCCTCCCGCCGGCCGCCGCCG 1 8820 ( 272) CCGTGTCCCCACCAGCAGCCG 1 22312 ( 471) CTTTCTCTCCACCCGCTACCG 1 37790 ( 69) CCGCCACCACAGCCGCTACCG 1 24718 ( 39) TCCTCGCCCTCGCCGCCGTCG 1 268286 ( 344) CATTGTCGCCGGAAGCCGCTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 11.4209 E= 4.1e+000 -923 187 -923 -70 -69 155 -923 -70 -923 55 51 30 -923 -45 -923 162 -923 155 51 -923 -69 -45 -48 89 -923 214 -923 -923 -923 114 51 -70 -69 187 -923 -923 -923 187 -923 -70 89 -45 51 -923 -923 55 151 -923 -69 187 -923 -923 31 155 -923 -923 -923 -923 210 -923 -923 214 -923 -923 -69 114 -923 30 31 -923 151 -923 -923 187 -923 -70 -923 187 -923 -70 -923 -923 210 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 4.1e+000 0.000000 0.833333 0.000000 0.166667 0.166667 0.666667 0.000000 0.166667 0.000000 0.333333 0.333333 0.333333 0.000000 0.166667 0.000000 0.833333 0.000000 0.666667 0.333333 0.000000 0.166667 0.166667 0.166667 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.333333 0.166667 0.166667 0.833333 0.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.500000 0.166667 0.333333 0.000000 0.000000 0.333333 0.666667 0.000000 0.166667 0.833333 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.500000 0.000000 0.333333 0.333333 0.000000 0.666667 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CC[CGT]T[CG]TC[CG]CC[AG][GC]C[CA]GC[CT][GA]CCG -------------------------------------------------------------------------------- Time 2.21 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 7 llr = 120 E-value = 6.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 1a:36439a19::97:46:6 pos.-specific C 6:43:631:914a::a::a: probability G ::6:4::::::::11:61:4 matrix T 3::4::4::::6::1::3:: bits 2.1 * * * 1.9 * * * * * 1.7 * * * * * 1.5 * ** * * * Relative 1.3 * **** ** * * Entropy 1.1 ** ** ******* ** ** (24.7 bits) 0.9 ** ** ********** ** 0.6 *** ** ************* 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel CAGTACTAACATCAACGACA consensus T CAGAA C AT G sequence C C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 24479 477 1.75e-10 CGGATACAAA CACCACTAACATCAACAACA TATC 25915 446 1.90e-09 CACGAAGCAT CAGAAACAACATCAACATCA CATACAATAA 25786 471 1.90e-09 CAGACATCCG CAGTAACAACATCATCGACA CAGCAACCAA 37790 8 3.41e-09 ATGGATC AACTGCAAACACCAACGACG ACAACGATAC 2081 333 1.80e-08 GCTACGCACG TAGTACAAACATCAGCGGCG TCTGTTGCAG 7540 211 4.20e-08 GTTACGTTCT CAGAGATCACACCGACGTCG TCATACGATT 20831 474 5.37e-08 ACGTTCGGTG TACCGCTAAACCCAACAACA GACAGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24479 1.7e-10 476_[+2]_4 25915 1.9e-09 445_[+2]_35 25786 1.9e-09 470_[+2]_10 37790 3.4e-09 7_[+2]_473 2081 1.8e-08 332_[+2]_148 7540 4.2e-08 210_[+2]_270 20831 5.4e-08 473_[+2]_7 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=7 24479 ( 477) CACCACTAACATCAACAACA 1 25915 ( 446) CAGAAACAACATCAACATCA 1 25786 ( 471) CAGTAACAACATCATCGACA 1 37790 ( 8) AACTGCAAACACCAACGACG 1 2081 ( 333) TAGTACAAACATCAGCGGCG 1 7540 ( 211) CAGAGATCACACCGACGTCG 1 20831 ( 474) TACCGCTAAACCCAACAACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7696 bayes= 9.94496 E= 6.4e-001 -91 133 -945 8 189 -945 -945 -945 -945 91 129 -945 9 33 -945 66 109 -945 88 -945 67 133 -945 -945 9 33 -945 66 167 -67 -945 -945 189 -945 -945 -945 -91 191 -945 -945 167 -67 -945 -945 -945 91 -945 108 -945 214 -945 -945 167 -945 -71 -945 141 -945 -71 -92 -945 214 -945 -945 67 -945 129 -945 109 -945 -71 8 -945 214 -945 -945 109 -945 88 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 6.4e-001 0.142857 0.571429 0.000000 0.285714 1.000000 0.000000 0.000000 0.000000 0.000000 0.428571 0.571429 0.000000 0.285714 0.285714 0.000000 0.428571 0.571429 0.000000 0.428571 0.000000 0.428571 0.571429 0.000000 0.000000 0.285714 0.285714 0.000000 0.428571 0.857143 0.142857 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 0.428571 0.000000 0.571429 0.000000 1.000000 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.714286 0.000000 0.142857 0.142857 0.000000 1.000000 0.000000 0.000000 0.428571 0.000000 0.571429 0.000000 0.571429 0.000000 0.142857 0.285714 0.000000 1.000000 0.000000 0.000000 0.571429 0.000000 0.428571 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CT]A[GC][TAC][AG][CA][TAC]AACA[TC]CAAC[GA][AT]C[AG] -------------------------------------------------------------------------------- Time 4.57 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 9 llr = 123 E-value = 1.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::1112:1:77::76: pos.-specific C 1::::::::31::::: probability G 183:94:8a:18a32: matrix T 8269:3a1::12::2a bits 2.1 * * 1.9 * * * * 1.7 * * * * 1.5 * * * * * Relative 1.3 * ** * * ** * Entropy 1.1 * ** **** *** * (19.7 bits) 0.9 ** ** **** *** * 0.6 ***** **** *** * 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TGTTGGTGGAAGGAAT consensus TG T C T GG sequence A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 22887 161 1.08e-08 GAAGCGTTCA TGTTGTTGGAAGGGGT GGTGGGTGAA 25915 314 4.65e-08 GAGTTCTTTT TGGTGATGGAATGAAT CCCATCTTCA 25786 17 9.99e-08 AGACGGATGT TGGTGGTTGAAGGAGT TGTGCATATG 8820 70 2.51e-07 CCAAGTGACG CGTTAGTGGAAGGAAT GATTGAAGCG 2081 128 2.68e-07 AACACGACGA TGGAGGTGGAGGGAAT GTTGTAAGGG 38651 318 2.91e-07 TTCGTTTGTC TTTTGATGGCAGGGTT AGTTTTACGC 22312 307 2.91e-07 GTGTCCTGGT GGTTGTTGGCATGAAT CGATTGTTAG 268286 299 6.95e-07 GCATTTAGCA TTTTGGTGGCTGGGTT CGTCTTTGCT 2797 149 1.11e-06 AAGATTGTGG TGATGTTAGACGGAAT TGTGACAGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22887 1.1e-08 160_[+3]_324 25915 4.6e-08 313_[+3]_171 25786 1e-07 16_[+3]_468 8820 2.5e-07 69_[+3]_415 2081 2.7e-07 127_[+3]_357 38651 2.9e-07 317_[+3]_167 22312 2.9e-07 306_[+3]_178 268286 7e-07 298_[+3]_186 2797 1.1e-06 148_[+3]_336 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=9 22887 ( 161) TGTTGTTGGAAGGGGT 1 25915 ( 314) TGGTGATGGAATGAAT 1 25786 ( 17) TGGTGGTTGAAGGAGT 1 8820 ( 70) CGTTAGTGGAAGGAAT 1 2081 ( 128) TGGAGGTGGAGGGAAT 1 38651 ( 318) TTTTGATGGCAGGGTT 1 22312 ( 307) GGTTGTTGGCATGAAT 1 268286 ( 299) TTTTGGTGGCTGGGTT 1 2797 ( 149) TGATGTTAGACGGAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7760 bayes= 9.88469 E= 1.2e+001 -982 -103 -107 152 -982 -982 174 -28 -127 -982 51 104 -127 -982 -982 172 -127 -982 193 -982 -27 -982 93 30 -982 -982 -982 189 -127 -982 174 -128 -982 -982 210 -982 131 55 -982 -982 131 -103 -107 -128 -982 -982 174 -28 -982 -982 210 -982 131 -982 51 -982 105 -982 -7 -28 -982 -982 -982 189 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 1.2e+001 0.000000 0.111111 0.111111 0.777778 0.000000 0.000000 0.777778 0.222222 0.111111 0.000000 0.333333 0.555556 0.111111 0.000000 0.000000 0.888889 0.111111 0.000000 0.888889 0.000000 0.222222 0.000000 0.444444 0.333333 0.000000 0.000000 0.000000 1.000000 0.111111 0.000000 0.777778 0.111111 0.000000 0.000000 1.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.666667 0.111111 0.111111 0.111111 0.000000 0.000000 0.777778 0.222222 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.555556 0.000000 0.222222 0.222222 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[GT][TG]TG[GTA]TGG[AC]A[GT]G[AG][AGT]T -------------------------------------------------------------------------------- Time 6.95 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 2081 1.06e-07 127_[+3(2.68e-07)]_189_\ [+2(1.80e-08)]_148 20831 4.15e-05 125_[+1(6.60e-05)]_327_\ [+2(5.37e-08)]_7 22312 1.69e-08 306_[+3(2.91e-07)]_148_\ [+1(1.83e-09)]_9 22887 1.49e-05 160_[+3(1.08e-08)]_324 24112 4.35e-01 500 24479 1.05e-06 476_[+2(1.75e-10)]_4 24718 5.57e-05 38_[+1(1.54e-08)]_441 25786 2.11e-16 16_[+3(9.99e-08)]_114_\ [+2(1.54e-05)]_221_[+1(1.02e-11)]_62_[+2(1.90e-09)]_10 25915 2.39e-09 313_[+3(4.65e-08)]_116_\ [+2(1.90e-09)]_35 268286 4.73e-07 298_[+3(6.95e-07)]_29_\ [+1(1.82e-08)]_136 2797 6.95e-03 148_[+3(1.11e-06)]_336 37790 2.70e-10 7_[+2(3.41e-09)]_41_[+1(3.81e-09)]_\ 173_[+2(7.38e-05)]_15_[+1(5.11e-05)]_182 38651 9.90e-04 317_[+3(2.91e-07)]_167 6616 2.26e-01 500 7540 3.37e-04 210_[+2(4.20e-08)]_270 8820 1.80e-09 69_[+3(2.51e-07)]_165_\ [+3(3.23e-05)]_5_[+1(2.82e-10)]_208 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************