******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/499/499.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31652 1.0000 500 36356 1.0000 500 47930 1.0000 500 38634 1.0000 500 16517 1.0000 500 33768 1.0000 500 44549 1.0000 500 34672 1.0000 500 45215 1.0000 500 43934 1.0000 500 49050 1.0000 500 49327 1.0000 500 47230 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/499/499.seqs.fa -oc motifs/499 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.244 C 0.240 G 0.240 T 0.276 Background letter frequencies (from dataset with add-one prior applied): A 0.244 C 0.240 G 0.240 T 0.276 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 9 llr = 132 E-value = 1.7e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 12:9492::64:::11:23a pos.-specific C :29:6:1:a22:424224:: probability G 9:11:14::::1:841417: matrix T :6::::2a:2396::632:: bits 2.1 * * 1.9 ** * 1.6 * * ** * 1.4 * ** * ** * * Relative 1.2 * ** * ** * * ** Entropy 1.0 * **** ** *** ** (21.2 bits) 0.8 * **** ** *** ** 0.6 * **** *** **** ** 0.4 ****** ******** * ** 0.2 ******************** 0.0 -------------------- Multilevel GTCACAGTCAATTGCTGCGA consensus A A A CT CCGCTAA sequence C T TC CT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 43934 431 2.52e-11 CTCATTCACT GTCACAATCAATTGGTGCGA CCAAAAATCT 45215 414 5.38e-09 AGAGGAACTA GTCAAAGTCTCTCGCTGCAA GGCATTGCAT 31652 53 7.16e-09 TCTTCCTCGT GTCACAGTCATTTGCGTAGA ACAAATTACA 49327 343 1.33e-08 GTTCGCAGCA GACACAATCAATCCGTCCGA CACATCCATC 49050 452 1.91e-08 TGCAGTCAGA GTCACAGTCCATTGCATTGA GACCGCCGTA 47230 41 2.63e-07 ACGTGATGGA GCCAAGCTCATTCGGCTCGA CTGGCTAGTC 47930 176 7.16e-07 CGTGTTCGTT ATCACATTCCATCGACGAAA TGCCGCAACG 34672 413 8.72e-07 CATCGACGAC GACGAATTCACTTCGTCGGA CGCGACGATC 44549 139 1.05e-06 GATGCAGGCT GCGAAAGTCTTGTGCTGTAA GTGGTGACTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43934 2.5e-11 430_[+1]_50 45215 5.4e-09 413_[+1]_67 31652 7.2e-09 52_[+1]_428 49327 1.3e-08 342_[+1]_138 49050 1.9e-08 451_[+1]_29 47230 2.6e-07 40_[+1]_440 47930 7.2e-07 175_[+1]_305 34672 8.7e-07 412_[+1]_68 44549 1e-06 138_[+1]_342 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=9 43934 ( 431) GTCACAATCAATTGGTGCGA 1 45215 ( 414) GTCAAAGTCTCTCGCTGCAA 1 31652 ( 53) GTCACAGTCATTTGCGTAGA 1 49327 ( 343) GACACAATCAATCCGTCCGA 1 49050 ( 452) GTCACAGTCCATTGCATTGA 1 47230 ( 41) GCCAAGCTCATTCGGCTCGA 1 47930 ( 176) ATCACATTCCATCGACGAAA 1 34672 ( 413) GACGAATTCACTTCGTCGGA 1 44549 ( 139) GCGAAAGTCTTGTGCTGTAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6253 bayes= 9.57282 E= 1.7e+000 -113 -982 189 -982 -13 -11 -982 101 -982 189 -111 -982 187 -982 -111 -982 87 121 -982 -982 187 -982 -111 -982 -13 -111 89 -31 -982 -982 -982 185 -982 206 -982 -982 119 -11 -982 -31 87 -11 -982 27 -982 -982 -111 168 -982 89 -982 101 -982 -11 170 -982 -113 89 89 -982 -113 -11 -111 101 -982 -11 89 27 -13 89 -111 -31 45 -982 148 -982 204 -982 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 9 E= 1.7e+000 0.111111 0.000000 0.888889 0.000000 0.222222 0.222222 0.000000 0.555556 0.000000 0.888889 0.111111 0.000000 0.888889 0.000000 0.111111 0.000000 0.444444 0.555556 0.000000 0.000000 0.888889 0.000000 0.111111 0.000000 0.222222 0.111111 0.444444 0.222222 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.555556 0.222222 0.000000 0.222222 0.444444 0.222222 0.000000 0.333333 0.000000 0.000000 0.111111 0.888889 0.000000 0.444444 0.000000 0.555556 0.000000 0.222222 0.777778 0.000000 0.111111 0.444444 0.444444 0.000000 0.111111 0.222222 0.111111 0.555556 0.000000 0.222222 0.444444 0.333333 0.222222 0.444444 0.111111 0.222222 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[TAC]CA[CA]A[GAT]TC[ACT][ATC]T[TC][GC][CG][TC][GTC][CAT][GA]A -------------------------------------------------------------------------------- Time 1.56 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 8 llr = 94 E-value = 1.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::61::::::: pos.-specific C 1::1:81::1:1 probability G 9a:3::::1:5: matrix T ::a:939a9959 bits 2.1 * 1.9 ** * 1.6 ** * 1.4 *** * Relative 1.2 *** ****** * Entropy 1.0 *** ******** (16.9 bits) 0.8 ************ 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GGTATCTTTTGT consensus G T T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 49327 424 9.88e-08 TTCAGCGAGT GGTATCTTTTGT GAAGACTCTT 31652 185 1.31e-06 GTGTTGAATT GGTATCTTTCGT TGTCGTTCCT 34672 50 1.90e-06 TGTATAGTAC GGTATCCTTTTT TCCCAAGTCT 43934 221 2.01e-06 TCGTGGTCGT CGTATCTTTTTT GGTTGGTTCT 49050 29 2.25e-06 ACAAACTCGG GGTGTTTTTTTT GCATTGGAAC 44549 161 2.68e-06 TGCTGTAAGT GGTGACTTTTGT ATTGCCGGTA 33768 58 3.37e-06 TGGTAGTCTG GGTCTTTTTTGT GATTGGGAAC 45215 318 8.09e-06 GAATGAGCGT GGTATCTTGTTC CAGCCGTGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49327 9.9e-08 423_[+2]_65 31652 1.3e-06 184_[+2]_304 34672 1.9e-06 49_[+2]_439 43934 2e-06 220_[+2]_268 49050 2.3e-06 28_[+2]_460 44549 2.7e-06 160_[+2]_328 33768 3.4e-06 57_[+2]_431 45215 8.1e-06 317_[+2]_171 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=8 49327 ( 424) GGTATCTTTTGT 1 31652 ( 185) GGTATCTTTCGT 1 34672 ( 50) GGTATCCTTTTT 1 43934 ( 221) CGTATCTTTTTT 1 49050 ( 29) GGTGTTTTTTTT 1 44549 ( 161) GGTGACTTTTGT 1 33768 ( 58) GGTCTTTTTTGT 1 45215 ( 318) GGTATCTTGTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 9.63231 E= 1.5e+002 -965 -94 187 -965 -965 -965 206 -965 -965 -965 -965 185 136 -94 6 -965 -96 -965 -965 166 -965 164 -965 -14 -965 -94 -965 166 -965 -965 -965 185 -965 -965 -94 166 -965 -94 -965 166 -965 -965 106 85 -965 -94 -965 166 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 1.5e+002 0.000000 0.125000 0.875000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.625000 0.125000 0.250000 0.000000 0.125000 0.000000 0.000000 0.875000 0.000000 0.750000 0.000000 0.250000 0.000000 0.125000 0.000000 0.875000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.125000 0.000000 0.875000 0.000000 0.000000 0.500000 0.500000 0.000000 0.125000 0.000000 0.875000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GGT[AG]T[CT]TTTT[GT]T -------------------------------------------------------------------------------- Time 3.13 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 13 llr = 120 E-value = 1.9e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :118::::282: pos.-specific C 2:5:92::426: probability G 1241::a:3:22 matrix T 771118:a2::8 bits 2.1 * 1.9 ** 1.6 * ** 1.4 * ** * Relative 1.2 ** ** * Entropy 1.0 ***** * * (13.3 bits) 0.8 ** ***** * * 0.6 ** ***** *** 0.4 ******** *** 0.2 ************ 0.0 ------------ Multilevel TTCACTGTCACT consensus CGG C G GG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43934 386 8.08e-07 GAACAGAGCG TTCACCGTCACT GTCGGTTTCT 47930 232 1.52e-06 TCAGAAGAAG TTGACTGTTACT GTTATGTCTG 34672 124 1.94e-06 CCCTATTCCA TTCACTGTCAAT AGAAAGAGAT 31652 164 2.07e-06 ACATTTCTGC TTGACTGTGACG TGTTGAATTG 44549 405 5.73e-06 CAAAATTTTA TTGACTGTCAGG AACTCCTCCC 49050 78 9.27e-06 GTTTACTGCC TGCACTGTCCCT TAGTTTCCGC 49327 283 2.13e-05 ATTTTCGATC CTCACCGTCAAT CCGAGTCCTA 16517 381 2.92e-05 GCAACAACAA CTAACTGTAACT GTAACACCAG 47230 195 3.59e-05 GGTATGTGGT CTGTCTGTGACT GTGTAATTGG 36356 24 5.39e-05 CGTCTCCATA TGGACCGTGAGG ATCATTGTCT 45215 197 1.06e-04 AGCTGGTCGT TTCGTTGTGACT CATTGAGAGT 33768 5 1.54e-04 GTTA TACACTGTTCGT ACTGTGCTAC 38634 62 1.62e-04 ACCGGATCAT GGTACTGTAACT ATAAATTACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43934 8.1e-07 385_[+3]_103 47930 1.5e-06 231_[+3]_257 34672 1.9e-06 123_[+3]_365 31652 2.1e-06 163_[+3]_325 44549 5.7e-06 404_[+3]_84 49050 9.3e-06 77_[+3]_411 49327 2.1e-05 282_[+3]_206 16517 2.9e-05 380_[+3]_108 47230 3.6e-05 194_[+3]_294 36356 5.4e-05 23_[+3]_465 45215 0.00011 196_[+3]_292 33768 0.00015 4_[+3]_484 38634 0.00016 61_[+3]_427 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=13 43934 ( 386) TTCACCGTCACT 1 47930 ( 232) TTGACTGTTACT 1 34672 ( 124) TTCACTGTCAAT 1 31652 ( 164) TTGACTGTGACG 1 44549 ( 405) TTGACTGTCAGG 1 49050 ( 78) TGCACTGTCCCT 1 49327 ( 283) CTCACCGTCAAT 1 16517 ( 381) CTAACTGTAACT 1 47230 ( 195) CTGTCTGTGACT 1 36356 ( 24) TGGACCGTGAGG 1 45215 ( 197) TTCGTTGTGACT 1 33768 ( 5) TACACTGTTCGT 1 38634 ( 62) GGTACTGTAACT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 8.93074 E= 1.9e+003 -1035 -6 -164 132 -166 -1035 -5 132 -166 94 68 -184 180 -1035 -164 -184 -1035 194 -1035 -184 -1035 -6 -1035 148 -1035 -1035 206 -1035 -1035 -1035 -1035 185 -66 68 36 -84 180 -64 -1035 -1035 -66 136 -5 -1035 -1035 -1035 -5 148 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 1.9e+003 0.000000 0.230769 0.076923 0.692308 0.076923 0.000000 0.230769 0.692308 0.076923 0.461538 0.384615 0.076923 0.846154 0.000000 0.076923 0.076923 0.000000 0.923077 0.000000 0.076923 0.000000 0.230769 0.000000 0.769231 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.153846 0.384615 0.307692 0.153846 0.846154 0.153846 0.000000 0.000000 0.153846 0.615385 0.230769 0.000000 0.000000 0.000000 0.230769 0.769231 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TC][TG][CG]AC[TC]GT[CG]A[CG][TG] -------------------------------------------------------------------------------- Time 4.54 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31652 8.63e-10 52_[+1(7.16e-09)]_91_[+3(2.07e-06)]_\ 9_[+2(1.31e-06)]_304 36356 1.35e-01 23_[+3(5.39e-05)]_465 47930 2.50e-05 175_[+1(7.16e-07)]_36_\ [+3(1.52e-06)]_108_[+3(7.38e-05)]_137 38634 1.15e-01 500 16517 5.01e-02 118_[+3(9.51e-05)]_250_\ [+3(2.92e-05)]_108 33768 4.24e-03 57_[+2(3.37e-06)]_431 44549 4.13e-07 138_[+1(1.05e-06)]_2_[+2(2.68e-06)]_\ 232_[+3(5.73e-06)]_84 34672 9.54e-08 49_[+2(1.90e-06)]_62_[+3(1.94e-06)]_\ 277_[+1(8.72e-07)]_68 45215 1.30e-07 317_[+2(8.09e-06)]_84_\ [+1(5.38e-09)]_67 43934 2.73e-12 220_[+2(2.01e-06)]_153_\ [+3(8.08e-07)]_33_[+1(2.52e-11)]_50 49050 1.41e-08 28_[+2(2.25e-06)]_37_[+3(9.27e-06)]_\ 362_[+1(1.91e-08)]_29 49327 1.21e-09 282_[+3(2.13e-05)]_48_\ [+1(1.33e-08)]_61_[+2(9.88e-08)]_65 47230 1.63e-04 40_[+1(2.63e-07)]_134_\ [+3(3.59e-05)]_294 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************