******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/496/496.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42565 1.0000 500 32176 1.0000 500 43043 1.0000 500 43099 1.0000 500 9617 1.0000 500 46399 1.0000 500 2394 1.0000 500 48752 1.0000 500 32847 1.0000 500 49980 1.0000 500 10583 1.0000 500 11041 1.0000 500 11319 1.0000 500 54394 1.0000 500 35579 1.0000 500 32119 1.0000 500 50627 1.0000 500 48138 1.0000 500 47545 1.0000 500 34343 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/496/496.seqs.fa -oc motifs/496 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10000 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.270 C 0.258 G 0.221 T 0.251 Background letter frequencies (from dataset with add-one prior applied): A 0.269 C 0.258 G 0.221 T 0.251 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 9 llr = 136 E-value = 4.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :1:3::31::::7:1123::: pos.-specific C :9714461121a1::3:1:11 probability G a:3612:1::2::266:::46 matrix T ::::4317987:283:86a43 bits 2.2 * 2.0 * * * 1.7 * * * 1.5 ** * * * Relative 1.3 ** ** * * * * Entropy 1.1 *** ** * * * * (21.8 bits) 0.9 *** **** * * * 0.7 ***** *************** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GCCGCCCTTTTCATGGTTTGG consensus GATTA CG TGTCAA TT sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 46399 331 3.90e-10 CTGTGAATAC GCGGCCATTTTCATGCTTTTT CGGAAAGGGG 35579 43 5.99e-10 CTTTCCTTTT GCCGCTATTTCCATGGTTTTG ACGGCCATTT 34343 10 4.43e-09 ATCTTATTT GCGGTTATTTGCAGTGTTTGG GCCCAAAACA 49980 300 3.00e-08 AGCCCGCACC GCCGTGCTCCTCCTGGTTTGG ATGCTCTGCT 48138 298 6.67e-08 CAGCACACTC GACCCGCTTTTCATGATTTGG GGAAATCGAA 10583 27 1.36e-07 CAGACAGAGC GCCGTCCATTGCATAGTCTTT CATAAGCAGA 48752 13 1.82e-07 TGACGGTAGA GCGAGCCCTTTCTTTCTATTG TTCGTCGTGG 54394 443 4.02e-07 TAACTGTGTG GCCACCTGTTTCATGCAATGC AGTTATTCTC 42565 151 6.74e-07 GATTCTCGAT GCCATTCTTCTCTGTGAATCT CGCGCAAAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46399 3.9e-10 330_[+1]_149 35579 6e-10 42_[+1]_437 34343 4.4e-09 9_[+1]_470 49980 3e-08 299_[+1]_180 48138 6.7e-08 297_[+1]_182 10583 1.4e-07 26_[+1]_453 48752 1.8e-07 12_[+1]_467 54394 4e-07 442_[+1]_37 42565 6.7e-07 150_[+1]_329 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=9 46399 ( 331) GCGGCCATTTTCATGCTTTTT 1 35579 ( 43) GCCGCTATTTCCATGGTTTTG 1 34343 ( 10) GCGGTTATTTGCAGTGTTTGG 1 49980 ( 300) GCCGTGCTCCTCCTGGTTTGG 1 48138 ( 298) GACCCGCTTTTCATGATTTGG 1 10583 ( 27) GCCGTCCATTGCATAGTCTTT 1 48752 ( 13) GCGAGCCCTTTCTTTCTATTG 1 54394 ( 443) GCCACCTGTTTCATGCAATGC 1 42565 ( 151) GCCATTCTTCTCTGTGAATCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9600 bayes= 10.192 E= 4.8e+002 -982 -982 218 -982 -128 178 -982 -982 -982 137 59 -982 31 -122 133 -982 -982 78 -99 82 -982 78 1 41 31 110 -982 -118 -128 -122 -99 141 -982 -122 -982 182 -982 -22 -982 163 -982 -122 1 141 -982 195 -982 -982 131 -122 -982 -18 -982 -982 1 163 -128 -982 133 41 -128 37 133 -982 -28 -982 -982 163 31 -122 -982 114 -982 -982 -982 199 -982 -122 101 82 -982 -122 133 41 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 4.8e+002 0.000000 0.000000 1.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.333333 0.111111 0.555556 0.000000 0.000000 0.444444 0.111111 0.444444 0.000000 0.444444 0.222222 0.333333 0.333333 0.555556 0.000000 0.111111 0.111111 0.111111 0.111111 0.666667 0.000000 0.111111 0.000000 0.888889 0.000000 0.222222 0.000000 0.777778 0.000000 0.111111 0.222222 0.666667 0.000000 1.000000 0.000000 0.000000 0.666667 0.111111 0.000000 0.222222 0.000000 0.000000 0.222222 0.777778 0.111111 0.000000 0.555556 0.333333 0.111111 0.333333 0.555556 0.000000 0.222222 0.000000 0.000000 0.777778 0.333333 0.111111 0.000000 0.555556 0.000000 0.000000 0.000000 1.000000 0.000000 0.111111 0.444444 0.444444 0.000000 0.111111 0.555556 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GC[CG][GA][CT][CTG][CA]TT[TC][TG]C[AT][TG][GT][GC][TA][TA]T[GT][GT] -------------------------------------------------------------------------------- Time 3.63 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 12 llr = 158 E-value = 1.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::6:1::1:::734311:3:3 pos.-specific C :7:832:3:632351:283:5 probability G 5:113::4:18:21:83::9: matrix T 533138a3a3:23:6143412 bits 2.2 2.0 * * 1.7 * * * 1.5 * * * Relative 1.3 ** * * * * Entropy 1.1 ** * ** * * * * * (19.1 bits) 0.9 ** * ** * * * * * 0.7 **** ** **** *** * * 0.4 **** ** **** *** **** 0.2 ************ ******** 0.0 --------------------- Multilevel GCACGTTGTCGATCTGTCTGC consensus TTT T C TC AAA GTC A sequence C T C A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 2394 288 4.71e-13 CAAAGTCGTT GCACGTTGTCGATCTGGCTGC AATCTAACGA 46399 184 2.99e-08 CCGTCACGAC GTACGTTCTCGCTATGCCTGC TGCGCACGCC 50627 231 3.38e-08 GTTTTTCTAC GCACGCTGTCGAGCAGTTTGC TCTTTACTGC 43099 265 1.04e-07 GCACGCACGC GCTCATTGTTGAAGTGTCCGC ACACACAAGC 43043 477 4.28e-07 GGTCTCACCA TCACTTTATCGAACTGTCATA ACC 54394 49 5.95e-07 CACGGGGTCG GCACTCTCTCGACACGGTCGC AGAATATATC 35579 388 7.54e-07 ATCGGAAATG TCACTTTTTTGTACAGACCGT GAAAAAATGG 48138 271 8.14e-07 GGACAAATTA GCTCGTTTTCCTTCTATCAGC ACACTCGACC 34343 398 1.10e-06 CACCAATAAG TTGCTTTGTTCCCCTGTCAGA ACTGTAGCTA 48752 361 1.10e-06 AATACCGGGC TCAGCTTGTTGATATTGCTGT TGTTGGTCTT 42565 350 2.03e-06 AGAAACAATT TTTTCTTCTCCACAAGGCCGA AATCGTGGTG 11041 392 2.31e-06 AATCTTCGGA TTTCCTTTTGGAGAAGCTTGA CAGCCTTTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 2394 4.7e-13 287_[+2]_192 46399 3e-08 183_[+2]_296 50627 3.4e-08 230_[+2]_249 43099 1e-07 264_[+2]_215 43043 4.3e-07 476_[+2]_3 54394 6e-07 48_[+2]_431 35579 7.5e-07 387_[+2]_92 48138 8.1e-07 270_[+2]_209 34343 1.1e-06 397_[+2]_82 48752 1.1e-06 360_[+2]_119 42565 2e-06 349_[+2]_130 11041 2.3e-06 391_[+2]_88 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=12 2394 ( 288) GCACGTTGTCGATCTGGCTGC 1 46399 ( 184) GTACGTTCTCGCTATGCCTGC 1 50627 ( 231) GCACGCTGTCGAGCAGTTTGC 1 43099 ( 265) GCTCATTGTTGAAGTGTCCGC 1 43043 ( 477) TCACTTTATCGAACTGTCATA 1 54394 ( 49) GCACTCTCTCGACACGGTCGC 1 35579 ( 388) TCACTTTTTTGTACAGACCGT 1 48138 ( 271) GCTCGTTTTCCTTCTATCAGC 1 34343 ( 398) TTGCTTTGTTCCCCTGTCAGA 1 48752 ( 361) TCAGCTTGTTGATATTGCTGT 1 42565 ( 350) TTTTCTTCTCCACAAGGCCGA 1 11041 ( 392) TTTCCTTTTGGAGAAGCTTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9600 bayes= 10.09 E= 1.4e+002 -1023 -1023 118 99 -1023 137 -1023 41 111 -1023 -140 41 -1023 169 -140 -159 -169 -5 59 41 -1023 -63 -1023 173 -1023 -1023 -1023 199 -169 -5 92 -1 -1023 -1023 -1023 199 -1023 117 -140 41 -1023 -5 176 -1023 131 -63 -1023 -59 -11 -5 -41 41 63 95 -140 -1023 31 -163 -1023 121 -169 -1023 192 -159 -169 -63 59 73 -1023 154 -1023 -1 -11 37 -1023 73 -1023 -1023 205 -159 31 95 -1023 -59 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 12 E= 1.4e+002 0.000000 0.000000 0.500000 0.500000 0.000000 0.666667 0.000000 0.333333 0.583333 0.000000 0.083333 0.333333 0.000000 0.833333 0.083333 0.083333 0.083333 0.250000 0.333333 0.333333 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.083333 0.250000 0.416667 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.583333 0.083333 0.333333 0.000000 0.250000 0.750000 0.000000 0.666667 0.166667 0.000000 0.166667 0.250000 0.250000 0.166667 0.333333 0.416667 0.500000 0.083333 0.000000 0.333333 0.083333 0.000000 0.583333 0.083333 0.000000 0.833333 0.083333 0.083333 0.166667 0.333333 0.416667 0.000000 0.750000 0.000000 0.250000 0.250000 0.333333 0.000000 0.416667 0.000000 0.000000 0.916667 0.083333 0.333333 0.500000 0.000000 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GT][CT][AT]C[GTC]TT[GCT]T[CT][GC]A[TAC][CA][TA]G[TG][CT][TCA]G[CA] -------------------------------------------------------------------------------- Time 7.12 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 9 llr = 119 E-value = 2.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :1:3::::::22a:3: pos.-specific C 9::::8323:73:1:: probability G :222:27:::12:::3 matrix T 1784a::87a:2:977 bits 2.2 2.0 * * * 1.7 * * * 1.5 * * * ** Relative 1.3 * * ** * * ** Entropy 1.1 * * ****** **** (19.0 bits) 0.9 *** ****** **** 0.7 *** ******* **** 0.4 *********** **** 0.2 *********** **** 0.0 ---------------- Multilevel CTTTTCGTTTCCATTT consensus GGA GCCC AA AG sequence G G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 46399 478 1.02e-09 AACACGGAGT CTTATCGTTTCCATTT TAGCACA 48752 198 2.58e-07 GATCCGCCTT CTTTTGGTTTATATTG TTCGGACCAA 11319 116 2.80e-07 GCGCGAACTC TGTTTCGTTTCGATTT ATAGAGTACG 32176 246 3.12e-07 TTTCTCAATG CTTATCCTTTCCACTT TTTCGACCGC 10583 205 5.78e-07 GGAAAGATAA CTTTTGGTTTATATAG ATCTAAACAT 32847 113 6.76e-07 GTATCAAAAT CTGATCGTCTGGATTT ACATTGGAAC 43099 26 7.84e-07 AAACGGATCA CATTTCCTCTCCATAT ATTGCTATGA 50627 172 1.13e-06 AACTTATAAA CTTGTCCCCTCAATAT AGGGATATTG 49980 237 1.46e-06 CCTAGTCTCG CGGGTCGCTTCAATTG ACGTGAATGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46399 1e-09 477_[+3]_7 48752 2.6e-07 197_[+3]_287 11319 2.8e-07 115_[+3]_369 32176 3.1e-07 245_[+3]_239 10583 5.8e-07 204_[+3]_280 32847 6.8e-07 112_[+3]_372 43099 7.8e-07 25_[+3]_459 50627 1.1e-06 171_[+3]_313 49980 1.5e-06 236_[+3]_248 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=9 46399 ( 478) CTTATCGTTTCCATTT 1 48752 ( 198) CTTTTGGTTTATATTG 1 11319 ( 116) TGTTTCGTTTCGATTT 1 32176 ( 246) CTTATCCTTTCCACTT 1 10583 ( 205) CTTTTGGTTTATATAG 1 32847 ( 113) CTGATCGTCTGGATTT 1 43099 ( 26) CATTTCCTCTCCATAT 1 50627 ( 172) CTTGTCCCCTCAATAT 1 49980 ( 237) CGGGTCGCTTCAATTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 10.2069 E= 2.0e+002 -982 178 -982 -118 -128 -982 1 141 -982 -982 1 163 31 -982 1 82 -982 -982 -982 199 -982 159 1 -982 -982 37 159 -982 -982 -22 -982 163 -982 37 -982 141 -982 -982 -982 199 -28 137 -99 -982 -28 37 1 -18 189 -982 -982 -982 -982 -122 -982 182 31 -982 -982 141 -982 -982 59 141 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 2.0e+002 0.000000 0.888889 0.000000 0.111111 0.111111 0.000000 0.222222 0.666667 0.000000 0.000000 0.222222 0.777778 0.333333 0.000000 0.222222 0.444444 0.000000 0.000000 0.000000 1.000000 0.000000 0.777778 0.222222 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.222222 0.000000 0.777778 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.222222 0.666667 0.111111 0.000000 0.222222 0.333333 0.222222 0.222222 1.000000 0.000000 0.000000 0.000000 0.000000 0.111111 0.000000 0.888889 0.333333 0.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.666667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[TG][TG][TAG]T[CG][GC][TC][TC]T[CA][CAGT]AT[TA][TG] -------------------------------------------------------------------------------- Time 10.97 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42565 1.37e-05 150_[+1(6.74e-07)]_178_\ [+2(2.03e-06)]_130 32176 9.37e-04 245_[+3(3.12e-07)]_239 43043 3.00e-03 476_[+2(4.28e-07)]_3 43099 5.63e-07 25_[+3(7.84e-07)]_223_\ [+2(1.04e-07)]_215 9617 8.70e-01 500 46399 1.18e-15 183_[+2(2.99e-08)]_126_\ [+1(3.90e-10)]_126_[+3(1.02e-09)]_7 2394 9.63e-09 287_[+2(4.71e-13)]_192 48752 2.08e-09 12_[+1(1.82e-07)]_164_\ [+3(2.58e-07)]_147_[+2(1.10e-06)]_119 32847 3.64e-03 112_[+3(6.76e-07)]_372 49980 1.07e-06 236_[+3(1.46e-06)]_47_\ [+1(3.00e-08)]_56_[+1(5.53e-05)]_103 10583 2.27e-06 26_[+1(1.36e-07)]_157_\ [+3(5.78e-07)]_280 11041 5.31e-03 391_[+2(2.31e-06)]_88 11319 3.28e-04 115_[+3(2.80e-07)]_77_\ [+1(7.17e-05)]_271 54394 2.51e-06 48_[+2(5.95e-07)]_73_[+1(7.98e-05)]_\ 279_[+1(4.02e-07)]_37 35579 1.36e-08 42_[+1(5.99e-10)]_324_\ [+2(7.54e-07)]_92 32119 7.91e-01 500 50627 1.40e-06 171_[+3(1.13e-06)]_43_\ [+2(3.38e-08)]_45_[+2(7.15e-05)]_183 48138 9.35e-07 270_[+2(8.14e-07)]_6_[+1(6.67e-08)]_\ 182 47545 3.98e-01 500 34343 2.41e-07 9_[+1(4.43e-09)]_367_[+2(1.10e-06)]_\ 82 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************