******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/232/232.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 8950 1.0000 500 8926 1.0000 500 8112 1.0000 500 43233 1.0000 500 42116 1.0000 500 37519 1.0000 500 38448 1.0000 500 38525 1.0000 500 39274 1.0000 500 55112 1.0000 500 50238 1.0000 500 42361 1.0000 500 45443 1.0000 500 45510 1.0000 500 12265 1.0000 500 33457 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/232/232.seqs.fa -oc motifs/232 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.249 C 0.261 G 0.246 T 0.244 Background letter frequencies (from dataset with add-one prior applied): A 0.249 C 0.261 G 0.246 T 0.245 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 10 llr = 140 E-value = 1.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 32:2::36::2:12:3:41:: pos.-specific C 6::4:112:3:9::94:631: probability G :5::8332158:6::2a:373 matrix T 13a4263:92:13811::327 bits 2.0 * * 1.8 * * 1.6 * * * 1.4 * * * * * Relative 1.2 * * * ** ** * * Entropy 1.0 * * * ** ** ** * (20.1 bits) 0.8 * ** * ***** ** ** 0.6 *** ** ******** ** ** 0.4 ****** ******** ** ** 0.2 ********************* 0.0 --------------------- Multilevel CGTCGTAATGGCGTCCGCCGT consensus AT TTGGC CA TA A AGTG sequence A A TG T G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 42116 428 2.03e-10 TTGGTACGGC CGTTTTGATGGCGTCCGCCGT TGCCGCATTC 37519 394 2.45e-09 ATCACATATG CTTAGTAATTGCGTCAGCTGT ACAGTCTAGC 42361 323 1.16e-08 TCACGACAAC CGTCGGTATCGCTTCCGCCTT TCAGCCAGCG 45443 101 2.38e-08 TCGACTACGT CATCGTTCTGACGTCAGCCGT CGTGGATGTC 33457 394 4.17e-08 ATCGGCACTC CATTGTGATGGCTTCTGATGG CCGACGGAAT 12265 365 4.26e-07 ATGGTAAAAC AGTCGCACTCGCTACCGCTGT TGTCGTGTCG 43233 241 4.93e-07 AGGTAGTAGT AGTAGTAATGGCGACGGAGCG AGCGACGAAA 38525 195 9.63e-07 ACGTGCGTTG CTTTGGCGTTGCGTTGGAGGT TGGAGGCGTG 50238 190 1.82e-06 TTTTTTCGCA AGTCGTTGGCGTGTCCGAGTT GCAGCATGAT 45510 274 4.21e-06 GCATTCTAGT TTTTTGGATGACATCAGCAGG CGATTTTTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42116 2e-10 427_[+1]_52 37519 2.4e-09 393_[+1]_86 42361 1.2e-08 322_[+1]_157 45443 2.4e-08 100_[+1]_379 33457 4.2e-08 393_[+1]_86 12265 4.3e-07 364_[+1]_115 43233 4.9e-07 240_[+1]_239 38525 9.6e-07 194_[+1]_285 50238 1.8e-06 189_[+1]_290 45510 4.2e-06 273_[+1]_206 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=10 42116 ( 428) CGTTTTGATGGCGTCCGCCGT 1 37519 ( 394) CTTAGTAATTGCGTCAGCTGT 1 42361 ( 323) CGTCGGTATCGCTTCCGCCTT 1 45443 ( 101) CATCGTTCTGACGTCAGCCGT 1 33457 ( 394) CATTGTGATGGCTTCTGATGG 1 12265 ( 365) AGTCGCACTCGCTACCGCTGT 1 43233 ( 241) AGTAGTAATGGCGACGGAGCG 1 38525 ( 195) CTTTGGCGTTGCGTTGGAGGT 1 50238 ( 190) AGTCGTTGGCGTGTCCGAGTT 1 45510 ( 274) TTTTTGGATGACATCAGCAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 10.5274 E= 1.4e+002 27 120 -997 -129 -32 -997 102 29 -997 -997 -997 203 -32 62 -997 71 -997 -997 170 -29 -997 -138 29 129 27 -138 29 29 127 -38 -30 -997 -997 -997 -130 188 -997 20 102 -29 -32 -997 170 -997 -997 179 -997 -129 -131 -997 129 29 -32 -997 -997 171 -997 179 -997 -129 27 62 -30 -129 -997 -997 202 -997 68 120 -997 -997 -131 20 29 29 -997 -138 151 -29 -997 -997 29 152 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 1.4e+002 0.300000 0.600000 0.000000 0.100000 0.200000 0.000000 0.500000 0.300000 0.000000 0.000000 0.000000 1.000000 0.200000 0.400000 0.000000 0.400000 0.000000 0.000000 0.800000 0.200000 0.000000 0.100000 0.300000 0.600000 0.300000 0.100000 0.300000 0.300000 0.600000 0.200000 0.200000 0.000000 0.000000 0.000000 0.100000 0.900000 0.000000 0.300000 0.500000 0.200000 0.200000 0.000000 0.800000 0.000000 0.000000 0.900000 0.000000 0.100000 0.100000 0.000000 0.600000 0.300000 0.200000 0.000000 0.000000 0.800000 0.000000 0.900000 0.000000 0.100000 0.300000 0.400000 0.200000 0.100000 0.000000 0.000000 1.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.100000 0.300000 0.300000 0.300000 0.000000 0.100000 0.700000 0.200000 0.000000 0.000000 0.300000 0.700000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CA][GTA]T[CTA][GT][TG][AGT][ACG]T[GCT][GA]C[GT][TA]C[CAG]G[CA][CGT][GT][TG] -------------------------------------------------------------------------------- Time 2.38 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 10 llr = 108 E-value = 1.1e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 2a31::::236: pos.-specific C 5:::a78a:44: probability G 1::::3::81:a matrix T 2:79::2::2:: bits 2.0 * * * * 1.8 * * * * 1.6 * ** * * 1.4 * ** * * Relative 1.2 **** *** * Entropy 1.0 ******** ** (15.6 bits) 0.8 ******** ** 0.6 ******** ** 0.4 ******** ** 0.2 ************ 0.0 ------------ Multilevel CATTCCCCGCAG consensus A A GT AAC sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 55112 103 7.12e-08 GCTGTCCGCC CATTCCCCGCAG TCCTCCGGAT 38448 463 4.19e-07 ATAACCCTAC CATTCCCCGACG ACGATGATCT 12265 68 6.26e-07 TGCTCTCTTA CAATCCCCGCAG TATCATCAGT 39274 5 4.97e-06 ACAC AATTCCCCGGAG ACGCACCCGA 33457 142 8.93e-06 TCTTGTTTCG CAATCCTCGTAG TCAGAGTCGA 8112 107 8.93e-06 GAATACTCGG GATTCGCCGCCG GCTTTGGAGT 38525 394 1.12e-05 ATGTATGATC CAATCGCCACAG CGGTTTCACA 42361 365 1.27e-05 CACCTTCTCG TATTCGCCAAAG TCCACGGACA 42116 55 1.34e-05 CAATCTGCCG AATTCCTCGTCG CCAATGTAAG 8926 241 1.93e-05 CGTTTTCGTC TATACCCCGACG TGTAGATCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 55112 7.1e-08 102_[+2]_386 38448 4.2e-07 462_[+2]_26 12265 6.3e-07 67_[+2]_421 39274 5e-06 4_[+2]_484 33457 8.9e-06 141_[+2]_347 8112 8.9e-06 106_[+2]_382 38525 1.1e-05 393_[+2]_95 42361 1.3e-05 364_[+2]_124 42116 1.3e-05 54_[+2]_434 8926 1.9e-05 240_[+2]_248 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=10 55112 ( 103) CATTCCCCGCAG 1 38448 ( 463) CATTCCCCGACG 1 12265 ( 68) CAATCCCCGCAG 1 39274 ( 5) AATTCCCCGGAG 1 33457 ( 142) CAATCCTCGTAG 1 8112 ( 107) GATTCGCCGCCG 1 38525 ( 394) CAATCGCCACAG 1 42361 ( 365) TATTCGCCAAAG 1 42116 ( 55) AATTCCTCGTCG 1 8926 ( 241) TATACCCCGACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 9.86175 E= 1.1e+003 -32 94 -130 -29 200 -997 -997 -997 27 -997 -997 152 -131 -997 -997 188 -997 194 -997 -997 -997 142 29 -997 -997 162 -997 -29 -997 194 -997 -997 -32 -997 170 -997 27 62 -130 -29 127 62 -997 -997 -997 -997 202 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 1.1e+003 0.200000 0.500000 0.100000 0.200000 1.000000 0.000000 0.000000 0.000000 0.300000 0.000000 0.000000 0.700000 0.100000 0.000000 0.000000 0.900000 0.000000 1.000000 0.000000 0.000000 0.000000 0.700000 0.300000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.300000 0.400000 0.100000 0.200000 0.600000 0.400000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CAT]A[TA]TC[CG][CT]C[GA][CAT][AC]G -------------------------------------------------------------------------------- Time 4.63 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 10 llr = 121 E-value = 2.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::1::1:25214::: pos.-specific C ::3:2::42246:1:a probability G 2:77:4:52::::2:: matrix T 8a:28691434367a: bits 2.0 * ** 1.8 * ** 1.6 * * ** 1.4 * * ** Relative 1.2 *** * * ** Entropy 1.0 *** *** * ** (17.5 bits) 0.8 ******* **** 0.6 ******** * ***** 0.4 ******** ******* 0.2 ******** ******* 0.0 ---------------- Multilevel TTGGTTTGTACCTTTC consensus G CTCG CATTTAG sequence CCA G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 42361 210 6.41e-09 CCCTTAGTTG TTGGTTTCTTCCTTTC TCCTTGGTCG 39274 315 8.75e-08 CTACGATGCT TTGGTGTCCTCCTTTC ATTCCGCTTC 38448 484 2.53e-07 GACGATGATC TTGGTGTGGATCAGTC G 43233 443 6.98e-07 TCGAGTGTCC TTGGTGTCACTTATTC ACACATACAC 38525 240 1.13e-06 CACCCGTGAC GTGGTTTGAAATATTC AAACAAGAAA 8112 263 1.13e-06 GGTACACTCC TTCTTTTGTATATTTC ATGATCGCTG 33457 332 4.51e-06 ACAAATCGTC TTCGTTTTCACCTCTC TAGCTAGAGC 8950 467 4.51e-06 TGTCCATTAC TTGTCGTGTCATTTTC TTTGCAGTGG 45510 88 4.81e-06 CTGGCACGTG TTCATTACTATCTTTC TAGTCGCGCC 8926 372 5.43e-06 TTTTTTGGTG GTGGCTTGGTCCAGTC TCCCATCTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42361 6.4e-09 209_[+3]_275 39274 8.8e-08 314_[+3]_170 38448 2.5e-07 483_[+3]_1 43233 7e-07 442_[+3]_42 38525 1.1e-06 239_[+3]_245 8112 1.1e-06 262_[+3]_222 33457 4.5e-06 331_[+3]_153 8950 4.5e-06 466_[+3]_18 45510 4.8e-06 87_[+3]_397 8926 5.4e-06 371_[+3]_113 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=10 42361 ( 210) TTGGTTTCTTCCTTTC 1 39274 ( 315) TTGGTGTCCTCCTTTC 1 38448 ( 484) TTGGTGTGGATCAGTC 1 43233 ( 443) TTGGTGTCACTTATTC 1 38525 ( 240) GTGGTTTGAAATATTC 1 8112 ( 263) TTCTTTTGTATATTTC 1 33457 ( 332) TTCGTTTTCACCTCTC 1 8950 ( 467) TTGTCGTGTCATTTTC 1 45510 ( 88) TTCATTACTATCTTTC 1 8926 ( 372) GTGGCTTGGTCCAGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7760 bayes= 11.0732 E= 2.2e+003 -997 -997 -30 171 -997 -997 -997 203 -997 20 151 -997 -131 -997 151 -29 -997 -38 -997 171 -997 -997 70 129 -131 -997 -997 188 -997 62 102 -129 -32 -38 -30 71 100 -38 -997 29 -32 62 -997 71 -131 120 -997 29 68 -997 -997 129 -997 -138 -30 152 -997 -997 -997 203 -997 194 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 2.2e+003 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.300000 0.700000 0.000000 0.100000 0.000000 0.700000 0.200000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.400000 0.600000 0.100000 0.000000 0.000000 0.900000 0.000000 0.400000 0.500000 0.100000 0.200000 0.200000 0.200000 0.400000 0.500000 0.200000 0.000000 0.300000 0.200000 0.400000 0.000000 0.400000 0.100000 0.600000 0.000000 0.300000 0.400000 0.000000 0.000000 0.600000 0.000000 0.100000 0.200000 0.700000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TG]T[GC][GT][TC][TG]T[GC][TACG][ATC][CTA][CT][TA][TG]TC -------------------------------------------------------------------------------- Time 6.77 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8950 2.58e-02 466_[+3(4.51e-06)]_18 8926 1.18e-03 240_[+2(1.93e-05)]_119_\ [+3(5.43e-06)]_113 8112 3.79e-05 106_[+2(8.93e-06)]_144_\ [+3(1.13e-06)]_222 43233 4.67e-06 240_[+1(4.93e-07)]_181_\ [+3(6.98e-07)]_42 42116 8.68e-08 54_[+2(1.34e-05)]_361_\ [+1(2.03e-10)]_52 37519 1.49e-05 393_[+1(2.45e-09)]_86 38448 1.42e-06 462_[+2(4.19e-07)]_9_[+3(2.53e-07)]_\ 1 38525 3.17e-07 194_[+1(9.63e-07)]_24_\ [+3(1.13e-06)]_138_[+2(1.12e-05)]_95 39274 7.43e-06 4_[+2(4.97e-06)]_298_[+3(8.75e-08)]_\ 170 55112 3.99e-04 102_[+2(7.12e-08)]_386 50238 1.33e-02 189_[+1(1.82e-06)]_290 42361 5.10e-11 209_[+3(6.41e-09)]_97_\ [+1(1.16e-08)]_21_[+2(1.27e-05)]_124 45443 1.30e-04 100_[+1(2.38e-08)]_379 45510 3.14e-04 87_[+3(4.81e-06)]_170_\ [+1(4.21e-06)]_206 12265 5.97e-06 67_[+2(6.26e-07)]_285_\ [+1(4.26e-07)]_115 33457 5.22e-08 141_[+2(8.93e-06)]_178_\ [+3(4.51e-06)]_46_[+1(4.17e-08)]_86 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************