******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/452/452.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 37902 1.0000 500 40473 1.0000 290 33459 1.0000 500 34728 1.0000 500 27135 1.0000 500 37222 1.0000 500 32340 1.0000 500 39576 1.0000 500 36473 1.0000 500 34943 1.0000 500 37299 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/452/452.seqs.fa -oc motifs/452 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5290 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.291 C 0.242 G 0.205 T 0.262 Background letter frequencies (from dataset with add-one prior applied): A 0.291 C 0.242 G 0.205 T 0.262 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 111 E-value = 1.1e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 8:3:::a:::::8:3:::::: pos.-specific C :82:5::2a223:23:28::: probability G ::22:8:2:::72::5:27:: matrix T 223852:7:88::8358:3aa bits 2.3 2.1 * 1.8 * * ** 1.6 ** * ** Relative 1.4 * * ** **** * ** ** Entropy 1.1 ** * ** ****** ****** (26.8 bits) 0.9 ** **** ****** ****** 0.7 ** *********** ****** 0.5 ** *********** ****** 0.2 ** ****************** 0.0 --------------------- Multilevel ACATCGATCTTGATAGTCGTT consensus T T C CT T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 39576 243 7.43e-12 GATCAAACAA ACATTGATCTTGATATTCGTT GAGAATGAGT 37902 390 1.10e-11 CGTTTTCCTG ACTTCGATCTTCATCGTCGTT TTGTGTTCTT 34728 390 1.37e-09 GTTTTCCTCA ACTGTGATCTCCATCGTCGTT TTGTGTTCTT 36473 476 3.85e-09 ATCAAACGGA ACGTTGATCTTGACATTGTTT GAGT 27135 144 1.22e-08 TTTGGTCTGG ACATCTACCCTGGTTTTCGTT CGCTCTCAAT 32340 169 2.44e-08 CCCGCCTGTT TTCTCGAGCTTGATTGCCTTT CGATACAAGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39576 7.4e-12 242_[+1]_237 37902 1.1e-11 389_[+1]_90 34728 1.4e-09 389_[+1]_90 36473 3.8e-09 475_[+1]_4 27135 1.2e-08 143_[+1]_336 32340 2.4e-08 168_[+1]_311 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 39576 ( 243) ACATTGATCTTGATATTCGTT 1 37902 ( 390) ACTTCGATCTTCATCGTCGTT 1 34728 ( 390) ACTGTGATCTCCATCGTCGTT 1 36473 ( 476) ACGTTGATCTTGACATTGTTT 1 27135 ( 144) ACATCTACCCTGGTTTTCGTT 1 32340 ( 169) TTCTCGAGCTTGATTGCCTTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5070 bayes= 10.169 E= 1.1e-001 152 -923 -923 -65 -923 178 -923 -65 20 -54 -30 35 -923 -923 -30 167 -923 105 -923 93 -923 -923 202 -65 178 -923 -923 -923 -923 -54 -30 135 -923 204 -923 -923 -923 -54 -923 167 -923 -54 -923 167 -923 46 170 -923 152 -923 -30 -923 -923 -54 -923 167 20 46 -923 35 -923 -923 128 93 -923 -54 -923 167 -923 178 -30 -923 -923 -923 170 35 -923 -923 -923 193 -923 -923 -923 193 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 1.1e-001 0.833333 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.166667 0.333333 0.166667 0.166667 0.333333 0.000000 0.000000 0.166667 0.833333 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.833333 0.166667 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.166667 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.833333 0.000000 0.333333 0.666667 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.166667 0.000000 0.833333 0.333333 0.333333 0.000000 0.333333 0.000000 0.000000 0.500000 0.500000 0.000000 0.166667 0.000000 0.833333 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- AC[AT]T[CT]GATCTT[GC]AT[ACT][GT]TC[GT]TT -------------------------------------------------------------------------------- Time 1.01 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 17 sites = 4 llr = 78 E-value = 3.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::::5a:::3a:::: pos.-specific C :5::8::::::5::8:: probability G :55:3a::a::3:::a: matrix T a:5a::5::aa::a3:a bits 2.3 * * * 2.1 * * * 1.8 * * * **** ** ** 1.6 * * * **** ** ** Relative 1.4 * *** **** ** ** Entropy 1.1 ****** **** ***** (28.2 bits) 0.9 *********** ***** 0.7 *********** ***** 0.5 ***************** 0.2 ***************** 0.0 ----------------- Multilevel TCGTCGAAGTTCATCGT consensus GT G T A T sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ----------------- 34728 365 3.24e-10 AGACTCATAA TCTTCGTAGTTCATCGT TTTCCTCAAC 37902 366 3.24e-10 AGACTCATAA TCTTCGTAGTTCATCGT TTTCCTGACT 36473 327 4.89e-10 AGGAATGAGT TGGTCGAAGTTGATCGT GCGAAATCAC 39576 77 4.24e-09 AAGAATAAAT TGGTGGAAGTTAATTGT GCAAAATCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34728 3.2e-10 364_[+2]_119 37902 3.2e-10 365_[+2]_118 36473 4.9e-10 326_[+2]_157 39576 4.2e-09 76_[+2]_407 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=17 seqs=4 34728 ( 365) TCTTCGTAGTTCATCGT 1 37902 ( 366) TCTTCGTAGTTCATCGT 1 36473 ( 327) TGGTCGAAGTTGATCGT 1 39576 ( 77) TGGTGGAAGTTAATTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 17 n= 5114 bayes= 10.3191 E= 3.8e+000 -865 -865 -865 193 -865 104 128 -865 -865 -865 128 93 -865 -865 -865 193 -865 163 28 -865 -865 -865 228 -865 78 -865 -865 93 178 -865 -865 -865 -865 -865 228 -865 -865 -865 -865 193 -865 -865 -865 193 -22 104 28 -865 178 -865 -865 -865 -865 -865 -865 193 -865 163 -865 -7 -865 -865 228 -865 -865 -865 -865 193 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 17 nsites= 4 E= 3.8e+000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.000000 0.500000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.250000 0.500000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[CG][GT]T[CG]G[AT]AGTT[CAG]AT[CT]GT -------------------------------------------------------------------------------- Time 2.12 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 5 llr = 82 E-value = 4.3e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :a6:a:a6::::aa: pos.-specific C a:46:2:::::a::4 probability G :::4:4::a:::::: matrix T :::::4:4:aa:::6 bits 2.3 * 2.1 * * * 1.8 ** * * ****** 1.6 ** * * ****** Relative 1.4 ** * * ****** Entropy 1.1 ** ** * ****** (23.7 bits) 0.9 ***** ********* 0.7 *************** 0.5 *************** 0.2 *************** 0.0 --------------- Multilevel CAACAGAAGTTCAAT consensus CG T T C sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 34728 252 7.49e-09 GACAACGTCT CACCAGAAGTTCAAT CGCCTACTGC 37902 253 7.49e-09 GACAACGTCT CACCAGAAGTTCAAT CGCCTACTGC 39576 28 3.97e-08 CGTGAAGAAG CAAGATATGTTCAAT GTGGGAATGA 36473 278 5.79e-08 CGTGAAGAAG CAAGATATGTTCAAC ATGGGAACGA 37299 249 6.20e-08 GCTCCAACGA CAACACAAGTTCAAC AAATACGCCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34728 7.5e-09 251_[+3]_234 37902 7.5e-09 252_[+3]_233 39576 4e-08 27_[+3]_458 36473 5.8e-08 277_[+3]_208 37299 6.2e-08 248_[+3]_237 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=5 34728 ( 252) CACCAGAAGTTCAAT 1 37902 ( 253) CACCAGAAGTTCAAT 1 39576 ( 28) CAAGATATGTTCAAT 1 36473 ( 278) CAAGATATGTTCAAC 1 37299 ( 249) CAACACAAGTTCAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 5136 bayes= 9.43682 E= 4.3e+000 -897 204 -897 -897 178 -897 -897 -897 104 72 -897 -897 -897 131 96 -897 178 -897 -897 -897 -897 -27 96 61 178 -897 -897 -897 104 -897 -897 61 -897 -897 228 -897 -897 -897 -897 193 -897 -897 -897 193 -897 204 -897 -897 178 -897 -897 -897 178 -897 -897 -897 -897 72 -897 119 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 5 E= 4.3e+000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.400000 0.400000 1.000000 0.000000 0.000000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.400000 0.000000 0.600000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CA[AC][CG]A[GTC]A[AT]GTTCAA[TC] -------------------------------------------------------------------------------- Time 3.16 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37902 3.52e-18 252_[+3(7.49e-09)]_98_\ [+2(3.24e-10)]_7_[+1(1.10e-11)]_90 40473 1.00e+00 290 33459 6.92e-01 500 34728 3.55e-16 251_[+3(7.49e-09)]_98_\ [+2(3.24e-10)]_8_[+1(1.37e-09)]_90 27135 6.84e-05 143_[+1(1.22e-08)]_336 37222 4.14e-01 500 32340 1.30e-04 168_[+1(2.44e-08)]_90_\ [+1(5.90e-05)]_200 39576 1.39e-16 27_[+3(3.97e-08)]_34_[+2(4.24e-09)]_\ 149_[+1(7.43e-12)]_237 36473 9.81e-15 81_[+1(5.76e-05)]_175_\ [+3(5.79e-08)]_34_[+2(4.89e-10)]_132_[+1(3.85e-09)]_4 34943 6.83e-01 500 37299 3.05e-04 248_[+3(6.20e-08)]_237 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************