******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/235/235.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47106 1.0000 500 47705 1.0000 500 48049 1.0000 500 49934 1.0000 500 44571 1.0000 500 45051 1.0000 500 45608 1.0000 500 45681 1.0000 500 36079 1.0000 500 50593 1.0000 500 45680 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/235/235.seqs.fa -oc motifs/235 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.245 G 0.238 T 0.249 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.245 G 0.238 T 0.249 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 13 sites = 4 llr = 64 E-value = 7.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::a::::::::: pos.-specific C 8::::::::8::: probability G :aa:5aa:a38a: matrix T 3:::5::a::3:a bits 2.1 ** **** ** 1.9 *** **** ** 1.7 *** **** ** 1.4 *** **** ** Relative 1.2 **** ******** Entropy 1.0 ************* (23.1 bits) 0.8 ************* 0.6 ************* 0.4 ************* 0.2 ************* 0.0 ------------- Multilevel CGGAGGGTGCGGT consensus T T GT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 45051 400 2.10e-08 AACATTTCTT CGGATGGTGCGGT CTCACAAGTT 49934 62 2.10e-08 GTGTGTGTGC CGGATGGTGCGGT TTGATTGGGT 45608 366 5.22e-08 TTTGAAACTC CGGAGGGTGCTGT TCACAAAACG 47106 477 1.16e-07 CCGCAACGTG TGGAGGGTGGGGT GCCACCCTGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45051 2.1e-08 399_[+1]_88 49934 2.1e-08 61_[+1]_426 45608 5.2e-08 365_[+1]_122 47106 1.2e-07 476_[+1]_11 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=13 seqs=4 45051 ( 400) CGGATGGTGCGGT 1 49934 ( 62) CGGATGGTGCGGT 1 45608 ( 366) CGGAGGGTGCTGT 1 47106 ( 477) TGGAGGGTGGGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 5368 bayes= 11.1265 E= 7.5e+000 -865 161 -865 0 -865 -865 207 -865 -865 -865 207 -865 190 -865 -865 -865 -865 -865 107 100 -865 -865 207 -865 -865 -865 207 -865 -865 -865 -865 200 -865 -865 207 -865 -865 161 7 -865 -865 -865 165 0 -865 -865 207 -865 -865 -865 -865 200 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 4 E= 7.5e+000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CT]GGA[GT]GGTG[CG][GT]GT -------------------------------------------------------------------------------- Time 1.15 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 8 llr = 120 E-value = 1.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::::3411114313:1:: pos.-specific C 11aa31a:463191551:::6 probability G :9::51:8133:::3164:33 matrix T 9:::38::1:48:5:3:6981 bits 2.1 ** * 1.9 ** * 1.7 ** * 1.4 **** * * * Relative 1.2 **** ** * ** Entropy 1.0 **** *** ** *** (21.7 bits) 0.8 **** *** ** **** 0.6 ******** * *** ***** 0.4 ******** * **** ***** 0.2 ********************* 0.0 --------------------- Multilevel TGCCGTCGACTTCTCCGTTTC consensus C ACGC AATAG GG sequence T G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 50593 381 1.06e-09 CGCATCGGGC TGCCTTCGTCGTCTCCATTTC GGATATTCTC 45051 126 2.30e-09 AGCTAGGTGA TGCCGGCGCACTCTCCGTTTC AGCCAATGTT 47106 380 8.17e-09 ATCTGACCAG TGCCGTCAACGTCAACGTTGG GGGGATGAGG 44571 447 2.24e-08 AGACTACGGT TGCCGTCGCCCTCTCGCGTTT CCGTCCGTCC 47705 463 2.14e-07 ACCGTGGAAA CGCCTTCGCCTTCCAAGGTGC GCGAATTGCG 45681 209 2.29e-07 GGGACAGATA TGCCCTCGAGTCCACTGGATG TATGTGCCAA 36079 405 2.79e-07 ATAACGACAG TGCCGCCGACTAATGTATTTC CCTCTATTGG 49934 331 2.79e-07 GCACCATTCA TCCCCTCAGGATCAGCGTTTC GCAAAACCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50593 1.1e-09 380_[+2]_99 45051 2.3e-09 125_[+2]_354 47106 8.2e-09 379_[+2]_100 44571 2.2e-08 446_[+2]_33 47705 2.1e-07 462_[+2]_17 45681 2.3e-07 208_[+2]_271 36079 2.8e-07 404_[+2]_75 49934 2.8e-07 330_[+2]_149 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=8 50593 ( 381) TGCCTTCGTCGTCTCCATTTC 1 45051 ( 126) TGCCGGCGCACTCTCCGTTTC 1 47106 ( 380) TGCCGTCAACGTCAACGTTGG 1 44571 ( 447) TGCCGTCGCCCTCTCGCGTTT 1 47705 ( 463) CGCCTTCGCCTTCCAAGGTGC 1 45681 ( 209) TGCCCTCGAGTCCACTGGATG 1 36079 ( 405) TGCCGCCGACTAATGTATTTC 1 49934 ( 331) TCCCCTCAGGATCAGCGTTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 10.102 E= 1.4e+002 -965 -97 -965 181 -965 -97 188 -965 -965 203 -965 -965 -965 203 -965 -965 -965 3 107 0 -965 -97 -93 159 -965 203 -965 -965 -10 -965 165 -965 49 61 -93 -99 -110 135 7 -965 -110 3 7 59 -110 -97 -965 159 -110 183 -965 -965 49 -97 -965 100 -10 103 7 -965 -110 103 -93 0 -10 -97 139 -965 -965 -965 66 133 -110 -965 -965 181 -965 -965 7 159 -965 135 7 -99 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 1.4e+002 0.000000 0.125000 0.000000 0.875000 0.000000 0.125000 0.875000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.125000 0.125000 0.750000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.375000 0.375000 0.125000 0.125000 0.125000 0.625000 0.250000 0.000000 0.125000 0.250000 0.250000 0.375000 0.125000 0.125000 0.000000 0.750000 0.125000 0.875000 0.000000 0.000000 0.375000 0.125000 0.000000 0.500000 0.250000 0.500000 0.250000 0.000000 0.125000 0.500000 0.125000 0.250000 0.250000 0.125000 0.625000 0.000000 0.000000 0.000000 0.375000 0.625000 0.125000 0.000000 0.000000 0.875000 0.000000 0.000000 0.250000 0.750000 0.000000 0.625000 0.250000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TGCC[GCT]TC[GA][AC][CG][TCG]TC[TA][CAG][CT][GA][TG]T[TG][CG] -------------------------------------------------------------------------------- Time 2.28 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 5 llr = 74 E-value = 2.0e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::22:::2::::::: pos.-specific C :2:::::6::2:::6 probability G a2:8a4a::a262:4 matrix T :68::6:2a:648a: bits 2.1 * * * ** * 1.9 * * * ** * 1.7 * * * ** * 1.4 * * * ** * Relative 1.2 * *** * ** ** Entropy 1.0 * ***** ** **** (21.3 bits) 0.8 * ***** ** **** 0.6 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel GTTGGTGCTGTGTTC consensus CAA G A CTG G sequence G T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 45051 65 9.60e-09 GGCGTCCGTC GTTGGTGCTGTGGTC ATTTTACGTG 47106 319 3.59e-08 TGCTTGTCAT GTTGGGGCTGCTTTC TCATCCAACA 49934 220 1.14e-07 TCGCGCGGTG GCAGGTGCTGTGTTG GTCACGAACG 45681 63 1.32e-07 CCTAAACCGA GGTGGGGTTGTGTTG CTATGTTGTC 48049 118 3.96e-07 AGACTTTACG GTTAGTGATGGTTTC TCGTTTGGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45051 9.6e-09 64_[+3]_421 47106 3.6e-08 318_[+3]_167 49934 1.1e-07 219_[+3]_266 45681 1.3e-07 62_[+3]_423 48049 4e-07 117_[+3]_368 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=5 45051 ( 65) GTTGGTGCTGTGGTC 1 47106 ( 319) GTTGGGGCTGCTTTC 1 49934 ( 220) GCAGGTGCTGTGTTG 1 45681 ( 63) GGTGGGGTTGTGTTG 1 48049 ( 118) GTTAGTGATGGTTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 5346 bayes= 10.3127 E= 2.0e+003 -897 -897 207 -897 -897 -29 -25 127 -42 -897 -897 168 -42 -897 175 -897 -897 -897 207 -897 -897 -897 75 127 -897 -897 207 -897 -42 129 -897 -32 -897 -897 -897 200 -897 -897 207 -897 -897 -29 -25 127 -897 -897 133 68 -897 -897 -25 168 -897 -897 -897 200 -897 129 75 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 5 E= 2.0e+003 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.200000 0.600000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 1.000000 0.000000 0.200000 0.600000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.200000 0.600000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.400000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[TCG][TA][GA]G[TG]G[CAT]TG[TCG][GT][TG]T[CG] -------------------------------------------------------------------------------- Time 3.39 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47106 2.27e-12 318_[+3(3.59e-08)]_46_\ [+2(8.17e-09)]_76_[+1(1.16e-07)]_11 47705 3.30e-03 462_[+2(2.14e-07)]_17 48049 4.04e-03 117_[+3(3.96e-07)]_368 49934 3.70e-11 61_[+1(2.10e-08)]_145_\ [+3(1.14e-07)]_96_[+2(2.79e-07)]_149 44571 3.01e-04 446_[+2(2.24e-08)]_33 45051 3.91e-14 64_[+3(9.60e-09)]_46_[+2(2.30e-09)]_\ 253_[+1(2.10e-08)]_88 45608 1.08e-03 71_[+1(2.51e-05)]_281_\ [+1(5.22e-08)]_122 45681 9.81e-07 62_[+3(1.32e-07)]_131_\ [+2(2.29e-07)]_271 36079 4.97e-03 404_[+2(2.79e-07)]_75 50593 2.45e-05 380_[+2(1.06e-09)]_99 45680 9.99e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************