******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/219/219.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47968 1.0000 500 48692 1.0000 500 49659 1.0000 500 1971 1.0000 500 50397 1.0000 500 43954 1.0000 500 10354 1.0000 500 47283 1.0000 500 50079 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/219/219.seqs.fa -oc motifs/219 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.243 C 0.262 G 0.240 T 0.255 Background letter frequencies (from dataset with add-one prior applied): A 0.243 C 0.262 G 0.240 T 0.255 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 14 sites = 5 llr = 71 E-value = 2.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :2::::48:::44: pos.-specific C :4:::4:2a:226: probability G a4a:a66::a82:a matrix T :::a:::::::2:: bits 2.1 * *** * * 1.9 * *** ** * 1.6 * *** ** * 1.4 * *** ** * Relative 1.2 * *** **** * Entropy 1.0 * ********* ** (20.5 bits) 0.8 * ********* ** 0.6 * ********* ** 0.4 *********** ** 0.2 *********** ** 0.0 -------------- Multilevel GCGTGGGACGGACG consensus G CAC CCA sequence A G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 50079 192 4.54e-08 CGTAGAGACC GGGTGGGACGGTCG ACATGCAGAC 50397 220 5.77e-08 TCAGAGAAAT GCGTGCGACGGAAG GACCCCACGG 47968 169 1.90e-07 AGCTGTGCTA GCGTGGGACGCACG GCACACTCGT 10354 50 4.50e-07 AAGGCCCTGC GAGTGCAACGGGAG CTACAGAACG 49659 183 5.07e-07 GCCCACCAGT GGGTGGACCGGCCG GCCATCAGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50079 4.5e-08 191_[+1]_295 50397 5.8e-08 219_[+1]_267 47968 1.9e-07 168_[+1]_318 10354 4.5e-07 49_[+1]_437 49659 5.1e-07 182_[+1]_304 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=14 seqs=5 50079 ( 192) GGGTGGGACGGTCG 1 50397 ( 220) GCGTGCGACGGAAG 1 47968 ( 169) GCGTGGGACGCACG 1 10354 ( 50) GAGTGCAACGGGAG 1 49659 ( 183) GGGTGGACCGGCCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 4383 bayes= 10.0259 E= 2.2e+002 -897 -897 206 -897 -28 61 73 -897 -897 -897 206 -897 -897 -897 -897 197 -897 -897 206 -897 -897 61 132 -897 72 -897 132 -897 172 -39 -897 -897 -897 193 -897 -897 -897 -897 206 -897 -897 -39 173 -897 72 -39 -26 -35 72 120 -897 -897 -897 -897 206 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 5 E= 2.2e+002 0.000000 0.000000 1.000000 0.000000 0.200000 0.400000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.400000 0.000000 0.600000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.400000 0.200000 0.200000 0.200000 0.400000 0.600000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[CGA]GTG[GC][GA][AC]CG[GC][ACGT][CA]G -------------------------------------------------------------------------------- Time 0.85 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 4 llr = 77 E-value = 2.5e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A a:a5:3::3::::3a588:: pos.-specific C :::583::3::3:8:5:::: probability G :5:::5aa5:35::::338: matrix T :5::3::::a83a:::::3a bits 2.1 * * ** * * * * 1.9 * * ** * * * * 1.6 * * ** * * * * 1.4 * * ** * * * * Relative 1.2 * * * ** ** *** **** Entropy 1.0 ***** ** ** ******** (27.8 bits) 0.8 ***** ** ** ******** 0.6 *********** ******** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel AGAACGGGGTTGTCAAAAGT consensus T CTA A GC A CGGT sequence C C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 49659 116 3.09e-11 TCAAGTCATG AGAACGGGGTTCTCACAAGT CGGTGTCTTT 10354 378 3.34e-10 ACAGGAGCTG ATACTGGGGTTTTCACAAGT CATCTTCAAA 1971 289 2.85e-09 ACAACGAAAT ATAACAGGCTTGTCAAGATT GCGTGTGATC 50397 297 5.22e-09 TTTCCGAACA AGACCCGGATGGTAAAAGGT GTCGCTCGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49659 3.1e-11 115_[+2]_365 10354 3.3e-10 377_[+2]_103 1971 2.9e-09 288_[+2]_192 50397 5.2e-09 296_[+2]_184 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=4 49659 ( 116) AGAACGGGGTTCTCACAAGT 1 10354 ( 378) ATACTGGGGTTTTCACAAGT 1 1971 ( 289) ATAACAGGCTTGTCAAGATT 1 50397 ( 297) AGACCCGGATGGTAAAAGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4329 bayes= 10.816 E= 2.5e+003 204 -865 -865 -865 -865 -865 106 97 204 -865 -865 -865 104 93 -865 -865 -865 152 -865 -3 4 -6 106 -865 -865 -865 205 -865 -865 -865 205 -865 4 -6 106 -865 -865 -865 -865 197 -865 -865 6 155 -865 -6 106 -3 -865 -865 -865 197 4 152 -865 -865 204 -865 -865 -865 104 93 -865 -865 162 -865 6 -865 162 -865 6 -865 -865 -865 164 -3 -865 -865 -865 197 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 2.5e+003 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.250000 0.250000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.250000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 0.000000 1.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[GT]A[AC][CT][GAC]GG[GAC]T[TG][GCT]T[CA]A[AC][AG][AG][GT]T -------------------------------------------------------------------------------- Time 1.72 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 5 llr = 76 E-value = 4.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :4a::8:::::8::a pos.-specific C :2:8::a2a4::6:: probability G a4:2a::4:6:246: matrix T :::::2:4::a::4: bits 2.1 * * * * * 1.9 * * * * * * * 1.6 * * * * * * * 1.4 * * * * * * * Relative 1.2 * ***** * ** * Entropy 1.0 * ***** ******* (22.0 bits) 0.8 * ***** ******* 0.6 * ***** ******* 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel GAACGACGCGTACGA consensus G G T T C GGT sequence C C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 50397 103 2.75e-08 CCCTAGCGTG GGACGACTCCTACTA GTACTATTAC 47283 377 2.99e-08 CGTTGACACC GAACGACCCGTAGGA ACCTGGTGTG 43954 120 4.34e-08 CGAGAAATGT GGAGGACGCGTACGA TTCCTCGACG 10354 76 1.28e-07 ACAGAACGAA GCACGACTCGTGCGA ACGGGGATGC 48692 470 1.85e-07 TCACGAACGG GAACGTCGCCTAGTA TTGTACACGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50397 2.8e-08 102_[+3]_383 47283 3e-08 376_[+3]_109 43954 4.3e-08 119_[+3]_366 10354 1.3e-07 75_[+3]_410 48692 1.9e-07 469_[+3]_16 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=5 50397 ( 103) GGACGACTCCTACTA 1 47283 ( 377) GAACGACCCGTAGGA 1 43954 ( 120) GGAGGACGCGTACGA 1 10354 ( 76) GCACGACTCGTGCGA 1 48692 ( 470) GAACGTCGCCTAGTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4374 bayes= 10.023 E= 4.0e+002 -897 -897 206 -897 72 -39 73 -897 204 -897 -897 -897 -897 161 -26 -897 -897 -897 206 -897 172 -897 -897 -35 -897 193 -897 -897 -897 -39 73 65 -897 193 -897 -897 -897 61 132 -897 -897 -897 -897 197 172 -897 -26 -897 -897 120 73 -897 -897 -897 132 65 204 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 5 E= 4.0e+002 0.000000 0.000000 1.000000 0.000000 0.400000 0.200000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.400000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 0.000000 1.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 0.600000 0.400000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[AGC]A[CG]G[AT]C[GTC]C[GC]T[AG][CG][GT]A -------------------------------------------------------------------------------- Time 2.46 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47968 4.63e-03 168_[+1(1.90e-07)]_318 48692 1.54e-03 469_[+3(1.85e-07)]_16 49659 1.29e-09 115_[+2(3.09e-11)]_47_\ [+1(5.07e-07)]_304 1971 1.14e-04 288_[+2(2.85e-09)]_192 50397 5.98e-13 102_[+3(2.75e-08)]_102_\ [+1(5.77e-08)]_63_[+2(5.22e-09)]_184 43954 1.41e-03 119_[+3(4.34e-08)]_156_\ [+3(6.82e-05)]_195 10354 1.32e-12 49_[+1(4.50e-07)]_12_[+3(1.28e-07)]_\ 287_[+2(3.34e-10)]_103 47283 5.28e-04 376_[+3(2.99e-08)]_109 50079 1.00e-03 191_[+1(4.54e-08)]_295 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************