******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/402/402.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 37441 1.0000 500 21543 1.0000 500 38016 1.0000 500 39662 1.0000 500 23988 1.0000 500 5271 1.0000 500 35318 1.0000 500 46134 1.0000 500 12788 1.0000 500 40354 1.0000 500 48099 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/402/402.seqs.fa -oc motifs/402 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.264 C 0.242 G 0.223 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.264 C 0.242 G 0.223 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 5 llr = 88 E-value = 2.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :44::8::226:::8a::6 pos.-specific C :224:2a24::88:2:::4 probability G a442a::82:422a::a8: matrix T :::4::::28:::::::2: bits 2.2 * * * * 1.9 * * * * ** 1.7 * * * * ** 1.5 * * ** * ** Relative 1.3 * **** ******* Entropy 1.1 * **** ********** (25.4 bits) 0.9 * **** ********** 0.6 * **** ********** 0.4 ******** ********** 0.2 ******** ********** 0.0 ------------------- Multilevel GAACGACGCTACCGAAGGA consensus GGT C CAAGGG C TC sequence CCG G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 35318 89 5.26e-10 TGTGGGGTCT GAGGGACGGTGCCGAAGGA TTGGTTTGGA 23988 188 7.00e-10 TTATTTTTCA GGACGACGCTACGGAAGGC AGCGAAGAAG 40354 281 6.25e-09 CAAATGCTCT GCATGCCGTTACCGAAGGC GCGACTCTGG 46134 301 1.24e-08 ACCTGACCGT GAGTGACGATGGCGAAGTA GTTTACAGTT 21543 11 2.35e-08 AATGCCGATA GGCCGACCCAACCGCAGGA TTTCTTCCCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35318 5.3e-10 88_[+1]_393 23988 7e-10 187_[+1]_294 40354 6.2e-09 280_[+1]_201 46134 1.2e-08 300_[+1]_181 21543 2.4e-08 10_[+1]_471 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=5 35318 ( 89) GAGGGACGGTGCCGAAGGA 1 23988 ( 188) GGACGACGCTACGGAAGGC 1 40354 ( 281) GCATGCCGTTACCGAAGGC 1 46134 ( 301) GAGTGACGATGGCGAAGTA 1 21543 ( 11) GGCCGACCCAACCGCAGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 5302 bayes= 10.9931 E= 2.3e+001 -897 -897 216 -897 60 -28 84 -897 60 -28 84 -897 -897 72 -16 56 -897 -897 216 -897 160 -28 -897 -897 -897 204 -897 -897 -897 -28 184 -897 -40 72 -16 -44 -40 -897 -897 156 118 -897 84 -897 -897 172 -16 -897 -897 172 -16 -897 -897 -897 216 -897 160 -28 -897 -897 192 -897 -897 -897 -897 -897 216 -897 -897 -897 184 -44 118 72 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 5 E= 2.3e+001 0.000000 0.000000 1.000000 0.000000 0.400000 0.200000 0.400000 0.000000 0.400000 0.200000 0.400000 0.000000 0.000000 0.400000 0.200000 0.400000 0.000000 0.000000 1.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.200000 0.400000 0.200000 0.200000 0.200000 0.000000 0.000000 0.800000 0.600000 0.000000 0.400000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.600000 0.400000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[AGC][AGC][CTG]G[AC]C[GC][CAGT][TA][AG][CG][CG]G[AC]AG[GT][AC] -------------------------------------------------------------------------------- Time 1.19 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 6 llr = 88 E-value = 2.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :aa83::3:8a73::2 pos.-specific C 8::2:8:222:::5a: probability G 2:::52a28::2:3:8 matrix T ::::2::3:::272:: bits 2.2 * 1.9 ** * * * 1.7 ** * * * 1.5 *** ** * * ** Relative 1.3 **** ** *** ** Entropy 1.1 **** ** *** * ** (21.3 bits) 0.9 **** ** *** * ** 0.6 ******* ******** 0.4 ******* ******** 0.2 ******* ******** 0.0 ---------------- Multilevel CAAAGCGAGAAATCCG consensus A T AG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 46134 251 4.16e-10 AAGCAAAGCG CAAAGCGAGAAATCCG GTTTTCTCCG 37441 302 4.31e-08 AGAACGACCT CAAATCGTGAATTCCG CGAAGACTGA 12788 213 7.01e-08 ACTTCAACAG CAAAACGTGCAAAGCG GATATCCAGT 35318 248 1.96e-07 CTGTATGTAA CAAAACGACAAAATCG CTCAGGACAT 5271 308 2.39e-07 TCGAACGTCC CAAAGGGCGAAATGCA CAAATAGTTG 38016 390 3.84e-07 TCACGCAACG GAACGCGGGAAGTCCG AAATAGTGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46134 4.2e-10 250_[+2]_234 37441 4.3e-08 301_[+2]_183 12788 7e-08 212_[+2]_272 35318 2e-07 247_[+2]_237 5271 2.4e-07 307_[+2]_177 38016 3.8e-07 389_[+2]_95 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=6 46134 ( 251) CAAAGCGAGAAATCCG 1 37441 ( 302) CAAATCGTGAATTCCG 1 12788 ( 213) CAAAACGTGCAAAGCG 1 35318 ( 248) CAAAACGACAAAATCG 1 5271 ( 308) CAAAGGGCGAAATGCA 1 38016 ( 390) GAACGCGGGAAGTCCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5335 bayes= 10.2426 E= 2.5e+001 -923 178 -42 -923 192 -923 -923 -923 192 -923 -923 -923 165 -54 -923 -923 33 -923 117 -70 -923 178 -42 -923 -923 -923 216 -923 33 -54 -42 30 -923 -54 190 -923 165 -54 -923 -923 192 -923 -923 -923 133 -923 -42 -70 33 -923 -923 130 -923 104 58 -70 -923 204 -923 -923 -66 -923 190 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 2.5e+001 0.000000 0.833333 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.333333 0.000000 0.500000 0.166667 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.166667 0.166667 0.333333 0.000000 0.166667 0.833333 0.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.000000 0.166667 0.166667 0.333333 0.000000 0.000000 0.666667 0.000000 0.500000 0.333333 0.166667 0.000000 1.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CAAA[GA]CG[AT]GAAA[TA][CG]CG -------------------------------------------------------------------------------- Time 2.30 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 8 llr = 120 E-value = 3.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1:::4:::41:::31:1::1 pos.-specific C ::344:39::3:9:1:1:83 probability G ::::33:15::1:5313::: matrix T 9a86:88:198913595a36 bits 2.2 1.9 * * 1.7 * * 1.5 * * * * Relative 1.3 ** * * ** * ** Entropy 1.1 **** *** **** * ** (21.6 bits) 0.9 **** *** **** * ** 0.6 **** ******** * *** 0.4 ************** * *** 0.2 ******************** 0.0 -------------------- Multilevel TTTTATTCGTTTCGTTTTCT consensus CCCGC A C AG G TC sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 35318 356 1.67e-10 GTGGTGCCAT TTTTCTTCGTTTCGTTATCT CACCCTTTCT 21543 405 1.06e-08 TTGTTCCGTG TTCCATTCGTCTCGCTTTCT CGTTCCGTTT 37441 238 2.40e-08 GAGATATTTC ATTTATTCATTTCAGTGTCT CATCACAACG 48099 475 2.65e-08 GCCGAGGTTA TTTCGTCCATTTCTATTTCT ACCGAA 5271 409 7.04e-08 CATGCGGGAT TTTTGGTCAATTCATTTTCC GGCATGGTAG 46134 268 8.96e-08 AGAAATCCGG TTTTCTCCGTTGCGTTTTTA TCAACCTGAC 40354 389 3.13e-07 GCCACTCTCG TTTTCTTGTTCTCGTTCTTT TTTTTGGTCT 38016 312 1.03e-06 GAGGCCCTGT TTCCAGTCGTTTTTGGGTCC CTGTCGTGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35318 1.7e-10 355_[+3]_125 21543 1.1e-08 404_[+3]_76 37441 2.4e-08 237_[+3]_243 48099 2.7e-08 474_[+3]_6 5271 7e-08 408_[+3]_72 46134 9e-08 267_[+3]_213 40354 3.1e-07 388_[+3]_92 38016 1e-06 311_[+3]_169 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=8 35318 ( 356) TTTTCTTCGTTTCGTTATCT 1 21543 ( 405) TTCCATTCGTCTCGCTTTCT 1 37441 ( 238) ATTTATTCATTTCAGTGTCT 1 48099 ( 475) TTTCGTCCATTTCTATTTCT 1 5271 ( 409) TTTTGGTCAATTCATTTTCC 1 46134 ( 268) TTTTCTCCGTTGCGTTTTTA 1 40354 ( 389) TTTTCTTGTTCTCGTTCTTT 1 38016 ( 312) TTCCAGTCGTTTTTGGGTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5291 bayes= 9.36714 E= 3.6e+001 -108 -965 -965 169 -965 -965 -965 188 -965 5 -965 147 -965 63 -965 121 50 63 17 -965 -965 -965 17 147 -965 5 -965 147 -965 185 -83 -965 50 -965 117 -111 -108 -965 -965 169 -965 5 -965 147 -965 -965 -83 169 -965 185 -965 -111 -8 -965 117 -11 -108 -95 17 88 -965 -965 -83 169 -108 -95 17 88 -965 -965 -965 188 -965 163 -965 -11 -108 5 -965 121 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 8 E= 3.6e+001 0.125000 0.000000 0.000000 0.875000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.375000 0.000000 0.625000 0.375000 0.375000 0.250000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.250000 0.000000 0.750000 0.000000 0.875000 0.125000 0.000000 0.375000 0.000000 0.500000 0.125000 0.125000 0.000000 0.000000 0.875000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.125000 0.875000 0.000000 0.875000 0.000000 0.125000 0.250000 0.000000 0.500000 0.250000 0.125000 0.125000 0.250000 0.500000 0.000000 0.000000 0.125000 0.875000 0.125000 0.125000 0.250000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.250000 0.125000 0.250000 0.000000 0.625000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TT[TC][TC][ACG][TG][TC]C[GA]T[TC]TC[GAT][TG]T[TG]T[CT][TC] -------------------------------------------------------------------------------- Time 3.39 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37441 2.67e-08 237_[+3(2.40e-08)]_44_\ [+2(4.31e-08)]_183 21543 7.04e-09 10_[+1(2.35e-08)]_375_\ [+3(1.06e-08)]_76 38016 4.13e-06 311_[+3(1.03e-06)]_58_\ [+2(3.84e-07)]_95 39662 8.66e-01 500 23988 3.35e-05 187_[+1(7.00e-10)]_294 5271 5.90e-07 307_[+2(2.39e-07)]_85_\ [+3(7.04e-08)]_72 35318 1.69e-15 88_[+1(5.26e-10)]_140_\ [+2(1.96e-07)]_92_[+3(1.67e-10)]_125 46134 3.86e-14 250_[+2(4.16e-10)]_1_[+3(8.96e-08)]_\ 13_[+1(1.24e-08)]_181 12788 8.53e-04 212_[+2(7.01e-08)]_272 40354 9.23e-08 280_[+1(6.25e-09)]_89_\ [+3(3.13e-07)]_92 48099 2.54e-04 474_[+3(2.65e-08)]_6 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************