******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/491/491.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10275 1.0000 500 11106 1.0000 500 11415 1.0000 500 11431 1.0000 500 11913 1.0000 500 23377 1.0000 500 23486 1.0000 500 24257 1.0000 500 268614 1.0000 500 6307 1.0000 500 7576 1.0000 500 bd759 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/491/491.seqs.fa -oc motifs/491 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.262 C 0.239 G 0.237 T 0.262 Background letter frequencies (from dataset with add-one prior applied): A 0.262 C 0.239 G 0.237 T 0.262 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 106 E-value = 3.5e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::93:a29::a: pos.-specific C 16:2a:8:1a:2 probability G 931:::::1::: matrix T :1:4:::18::8 bits 2.1 * * 1.9 ** ** 1.7 * ** ** 1.5 * * ** * ** Relative 1.2 * * **** *** Entropy 1.0 * * ******** (17.0 bits) 0.8 * * ******** 0.6 *** ******** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GCATCACATCAT consensus G A A C sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 10275 382 3.21e-07 AGCCTGCGAA GGAACACATCAT ATCACTCAAT 11913 385 4.40e-07 TTTGCCCGAC GCATCACATCAC AACAATTGGT 24257 76 5.12e-07 GCATTCCTCG GCATCAAATCAT ATGACGGCAC 23377 338 1.14e-06 AGAAGATAGA GGAACACATCAC CGAGTGCAGT 7576 114 1.41e-06 TGAACACGGT GCATCACTTCAT TGACGATGTT bd759 317 2.32e-06 CCCATCTTTC GCACCACACCAT GGTTAAATTC 11431 179 2.32e-06 TTCTTCGCAG GCACCACAGCAT CCTTTACCTT 11415 445 2.69e-06 CAGCATCATA CGATCACATCAT CCATTCATTT 23486 145 1.53e-05 TGTAAGTCAG GTGACAAATCAT CCAACTACCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10275 3.2e-07 381_[+1]_107 11913 4.4e-07 384_[+1]_104 24257 5.1e-07 75_[+1]_413 23377 1.1e-06 337_[+1]_151 7576 1.4e-06 113_[+1]_375 bd759 2.3e-06 316_[+1]_172 11431 2.3e-06 178_[+1]_310 11415 2.7e-06 444_[+1]_44 23486 1.5e-05 144_[+1]_344 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 10275 ( 382) GGAACACATCAT 1 11913 ( 385) GCATCACATCAC 1 24257 ( 76) GCATCAAATCAT 1 23377 ( 338) GGAACACATCAC 1 7576 ( 114) GCATCACTTCAT 1 bd759 ( 317) GCACCACACCAT 1 11431 ( 179) GCACCACAGCAT 1 11415 ( 445) CGATCACATCAT 1 23486 ( 145) GTGACAAATCAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 9.48101 E= 3.5e-001 -982 -110 191 -982 -982 122 49 -124 176 -982 -109 -982 35 -10 -982 76 -982 207 -982 -982 193 -982 -982 -982 -24 170 -982 -982 176 -982 -982 -124 -982 -110 -109 157 -982 207 -982 -982 193 -982 -982 -982 -982 -10 -982 157 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 3.5e-001 0.000000 0.111111 0.888889 0.000000 0.000000 0.555556 0.333333 0.111111 0.888889 0.000000 0.111111 0.000000 0.333333 0.222222 0.000000 0.444444 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.222222 0.777778 0.000000 0.000000 0.888889 0.000000 0.000000 0.111111 0.000000 0.111111 0.111111 0.777778 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.222222 0.000000 0.777778 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[CG]A[TAC]CA[CA]ATCA[TC] -------------------------------------------------------------------------------- Time 1.42 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 4 llr = 67 E-value = 1.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :a8:a:::a:5:3: pos.-specific C :::a:::::::33: probability G a::::aaa:a58:a matrix T ::3:::::::::5: bits 2.1 * * *** * * 1.9 ** ******* * 1.7 ** ******* * 1.5 ** ******* * Relative 1.2 ** ******* * * Entropy 1.0 ************ * (24.2 bits) 0.8 ************ * 0.6 ************ * 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel GAACAGGGAGAGTG consensus T GCA sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 6307 278 2.94e-09 CGAAGCACGA GAACAGGGAGGGTG CAAAGAGAAG 7576 40 6.18e-09 ATTAGGGGTG GAACAGGGAGAGTG TACAGCGGCA 10275 126 1.18e-08 GGGAAGGAAA GAACAGGGAGGGAG AAGCCAGAGA 23486 177 6.91e-08 ACTATTCTGA GATCAGGGAGACCG ATCCGTTCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 6307 2.9e-09 277_[+2]_209 7576 6.2e-09 39_[+2]_447 10275 1.2e-08 125_[+2]_361 23486 6.9e-08 176_[+2]_310 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=4 6307 ( 278) GAACAGGGAGGGTG 1 7576 ( 40) GAACAGGGAGAGTG 1 10275 ( 126) GAACAGGGAGGGAG 1 23486 ( 177) GATCAGGGAGACCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 5844 bayes= 11.2491 E= 1.7e+001 -865 -865 207 -865 193 -865 -865 -865 151 -865 -865 -7 -865 206 -865 -865 193 -865 -865 -865 -865 -865 207 -865 -865 -865 207 -865 -865 -865 207 -865 193 -865 -865 -865 -865 -865 207 -865 93 -865 108 -865 -865 7 166 -865 -7 7 -865 93 -865 -865 207 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 4 E= 1.7e+001 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.250000 0.750000 0.000000 0.250000 0.250000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GA[AT]CAGGGAG[AG][GC][TAC]G -------------------------------------------------------------------------------- Time 2.61 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 8 llr = 95 E-value = 5.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1::::39:1a4: pos.-specific C :5:::114::6: probability G ::a:46::9::a matrix T 95:a6::6:::: bits 2.1 * * 1.9 ** * * 1.7 ** * * 1.5 * ** * ** * Relative 1.2 * ** * ** * Entropy 1.0 ***** ****** (17.2 bits) 0.8 ************ 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TCGTTGATGACG consensus T GA C A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 7576 5 1.22e-07 TTGG TTGTTGATGACG GGTGTACAAT 24257 30 4.78e-07 TGGTTGTATG TTGTTGATGAAG ATACGCTGAT 268614 37 5.26e-07 CGCATGTCCC TCGTGGACGACG AGGGCTCGTT 23377 176 2.37e-06 AGTACCTCCA TCGTTGATAACG TTGCAAAGAG 11431 32 2.56e-06 AATTTTGTGT TTGTTCACGACG TCATCGTATC 6307 261 3.46e-06 GTTGTGCAAG TTGTGAACGAAG CACGAGAACA 23486 61 3.46e-06 ACGACGGCAA TCGTTGCTGAAG GAAGGAGGAG 10275 203 6.29e-06 CAAAAAAAAC ACGTGAATGACG AAGACGCAGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7576 1.2e-07 4_[+3]_484 24257 4.8e-07 29_[+3]_459 268614 5.3e-07 36_[+3]_452 23377 2.4e-06 175_[+3]_313 11431 2.6e-06 31_[+3]_457 6307 3.5e-06 260_[+3]_228 23486 3.5e-06 60_[+3]_428 10275 6.3e-06 202_[+3]_286 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=8 7576 ( 5) TTGTTGATGACG 1 24257 ( 30) TTGTTGATGAAG 1 268614 ( 37) TCGTGGACGACG 1 23377 ( 176) TCGTTGATAACG 1 11431 ( 32) TTGTTCACGACG 1 6307 ( 261) TTGTGAACGAAG 1 23486 ( 61) TCGTTGCTGAAG 1 10275 ( 203) ACGTGAATGACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 9.51668 E= 5.2e+001 -107 -965 -965 174 -965 107 -965 93 -965 -965 208 -965 -965 -965 -965 193 -965 -965 66 125 -7 -93 140 -965 174 -93 -965 -965 -965 65 -965 125 -107 -965 188 -965 193 -965 -965 -965 51 139 -965 -965 -965 -965 208 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 5.2e+001 0.125000 0.000000 0.000000 0.875000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.375000 0.625000 0.250000 0.125000 0.625000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 0.375000 0.000000 0.625000 0.125000 0.000000 0.875000 0.000000 1.000000 0.000000 0.000000 0.000000 0.375000 0.625000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[CT]GT[TG][GA]A[TC]GA[CA]G -------------------------------------------------------------------------------- Time 3.94 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10275 1.06e-09 125_[+2(1.18e-08)]_63_\ [+3(6.29e-06)]_167_[+1(3.21e-07)]_107 11106 8.51e-01 500 11415 2.50e-02 444_[+1(2.69e-06)]_13_\ [+1(8.68e-05)]_19 11431 1.10e-04 31_[+3(2.56e-06)]_135_\ [+1(2.32e-06)]_310 11913 4.12e-03 384_[+1(4.40e-07)]_104 23377 6.24e-05 175_[+3(2.37e-06)]_150_\ [+1(1.14e-06)]_151 23486 1.08e-07 60_[+3(3.46e-06)]_72_[+1(1.53e-05)]_\ 20_[+2(6.91e-08)]_310 24257 3.93e-06 29_[+3(4.78e-07)]_34_[+1(5.12e-07)]_\ 413 268614 1.03e-03 36_[+3(5.26e-07)]_452 6307 8.36e-08 260_[+3(3.46e-06)]_5_[+2(2.94e-09)]_\ 209 7576 5.87e-11 4_[+3(1.22e-07)]_23_[+2(6.18e-09)]_\ 60_[+1(1.41e-06)]_87_[+3(4.97e-05)]_276 bd759 2.91e-02 316_[+1(2.32e-06)]_172 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************