******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/500/500.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10545 1.0000 500 11155 1.0000 500 11156 1.0000 500 11969 1.0000 500 12074 1.0000 500 21840 1.0000 500 23369 1.0000 500 24274 1.0000 500 25634 1.0000 500 264590 1.0000 500 3067 1.0000 311 34559 1.0000 500 4936 1.0000 500 8616 1.0000 500 9647 1.0000 500 bd1237 1.0000 500 bd1788 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/500/500.seqs.fa -oc motifs/500 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8311 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.232 G 0.224 T 0.274 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.232 G 0.224 T 0.274 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 7 llr = 117 E-value = 9.1e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a:33:7:61:9914::3:9 pos.-specific C :a6:91a419:16:a1:a1 probability G ::1111::31::3::17:: matrix T :::6::::4:1::6:7::: bits 2.2 * * * * 1.9 ** * * * 1.7 ** * * * 1.5 ** * * * * * Relative 1.3 ** * * *** * *** Entropy 1.1 ** * ** *** * *** (24.1 bits) 0.9 ** **** *** ****** 0.6 *** **** ********** 0.4 ******** ********** 0.2 ******************* 0.0 ------------------- Multilevel ACCTCACATCAACTCTGCA consensus AA CG GA A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 11156 206 1.12e-11 ATGTGGCGAA ACCTCACCTCAACTCTGCA CAGACAGCTT 264590 342 2.78e-10 ATTTGGAGAA ACCTCGCCTCAACTCTGCA TAGACAGCCT 34559 471 4.42e-09 TTTTTCAACC ACAACACATCAAGTCTACA ATCACATCAC 4936 275 3.11e-08 CGGGGTCTGC ACCAGACACCAAAACTGCA TAGATCGAGG bd1788 345 3.56e-08 ACGACCTGAT ACCGCACCGGTACACTGCA TCATAGAAAA 11969 51 4.06e-08 CAACAACAAA ACGTCACAGCACGACGGCA ACGGTGAGCT 3067 263 1.55e-07 CAGCAGACAA ACATCCCAACAACTCCACC ACTCCCTCAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11156 1.1e-11 205_[+1]_276 264590 2.8e-10 341_[+1]_140 34559 4.4e-09 470_[+1]_11 4936 3.1e-08 274_[+1]_207 bd1788 3.6e-08 344_[+1]_137 11969 4.1e-08 50_[+1]_431 3067 1.5e-07 262_[+1]_30 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=7 11156 ( 206) ACCTCACCTCAACTCTGCA 1 264590 ( 342) ACCTCGCCTCAACTCTGCA 1 34559 ( 471) ACAACACATCAAGTCTACA 1 4936 ( 275) ACCAGACACCAAAACTGCA 1 bd1788 ( 345) ACCGCACCGGTACACTGCA 1 11969 ( 51) ACGTCACAGCACGACGGCA 1 3067 ( 263) ACATCCCAACAACTCCACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 8005 bayes= 10.0018 E= 9.1e-001 188 -945 -945 -945 -945 211 -945 -945 8 130 -65 -945 8 -945 -65 106 -945 188 -65 -945 140 -70 -65 -945 -945 211 -945 -945 108 89 -945 -945 -92 -70 35 65 -945 188 -65 -945 166 -945 -945 -94 166 -70 -945 -945 -92 130 35 -945 66 -945 -945 106 -945 211 -945 -945 -945 -70 -65 138 8 -945 167 -945 -945 211 -945 -945 166 -70 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 7 E= 9.1e-001 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.285714 0.571429 0.142857 0.000000 0.285714 0.000000 0.142857 0.571429 0.000000 0.857143 0.142857 0.000000 0.714286 0.142857 0.142857 0.000000 0.000000 1.000000 0.000000 0.000000 0.571429 0.428571 0.000000 0.000000 0.142857 0.142857 0.285714 0.428571 0.000000 0.857143 0.142857 0.000000 0.857143 0.000000 0.000000 0.142857 0.857143 0.142857 0.000000 0.000000 0.142857 0.571429 0.285714 0.000000 0.428571 0.000000 0.000000 0.571429 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.142857 0.714286 0.285714 0.000000 0.714286 0.000000 0.000000 1.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- AC[CA][TA]CAC[AC][TG]CAA[CG][TA]CT[GA]CA -------------------------------------------------------------------------------- Time 2.86 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 7 llr = 101 E-value = 1.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :11::aa3a1:69339 pos.-specific C :11:a::6::9116:: probability G a4:9:::::911::71 matrix T :371:::1:::1:1:: bits 2.2 * * 1.9 * *** * 1.7 * *** * 1.5 * **** *** Relative 1.3 * **** *** * ** Entropy 1.1 * **** *** * ** (20.9 bits) 0.9 * ***** *** * ** 0.6 * ********* **** 0.4 * ********* **** 0.2 **************** 0.0 ---------------- Multilevel GGTGCAACAGCAACGA consensus T A AA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- bd1788 323 8.36e-09 TAGCTATGCT GTAGCAACAGCAACGA CCTGATACCG 24274 17 1.10e-08 ATCAGCATAA GCTGCAACAGCCACGA TGATGAGGCA 11969 31 7.45e-08 AATGGGATAT GACGCAACAGCAACAA CAAAACGTCA 8616 442 9.00e-08 ACTGAACACC GGTGCAAAAGCTCCGA TTCATTTTAG 25634 417 2.53e-07 TTCTGCTCGG GTTTCAATAGCAAAGA TATTGTACGT 11155 89 3.28e-07 TCGACATAGA GGTGCAACAAGAATGA GGAAAGCGGC bd1237 295 4.22e-07 AGACGGAGAA GGTGCAAAAGCGAAAG ATACCAAACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd1788 8.4e-09 322_[+2]_162 24274 1.1e-08 16_[+2]_468 11969 7.5e-08 30_[+2]_454 8616 9e-08 441_[+2]_43 25634 2.5e-07 416_[+2]_68 11155 3.3e-07 88_[+2]_396 bd1237 4.2e-07 294_[+2]_190 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=7 bd1788 ( 323) GTAGCAACAGCAACGA 1 24274 ( 17) GCTGCAACAGCCACGA 1 11969 ( 31) GACGCAACAGCAACAA 1 8616 ( 442) GGTGCAAAAGCTCCGA 1 25634 ( 417) GTTTCAATAGCAAAGA 1 11155 ( 89) GGTGCAACAAGAATGA 1 bd1237 ( 295) GGTGCAAAAGCGAAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8056 bayes= 10.7734 E= 1.4e+001 -945 -945 216 -945 -92 -70 94 6 -92 -70 -945 138 -945 -945 194 -94 -945 211 -945 -945 188 -945 -945 -945 188 -945 -945 -945 8 130 -945 -94 188 -945 -945 -945 -92 -945 194 -945 -945 188 -65 -945 108 -70 -65 -94 166 -70 -945 -945 8 130 -945 -94 8 -945 167 -945 166 -945 -65 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 1.4e+001 0.000000 0.000000 1.000000 0.000000 0.142857 0.142857 0.428571 0.285714 0.142857 0.142857 0.000000 0.714286 0.000000 0.000000 0.857143 0.142857 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.571429 0.000000 0.142857 1.000000 0.000000 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.857143 0.142857 0.000000 0.571429 0.142857 0.142857 0.142857 0.857143 0.142857 0.000000 0.000000 0.285714 0.571429 0.000000 0.142857 0.285714 0.000000 0.714286 0.000000 0.857143 0.000000 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[GT]TGCAA[CA]AGCAA[CA][GA]A -------------------------------------------------------------------------------- Time 5.38 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 5 llr = 98 E-value = 3.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a8::a::8::a:::::2::2 pos.-specific C ::28:a6:48:2426a::a: probability G :222::42:::::8::2::8 matrix T ::6:::::62:86:4:6a:: bits 2.2 * * * 1.9 * ** * * ** 1.7 * ** * * ** 1.5 * ** * * * ** Relative 1.3 ** *** * *** * * *** Entropy 1.1 ** ************* *** (28.4 bits) 0.9 ** ************* *** 0.6 ******************** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel AATCACCATCATTGCCTTCG consensus GCG GGCT CCCT A A sequence G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 264590 448 1.12e-12 GTCAAAGGAG AATCACCATCATTGCCTTCG TTGCATACGC 11156 454 1.12e-12 GTCAAAGGAG AATCACCATCATTGCCTTCG TTGCATACAC bd1237 67 2.35e-09 ATCTCATCGC AAGCACCACCATCGTCATCA CCGCTCTCAT 21840 105 3.08e-09 CGCTCTTGGA AGCCACGACCACCGCCGTCG GGTTAGGAAG 11155 62 6.68e-09 CAACGATGAC AATGACGGTTATTCTCTTCG ACATAGAGGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 264590 1.1e-12 447_[+3]_33 11156 1.1e-12 453_[+3]_27 bd1237 2.3e-09 66_[+3]_414 21840 3.1e-09 104_[+3]_376 11155 6.7e-09 61_[+3]_419 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=5 264590 ( 448) AATCACCATCATTGCCTTCG 1 11156 ( 454) AATCACCATCATTGCCTTCG 1 bd1237 ( 67) AAGCACCACCATCGTCATCA 1 21840 ( 105) AGCCACGACCACCGCCGTCG 1 11155 ( 62) AATGACGGTTATTCTCTTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7988 bayes= 10.0748 E= 3.0e+001 188 -897 -897 -897 156 -897 -16 -897 -897 -21 -16 113 -897 178 -16 -897 188 -897 -897 -897 -897 211 -897 -897 -897 137 84 -897 156 -897 -16 -897 -897 79 -897 113 -897 178 -897 -45 188 -897 -897 -897 -897 -21 -897 155 -897 79 -897 113 -897 -21 184 -897 -897 137 -897 55 -897 211 -897 -897 -44 -897 -16 113 -897 -897 -897 187 -897 211 -897 -897 -44 -897 184 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 5 E= 3.0e+001 1.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.200000 0.200000 0.600000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.800000 0.000000 0.200000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.400000 0.000000 0.600000 0.000000 0.200000 0.800000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.200000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[AG][TCG][CG]AC[CG][AG][TC][CT]A[TC][TC][GC][CT]C[TAG]TC[GA] -------------------------------------------------------------------------------- Time 7.83 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10545 5.94e-01 500 11155 1.18e-08 61_[+3(6.68e-09)]_7_[+2(3.28e-07)]_\ 396 11156 2.21e-15 205_[+1(1.12e-11)]_129_\ [+3(3.00e-08)]_80_[+3(1.12e-12)]_27 11969 9.64e-08 30_[+2(7.45e-08)]_4_[+1(4.06e-08)]_\ 431 12074 4.27e-01 500 21840 2.01e-05 104_[+3(3.08e-09)]_376 23369 7.73e-01 500 24274 2.53e-04 16_[+2(1.10e-08)]_468 25634 4.52e-03 416_[+2(2.53e-07)]_68 264590 3.72e-14 341_[+1(2.78e-10)]_87_\ [+3(1.12e-12)]_12_[+3(2.72e-05)]_1 3067 1.85e-04 262_[+1(1.55e-07)]_30 34559 8.39e-05 330_[+1(1.51e-05)]_121_\ [+1(4.42e-09)]_11 4936 8.77e-05 274_[+1(3.11e-08)]_207 8616 5.93e-04 441_[+2(9.00e-08)]_43 9647 7.22e-01 500 bd1237 1.74e-08 66_[+3(2.35e-09)]_208_\ [+2(4.22e-07)]_190 bd1788 1.51e-08 322_[+2(8.36e-09)]_6_[+1(3.56e-08)]_\ 137 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************