******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/444/444.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47352 1.0000 500 47694 1.0000 500 38108 1.0000 500 3388 1.0000 500 48494 1.0000 500 39045 1.0000 500 39403 1.0000 500 48855 1.0000 500 49025 1.0000 500 49842 1.0000 500 33128 1.0000 500 38934 1.0000 500 31875 1.0000 500 48581 1.0000 500 39796 1.0000 500 33110 1.0000 500 37575 1.0000 500 35923 1.0000 500 48994 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/444/444.seqs.fa -oc motifs/444 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.232 G 0.215 T 0.285 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.232 G 0.215 T 0.285 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 9 llr = 144 E-value = 1.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :4::923383762621::::9 pos.-specific C :626::47:1:3::1:82111 probability G ::84:81:26318171:::7: matrix T a:::1:1::::::3:82892: bits 2.2 2.0 1.8 * 1.6 * Relative 1.3 * * ** * * * * Entropy 1.1 ****** ** * * *** * (23.0 bits) 0.9 ****** ** * * ******* 0.7 ****** ****** ******* 0.4 ****** ************** 0.2 ********************* 0.0 --------------------- Multilevel TCGCAGCCAGAAGAGTCTTGA consensus ACG AAAGAGCATA TC T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 35923 160 3.98e-12 ATCTCTTCGT TAGGAGCCAGACGAGTCTTGA TCACCTCGTG 31875 159 3.98e-12 ATCTCTTCGT TAGGAGCCAGACGAGTCTTGA TCACCTCGTG 39796 201 3.18e-08 CGAAAGTCAA TCGCAGTCAGAAGAAGCTCGA ATTTGCCGAT 47694 466 3.44e-08 GCAACTTGTA TCGCAGCAGGGAGAGTCCTTC ACGCCCAGCG 3388 413 3.72e-08 GCAACAGACT TCGCAAGCAGAAGTCTTTTGA ATTTCAAGCC 37575 416 4.34e-08 GAGGCTCCAA TCCGAGAAAAGCGGATCTTGA TTGCAATGCG 49842 31 8.92e-08 TAATAACATT TAGCTGACGCAAGTGTCTTTA CTTTGACACT 48994 453 1.90e-07 AGTTCTGCAA TACCAACCAAGGAAGTTTTGA GGCCAACGCC 33128 25 2.53e-07 AGAGGGTCTT TCGGAGAAAAAAATGACCTCA TTCATAACCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35923 4e-12 159_[+1]_320 31875 4e-12 158_[+1]_321 39796 3.2e-08 200_[+1]_279 47694 3.4e-08 465_[+1]_14 3388 3.7e-08 412_[+1]_67 37575 4.3e-08 415_[+1]_64 49842 8.9e-08 30_[+1]_449 48994 1.9e-07 452_[+1]_27 33128 2.5e-07 24_[+1]_455 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=9 35923 ( 160) TAGGAGCCAGACGAGTCTTGA 1 31875 ( 159) TAGGAGCCAGACGAGTCTTGA 1 39796 ( 201) TCGCAGTCAGAAGAAGCTCGA 1 47694 ( 466) TCGCAGCAGGGAGAGTCCTTC 1 3388 ( 413) TCGCAAGCAGAAGTCTTTTGA 1 37575 ( 416) TCCGAGAAAAGCGGATCTTGA 1 49842 ( 31) TAGCTGACGCAAGTGTCTTTA 1 48994 ( 453) TACCAACCAAGGAAGTTTTGA 1 33128 ( 25) TCGGAGAAAAAAATGACCTCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9120 bayes= 10.1179 E= 1.6e+000 -982 -982 -982 181 73 126 -982 -982 -982 -6 185 -982 -982 126 105 -982 173 -982 -982 -136 -27 -982 185 -982 31 94 -95 -136 31 152 -982 -982 154 -982 5 -982 31 -106 137 -982 131 -982 63 -982 105 52 -95 -982 -27 -982 185 -982 105 -982 -95 23 -27 -106 163 -982 -127 -982 -95 145 -982 175 -982 -36 -982 -6 -982 145 -982 -106 -982 164 -982 -106 163 -36 173 -106 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 1.6e+000 0.000000 0.000000 0.000000 1.000000 0.444444 0.555556 0.000000 0.000000 0.000000 0.222222 0.777778 0.000000 0.000000 0.555556 0.444444 0.000000 0.888889 0.000000 0.000000 0.111111 0.222222 0.000000 0.777778 0.000000 0.333333 0.444444 0.111111 0.111111 0.333333 0.666667 0.000000 0.000000 0.777778 0.000000 0.222222 0.000000 0.333333 0.111111 0.555556 0.000000 0.666667 0.000000 0.333333 0.000000 0.555556 0.333333 0.111111 0.000000 0.222222 0.000000 0.777778 0.000000 0.555556 0.000000 0.111111 0.333333 0.222222 0.111111 0.666667 0.000000 0.111111 0.000000 0.111111 0.777778 0.000000 0.777778 0.000000 0.222222 0.000000 0.222222 0.000000 0.777778 0.000000 0.111111 0.000000 0.888889 0.000000 0.111111 0.666667 0.222222 0.888889 0.111111 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[CA][GC][CG]A[GA][CA][CA][AG][GA][AG][AC][GA][AT][GA]T[CT][TC]T[GT]A -------------------------------------------------------------------------------- Time 3.00 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 4 llr = 88 E-value = 3.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::5::3::a:8:aa::::88 pos.-specific C 3::a:::::53:::a338:3 probability G 8::::8:a:::a:::8833: matrix T :a5:a:a::5:::::::::: bits 2.2 * * * * 2.0 * ** **** 1.8 * ** *** **** 1.6 * ** *** **** Relative 1.3 ** ****** ******* Entropy 1.1 ** ****** ********** (31.8 bits) 0.9 ******************** 0.7 ******************** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GTACTGTGACAGAACGGCAA consensus C T A TC CCGGC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 35923 283 1.36e-12 GTATTGCCTG GTTCTGTGACAGAACGGCAA TAAGGCAACC 31875 282 1.36e-12 GTATTGCCTG GTTCTGTGACAGAACGGCAA TAAGGCAACC 38108 47 3.08e-10 GATTTGCTAT GTACTGTGATCGAACCGGGA ATGTGGATCA 33110 380 4.60e-10 TAGCAAGTGA CTACTATGATAGAACGCCAC TGTCGCCATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35923 1.4e-12 282_[+2]_198 31875 1.4e-12 281_[+2]_199 38108 3.1e-10 46_[+2]_434 33110 4.6e-10 379_[+2]_101 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=4 35923 ( 283) GTTCTGTGACAGAACGGCAA 1 31875 ( 282) GTTCTGTGACAGAACGGCAA 1 38108 ( 47) GTACTGTGATCGAACCGGGA 1 33110 ( 380) CTACTATGATAGAACGCCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 9139 bayes= 11.1572 E= 3.9e+000 -865 11 180 -865 -865 -865 -865 181 90 -865 -865 81 -865 211 -865 -865 -865 -865 -865 181 -10 -865 180 -865 -865 -865 -865 181 -865 -865 221 -865 190 -865 -865 -865 -865 111 -865 81 148 11 -865 -865 -865 -865 221 -865 190 -865 -865 -865 190 -865 -865 -865 -865 211 -865 -865 -865 11 180 -865 -865 11 180 -865 -865 169 22 -865 148 -865 22 -865 148 11 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 3.9e+000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.500000 0.000000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.750000 0.250000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GC]T[AT]CT[GA]TGA[CT][AC]GAAC[GC][GC][CG][AG][AC] -------------------------------------------------------------------------------- Time 6.31 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 11 llr = 155 E-value = 1.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 2263196413::924::::85 pos.-specific C :2:57::2533:1:::1a::: probability G 2:::214:11:7:1:19:214 matrix T 6643:::54473:769::812 bits 2.2 * 2.0 * 1.8 ** 1.6 * * ** Relative 1.3 * ** *** Entropy 1.1 *** *** ***** (20.4 bits) 0.9 * *** *** ****** 0.7 *** *** ********** 0.4 ******** *********** 0.2 ********* *********** 0.0 --------------------- Multilevel TTACCAATCTTGATTTGCTAA consensus TA GATACT A G sequence T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 35923 230 6.57e-11 CGTTGTCAGA TTAACAGTCTTGATTTGCTAA TCACTTTGCT 47694 64 2.51e-10 ACGACTCGTC TTATCAAACCTGATATGCTAG GAACGAAATA 31875 229 1.49e-09 CATTGTCAGA TTAACAGTCTTGAATTGCTAA TCACTTTGCT 37575 319 1.23e-07 TGCAGAAACC TTACCAGATACGATTTCCGAA AAGGCGGGTA 38934 130 1.74e-07 CATGATAAAC TCTCCAAATCTTATTTGCTTG GCAACTGTAC 38108 26 1.74e-07 AGTCTGACTA GTATCGGTCTCGATTTGCTAT GTACTGTGAT 49842 225 4.49e-07 GTGAAAAAAG ACTCCAATCGTTATATGCGAA ATCTCGGTCA 33128 220 6.84e-07 TGTCTCTTTT TTAAGAATTATGAGTTGCTGT ATAAAACCAG 3388 301 6.84e-07 GTACGCAAGT TTTCAAACGATGCTTTGCTAG AGGAAAATGG 48855 98 1.01e-06 CTGTGTCTTC GAACCAACACCGATAGGCTAG TATTTAATAA 49025 298 1.81e-06 TTTACATTCT AATTGAAATTTTAAATGCTAA ATCAAACACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35923 6.6e-11 229_[+3]_250 47694 2.5e-10 63_[+3]_416 31875 1.5e-09 228_[+3]_251 37575 1.2e-07 318_[+3]_161 38934 1.7e-07 129_[+3]_350 38108 1.7e-07 25_[+3]_454 49842 4.5e-07 224_[+3]_255 33128 6.8e-07 219_[+3]_260 3388 6.8e-07 300_[+3]_179 48855 1e-06 97_[+3]_382 49025 1.8e-06 297_[+3]_182 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=11 35923 ( 230) TTAACAGTCTTGATTTGCTAA 1 47694 ( 64) TTATCAAACCTGATATGCTAG 1 31875 ( 229) TTAACAGTCTTGAATTGCTAA 1 37575 ( 319) TTACCAGATACGATTTCCGAA 1 38934 ( 130) TCTCCAAATCTTATTTGCTTG 1 38108 ( 26) GTATCGGTCTCGATTTGCTAT 1 49842 ( 225) ACTCCAATCGTTATATGCGAA 1 33128 ( 220) TTAAGAATTATGAGTTGCTGT 1 3388 ( 301) TTTCAAACGATGCTTTGCTAG 1 48855 ( 98) GAACCAACACCGATAGGCTAG 1 49025 ( 298) AATTGAAATTTTAAATGCTAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9120 bayes= 9.24555 E= 1.3e+001 -56 -1010 -24 116 -56 -35 -1010 116 125 -1010 -1010 35 3 97 -1010 -6 -156 165 -24 -1010 176 -1010 -124 -1010 125 -1010 76 -1010 44 -35 -1010 67 -156 97 -124 35 3 24 -124 35 -1010 24 -1010 135 -1010 -1010 176 -6 176 -135 -1010 -1010 -56 -1010 -124 135 44 -1010 -1010 116 -1010 -1010 -124 167 -1010 -135 208 -1010 -1010 211 -1010 -1010 -1010 -1010 -24 152 161 -1010 -124 -165 76 -1010 76 -65 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 11 E= 1.3e+001 0.181818 0.000000 0.181818 0.636364 0.181818 0.181818 0.000000 0.636364 0.636364 0.000000 0.000000 0.363636 0.272727 0.454545 0.000000 0.272727 0.090909 0.727273 0.181818 0.000000 0.909091 0.000000 0.090909 0.000000 0.636364 0.000000 0.363636 0.000000 0.363636 0.181818 0.000000 0.454545 0.090909 0.454545 0.090909 0.363636 0.272727 0.272727 0.090909 0.363636 0.000000 0.272727 0.000000 0.727273 0.000000 0.000000 0.727273 0.272727 0.909091 0.090909 0.000000 0.000000 0.181818 0.000000 0.090909 0.727273 0.363636 0.000000 0.000000 0.636364 0.000000 0.000000 0.090909 0.909091 0.000000 0.090909 0.909091 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.181818 0.818182 0.818182 0.000000 0.090909 0.090909 0.454545 0.000000 0.363636 0.181818 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TT[AT][CAT]CA[AG][TA][CT][TAC][TC][GT]AT[TA]TGCTA[AG] -------------------------------------------------------------------------------- Time 9.69 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47352 6.07e-01 500 47694 3.67e-10 63_[+3(2.51e-10)]_381_\ [+1(3.44e-08)]_14 38108 9.86e-10 25_[+3(1.74e-07)]_[+2(3.08e-10)]_\ 434 3388 1.16e-06 73_[+1(1.93e-05)]_206_\ [+3(6.84e-07)]_91_[+1(3.72e-08)]_67 48494 7.83e-01 500 39045 2.98e-01 500 39403 8.81e-01 500 48855 3.48e-03 97_[+3(1.01e-06)]_382 49025 2.07e-02 297_[+3(1.81e-06)]_182 49842 1.43e-06 30_[+1(8.92e-08)]_173_\ [+3(4.49e-07)]_255 33128 4.67e-06 24_[+1(2.53e-07)]_174_\ [+3(6.84e-07)]_260 38934 2.13e-03 129_[+3(1.74e-07)]_350 31875 1.41e-21 158_[+1(3.98e-12)]_49_\ [+3(1.49e-09)]_32_[+2(1.36e-12)]_199 48581 2.83e-01 500 39796 1.10e-04 200_[+1(3.18e-08)]_279 33110 5.09e-06 379_[+2(4.60e-10)]_101 37575 3.10e-08 318_[+3(1.23e-07)]_76_\ [+1(4.34e-08)]_64 35923 6.95e-23 159_[+1(3.98e-12)]_49_\ [+3(6.57e-11)]_32_[+2(1.36e-12)]_198 48994 1.41e-03 452_[+1(1.90e-07)]_27 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************