******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/154/154.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42881 1.0000 500 43072 1.0000 500 47402 1.0000 500 4846 1.0000 500 33077 1.0000 500 49245 1.0000 500 55126 1.0000 500 44297 1.0000 500 34061 1.0000 500 45841 1.0000 500 42663 1.0000 500 32253 1.0000 500 43225 1.0000 500 36699 1.0000 500 49818 1.0000 500 45461 1.0000 500 41125 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/154/154.seqs.fa -oc motifs/154 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.266 C 0.244 G 0.229 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.266 C 0.244 G 0.229 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 18 sites = 17 llr = 188 E-value = 2.9e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :21::2611:::22:1:: pos.-specific C 12:862:21451:51518 probability G 1::23121:144:424:1 matrix T 969:141685158:7191 bits 2.1 1.9 1.7 * * 1.5 * * Relative 1.3 * ** * * Entropy 1.1 * ** * ** (16.0 bits) 0.9 * *** * ** * ** 0.6 ***** ********* ** 0.4 ***** ************ 0.2 ***** ************ 0.0 ------------------ Multilevel TTTCCTATTTCTTCTCTC consensus C GAGC CGG G G sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 41125 285 4.26e-10 TGATGGCAAT TTTCCTATTTCTTGTGTC CTATGACAGC 45461 285 4.26e-10 TGATGGCAAT TTTCCTATTTCTTGTGTC CTATGACAGC 47402 430 4.54e-08 ACCGTTCGGA TTTCCCATTGGGTCTCTC AATTCATTCC 55126 7 8.99e-07 GCATTC TTTCGTATTCGTTCCATC TGGCTTTTCC 43072 432 1.21e-06 ATCGATAAAA TATGCCATCTGTTCTCTC TTGCGAACCG 34061 284 2.14e-06 TGTGCTCCAA TTTCCGGATCGTTATCTC GTATCGACGT 4846 345 2.14e-06 AGCCCTAAAT TCTCCTGTTTGGAAGCTC GTACTAGTAC 43225 431 2.56e-06 CAACCCATTT TCTCGCATTCTCTCTCTC GAGCACAACC 49818 231 4.31e-06 GACATCTCGA TTTCTTTGTTCTTCTCTC GGAGGCGCAG 42663 467 4.31e-06 ATACTGTAAA TTTCCAATACCTTGTTTT TGCGGCAACG 44297 250 5.08e-06 TTTGCGTTGG TATGGTATTCCTTCTGCC GAAGAAGCCC 33077 6 5.08e-06 CGGTA TCTCCAGCTTCGTATGTT TCCGTCTACG 36699 449 7.00e-06 ATTGCTTCAA TCTCGCACATCGTGGGTC TCCTACCTTG 42881 9 1.58e-05 TGTGTATG TTTGGTTTTCTGTGCGTC GGGCGGGCTG 45841 25 2.08e-05 TCTCCCGGAA TTTCCAACCCCGACGCTG TTGTTTGCGA 49245 175 2.23e-05 CGAGTACAGC GATCCAGTTTGGACTTTC GAAACCTGCC 32253 311 5.59e-05 GATTCTCCTG CTACCGACTTGTTGTCTG AAGCGACATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41125 4.3e-10 284_[+1]_198 45461 4.3e-10 284_[+1]_198 47402 4.5e-08 429_[+1]_53 55126 9e-07 6_[+1]_476 43072 1.2e-06 431_[+1]_51 34061 2.1e-06 283_[+1]_199 4846 2.1e-06 344_[+1]_138 43225 2.6e-06 430_[+1]_52 49818 4.3e-06 230_[+1]_252 42663 4.3e-06 466_[+1]_16 44297 5.1e-06 249_[+1]_233 33077 5.1e-06 5_[+1]_477 36699 7e-06 448_[+1]_34 42881 1.6e-05 8_[+1]_474 45841 2.1e-05 24_[+1]_458 49245 2.2e-05 174_[+1]_308 32253 5.6e-05 310_[+1]_172 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=18 seqs=17 41125 ( 285) TTTCCTATTTCTTGTGTC 1 45461 ( 285) TTTCCTATTTCTTGTGTC 1 47402 ( 430) TTTCCCATTGGGTCTCTC 1 55126 ( 7) TTTCGTATTCGTTCCATC 1 43072 ( 432) TATGCCATCTGTTCTCTC 1 34061 ( 284) TTTCCGGATCGTTATCTC 1 4846 ( 345) TCTCCTGTTTGGAAGCTC 1 43225 ( 431) TCTCGCATTCTCTCTCTC 1 49818 ( 231) TTTCTTTGTTCTTCTCTC 1 42663 ( 467) TTTCCAATACCTTGTTTT 1 44297 ( 250) TATGGTATTCCTTCTGCC 1 33077 ( 6) TCTCCAGCTTCGTATGTT 1 36699 ( 449) TCTCGCACATCGTGGGTC 1 42881 ( 9) TTTGGTTTTCTGTGCGTC 1 45841 ( 25) TTTCCAACCCCGACGCTG 1 49245 ( 175) GATCCAGTTTGGACTTTC 1 32253 ( 311) CTACCGACTTGTTGTCTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 8211 bayes= 8.91289 E= 2.9e-003 -1073 -205 -196 176 -59 -5 -1073 118 -218 -1073 -1073 185 -1073 175 -37 -1073 -1073 140 36 -214 -18 -5 -96 66 128 -1073 4 -114 -218 -5 -196 131 -118 -105 -1073 155 -1073 75 -196 102 -1073 95 85 -114 -1073 -205 85 102 -59 -1073 -1073 166 -59 95 62 -1073 -1073 -105 -37 144 -218 95 62 -114 -1073 -205 -1073 185 -1073 165 -96 -114 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 17 E= 2.9e-003 0.000000 0.058824 0.058824 0.882353 0.176471 0.235294 0.000000 0.588235 0.058824 0.000000 0.000000 0.941176 0.000000 0.823529 0.176471 0.000000 0.000000 0.647059 0.294118 0.058824 0.235294 0.235294 0.117647 0.411765 0.647059 0.000000 0.235294 0.117647 0.058824 0.235294 0.058824 0.647059 0.117647 0.117647 0.000000 0.764706 0.000000 0.411765 0.058824 0.529412 0.000000 0.470588 0.411765 0.117647 0.000000 0.058824 0.411765 0.529412 0.176471 0.000000 0.000000 0.823529 0.176471 0.470588 0.352941 0.000000 0.000000 0.117647 0.176471 0.705882 0.058824 0.470588 0.352941 0.117647 0.000000 0.058824 0.000000 0.941176 0.000000 0.764706 0.117647 0.117647 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[TC]TC[CG][TAC][AG][TC]T[TC][CG][TG]T[CG]T[CG]TC -------------------------------------------------------------------------------- Time 2.51 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 114 E-value = 3.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :7::::::::::2::2232:: pos.-specific C ::8:2::55::a3::285:37 probability G :22a8::5533:37a:::8:3 matrix T a2:::aa::77:23:7:2:7: bits 2.1 * * * 1.9 * * ** * * 1.7 * * ** * * 1.5 * ***** * * * Relative 1.3 * ***** * * * * Entropy 1.1 * ********** ** * *** (27.5 bits) 0.9 * ********** ** * *** 0.6 ************ **** *** 0.4 ************ ******** 0.2 ********************* 0.0 --------------------- Multilevel TACGGTTCCTTCCGGTCCGTC consensus GGGG GT A CG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 41125 209 5.55e-12 GAGTTTCGTT TACGGTTCGGTCCGGTCCGTC CCACTTGACG 45461 209 5.55e-12 GAGTTTCGTT TACGGTTCGGTCCGGTCCGTC CCACTTGACG 4846 302 1.41e-09 CAGAAAACCC TAGGGTTGGTGCAGGTCAGTG GTTGCTCGAC 45841 244 1.93e-09 CCTGACAGCT TTCGGTTGCTTCTTGCCCGTC GCCGAGATAC 43225 55 3.61e-09 AGTCACATGC TACGCTTCCTTCGTGTCTGCG ACCGGATCCG 36699 332 1.80e-08 ATAAGCTCTT TGCGGTTGCTGCGGGAAAACC GAATTAAAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41125 5.5e-12 208_[+2]_271 45461 5.5e-12 208_[+2]_271 4846 1.4e-09 301_[+2]_178 45841 1.9e-09 243_[+2]_236 43225 3.6e-09 54_[+2]_425 36699 1.8e-08 331_[+2]_148 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 41125 ( 209) TACGGTTCGGTCCGGTCCGTC 1 45461 ( 209) TACGGTTCGGTCCGGTCCGTC 1 4846 ( 302) TAGGGTTGGTGCAGGTCAGTG 1 45841 ( 244) TTCGGTTGCTTCTTGCCCGTC 1 43225 ( 55) TACGCTTCCTTCGTGTCTGCG 1 36699 ( 332) TGCGGTTGCTGCGGGAAAACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8160 bayes= 10.8561 E= 3.4e-001 -923 -923 -923 194 132 -923 -46 -64 -923 177 -46 -923 -923 -923 213 -923 -923 -55 186 -923 -923 -923 -923 194 -923 -923 -923 194 -923 103 113 -923 -923 103 113 -923 -923 -923 54 136 -923 -923 54 136 -923 203 -923 -923 -68 45 54 -64 -923 -923 154 36 -923 -923 213 -923 -68 -55 -923 136 -68 177 -923 -923 32 103 -923 -64 -68 -923 186 -923 -923 45 -923 136 -923 145 54 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 3.4e-001 0.000000 0.000000 0.000000 1.000000 0.666667 0.000000 0.166667 0.166667 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.166667 0.333333 0.333333 0.166667 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 1.000000 0.000000 0.166667 0.166667 0.000000 0.666667 0.166667 0.833333 0.000000 0.000000 0.333333 0.500000 0.000000 0.166667 0.166667 0.000000 0.833333 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.666667 0.333333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TACGGTT[CG][CG][TG][TG]C[CG][GT]GTC[CA]G[TC][CG] -------------------------------------------------------------------------------- Time 4.96 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 112 E-value = 1.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::8::2:a7::5::::333:: pos.-specific C 88:a223:323:a::5::2:: probability G :22::37::5:5::8:::2:a matrix T 2:::83:::37::a25773a: bits 2.1 * * * 1.9 * * ** ** 1.7 * * ** ** 1.5 * * * *** ** Relative 1.3 ***** ** *** ** Entropy 1.1 ***** *** ******** ** (26.8 bits) 0.9 ***** *** ******** ** 0.6 ***** ************ ** 0.4 ***** ************ ** 0.2 ***** ************ ** 0.0 --------------------- Multilevel CCACTGGAAGTACTGCTTATG consensus TC CTCG TAAT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 41125 230 1.34e-10 CCGGTCCGTC CCACTTGACGTACTGTATATG TGATTGAAAC 45461 230 1.34e-10 CCGGTCCGTC CCACTTGACGTACTGTATATG TGATTGAAAC 49818 129 7.90e-10 TCGACTTGCA TCACTGGAATCGCTGTTTTTG CAAATTAGGA 42663 41 4.05e-09 CCTTCTCTCA CCACTACAACTACTGCTACTG CTACGACGAC 33077 112 4.39e-09 CCATCTCGGT CCGCCGCAATTGCTGCTTTTG TCTCGCCAGC 44297 163 1.32e-08 TAACTTCAGA CGACTCGAAGCGCTTCTAGTG TAATTCGGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41125 1.3e-10 229_[+3]_250 45461 1.3e-10 229_[+3]_250 49818 7.9e-10 128_[+3]_351 42663 4e-09 40_[+3]_439 33077 4.4e-09 111_[+3]_368 44297 1.3e-08 162_[+3]_317 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 41125 ( 230) CCACTTGACGTACTGTATATG 1 45461 ( 230) CCACTTGACGTACTGTATATG 1 49818 ( 129) TCACTGGAATCGCTGTTTTTG 1 42663 ( 41) CCACTACAACTACTGCTACTG 1 33077 ( 112) CCGCCGCAATTGCTGCTTTTG 1 44297 ( 163) CGACTCGAAGCGCTTCTAGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8160 bayes= 10.067 E= 1.4e+001 -923 177 -923 -64 -923 177 -46 -923 164 -923 -46 -923 -923 203 -923 -923 -923 -55 -923 168 -68 -55 54 36 -923 45 154 -923 191 -923 -923 -923 132 45 -923 -923 -923 -55 113 36 -923 45 -923 136 91 -923 113 -923 -923 203 -923 -923 -923 -923 -923 194 -923 -923 186 -64 -923 103 -923 94 32 -923 -923 136 32 -923 -923 136 32 -55 -46 36 -923 -923 -923 194 -923 -923 213 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 1.4e+001 0.000000 0.833333 0.000000 0.166667 0.000000 0.833333 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.166667 0.166667 0.333333 0.333333 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 0.166667 0.500000 0.333333 0.000000 0.333333 0.000000 0.666667 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.500000 0.000000 0.500000 0.333333 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.666667 0.333333 0.166667 0.166667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CCACT[GT][GC]A[AC][GT][TC][AG]CTG[CT][TA][TA][AT]TG -------------------------------------------------------------------------------- Time 7.39 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42881 6.24e-02 8_[+1(1.58e-05)]_474 43072 8.78e-03 431_[+1(1.21e-06)]_51 47402 8.48e-05 404_[+1(5.91e-05)]_7_[+1(4.54e-08)]_\ 53 4846 1.42e-07 301_[+2(1.41e-09)]_22_\ [+1(2.14e-06)]_138 33077 4.04e-07 5_[+1(5.08e-06)]_28_[+1(7.37e-05)]_\ 42_[+3(4.39e-09)]_342_[+3(5.19e-05)]_5 49245 7.01e-02 174_[+1(2.23e-05)]_308 55126 6.52e-03 6_[+1(8.99e-07)]_476 44297 3.38e-07 162_[+3(1.32e-08)]_66_\ [+1(5.08e-06)]_233 34061 2.68e-02 283_[+1(2.14e-06)]_199 45841 1.00e-06 24_[+1(2.08e-05)]_201_\ [+2(1.93e-09)]_236 42663 3.39e-07 40_[+3(4.05e-09)]_405_\ [+1(4.31e-06)]_16 32253 6.71e-02 310_[+1(5.59e-05)]_172 43225 2.51e-07 54_[+2(3.61e-09)]_355_\ [+1(2.56e-06)]_52 36699 3.06e-06 331_[+2(1.80e-08)]_96_\ [+1(7.00e-06)]_34 49818 1.60e-07 128_[+3(7.90e-10)]_81_\ [+1(4.31e-06)]_252 45461 4.90e-20 208_[+2(5.55e-12)]_[+3(1.34e-10)]_\ 34_[+1(4.26e-10)]_198 41125 4.90e-20 208_[+2(5.55e-12)]_[+3(1.34e-10)]_\ 34_[+1(4.26e-10)]_198 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************