******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/195/195.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9444 1.0000 500 42590 1.0000 500 42721 1.0000 500 32065 1.0000 500 32390 1.0000 500 36403 1.0000 500 46838 1.0000 500 13675 1.0000 500 47206 1.0000 500 49442 1.0000 500 31160 1.0000 500 44858 1.0000 500 46021 1.0000 500 50634 1.0000 500 35135 1.0000 500 49354 1.0000 500 50344 1.0000 500 45254 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/195/195.seqs.fa -oc motifs/195 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.259 C 0.244 G 0.223 T 0.274 Background letter frequencies (from dataset with add-one prior applied): A 0.259 C 0.244 G 0.223 T 0.274 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 108 E-value = 7.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::1::::9a:: pos.-specific C :6:21:931:a: probability G ::829317:::a matrix T a424:7:::::: bits 2.2 * 1.9 * *** 1.7 * * *** 1.5 * * * **** Relative 1.3 * * * ****** Entropy 1.1 * * ******** (17.4 bits) 0.9 *** ******** 0.6 *** ******** 0.4 *** ******** 0.2 ************ 0.0 ------------ Multilevel TCGTGTCGAACG consensus TTC G C sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 50344 4 1.86e-07 TTG TCGTGGCGAACG ATTGTGGTCT 49354 6 1.86e-07 CTTTG TCGTGGCGAACG ATTGTGGTCT 50634 133 1.86e-07 TGTTATCATT TCGGGTCGAACG CCTCTATAGC 31160 96 2.84e-07 CTCGTTCGCT TCGTGTCCAACG CCGGATATCA 44858 25 3.75e-07 CGGTGTGTGT TTGGGTCGAACG GGGACGGCAT 45254 362 2.14e-06 TTGTGTACCT TTTCGTCGAACG TTTGCCTCGC 46838 7 4.24e-06 CGCTTT TTTCGTCCAACG AGATCCCACC 35135 27 8.42e-06 CTCCGACTTT TTGTGTGGCACG GACGATGCCC 32390 283 1.03e-05 CGCATGGCAC TCGACGCCAACG GTAGGAATCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50344 1.9e-07 3_[+1]_485 49354 1.9e-07 5_[+1]_483 50634 1.9e-07 132_[+1]_356 31160 2.8e-07 95_[+1]_393 44858 3.7e-07 24_[+1]_464 45254 2.1e-06 361_[+1]_127 46838 4.2e-06 6_[+1]_482 35135 8.4e-06 26_[+1]_462 32390 1e-05 282_[+1]_206 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 50344 ( 4) TCGTGGCGAACG 1 49354 ( 6) TCGTGGCGAACG 1 50634 ( 133) TCGGGTCGAACG 1 31160 ( 96) TCGTGTCCAACG 1 44858 ( 25) TTGGGTCGAACG 1 45254 ( 362) TTTCGTCGAACG 1 46838 ( 7) TTTCGTCCAACG 1 35135 ( 27) TTGTGTGGCACG 1 32390 ( 283) TCGACGCCAACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 10.7809 E= 7.6e-001 -982 -982 -982 187 -982 119 -982 70 -982 -982 180 -30 -122 -13 -1 70 -982 -113 199 -982 -982 -982 58 128 -982 187 -101 -982 -982 45 158 -982 178 -113 -982 -982 195 -982 -982 -982 -982 204 -982 -982 -982 -982 216 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 7.6e-001 0.000000 0.000000 0.000000 1.000000 0.000000 0.555556 0.000000 0.444444 0.000000 0.000000 0.777778 0.222222 0.111111 0.222222 0.222222 0.444444 0.000000 0.111111 0.888889 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.888889 0.111111 0.000000 0.000000 0.333333 0.666667 0.000000 0.888889 0.111111 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[CT][GT][TCG]G[TG]C[GC]AACG -------------------------------------------------------------------------------- Time 2.78 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 17 llr = 159 E-value = 4.3e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :a9411:231:4 pos.-specific C 9:15429::8:6 probability G 1:::42:61:a: matrix T :::1251261:: bits 2.2 * 1.9 * * 1.7 * * 1.5 *** * * Relative 1.3 *** * * Entropy 1.1 *** * *** (13.5 bits) 0.9 *** * *** 0.6 **** ****** 0.4 **** ****** 0.2 ************ 0.0 ------------ Multilevel CAACCTCGTCGC consensus AGG AA A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47206 447 8.78e-07 TTGGTGCTAG CAACTTCGTCGA ACAGACTGAA 50634 290 2.38e-06 TGAAATCGTT CAAACGCGTCGA CCCCCTTCAT 50344 65 3.20e-06 ACAGAATGAC CAACCGCATCGC AATTGGATAG 49354 67 3.20e-06 ACAGAATGAC CAACCGCATCGC AATTGGATAG 44858 411 4.20e-06 GACTCTATTT CAAATTCATCGC GAACGGCCGT 42590 226 9.48e-06 TCTCGTCGTC CAACGTCTACGA TTAACAGCGT 35135 245 1.55e-05 TTATAACGGT CAAACTTGTCGA CTTGCGTGCT 46838 410 1.71e-05 TCCTCTGCGC CAAAGTCGATGC AGTCCAACTC 45254 231 2.22e-05 CACAGCTCTA CAATCCCGTCGC CTGACCCATT 13675 80 2.22e-05 CATTTTCAAA GAACGCCGTCGA AGGCCGGCAG 49442 20 2.41e-05 ACTAGCGTTT GAACCTCTTCGC TTCTTGGCGG 32390 4 2.41e-05 TTA CACAGTCGACGA GGGTCTCGCC 36403 369 2.87e-05 CGCTCGCTAC CAAATTCGATGC ATCTGAGTTT 9444 428 3.60e-05 GTGCGATAGA CACCTTCTTCGC GAAGAACTGC 42721 373 4.94e-05 ATTTTTTTGC CAAAGGCAGCGC CCATTGTTGC 32065 151 1.05e-04 CTGGGCTCGA CAACGACGTAGA TGCGGGTTCC 31160 417 1.38e-04 ATTGTTGTGC CAACACTGACGC CACTCTTTCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47206 8.8e-07 446_[+2]_42 50634 2.4e-06 289_[+2]_199 50344 3.2e-06 64_[+2]_424 49354 3.2e-06 66_[+2]_422 44858 4.2e-06 410_[+2]_78 42590 9.5e-06 225_[+2]_263 35135 1.6e-05 244_[+2]_244 46838 1.7e-05 409_[+2]_79 45254 2.2e-05 230_[+2]_258 13675 2.2e-05 79_[+2]_409 49442 2.4e-05 19_[+2]_469 32390 2.4e-05 3_[+2]_485 36403 2.9e-05 368_[+2]_120 9444 3.6e-05 427_[+2]_61 42721 4.9e-05 372_[+2]_116 32065 0.0001 150_[+2]_338 31160 0.00014 416_[+2]_72 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=17 47206 ( 447) CAACTTCGTCGA 1 50634 ( 290) CAAACGCGTCGA 1 50344 ( 65) CAACCGCATCGC 1 49354 ( 67) CAACCGCATCGC 1 44858 ( 411) CAAATTCATCGC 1 42590 ( 226) CAACGTCTACGA 1 35135 ( 245) CAAACTTGTCGA 1 46838 ( 410) CAAAGTCGATGC 1 45254 ( 231) CAATCCCGTCGC 1 13675 ( 80) GAACGCCGTCGA 1 49442 ( 20) GAACCTCTTCGC 1 32390 ( 4) CACAGTCGACGA 1 36403 ( 369) CAAATTCGATGC 1 9444 ( 428) CACCTTCTTCGC 1 42721 ( 373) CAAAGGCAGCGC 1 32065 ( 151) CAACGACGTAGA 1 31160 ( 417) CAACACTGACGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 9.08304 E= 4.3e-001 -1073 186 -92 -1073 195 -1073 -1073 -1073 177 -105 -1073 -1073 67 112 -1073 -222 -213 53 66 -22 -213 -47 8 95 -1073 186 -1073 -122 -14 -1073 140 -64 18 -1073 -192 124 -213 176 -1073 -122 -1073 -1073 216 -1073 67 127 -1073 -1073 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 17 E= 4.3e-001 0.000000 0.882353 0.117647 0.000000 1.000000 0.000000 0.000000 0.000000 0.882353 0.117647 0.000000 0.000000 0.411765 0.529412 0.000000 0.058824 0.058824 0.352941 0.352941 0.235294 0.058824 0.176471 0.235294 0.529412 0.000000 0.882353 0.000000 0.117647 0.235294 0.000000 0.588235 0.176471 0.294118 0.000000 0.058824 0.647059 0.058824 0.823529 0.000000 0.117647 0.000000 0.000000 1.000000 0.000000 0.411765 0.588235 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CAA[CA][CGT][TG]C[GA][TA]CG[CA] -------------------------------------------------------------------------------- Time 5.43 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 9 llr = 140 E-value = 4.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::1:1:22:3:19:42:271: pos.-specific C 1::63::3646::a2::4::1 probability G ::134::34::9::38:::32 matrix T 9a811a81:24:1:::a3367 bits 2.2 1.9 * * * * 1.7 * * * * * 1.5 * * *** * Relative 1.3 ** * *** ** Entropy 1.1 ** ** * *** ** * (22.5 bits) 0.9 *** ** * **** ** * 0.6 **** ** * **** ** *** 0.4 **** ** ************* 0.2 ********************* 0.0 --------------------- Multilevel TTTCGTTCCCCGACAGTCATT consensus GC AGGAT GA TTGG sequence A T C A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 35135 453 8.46e-11 CGAGCCGCGG TTTCGTTAGACGACAGTCATT TGTCATCTCA 42590 129 8.34e-10 ACGTACCGGT TTTCGTTCGTCGACCGTTATT GTTAGTTGGT 50344 435 4.10e-08 ATCGCAATTC TTTGCTTGCCTGACGATTTGG ACATAGGACG 49354 437 4.10e-08 ATCGCAATTC TTTGCTTGCCTGACGATTTGG ACATAGGACG 31160 434 4.10e-08 GACGCCACTC TTTCTTACCCCGACCGTATTT ACGCATTCCA 32065 222 5.69e-08 CACTGTTGCA TTACATTAGTTGACAGTCATT CGTTCCATAG 42721 464 1.03e-07 TACATCATTC TTTTGTTTCACAACAGTCAGT GGAATATCGA 45254 274 1.35e-07 TCGTCCACGC TTGCCTACCACGACGGTAATC TTTTCCTCCC 49442 250 1.75e-07 TCGTCGTCTC CTTGGTTGGCTGTCAGTCAAT GGCGGAATCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35135 8.5e-11 452_[+3]_27 42590 8.3e-10 128_[+3]_351 50344 4.1e-08 434_[+3]_45 49354 4.1e-08 436_[+3]_43 31160 4.1e-08 433_[+3]_46 32065 5.7e-08 221_[+3]_258 42721 1e-07 463_[+3]_16 45254 1.4e-07 273_[+3]_206 49442 1.7e-07 249_[+3]_230 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=9 35135 ( 453) TTTCGTTAGACGACAGTCATT 1 42590 ( 129) TTTCGTTCGTCGACCGTTATT 1 50344 ( 435) TTTGCTTGCCTGACGATTTGG 1 49354 ( 437) TTTGCTTGCCTGACGATTTGG 1 31160 ( 434) TTTCTTACCCCGACCGTATTT 1 32065 ( 222) TTACATTAGTTGACAGTCATT 1 42721 ( 464) TTTTGTTTCACAACAGTCAGT 1 45254 ( 274) TTGCCTACCACGACGGTAATC 1 49442 ( 250) CTTGGTTGGCTGTCAGTCAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 10.0398 E= 4.9e+000 -982 -113 -982 170 -982 -982 -982 187 -122 -982 -101 150 -982 119 58 -130 -122 45 99 -130 -982 -982 -982 187 -22 -982 -982 150 -22 45 58 -130 -982 119 99 -982 36 87 -982 -30 -982 119 -982 70 -122 -982 199 -982 178 -982 -982 -130 -982 204 -982 -982 78 -13 58 -982 -22 -982 180 -982 -982 -982 -982 187 -22 87 -982 28 136 -982 -982 28 -122 -982 58 102 -982 -113 -1 128 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 4.9e+000 0.000000 0.111111 0.000000 0.888889 0.000000 0.000000 0.000000 1.000000 0.111111 0.000000 0.111111 0.777778 0.000000 0.555556 0.333333 0.111111 0.111111 0.333333 0.444444 0.111111 0.000000 0.000000 0.000000 1.000000 0.222222 0.000000 0.000000 0.777778 0.222222 0.333333 0.333333 0.111111 0.000000 0.555556 0.444444 0.000000 0.333333 0.444444 0.000000 0.222222 0.000000 0.555556 0.000000 0.444444 0.111111 0.000000 0.888889 0.000000 0.888889 0.000000 0.000000 0.111111 0.000000 1.000000 0.000000 0.000000 0.444444 0.222222 0.333333 0.000000 0.222222 0.000000 0.777778 0.000000 0.000000 0.000000 0.000000 1.000000 0.222222 0.444444 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.111111 0.000000 0.333333 0.555556 0.000000 0.111111 0.222222 0.666667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TTT[CG][GC]T[TA][CGA][CG][CAT][CT]GAC[AGC][GA]T[CTA][AT][TG][TG] -------------------------------------------------------------------------------- Time 8.23 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9444 1.10e-01 427_[+2(3.60e-05)]_61 42590 1.70e-07 128_[+3(8.34e-10)]_76_\ [+2(9.48e-06)]_263 42721 7.60e-05 372_[+2(4.94e-05)]_79_\ [+3(1.03e-07)]_16 32065 7.17e-05 221_[+3(5.69e-08)]_258 32390 3.31e-03 3_[+2(2.41e-05)]_267_[+1(1.03e-05)]_\ 206 36403 7.84e-02 368_[+2(2.87e-05)]_120 46838 7.05e-04 6_[+1(4.24e-06)]_391_[+2(1.71e-05)]_\ 79 13675 4.47e-02 79_[+2(2.22e-05)]_409 47206 7.99e-03 52_[+2(2.41e-05)]_382_\ [+2(8.78e-07)]_42 49442 8.10e-05 19_[+2(2.41e-05)]_218_\ [+3(1.75e-07)]_230 31160 4.93e-08 95_[+1(2.84e-07)]_326_\ [+3(4.10e-08)]_46 44858 3.48e-05 24_[+1(3.75e-07)]_374_\ [+2(4.20e-06)]_78 46021 8.73e-01 500 50634 1.75e-06 132_[+1(1.86e-07)]_145_\ [+2(2.38e-06)]_199 35135 5.10e-10 26_[+1(8.42e-06)]_206_\ [+2(1.55e-05)]_196_[+3(8.46e-11)]_27 49354 1.07e-09 5_[+1(1.86e-07)]_49_[+2(3.20e-06)]_\ 358_[+3(4.10e-08)]_43 50344 1.07e-09 3_[+1(1.86e-07)]_49_[+2(3.20e-06)]_\ 358_[+3(4.10e-08)]_45 45254 1.79e-07 230_[+2(2.22e-05)]_31_\ [+3(1.35e-07)]_67_[+1(2.14e-06)]_127 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************