******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/332/332.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 8859 1.0000 500 43236 1.0000 500 7721 1.0000 500 13400 1.0000 500 37636 1.0000 500 48801 1.0000 500 23830 1.0000 500 16674 1.0000 500 44752 1.0000 500 34317 1.0000 500 45943 1.0000 500 12452 1.0000 500 32874 1.0000 500 40895 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/332/332.seqs.fa -oc motifs/332 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.282 C 0.226 G 0.222 T 0.270 Background letter frequencies (from dataset with add-one prior applied): A 0.282 C 0.226 G 0.222 T 0.270 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 7 llr = 103 E-value = 9.1e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::::::3::::7:: pos.-specific C 11:39:1:::a1::1 probability G :::1:a::37:9319 matrix T 99a61:9773:::9: bits 2.2 * * 2.0 * * * 1.7 * * * 1.5 * ** ** * Relative 1.3 *** *** *** ** Entropy 1.1 *** *********** (21.3 bits) 0.9 *** *********** 0.7 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel TTTTCGTTTGCGATG consensus C AGT G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 44752 182 3.41e-09 CGGTCACATC TTTTCGTTTGCGGTG CCTCGTCGTT 13400 52 7.78e-09 CAACAAAGGC TTTCCGTTGGCGATG TCACTGAACG 7721 426 7.78e-09 CAACAAAGGC TTTCCGTTGGCGATG TCACTGAACG 34317 446 8.03e-08 TCGTCACGGA TTTTTGTATGCGATG GAGTTCAAAA 40895 389 2.78e-07 ACATGAACAG CTTTCGCTTTCGATG GCATGAGAAG 43236 56 3.48e-07 CCCGACATGT TTTGCGTTTTCGGGG ACCTGCCAAT 23830 55 8.95e-07 CGCTTCGACG TCTTCGTATGCCATC GCGTAAAAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44752 3.4e-09 181_[+1]_304 13400 7.8e-09 51_[+1]_434 7721 7.8e-09 425_[+1]_60 34317 8e-08 445_[+1]_40 40895 2.8e-07 388_[+1]_97 43236 3.5e-07 55_[+1]_430 23830 8.9e-07 54_[+1]_431 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=7 44752 ( 182) TTTTCGTTTGCGGTG 1 13400 ( 52) TTTCCGTTGGCGATG 1 7721 ( 426) TTTCCGTTGGCGATG 1 34317 ( 446) TTTTTGTATGCGATG 1 40895 ( 389) CTTTCGCTTTCGATG 1 43236 ( 56) TTTGCGTTTTCGGGG 1 23830 ( 55) TCTTCGTATGCCATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6804 bayes= 10.5296 E= 9.1e-002 -945 -66 -945 167 -945 -66 -945 167 -945 -945 -945 189 -945 34 -63 108 -945 192 -945 -92 -945 -945 217 -945 -945 -66 -945 167 2 -945 -945 140 -945 -945 37 140 -945 -945 169 8 -945 214 -945 -945 -945 -66 195 -945 134 -945 37 -945 -945 -945 -63 167 -945 -66 195 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 7 E= 9.1e-002 0.000000 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.285714 0.142857 0.571429 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.000000 0.857143 0.285714 0.000000 0.000000 0.714286 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 0.714286 0.285714 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.714286 0.000000 0.285714 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.142857 0.857143 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TTT[TC]CGT[TA][TG][GT]CG[AG]TG -------------------------------------------------------------------------------- Time 1.70 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 14 llr = 134 E-value = 8.5e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 6224a49a63:: pos.-specific C 1:3::6:::2:a probability G 1646::1:34a: matrix T 111:::::11:: bits 2.2 ** 2.0 ** 1.7 * * ** 1.5 * * ** Relative 1.3 * ** ** Entropy 1.1 ***** ** (13.8 bits) 0.9 * ***** ** 0.7 * ****** ** 0.4 ** ****** ** 0.2 ************ 0.0 ------------ Multilevel AGGGACAAAGGC consensus ACA A GA sequence A C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48801 84 2.63e-07 ACGAAGGATG AGGGACAAAAGC AGTAGCCGGC 13400 40 5.39e-07 GCGTGCCTTG AGCAACAAAGGC TTTCCGTTGG 7721 414 5.39e-07 GCGTGCCTTG AGCAACAAAGGC TTTCCGTTGG 40895 137 1.10e-06 AGCTTTTGCG AAGGACAAAGGC AAAGGTTTTT 16674 215 4.00e-06 TCCCGGCCAC AGCGAAAAACGC TTGAGTACGT 45943 471 8.63e-06 CCTTTTCGGC TGGAACAAAAGC GCGAACAGTG 43236 349 1.34e-05 TTCCCTGGCG CGGAACAAGAGC CTTGCTTCTC 34317 106 1.64e-05 AGAATGATCG AGTGACAAGAGC TAGATAGGAA 23830 108 1.96e-05 CTCACGACAG AGAGAAGAAGGC AAACAAATCG 12452 276 5.31e-05 GGACTTGGTC TTGGACAAGCGC CTTCCTGTCG 8859 86 6.44e-05 CGGTACACCC AAAGAAGAAGGC ACGTCAGATT 32874 264 6.88e-05 AAAAATAAAT ATGAACAAGTGC ATGTGGAAGA 44752 94 9.64e-05 CCCCGCTCTT CGCAAAAATGGC AACGGCGCTC 37636 248 1.28e-04 CAAGTCGAAC GAAGAAAAACGC AAACCTCACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48801 2.6e-07 83_[+2]_405 13400 5.4e-07 39_[+2]_449 7721 5.4e-07 413_[+2]_75 40895 1.1e-06 136_[+2]_352 16674 4e-06 214_[+2]_274 45943 8.6e-06 470_[+2]_18 43236 1.3e-05 348_[+2]_140 34317 1.6e-05 105_[+2]_383 23830 2e-05 107_[+2]_381 12452 5.3e-05 275_[+2]_213 8859 6.4e-05 85_[+2]_403 32874 6.9e-05 263_[+2]_225 44752 9.6e-05 93_[+2]_395 37636 0.00013 247_[+2]_241 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=14 48801 ( 84) AGGGACAAAAGC 1 13400 ( 40) AGCAACAAAGGC 1 7721 ( 414) AGCAACAAAGGC 1 40895 ( 137) AAGGACAAAGGC 1 16674 ( 215) AGCGAAAAACGC 1 45943 ( 471) TGGAACAAAAGC 1 43236 ( 349) CGGAACAAGAGC 1 34317 ( 106) AGTGACAAGAGC 1 23830 ( 108) AGAGAAGAAGGC 1 12452 ( 276) TTGGACAAGCGC 1 8859 ( 86) AAAGAAGAAGGC 1 32874 ( 264) ATGAACAAGTGC 1 44752 ( 94) CGCAAAAATGGC 1 37636 ( 248) GAAGAAAAACGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.53747 E= 8.5e-001 119 -66 -163 -92 -40 -1045 154 -92 -40 34 95 -191 60 -1045 137 -1045 182 -1045 -1045 -1045 34 151 -1045 -1045 160 -1045 -63 -1045 182 -1045 -1045 -1045 119 -1045 37 -191 2 -8 95 -191 -1045 -1045 217 -1045 -1045 214 -1045 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 8.5e-001 0.642857 0.142857 0.071429 0.142857 0.214286 0.000000 0.642857 0.142857 0.214286 0.285714 0.428571 0.071429 0.428571 0.000000 0.571429 0.000000 1.000000 0.000000 0.000000 0.000000 0.357143 0.642857 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.642857 0.000000 0.285714 0.071429 0.285714 0.214286 0.428571 0.071429 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[GA][GCA][GA]A[CA]AA[AG][GAC]GC -------------------------------------------------------------------------------- Time 3.81 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 14 llr = 136 E-value = 5.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::1:11:::156 pos.-specific C :661832::95: probability G 2:3:11:1:::: matrix T 84:9:489a1:4 bits 2.2 2.0 * 1.7 * 1.5 * ** Relative 1.3 * * *** Entropy 1.1 ** ** ***** (14.0 bits) 0.9 ***** ****** 0.7 ***** ****** 0.4 ***** ****** 0.2 ************ 0.0 ------------ Multilevel TCCTCTTTTCAA consensus GTG CC CT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 32874 92 5.33e-07 GGAGAGTTGG TCCTCTTTTCAT GGTTAGCCAA 23830 245 5.33e-07 CTTCCTCGAA TTCTCTTTTCCA GCATGAAATC 34317 323 8.74e-07 TCAATTCGGT TTCTCCTTTCCA CCCTGTCAAT 43236 369 2.71e-06 GCCTTGCTTC TCCTCGTTTCAT GTCACGTAGA 8859 61 3.56e-06 GTAGCGACAG TTGTCCTTTCCA AGACGGTACA 45943 420 6.11e-06 TGAAAGTCTT TTCTCATTTCCT TCTTTTCTCC 12452 147 9.26e-06 CGACATCGTC TCGTCTCTTCAT GGTGTGAAAC 13400 411 9.85e-06 GTAAGTTGAC GCCTGTTTTCCA AACAAGATTT 37636 43 2.67e-05 TTGACGAAGT GTGTCATTTCCA TCCGGAGATA 7721 46 3.75e-05 GATAGTTTAT TCCTCCCTTTAA ACTGCTGCTG 44752 199 4.58e-05 TTGCGGTGCC TCGTCGTTTACA CAGCGTCTTG 48801 215 5.96e-05 ATGAAGAATT GCCTCTCGTCAA CGCCGGAGAA 40895 447 7.07e-05 GCAAGACACC TCCCGCTTTCAT AATATCACGG 16674 288 8.38e-05 CATATGTTAA TCATATTTTCAA TGTCCTACAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32874 5.3e-07 91_[+3]_397 23830 5.3e-07 244_[+3]_244 34317 8.7e-07 322_[+3]_166 43236 2.7e-06 368_[+3]_120 8859 3.6e-06 60_[+3]_428 45943 6.1e-06 419_[+3]_69 12452 9.3e-06 146_[+3]_342 13400 9.9e-06 410_[+3]_78 37636 2.7e-05 42_[+3]_446 7721 3.7e-05 45_[+3]_443 44752 4.6e-05 198_[+3]_290 48801 6e-05 214_[+3]_274 40895 7.1e-05 446_[+3]_42 16674 8.4e-05 287_[+3]_201 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=14 32874 ( 92) TCCTCTTTTCAT 1 23830 ( 245) TTCTCTTTTCCA 1 34317 ( 323) TTCTCCTTTCCA 1 43236 ( 369) TCCTCGTTTCAT 1 8859 ( 61) TTGTCCTTTCCA 1 45943 ( 420) TTCTCATTTCCT 1 12452 ( 147) TCGTCTCTTCAT 1 13400 ( 411) GCCTGTTTTCCA 1 37636 ( 43) GTGTCATTTCCA 1 7721 ( 46) TCCTCCCTTTAA 1 44752 ( 199) TCGTCGTTTACA 1 48801 ( 215) GCCTCTCGTCAA 1 40895 ( 447) TCCCGCTTTCAT 1 16674 ( 288) TCATATTTTCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 8.93074 E= 5.4e-001 -1045 -1045 -5 154 -1045 151 -1045 40 -198 151 37 -1045 -1045 -166 -1045 178 -198 180 -63 -1045 -98 34 -63 67 -1045 -8 -1045 154 -1045 -1045 -163 178 -1045 -1045 -1045 189 -198 192 -1045 -191 82 114 -1045 -1045 119 -1045 -1045 40 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 5.4e-001 0.000000 0.000000 0.214286 0.785714 0.000000 0.642857 0.000000 0.357143 0.071429 0.642857 0.285714 0.000000 0.000000 0.071429 0.000000 0.928571 0.071429 0.785714 0.142857 0.000000 0.142857 0.285714 0.142857 0.428571 0.000000 0.214286 0.000000 0.785714 0.000000 0.000000 0.071429 0.928571 0.000000 0.000000 0.000000 1.000000 0.071429 0.857143 0.000000 0.071429 0.500000 0.500000 0.000000 0.000000 0.642857 0.000000 0.000000 0.357143 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TG][CT][CG]TC[TC][TC]TTC[AC][AT] -------------------------------------------------------------------------------- Time 5.40 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8859 2.83e-03 60_[+3(3.56e-06)]_13_[+2(6.44e-05)]_\ 403 43236 3.34e-07 55_[+1(3.48e-07)]_278_\ [+2(1.34e-05)]_8_[+3(2.71e-06)]_120 7721 6.00e-09 45_[+3(3.75e-05)]_356_\ [+2(5.39e-07)]_[+1(7.78e-09)]_60 13400 1.76e-09 39_[+2(5.39e-07)]_[+1(7.78e-09)]_\ 344_[+3(9.85e-06)]_78 37636 7.41e-03 42_[+3(2.67e-05)]_446 48801 2.39e-04 83_[+2(2.63e-07)]_119_\ [+3(5.96e-05)]_274 23830 2.54e-07 54_[+1(8.95e-07)]_38_[+2(1.96e-05)]_\ 125_[+3(5.33e-07)]_244 16674 9.46e-04 214_[+2(4.00e-06)]_61_\ [+3(8.38e-05)]_201 44752 3.80e-07 93_[+2(9.64e-05)]_76_[+1(3.41e-09)]_\ 2_[+3(4.58e-05)]_290 34317 3.75e-08 105_[+2(1.64e-05)]_205_\ [+3(8.74e-07)]_13_[+1(9.87e-05)]_83_[+1(8.03e-08)]_40 45943 3.00e-04 419_[+3(6.11e-06)]_39_\ [+2(8.63e-06)]_18 12452 3.01e-03 146_[+3(9.26e-06)]_117_\ [+2(5.31e-05)]_213 32874 5.86e-04 91_[+3(5.33e-07)]_160_\ [+2(6.88e-05)]_225 40895 5.36e-07 136_[+2(1.10e-06)]_240_\ [+1(2.78e-07)]_43_[+3(7.07e-05)]_42 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************