******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/56/56.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 41508 1.0000 500 45119 1.0000 500 27447 1.0000 500 46297 1.0000 500 54552 1.0000 500 32251 1.0000 500 34031 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/56/56.seqs.fa -oc motifs/56 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.273 C 0.228 G 0.244 T 0.255 Background letter frequencies (from dataset with add-one prior applied): A 0.273 C 0.228 G 0.244 T 0.255 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 13 sites = 7 llr = 85 E-value = 9.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 6:::4:4:1:::: pos.-specific C 4:3a:91:::::9 probability G :11:::19::9:: matrix T :96:61319a1a1 bits 2.1 * 1.9 * * * 1.7 * * * 1.5 * * * * **** Relative 1.3 * * * ****** Entropy 1.1 ** * * ****** (17.4 bits) 0.9 ** *** ****** 0.6 ****** ****** 0.4 ****** ****** 0.2 ************* 0.0 ------------- Multilevel ATTCTCAGTTGTC consensus C C A T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 32251 425 1.42e-07 TTGCAAGTTT CTTCACTGTTGTC ACTTTCACTC 34031 415 2.12e-07 ATCTCGTCCT CTCCTCTGTTGTC TGCCAAACAA 41508 220 3.66e-07 CTGCCCAAGC CTCCTCCGTTGTC AGGCCGAAAA 54552 458 4.61e-07 CGGATCGTTC ATTCTCATTTGTC GAGACGACCT 46297 468 2.25e-06 GGCCTCGGGT ATTCACGGATGTC GGTCAGACAC 45119 356 3.24e-06 TAAAACTGCC ATGCACAGTTGTT ACCTGAGAGC 27447 291 1.11e-05 AGATACGGGG AGTCTTAGTTTTC AGTTACGCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32251 1.4e-07 424_[+1]_63 34031 2.1e-07 414_[+1]_73 41508 3.7e-07 219_[+1]_268 54552 4.6e-07 457_[+1]_30 46297 2.3e-06 467_[+1]_20 45119 3.2e-06 355_[+1]_132 27447 1.1e-05 290_[+1]_197 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=13 seqs=7 32251 ( 425) CTTCACTGTTGTC 1 34031 ( 415) CTCCTCTGTTGTC 1 41508 ( 220) CTCCTCCGTTGTC 1 54552 ( 458) ATTCTCATTTGTC 1 46297 ( 468) ATTCACGGATGTC 1 45119 ( 356) ATGCACAGTTGTT 1 27447 ( 291) AGTCTTAGTTTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 3416 bayes= 9.53451 E= 9.6e-001 107 91 -945 -945 -945 -945 -77 175 -945 32 -77 116 -945 213 -945 -945 65 -945 -945 116 -945 191 -945 -84 65 -67 -77 16 -945 -945 181 -84 -93 -945 -945 175 -945 -945 -945 197 -945 -945 181 -84 -945 -945 -945 197 -945 191 -945 -84 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 7 E= 9.6e-001 0.571429 0.428571 0.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.285714 0.142857 0.571429 0.000000 1.000000 0.000000 0.000000 0.428571 0.000000 0.000000 0.571429 0.000000 0.857143 0.000000 0.142857 0.428571 0.142857 0.142857 0.285714 0.000000 0.000000 0.857143 0.142857 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.857143 0.000000 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AC]T[TC]C[TA]C[AT]GTTGTC -------------------------------------------------------------------------------- Time 0.50 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 4 llr = 73 E-value = 1.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :3:58::83:a8::3:88: pos.-specific C :8:::8a:3a::8:58:3a probability G 8::533::5:::3a333:: matrix T 3:a::::3:::3::::::: bits 2.1 * * * * 1.9 * * ** * * 1.7 * * ** * * 1.5 * * ** * * Relative 1.3 *** ** ** ** * * Entropy 1.1 *** **** ***** **** (26.2 bits) 0.9 ******** ***** **** 0.6 ******** ********** 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel GCTAACCAGCAACGCCAAC consensus TA GGG TA TG AGGC sequence C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 46297 179 1.00e-10 CTCTACAAGA GCTGACCACCAACGACAAC ACTACCAACA 27447 101 2.45e-09 CCCAATCGGT GATAACCAGCAACGCCGCC TCCCGCGAAC 34031 445 2.57e-09 CAAGTTCCAG TCTAACCTGCAAGGCCAAC AAAACCACAT 45119 16 1.93e-08 TTGTGTCAAA GCTGGGCAACATCGGGAAC CGTCCCACTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46297 1e-10 178_[+2]_303 27447 2.5e-09 100_[+2]_381 34031 2.6e-09 444_[+2]_37 45119 1.9e-08 15_[+2]_466 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=4 46297 ( 179) GCTGACCACCAACGACAAC 1 27447 ( 101) GATAACCAGCAACGCCGCC 1 34031 ( 445) TCTAACCTGCAAGGCCAAC 1 45119 ( 16) GCTGGGCAACATCGGGAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 3374 bayes= 9.71853 E= 1.2e+003 -865 -865 162 -3 -13 172 -865 -865 -865 -865 -865 197 87 -865 103 -865 146 -865 3 -865 -865 172 3 -865 -865 213 -865 -865 146 -865 -865 -3 -13 13 103 -865 -865 213 -865 -865 187 -865 -865 -865 146 -865 -865 -3 -865 172 3 -865 -865 -865 203 -865 -13 113 3 -865 -865 172 3 -865 146 -865 3 -865 146 13 -865 -865 -865 213 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 4 E= 1.2e+003 0.000000 0.000000 0.750000 0.250000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.500000 0.000000 0.500000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.250000 0.250000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 0.750000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GT][CA]T[AG][AG][CG]C[AT][GAC]CA[AT][CG]G[CAG][CG][AG][AC]C -------------------------------------------------------------------------------- Time 1.04 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 7 llr = 76 E-value = 1.6e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 3aaa::47a:17 pos.-specific C 6:::663::::1 probability G 1:::3433:a9: matrix T ::::1::::::1 bits 2.1 * 1.9 *** ** 1.7 *** ** 1.5 *** *** Relative 1.3 *** *** Entropy 1.1 *** * **** (15.7 bits) 0.9 *** * ***** 0.6 ****** ***** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CAAACCAAAGGA consensus A GGCG sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46297 136 3.74e-07 TGTTTCCGTT CAAACGCAAGGA GCTAATGCGG 32251 118 6.07e-07 TAATGCCAGT CAAACCAGAGGA AGATCTTGTA 27447 231 2.10e-06 CAAAAGTCTG CAAACGGGAGGA GCCGTGCTCG 54552 60 3.96e-06 AATTTAAAAA AAAAGCGAAGGA TGATCTCACT 34031 311 7.23e-06 ATGAGCTCTA CAAACGCAAGAA CGCGAATGAT 45119 457 1.83e-05 GATTCTTTGG GAAAGCAAAGGC CATTGAGGTC 41508 433 2.13e-05 GGATAAAATC AAAATCAAAGGT ACTGGCATTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46297 3.7e-07 135_[+3]_353 32251 6.1e-07 117_[+3]_371 27447 2.1e-06 230_[+3]_258 54552 4e-06 59_[+3]_429 34031 7.2e-06 310_[+3]_178 45119 1.8e-05 456_[+3]_32 41508 2.1e-05 432_[+3]_56 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=7 46297 ( 136) CAAACGCAAGGA 1 32251 ( 118) CAAACCAGAGGA 1 27447 ( 231) CAAACGGGAGGA 1 54552 ( 60) AAAAGCGAAGGA 1 34031 ( 311) CAAACGCAAGAA 1 45119 ( 457) GAAAGCAAAGGC 1 41508 ( 433) AAAATCAAAGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3423 bayes= 8.93074 E= 1.6e+003 7 132 -77 -945 187 -945 -945 -945 187 -945 -945 -945 187 -945 -945 -945 -945 132 23 -84 -945 132 81 -945 65 32 23 -945 139 -945 23 -945 187 -945 -945 -945 -945 -945 203 -945 -93 -945 181 -945 139 -67 -945 -84 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 1.6e+003 0.285714 0.571429 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.571429 0.285714 0.142857 0.000000 0.571429 0.428571 0.000000 0.428571 0.285714 0.285714 0.000000 0.714286 0.000000 0.285714 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.714286 0.142857 0.000000 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA]AAA[CG][CG][ACG][AG]AGGA -------------------------------------------------------------------------------- Time 1.51 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41508 1.55e-04 219_[+1(3.66e-07)]_200_\ [+3(2.13e-05)]_56 45119 3.70e-08 15_[+2(1.93e-08)]_321_\ [+1(3.24e-06)]_88_[+3(1.83e-05)]_32 27447 2.34e-09 100_[+2(2.45e-09)]_111_\ [+3(2.10e-06)]_48_[+1(1.11e-05)]_197 46297 5.39e-12 135_[+3(3.74e-07)]_31_\ [+2(1.00e-10)]_270_[+1(2.25e-06)]_20 54552 2.68e-05 59_[+3(3.96e-06)]_264_\ [+1(2.45e-05)]_109_[+1(4.61e-07)]_30 32251 3.54e-06 117_[+3(6.07e-07)]_295_\ [+1(1.42e-07)]_63 34031 1.96e-10 310_[+3(7.23e-06)]_92_\ [+1(2.12e-07)]_17_[+2(2.57e-09)]_37 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************