******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/296/296.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 37077 1.0000 500 47790 1.0000 500 43349 1.0000 500 55070 1.0000 500 25856 1.0000 500 10972 1.0000 500 11402 1.0000 500 34681 1.0000 500 46174 1.0000 500 42838 1.0000 500 48652 1.0000 500 49944 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/296/296.seqs.fa -oc motifs/296 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.235 G 0.238 T 0.266 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.236 G 0.238 T 0.266 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 12 llr = 118 E-value = 1.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :88982:79a:5 pos.-specific C 3:2::43:::5: probability G 82:1:283::15 matrix T ::::33::1:4: bits 2.1 1.9 * 1.7 * 1.5 * ** Relative 1.3 **** * ** Entropy 1.0 ***** **** * (14.1 bits) 0.8 ***** **** * 0.6 ***** ****** 0.4 ***** ****** 0.2 ************ 0.0 ------------ Multilevel GAAAACGAAACA consensus C TTCG TG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48652 58 4.65e-07 CCAGACTATC GAAAACGGAACA GTCAAAATCA 46174 435 1.24e-06 AGAACATCAA GAAAATGAAATA GTGGAATCGT 25856 78 2.88e-06 GAGATGGTGT GACAACGAAACA GCTTCCACGT 47790 468 7.01e-06 TGCTGAAATT GAAATTGAAATA GAAAAGCCGT 43349 111 9.67e-06 TTTGGTAGTG GGAAAAGAAACG CACAAAGGGA 37077 181 1.07e-05 CATACATTGA GAAGACGAAATG CTGTTGATAG 34681 58 1.18e-05 ACCAAGTATA GAAAAGGAAAGG CACAACGGTA 55070 66 1.40e-05 TGAGACGGCA GAAATCCGAACG CGGAGAAATG 49944 104 2.82e-05 TGCTAGAGAG GGAAACCGAATA GAAGGTAGAT 10972 190 3.44e-05 TTTCGTATGT CAAATAGGAACG CGAAAAATCT 11402 489 5.60e-05 TAAGCTATAA CAAAAGGATACA 42838 437 6.22e-05 GAGGATTTCT CACAATCAAATG GAAAATTACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48652 4.6e-07 57_[+1]_431 46174 1.2e-06 434_[+1]_54 25856 2.9e-06 77_[+1]_411 47790 7e-06 467_[+1]_21 43349 9.7e-06 110_[+1]_378 37077 1.1e-05 180_[+1]_308 34681 1.2e-05 57_[+1]_431 55070 1.4e-05 65_[+1]_423 49944 2.8e-05 103_[+1]_385 10972 3.4e-05 189_[+1]_299 11402 5.6e-05 488_[+1] 42838 6.2e-05 436_[+1]_52 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=12 48652 ( 58) GAAAACGGAACA 1 46174 ( 435) GAAAATGAAATA 1 25856 ( 78) GACAACGAAACA 1 47790 ( 468) GAAATTGAAATA 1 43349 ( 111) GGAAAAGAAACG 1 37077 ( 181) GAAGACGAAATG 1 34681 ( 58) GAAAAGGAAAGG 1 55070 ( 66) GAAATCCGAACG 1 49944 ( 104) GGAAACCGAATA 1 10972 ( 190) CAAATAGGAACG 1 11402 ( 489) CAAAAGGATACA 1 42838 ( 437) CACAATCAAATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 9.37898 E= 1.4e+001 -1023 9 166 -1023 167 -1023 -51 -1023 167 -50 -1023 -1023 181 -1023 -151 -1023 152 -1023 -1023 -9 -65 82 -51 -9 -1023 9 166 -1023 135 -1023 49 -1023 181 -1023 -1023 -167 194 -1023 -1023 -1023 -1023 109 -151 65 94 -1023 107 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 1.4e+001 0.000000 0.250000 0.750000 0.000000 0.833333 0.000000 0.166667 0.000000 0.833333 0.166667 0.000000 0.000000 0.916667 0.000000 0.083333 0.000000 0.750000 0.000000 0.000000 0.250000 0.166667 0.416667 0.166667 0.250000 0.000000 0.250000 0.750000 0.000000 0.666667 0.000000 0.333333 0.000000 0.916667 0.000000 0.000000 0.083333 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.083333 0.416667 0.500000 0.000000 0.500000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC]AAA[AT][CT][GC][AG]AA[CT][AG] -------------------------------------------------------------------------------- Time 1.50 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 8 llr = 105 E-value = 8.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3:6::6:31113:a39 pos.-specific C 343a:3::9956a:51 probability G 561:a161::31::3: matrix T ::::::46::1::::: bits 2.1 ** * 1.9 ** ** 1.7 ** ** 1.5 ** ** ** * Relative 1.3 ** ** ** * Entropy 1.0 * ** * ** ** * (19.0 bits) 0.8 * ** * ** *** * 0.6 ********** ***** 0.4 ********** ***** 0.2 **************** 0.0 ---------------- Multilevel GGACGAGTCCCCCACA consensus ACC CTA GA A sequence C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 47790 379 1.50e-08 GAAGCCTACT CCACGAGTCCCCCAAA TTATCTGATT 43349 442 4.83e-08 GGACTTTCTG GGACGAGTCCTACACA CTCGTCGTCA 48652 37 5.71e-08 TCAACGAGTC ACACGAGTCCGCCAGA CTATCGAAAA 10972 262 1.02e-07 ATGTAACAGA GGCCGCTTCCCCCAAA GGGAGGAGCA 55070 210 5.38e-07 TGTGCAAACC AGACGAGACCCACACC ATCAATCCAC 25856 145 7.50e-07 ACCGTCAACC GCACGATTACCGCACA TCACACCTCA 37077 10 3.34e-06 TTCGCATTC CGCCGCTGCCACCAGA AGAAGAGCCG 11402 212 3.50e-06 ACTGTGAGGC GGGCGGGACAGCCACA TGCCAGTAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47790 1.5e-08 378_[+2]_106 43349 4.8e-08 441_[+2]_43 48652 5.7e-08 36_[+2]_448 10972 1e-07 261_[+2]_223 55070 5.4e-07 209_[+2]_275 25856 7.5e-07 144_[+2]_340 37077 3.3e-06 9_[+2]_475 11402 3.5e-06 211_[+2]_273 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=8 47790 ( 379) CCACGAGTCCCCCAAA 1 43349 ( 442) GGACGAGTCCTACACA 1 48652 ( 37) ACACGAGTCCGCCAGA 1 10972 ( 262) GGCCGCTTCCCCCAAA 1 55070 ( 210) AGACGAGACCCACACC 1 25856 ( 145) GCACGATTACCGCACA 1 37077 ( 10) CGCCGCTGCCACCAGA 1 11402 ( 212) GGGCGGGACAGCCACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 10.2426 E= 8.7e+001 -6 9 107 -965 -965 67 139 -965 126 9 -93 -965 -965 208 -965 -965 -965 -965 207 -965 126 9 -93 -965 -965 -965 139 50 -6 -965 -93 123 -106 189 -965 -965 -106 189 -965 -965 -106 109 7 -109 -6 141 -93 -965 -965 208 -965 -965 194 -965 -965 -965 -6 109 7 -965 174 -91 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 8.7e+001 0.250000 0.250000 0.500000 0.000000 0.000000 0.375000 0.625000 0.000000 0.625000 0.250000 0.125000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.625000 0.250000 0.125000 0.000000 0.000000 0.000000 0.625000 0.375000 0.250000 0.000000 0.125000 0.625000 0.125000 0.875000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.125000 0.500000 0.250000 0.125000 0.250000 0.625000 0.125000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.875000 0.125000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GAC][GC][AC]CG[AC][GT][TA]CC[CG][CA]CA[CAG]A -------------------------------------------------------------------------------- Time 2.95 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 9 llr = 130 E-value = 2.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::1717::18:::23:261:1 pos.-specific C 224:92::11321:1a3141: probability G :::1::a:317:64::43421 matrix T 8842:1:a4::8336::::78 bits 2.1 * * 1.9 ** * 1.7 * ** * 1.5 * ** * Relative 1.3 ** * ** ** * Entropy 1.0 ** * ** *** * * (20.9 bits) 0.8 ** **** *** * ** 0.6 ******** **** ** **** 0.4 ******** ************ 0.2 ********************* 0.0 --------------------- Multilevel TTCACAGTTAGTGGTCGACTT consensus CCTT C G CCTTA CGGG sequence A A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 49944 146 1.34e-08 CGGAATTCCA TTCACAGTAGGTGTTCGGGTT CGTGCACGCA 47790 338 2.08e-08 TTCTCTTTTA TTTTCAGTTAGCGTACAGCTT CACAGTAAAC 34681 259 3.14e-08 CCGGCAGCGA TTCACCGTCAGTGATCGACCT TATCCTATTG 42838 278 5.59e-08 AGGTGGATCG CTAACAGTGAGTGAACCAGGT AGTTCTAGGC 48652 162 1.04e-07 CTTCGAAGAA TTTACAGTTACTCGACGACGA ACACGTGTCT 10972 88 1.13e-07 ATTAAATTAT TTTACTGTTAGTTGTCGAATG CGGATCTGAC 11402 325 1.23e-07 TGTTCCGCAT TCCAAAGTGAGCTTTCCAGTT TCCCTCTAAA 25856 214 1.34e-07 TTTGCGAACA CTCGCAGTTCCTGGTCCGGTT TGAGTTTTCC 55070 125 1.07e-06 GAGAAAGGTT TCTTCCGTGACTTGCCACCTT TCTCACAAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49944 1.3e-08 145_[+3]_334 47790 2.1e-08 337_[+3]_142 34681 3.1e-08 258_[+3]_221 42838 5.6e-08 277_[+3]_202 48652 1e-07 161_[+3]_318 10972 1.1e-07 87_[+3]_392 11402 1.2e-07 324_[+3]_155 25856 1.3e-07 213_[+3]_266 55070 1.1e-06 124_[+3]_355 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=9 49944 ( 146) TTCACAGTAGGTGTTCGGGTT 1 47790 ( 338) TTTTCAGTTAGCGTACAGCTT 1 34681 ( 259) TTCACCGTCAGTGATCGACCT 1 42838 ( 278) CTAACAGTGAGTGAACCAGGT 1 48652 ( 162) TTTACAGTTACTCGACGACGA 1 10972 ( 88) TTTACTGTTAGTTGTCGAATG 1 11402 ( 325) TCCAAAGTGAGCTTTCCAGTT 1 25856 ( 214) CTCGCAGTTCCTGGTCCGGTT 1 55070 ( 125) TCTTCCGTGACTTGCCACCTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5760 bayes= 9.45417 E= 2.5e+002 -982 -8 -982 155 -982 -8 -982 155 -123 92 -982 74 135 -982 -110 -26 -123 192 -982 -982 135 -8 -982 -126 -982 -982 207 -982 -982 -982 -982 191 -123 -108 49 74 157 -108 -110 -982 -982 50 149 -982 -982 -8 -982 155 -982 -108 122 33 -23 -982 90 33 35 -108 -982 106 -982 208 -982 -982 -23 50 90 -982 109 -108 49 -982 -123 92 90 -982 -982 -108 -10 133 -123 -982 -110 155 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 2.5e+002 0.000000 0.222222 0.000000 0.777778 0.000000 0.222222 0.000000 0.777778 0.111111 0.444444 0.000000 0.444444 0.666667 0.000000 0.111111 0.222222 0.111111 0.888889 0.000000 0.000000 0.666667 0.222222 0.000000 0.111111 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.111111 0.111111 0.333333 0.444444 0.777778 0.111111 0.111111 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.222222 0.000000 0.777778 0.000000 0.111111 0.555556 0.333333 0.222222 0.000000 0.444444 0.333333 0.333333 0.111111 0.000000 0.555556 0.000000 1.000000 0.000000 0.000000 0.222222 0.333333 0.444444 0.000000 0.555556 0.111111 0.333333 0.000000 0.111111 0.444444 0.444444 0.000000 0.000000 0.111111 0.222222 0.666667 0.111111 0.000000 0.111111 0.777778 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TC][TC][CT][AT]C[AC]GT[TG]A[GC][TC][GT][GTA][TA]C[GCA][AG][CG][TG]T -------------------------------------------------------------------------------- Time 4.64 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37077 6.25e-04 9_[+2(3.34e-06)]_155_[+1(1.07e-05)]_\ 308 47790 1.12e-10 337_[+3(2.08e-08)]_20_\ [+2(1.50e-08)]_73_[+1(7.01e-06)]_21 43349 1.45e-05 110_[+1(9.67e-06)]_7_[+1(9.66e-05)]_\ 300_[+2(4.83e-08)]_43 55070 2.18e-07 65_[+1(1.40e-05)]_47_[+3(1.07e-06)]_\ 64_[+2(5.38e-07)]_275 25856 1.04e-08 77_[+1(2.88e-06)]_55_[+2(7.50e-07)]_\ 53_[+3(1.34e-07)]_266 10972 1.38e-08 87_[+3(1.13e-07)]_81_[+1(3.44e-05)]_\ 60_[+2(1.02e-07)]_223 11402 5.83e-07 211_[+2(3.50e-06)]_97_\ [+3(1.23e-07)]_143_[+1(5.60e-05)] 34681 1.00e-05 57_[+1(1.18e-05)]_189_\ [+3(3.14e-08)]_221 46174 3.52e-04 320_[+2(1.41e-05)]_98_\ [+1(1.24e-06)]_54 42838 3.02e-05 277_[+3(5.59e-08)]_138_\ [+1(6.22e-05)]_52 48652 1.40e-10 36_[+2(5.71e-08)]_5_[+1(4.65e-07)]_\ 92_[+3(1.04e-07)]_318 49944 1.26e-05 103_[+1(2.82e-05)]_30_\ [+3(1.34e-08)]_334 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************