******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/344/344.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11243 1.0000 500 11253 1.0000 500 11725 1.0000 500 23218 1.0000 500 25550 1.0000 500 264091 1.0000 500 7320 1.0000 500 7630 1.0000 500 7851 1.0000 500 8233 1.0000 500 8234 1.0000 500 8838 1.0000 500 8994 1.0000 500 919 1.0000 500 9882 1.0000 500 bd1076 1.0000 500 bd153 1.0000 500 bd816 1.0000 500 bd933 1.0000 500 bd943 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/344/344.seqs.fa -oc motifs/344 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10000 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.266 C 0.233 G 0.230 T 0.272 Background letter frequencies (from dataset with add-one prior applied): A 0.266 C 0.233 G 0.230 T 0.272 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 11 llr = 140 E-value = 7.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :173a6:455213:3: pos.-specific C 1:::::2:5::42::: probability G 89:7:286:48257:a matrix T 1:3::2:::2:4:37: bits 2.1 * 1.9 * * 1.7 * * * 1.5 * * * * * Relative 1.3 ** ** * * * * Entropy 1.1 ***** *** * *** (18.3 bits) 0.8 ***** *** * *** 0.6 ********* * **** 0.4 *********** **** 0.2 **************** 0.0 ---------------- Multilevel GGAGAAGGAAGCGGTG consensus TA ACG TATA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 9882 26 5.56e-08 TGGCTAGGCA GGTGAGGGAGGCGGTG AATCGTTTTA 23218 200 8.02e-08 TGGCGTAGGT GGAGATGGAGGCGTTG ACGATGCAGA bd816 259 2.54e-07 AAGAAAGTCC CGAGAAGGATGTGGTG CTACGGGCTG bd153 11 3.48e-07 GAGAGCAACT GGAAAAGGAAGGAGAG GCAGAGCAGC 11243 57 4.62e-07 ACTCCTCGTG GGAAAAGGAGGACGTG GCGTTGATGC 11725 57 6.05e-07 GTGGGAGCAA GGTAAAGGAAGCATTG GTCCACCCCT 8994 441 7.20e-07 TATATGCGAG GGAGAGGACAAGGGTG CTTACTTCCA 8234 391 9.39e-07 CCTATCTTCT GGAGAACACAGTCGAG CGCTTCCAAC 25550 43 1.40e-06 AACATTCCTT TGAGAAGACTGCAGTG GACTTGTAAT bd933 95 1.50e-06 TGATGAGAGC GGTGATGACGATGGTG GTGCTTGCGA 8233 63 3.82e-06 TGTTCGGATG GAAGAACGCAGTGTAG GCTAACCCTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9882 5.6e-08 25_[+1]_459 23218 8e-08 199_[+1]_285 bd816 2.5e-07 258_[+1]_226 bd153 3.5e-07 10_[+1]_474 11243 4.6e-07 56_[+1]_428 11725 6e-07 56_[+1]_428 8994 7.2e-07 440_[+1]_44 8234 9.4e-07 390_[+1]_94 25550 1.4e-06 42_[+1]_442 bd933 1.5e-06 94_[+1]_390 8233 3.8e-06 62_[+1]_422 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=11 9882 ( 26) GGTGAGGGAGGCGGTG 1 23218 ( 200) GGAGATGGAGGCGTTG 1 bd816 ( 259) CGAGAAGGATGTGGTG 1 bd153 ( 11) GGAAAAGGAAGGAGAG 1 11243 ( 57) GGAAAAGGAGGACGTG 1 11725 ( 57) GGTAAAGGAAGCATTG 1 8994 ( 441) GGAGAGGACAAGGGTG 1 8234 ( 391) GGAGAACACAGTCGAG 1 25550 ( 43) TGAGAAGACTGCAGTG 1 bd933 ( 95) GGTGATGACGATGGTG 1 8233 ( 63) GAAGAACGCAGTGTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 10.1382 E= 7.9e+000 -1010 -135 183 -158 -155 -1010 198 -1010 145 -1010 -1010 1 4 -1010 166 -1010 191 -1010 -1010 -1010 126 -1010 -34 -58 -1010 -35 183 -1010 45 -1010 147 -1010 104 97 -1010 -1010 77 -1010 66 -58 -55 -1010 183 -1010 -155 64 -34 42 4 -35 125 -1010 -1010 -1010 166 1 4 -1010 -1010 142 -1010 -1010 212 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 7.9e+000 0.000000 0.090909 0.818182 0.090909 0.090909 0.000000 0.909091 0.000000 0.727273 0.000000 0.000000 0.272727 0.272727 0.000000 0.727273 0.000000 1.000000 0.000000 0.000000 0.000000 0.636364 0.000000 0.181818 0.181818 0.000000 0.181818 0.818182 0.000000 0.363636 0.000000 0.636364 0.000000 0.545455 0.454545 0.000000 0.000000 0.454545 0.000000 0.363636 0.181818 0.181818 0.000000 0.818182 0.000000 0.090909 0.363636 0.181818 0.363636 0.272727 0.181818 0.545455 0.000000 0.000000 0.000000 0.727273 0.272727 0.272727 0.000000 0.000000 0.727273 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GG[AT][GA]AAG[GA][AC][AG]G[CT][GA][GT][TA]G -------------------------------------------------------------------------------- Time 3.72 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 8 llr = 126 E-value = 9.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A a3a:5963991848:5:98: pos.-specific C :6:5::36::913:1391:9 probability G :1:45:11:::13::1::3: matrix T :::1:1::11::13911::1 bits 2.1 1.9 * * 1.7 * * 1.5 * * * ** * Relative 1.3 * * * *** * ** * Entropy 1.1 * * ** *** ** **** (22.6 bits) 0.8 *** ** ***** ** **** 0.6 ************ ** **** 0.4 ************ ** **** 0.2 ************ ******* 0.0 -------------------- Multilevel ACACAAACAACAAATACAAC consensus A GG CA CT C G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- bd153 136 1.28e-10 CCTCGTTCTC ACACAAACAACACTTACAAC CAATCCTGCA bd1076 67 4.77e-10 GTCATCTAAC AAAGGACCAACAGATACAAC TCCTGCGAGT 11725 346 4.61e-09 AGTGCTGACG ACACGAACAACACATCTAGC TCATTAATTG 11253 472 2.13e-08 GATGAAAATG AGAGGAAGAACATATCCAAC TGTTACAAG 9882 410 1.01e-07 TCTCAAATGC ACACGTCAAACCAATTCAAC AACGACACCC bd933 396 1.23e-07 ATGAAAGAAC ACATAAACTACGAATACCAC CGACATTCAT 8994 50 1.41e-07 CCAACTCGTA ACACAAGCAAAAGTTACAAT CAATATCAAC 8838 236 3.29e-07 ACTGTTCTTC AAAGAAAAATCAAACGCAGC TGTTGTCGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd153 1.3e-10 135_[+2]_345 bd1076 4.8e-10 66_[+2]_414 11725 4.6e-09 345_[+2]_135 11253 2.1e-08 471_[+2]_9 9882 1e-07 409_[+2]_71 bd933 1.2e-07 395_[+2]_85 8994 1.4e-07 49_[+2]_431 8838 3.3e-07 235_[+2]_245 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=8 bd153 ( 136) ACACAAACAACACTTACAAC 1 bd1076 ( 67) AAAGGACCAACAGATACAAC 1 11725 ( 346) ACACGAACAACACATCTAGC 1 11253 ( 472) AGAGGAAGAACATATCCAAC 1 9882 ( 410) ACACGTCAAACCAATTCAAC 1 bd933 ( 396) ACATAAACTACGAATACCAC 1 8994 ( 50) ACACAAGCAAAAGTTACAAT 1 8838 ( 236) AAAGAAAAATCAAACGCAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 9620 bayes= 10.9681 E= 9.4e+001 191 -965 -965 -965 -9 142 -88 -965 191 -965 -965 -965 -965 110 71 -112 91 -965 112 -965 172 -965 -965 -112 123 10 -88 -965 -9 142 -88 -965 172 -965 -965 -112 172 -965 -965 -112 -109 191 -965 -965 149 -89 -88 -965 50 10 12 -112 149 -965 -965 -12 -965 -89 -965 169 91 10 -88 -112 -965 191 -965 -112 172 -89 -965 -965 149 -965 12 -965 -965 191 -965 -112 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 8 E= 9.4e+001 1.000000 0.000000 0.000000 0.000000 0.250000 0.625000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.375000 0.125000 0.500000 0.000000 0.500000 0.000000 0.875000 0.000000 0.000000 0.125000 0.625000 0.250000 0.125000 0.000000 0.250000 0.625000 0.125000 0.000000 0.875000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.125000 0.125000 0.875000 0.000000 0.000000 0.750000 0.125000 0.125000 0.000000 0.375000 0.250000 0.250000 0.125000 0.750000 0.000000 0.000000 0.250000 0.000000 0.125000 0.000000 0.875000 0.500000 0.250000 0.125000 0.125000 0.000000 0.875000 0.000000 0.125000 0.875000 0.125000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.875000 0.000000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[CA]A[CG][AG]A[AC][CA]AACA[ACG][AT]T[AC]CA[AG]C -------------------------------------------------------------------------------- Time 7.28 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 7 llr = 120 E-value = 1.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::::1::4:::731:::3:: pos.-specific C 1:34:113:3:733::::::: probability G 9414:37161:3::931a::a matrix T :661a416:6a::4:79:7a: bits 2.1 * * 1.9 * * * ** 1.7 * * * ** 1.5 * * * * * ** Relative 1.3 * * ** * ** ** Entropy 1.1 ** * * *** ******* (24.6 bits) 0.8 ** * * * *** ******* 0.6 ***** ******* ******* 0.4 ***** *************** 0.2 ********************* 0.0 --------------------- Multilevel GTTCTTGTGTTCATGTTGTTG consensus GCG G CAC GCA G A sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 919 44 1.51e-10 AGTTGCTTTG GTTCTAGTGCTCATGTTGTTG TTGCTTCAAT 11243 113 2.48e-10 CTTGGACCGA GTTGTTGTACTCATGTTGATG GTGGCCGAAA 8838 381 4.54e-09 TGCACTGTGT GTTCTTGGGGTGAAGTTGTTG CCTTTGGCGT bd933 178 7.99e-09 TGAATCGATT GGCGTCTTGTTCATGGTGTTG GATTGAGTGA 23218 111 2.08e-08 AGTGATGGGA CGGGTGGCGTTGACGTTGTTG ATGATGTTGA 7320 55 2.23e-08 ACTCCGTATC GGTCTGGCATTCCCATTGATG CCACCCCCGA 11725 241 9.21e-08 GGATATGAAG GTCTTTCTATTCCAGGGGTTG GATATTGATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 919 1.5e-10 43_[+3]_436 11243 2.5e-10 112_[+3]_367 8838 4.5e-09 380_[+3]_99 bd933 8e-09 177_[+3]_302 23218 2.1e-08 110_[+3]_369 7320 2.2e-08 54_[+3]_425 11725 9.2e-08 240_[+3]_239 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=7 919 ( 44) GTTCTAGTGCTCATGTTGTTG 1 11243 ( 113) GTTGTTGTACTCATGTTGATG 1 8838 ( 381) GTTCTTGGGGTGAAGTTGTTG 1 bd933 ( 178) GGCGTCTTGTTCATGGTGTTG 1 23218 ( 111) CGGGTGGCGTTGACGTTGTTG 1 7320 ( 55) GGTCTGGCATTCCCATTGATG 1 11725 ( 241) GTCTTTCTATTCCAGGGGTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9600 bayes= 11.0265 E= 1.2e+002 -945 -70 190 -945 -945 -945 90 107 -945 30 -69 107 -945 88 90 -92 -945 -945 -945 188 -89 -70 31 66 -945 -70 163 -92 -945 30 -69 107 69 -945 131 -945 -945 30 -69 107 -945 -945 -945 188 -945 162 31 -945 142 30 -945 -945 10 30 -945 66 -89 -945 190 -945 -945 -945 31 139 -945 -945 -69 166 -945 -945 212 -945 10 -945 -945 139 -945 -945 -945 188 -945 -945 212 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 1.2e+002 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 0.428571 0.571429 0.000000 0.285714 0.142857 0.571429 0.000000 0.428571 0.428571 0.142857 0.000000 0.000000 0.000000 1.000000 0.142857 0.142857 0.285714 0.428571 0.000000 0.142857 0.714286 0.142857 0.000000 0.285714 0.142857 0.571429 0.428571 0.000000 0.571429 0.000000 0.000000 0.285714 0.142857 0.571429 0.000000 0.000000 0.000000 1.000000 0.000000 0.714286 0.285714 0.000000 0.714286 0.285714 0.000000 0.000000 0.285714 0.285714 0.000000 0.428571 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 1.000000 0.000000 0.285714 0.000000 0.000000 0.714286 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[TG][TC][CG]T[TG]G[TC][GA][TC]T[CG][AC][TAC]G[TG]TG[TA]TG -------------------------------------------------------------------------------- Time 10.83 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11243 7.96e-09 56_[+1(4.62e-07)]_40_[+3(2.48e-10)]_\ 367 11253 1.16e-04 471_[+2(2.13e-08)]_9 11725 1.49e-11 [+3(8.69e-05)]_35_[+1(6.05e-07)]_\ 168_[+3(9.21e-08)]_84_[+2(4.61e-09)]_135 23218 7.56e-08 110_[+3(2.08e-08)]_68_\ [+1(8.02e-08)]_285 25550 4.75e-03 42_[+1(1.40e-06)]_442 264091 4.51e-01 500 7320 2.76e-04 54_[+3(2.23e-08)]_425 7630 7.59e-01 500 7851 3.84e-01 500 8233 1.11e-02 62_[+1(3.82e-06)]_422 8234 4.83e-03 390_[+1(9.39e-07)]_94 8838 3.98e-08 173_[+3(1.17e-05)]_41_\ [+2(3.29e-07)]_125_[+3(4.54e-09)]_99 8994 1.45e-06 49_[+2(1.41e-07)]_371_\ [+1(7.20e-07)]_44 919 2.25e-06 43_[+3(1.51e-10)]_436 9882 8.31e-08 25_[+1(5.56e-08)]_368_\ [+2(1.01e-07)]_71 bd1076 9.04e-06 66_[+2(4.77e-10)]_414 bd153 2.15e-09 10_[+1(3.48e-07)]_109_\ [+2(1.28e-10)]_345 bd816 2.07e-03 258_[+1(2.54e-07)]_226 bd933 7.69e-11 94_[+1(1.50e-06)]_67_[+3(7.99e-09)]_\ 197_[+2(1.23e-07)]_85 bd943 1.35e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************