******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/117/117.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11352 1.0000 500 12132 1.0000 500 24755 1.0000 500 263830 1.0000 500 263834 1.0000 500 264694 1.0000 500 264753 1.0000 500 264757 1.0000 500 270031 1.0000 500 36689 1.0000 500 36702 1.0000 500 38597 1.0000 500 38760 1.0000 500 8614 1.0000 500 869 1.0000 500 9277 1.0000 500 bd1756 1.0000 90 bd1839 1.0000 500 bd1840 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/117/117.seqs.fa -oc motifs/117 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9090 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.279 C 0.235 G 0.218 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.279 C 0.235 G 0.218 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 12 llr = 198 E-value = 2.0e-013 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 314a:8:72:87:96:82361 pos.-specific C ::5::25:8:1:2:25::3:: probability G :91:a:2::a:28:33:534: matrix T 8::::133::12:1:3231:9 bits 2.2 * * 2.0 * * 1.8 * ** * 1.5 * ** * ** * Relative 1.3 * ** ** ** * Entropy 1.1 ** ** *** ** * ** (23.8 bits) 0.9 ** *** **** ** * ** 0.7 ****************** ** 0.4 ****************** ** 0.2 ********************* 0.0 --------------------- Multilevel TGCAGACACGAAGAACAGCAT consensus A A TT GG TGG sequence T A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 264757 79 6.87e-10 CGGAGTTGTA TGCAGACACGATGAACTGCAT CCACTACTCA 263834 79 6.87e-10 CGGAGTTGTA TGCAGACACGATGAACTGCAT CCACTACTCA 264753 219 7.78e-10 GTGGTGCAAC TGAAGACTCGAAGAGTATGGT GACCAAGTGA 263830 219 7.78e-10 GTGGTGCAAC TGAAGACTCGAAGAGTATGGT GACCAAGTGA 12132 460 1.82e-09 CCGTTGAGAT TGCAGCGACGAAGAAGATGGT GCACCTCCAC 38597 296 7.05e-09 ACGTACAATA TGCAGATTAGAACAACAGCAT TTTCACATCA 36702 296 7.05e-09 ACGTACAATA TGCAGATTAGAACAACAGCAT TTTCACATCA 38760 396 1.58e-08 GGTTACTCCA AGAAGATACGAAGACGAAAAT GCAAACGCCG 36689 396 1.58e-08 GGTTACTCCA AGAAGATACGAAGACGAAAAT GCAAACGCCG 264694 31 1.74e-07 CCAAGATCTT TAGAGAGACGAGGAGTAGAGT CTAAGAGCTC 869 262 1.84e-07 GTTCCGAATC AGAAGTCACGTGGAACAGTGT GGAAGTGAAC bd1840 208 2.16e-07 GTGGTTGAGA TGCAGCCACGCAGTACATGAA GCCCCACCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 264757 6.9e-10 78_[+1]_401 263834 6.9e-10 78_[+1]_401 264753 7.8e-10 218_[+1]_261 263830 7.8e-10 218_[+1]_261 12132 1.8e-09 459_[+1]_20 38597 7.1e-09 295_[+1]_184 36702 7.1e-09 295_[+1]_184 38760 1.6e-08 395_[+1]_84 36689 1.6e-08 395_[+1]_84 264694 1.7e-07 30_[+1]_449 869 1.8e-07 261_[+1]_218 bd1840 2.2e-07 207_[+1]_272 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=12 264757 ( 79) TGCAGACACGATGAACTGCAT 1 263834 ( 79) TGCAGACACGATGAACTGCAT 1 264753 ( 219) TGAAGACTCGAAGAGTATGGT 1 263830 ( 219) TGAAGACTCGAAGAGTATGGT 1 12132 ( 460) TGCAGCGACGAAGAAGATGGT 1 38597 ( 296) TGCAGATTAGAACAACAGCAT 1 36702 ( 296) TGCAGATTAGAACAACAGCAT 1 38760 ( 396) AGAAGATACGAAGACGAAAAT 1 36689 ( 396) AGAAGATACGAAGACGAAAAT 1 264694 ( 31) TAGAGAGACGAGGAGTAGAGT 1 869 ( 262) AGAAGTCACGTGGAACAGTGT 1 bd1840 ( 208) TGCAGCCACGCAGTACATGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8710 bayes= 9.15994 E= 2.0e-013 -16 -1023 -1023 149 -174 -1023 207 -1023 58 109 -139 -1023 184 -1023 -1023 -1023 -1023 -1023 219 -1023 143 -50 -1023 -168 -1023 109 -39 32 126 -1023 -1023 32 -74 182 -1023 -1023 -1023 -1023 219 -1023 158 -150 -1023 -168 126 -1023 -39 -68 -1023 -50 193 -1023 172 -1023 -1023 -168 107 -50 19 -1023 -1023 109 19 -10 158 -1023 -1023 -68 -74 -1023 119 32 -16 50 61 -168 107 -1023 93 -1023 -174 -1023 -1023 177 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 12 E= 2.0e-013 0.250000 0.000000 0.000000 0.750000 0.083333 0.000000 0.916667 0.000000 0.416667 0.500000 0.083333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.166667 0.000000 0.083333 0.000000 0.500000 0.166667 0.333333 0.666667 0.000000 0.000000 0.333333 0.166667 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.083333 0.000000 0.083333 0.666667 0.000000 0.166667 0.166667 0.000000 0.166667 0.833333 0.000000 0.916667 0.000000 0.000000 0.083333 0.583333 0.166667 0.250000 0.000000 0.000000 0.500000 0.250000 0.250000 0.833333 0.000000 0.000000 0.166667 0.166667 0.000000 0.500000 0.333333 0.250000 0.333333 0.333333 0.083333 0.583333 0.000000 0.416667 0.000000 0.083333 0.000000 0.000000 0.916667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TA]G[CA]AGA[CT][AT]CGAAGA[AG][CGT]A[GT][CGA][AG]T -------------------------------------------------------------------------------- Time 3.72 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 10 llr = 172 E-value = 2.1e-008 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :2a2::747:a::7:3:14:1 pos.-specific C :6:::5:1:::::1::8:::1 probability G 5::2:53:36:492:::54:8 matrix T 52:6a::5:4:61:a7242a: bits 2.2 2.0 * * * 1.8 * * * * * * 1.5 * * * * * * Relative 1.3 * * * * * * ** Entropy 1.1 * * *** ***** *** ** (24.8 bits) 0.9 * * *** ********* ** 0.7 ******* ********** ** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GCATTCATAGATGATTCGATG consensus TA A GGAGT G G ATTG sequence T G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 38760 332 1.99e-11 ATATTAAAAT TCATTCATAGATGATTCTATG ACGTTCTTAC 36689 332 1.99e-11 ATATTAAAAT TCATTCATAGATGATTCTATG ACGTTCTTAC 38597 441 9.50e-10 CACTTGCTGT GTATTGAAATAGGATTCTGTG GACTTGACAT 36702 441 9.50e-10 CACTTGCTGT GTATTGAAATAGGATTCTGTG GACTTGACAT 264757 221 3.04e-09 TTCTCTGTCT TCAATCAAATATGGTTCGATG TATCCTGTTA 263834 221 3.04e-09 TTCTCTGTCT TCAATCAAATATGGTTCGATG TATCCTGTTA 264753 100 2.64e-08 AGACATCCAA GAAGTGGTGGATGATATGGTG CTCAAGATGA 263830 100 2.64e-08 AGACATCCAA GAAGTGGTGGATGATATGGTG CTCAAGATGA 11352 113 1.16e-07 AGTAGTCAAT GCATTCACGGAGGATACATTC ATCGCAACAG 264694 114 1.25e-07 GACGCCGGTA TCATTGGTAGAGTCTTCGTTA CTTGTTAATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38760 2e-11 331_[+2]_148 36689 2e-11 331_[+2]_148 38597 9.5e-10 440_[+2]_39 36702 9.5e-10 440_[+2]_39 264757 3e-09 220_[+2]_259 263834 3e-09 220_[+2]_259 264753 2.6e-08 99_[+2]_380 263830 2.6e-08 99_[+2]_380 11352 1.2e-07 112_[+2]_367 264694 1.3e-07 113_[+2]_366 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=10 38760 ( 332) TCATTCATAGATGATTCTATG 1 36689 ( 332) TCATTCATAGATGATTCTATG 1 38597 ( 441) GTATTGAAATAGGATTCTGTG 1 36702 ( 441) GTATTGAAATAGGATTCTGTG 1 264757 ( 221) TCAATCAAATATGGTTCGATG 1 263834 ( 221) TCAATCAAATATGGTTCGATG 1 264753 ( 100) GAAGTGGTGGATGATATGGTG 1 263830 ( 100) GAAGTGGTGGATGATATGGTG 1 11352 ( 113) GCATTCACGGAGGATACATTC 1 264694 ( 114) TCATTGGTAGAGTCTTCGTTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8710 bayes= 10.0167 E= 2.1e-008 -997 -997 119 90 -48 135 -997 -42 184 -997 -997 -997 -48 -997 -13 116 -997 -997 -997 190 -997 109 119 -997 133 -997 46 -997 52 -123 -997 90 133 -997 46 -997 -997 -997 146 58 184 -997 -997 -997 -997 -997 87 116 -997 -997 204 -142 133 -123 -13 -997 -997 -997 -997 190 11 -997 -997 139 -997 176 -997 -42 -148 -997 119 58 52 -997 87 -42 -997 -997 -997 190 -148 -123 187 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 2.1e-008 0.000000 0.000000 0.500000 0.500000 0.200000 0.600000 0.000000 0.200000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.200000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.700000 0.000000 0.300000 0.000000 0.400000 0.100000 0.000000 0.500000 0.700000 0.000000 0.300000 0.000000 0.000000 0.000000 0.600000 0.400000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.900000 0.100000 0.700000 0.100000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 0.300000 0.000000 0.000000 0.700000 0.000000 0.800000 0.000000 0.200000 0.100000 0.000000 0.500000 0.400000 0.400000 0.000000 0.400000 0.200000 0.000000 0.000000 0.000000 1.000000 0.100000 0.100000 0.800000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GT][CAT]A[TAG]T[CG][AG][TA][AG][GT]A[TG]G[AG]T[TA][CT][GT][AGT]TG -------------------------------------------------------------------------------- Time 7.47 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 10 llr = 168 E-value = 4.5e-007 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 35:58:97:2:a82287:a82 pos.-specific C 7:a225:39:8::81::7::6 probability G :::2:21:1:2:2:32:3::: matrix T :5:1:3:::8::::4:3::22 bits 2.2 * 2.0 * 1.8 * * * 1.5 * * * * Relative 1.3 * * * ** * ** Entropy 1.1 * * * ******** * *** (24.2 bits) 0.9 *** * ******** ***** 0.7 *** ********** ****** 0.4 *** ********** ****** 0.2 ********************* 0.0 --------------------- Multilevel CACAACAACTCAACTAACAAC consensus AT CCT C AG GAGGTG TA sequence G G A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 264753 289 3.32e-10 TCATCGCTCT CTCAATACCTCAGCTAACAAC GTTCATTGAT 263830 289 3.32e-10 TCATCGCTCT CTCAATACCTCAGCTAACAAC GTTCATTGAT 38760 129 3.39e-09 CAATCAATAA AACGAGAACTCAACGAACATC CATCACATGA 36689 129 3.39e-09 CAATCAATAA AACGAGAACTCAACGAACATC CATCACATGA 38597 266 5.75e-09 GATACAAGAG CACACCAACTGAACAAACAAA CGTACAATAT 36702 266 5.75e-09 GATACAAGAG CACACCAACTGAACAAACAAA CGTACAATAT 264757 146 3.93e-08 TCAGATGGCA CTCCACAACTCAAATGTGAAT CACTTCATGG 263834 146 3.93e-08 TCAGATGGCA CTCCACAACTCAAATGTGAAT CACTTCATGG 8614 132 6.55e-08 TCATCTATCT ATCTACACGACAACGAACAAC AAACATCAGG 24755 67 6.95e-08 TGATATGATA CACAATGACACAACCATGAAC ACATTTGAGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 264753 3.3e-10 288_[+3]_191 263830 3.3e-10 288_[+3]_191 38760 3.4e-09 128_[+3]_351 36689 3.4e-09 128_[+3]_351 38597 5.7e-09 265_[+3]_214 36702 5.7e-09 265_[+3]_214 264757 3.9e-08 145_[+3]_334 263834 3.9e-08 145_[+3]_334 8614 6.6e-08 131_[+3]_348 24755 7e-08 66_[+3]_413 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=10 264753 ( 289) CTCAATACCTCAGCTAACAAC 1 263830 ( 289) CTCAATACCTCAGCTAACAAC 1 38760 ( 129) AACGAGAACTCAACGAACATC 1 36689 ( 129) AACGAGAACTCAACGAACATC 1 38597 ( 266) CACACCAACTGAACAAACAAA 1 36702 ( 266) CACACCAACTGAACAAACAAA 1 264757 ( 146) CTCCACAACTCAAATGTGAAT 1 263834 ( 146) CTCCACAACTCAAATGTGAAT 1 8614 ( 132) ATCTACACGACAACGAACAAC 1 24755 ( 67) CACAATGACACAACCATGAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8710 bayes= 10.0167 E= 4.5e-007 11 157 -997 -997 84 -997 -997 90 -997 209 -997 -997 84 -23 -13 -142 152 -23 -997 -997 -997 109 -13 16 169 -997 -113 -997 133 35 -997 -997 -997 193 -113 -997 -48 -997 -997 158 -997 176 -13 -997 184 -997 -997 -997 152 -997 -13 -997 -48 176 -997 -997 -48 -123 46 58 152 -997 -13 -997 133 -997 -997 16 -997 157 46 -997 184 -997 -997 -997 152 -997 -997 -42 -48 135 -997 -42 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 4.5e-007 0.300000 0.700000 0.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.500000 0.200000 0.200000 0.100000 0.800000 0.200000 0.000000 0.000000 0.000000 0.500000 0.200000 0.300000 0.900000 0.000000 0.100000 0.000000 0.700000 0.300000 0.000000 0.000000 0.000000 0.900000 0.100000 0.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.100000 0.300000 0.400000 0.800000 0.000000 0.200000 0.000000 0.700000 0.000000 0.000000 0.300000 0.000000 0.700000 0.300000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.200000 0.600000 0.000000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA][AT]C[ACG][AC][CTG]A[AC]C[TA][CG]A[AG][CA][TGA][AG][AT][CG]A[AT][CAT] -------------------------------------------------------------------------------- Time 11.10 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11352 4.25e-05 112_[+2(1.16e-07)]_367 12132 2.18e-05 459_[+1(1.82e-09)]_20 24755 8.30e-04 66_[+3(6.95e-08)]_413 263830 6.90e-16 99_[+2(2.64e-08)]_98_[+1(7.78e-10)]_\ 49_[+3(3.32e-10)]_191 263834 7.34e-15 78_[+1(6.87e-10)]_46_[+3(3.93e-08)]_\ 30_[+2(7.94e-05)]_3_[+2(3.04e-09)]_259 264694 5.73e-07 30_[+1(1.74e-07)]_62_[+2(1.25e-07)]_\ 366 264753 6.90e-16 99_[+2(2.64e-08)]_98_[+1(7.78e-10)]_\ 49_[+3(3.32e-10)]_191 264757 7.34e-15 78_[+1(6.87e-10)]_46_[+3(3.93e-08)]_\ 30_[+2(7.94e-05)]_3_[+2(3.04e-09)]_259 270031 5.18e-01 500 36689 1.17e-16 105_[+3(4.34e-05)]_2_[+3(3.39e-09)]_\ 182_[+2(1.99e-11)]_43_[+1(1.58e-08)]_84 36702 3.58e-15 43_[+3(6.14e-05)]_201_\ [+3(5.75e-09)]_9_[+1(7.05e-09)]_124_[+2(9.50e-10)]_39 38597 3.58e-15 43_[+3(6.14e-05)]_201_\ [+3(5.75e-09)]_9_[+1(7.05e-09)]_124_[+2(9.50e-10)]_39 38760 1.17e-16 105_[+3(4.34e-05)]_2_[+3(3.39e-09)]_\ 182_[+2(1.99e-11)]_43_[+1(1.58e-08)]_84 8614 3.40e-04 131_[+3(6.55e-08)]_348 869 1.09e-03 261_[+1(1.84e-07)]_218 9277 8.81e-01 500 bd1756 6.76e-01 90 bd1839 7.14e-01 500 bd1840 8.32e-04 207_[+1(2.16e-07)]_272 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************