******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/197/197.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10752 1.0000 500 11196 1.0000 500 1938 1.0000 500 24863 1.0000 500 263716 1.0000 500 269276 1.0000 500 2812 1.0000 500 33056 1.0000 500 4249 1.0000 500 4760 1.0000 500 5087 1.0000 500 6236 1.0000 500 9462 1.0000 500 9650 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/197/197.seqs.fa -oc motifs/197 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.241 C 0.249 G 0.242 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.241 C 0.249 G 0.242 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 14 llr = 154 E-value = 1.2e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1::121:8:1263181 pos.-specific C 6145579:994379:9 probability G 3:12:111::41::1: matrix T 19513:111:1:::11 bits 2.1 1.8 1.6 ** * 1.4 * ** * Relative 1.2 * * ** ** * Entropy 1.0 * **** **** (15.9 bits) 0.8 * ***** **** 0.6 ** ***** ***** 0.4 *** ****** ***** 0.2 **************** 0.0 ---------------- Multilevel CTTCCCCACCCACCAC consensus G CGT GCA sequence A A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 9462 407 1.49e-08 CCTACATTTC CTCCCCCACCAAACAC CACTAACGCA 9650 376 2.51e-08 GCTTTGAAGC CTTGACCACCGACCAC ACACAAAGTA 10752 166 2.51e-08 GCTTTGAAGC CTTGACCACCGACCAC ACACAAAGTA 24863 481 4.34e-07 CGGTGGTAGC GTTGTGCACCCACCAC CACA 1938 408 1.95e-06 CACGTCGAGA CTTCCCCACCTCCCAT CCAACACACA 269276 380 3.84e-06 ACGCTTTTTC GTTCTGCACCAGACAC CACAGAATCA 11196 9 4.88e-06 TGATGCTT GTCTACCACCAACAAC TCTCACATTT 5087 200 5.61e-06 TTCAACTATT CTCACACACCGCCCGC CGCCGGCCGC 263716 432 7.16e-06 CGCAGACTCA CTCTCACGCCCCACAC TCACAGAACT 4249 475 8.39e-06 CAACCTCACT CTTCCCCTCCCCACAA TACCAAAGCA 2812 77 8.39e-06 AGTTGAAGTT GTGCCCCATCGGCCAC CGCCATTGAT 4760 448 9.01e-06 ACACTCTCCT CCTCCCGGCCGACCAC AACACAGCTT 33056 437 1.37e-05 CCACTACTCG ATCCTCTACCCACCTC TCCTCTGGTT 6236 471 3.82e-05 AGTTATATCA TTCATCCACACACCTC CGTATCGACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9462 1.5e-08 406_[+1]_78 9650 2.5e-08 375_[+1]_109 10752 2.5e-08 165_[+1]_319 24863 4.3e-07 480_[+1]_4 1938 1.9e-06 407_[+1]_77 269276 3.8e-06 379_[+1]_105 11196 4.9e-06 8_[+1]_476 5087 5.6e-06 199_[+1]_285 263716 7.2e-06 431_[+1]_53 4249 8.4e-06 474_[+1]_10 2812 8.4e-06 76_[+1]_408 4760 9e-06 447_[+1]_37 33056 1.4e-05 436_[+1]_48 6236 3.8e-05 470_[+1]_14 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=14 9462 ( 407) CTCCCCCACCAAACAC 1 9650 ( 376) CTTGACCACCGACCAC 1 10752 ( 166) CTTGACCACCGACCAC 1 24863 ( 481) GTTGTGCACCCACCAC 1 1938 ( 408) CTTCCCCACCTCCCAT 1 269276 ( 380) GTTCTGCACCAGACAC 1 11196 ( 9) GTCTACCACCAACAAC 1 5087 ( 200) CTCACACACCGCCCGC 1 263716 ( 432) CTCTCACGCCCCACAC 1 4249 ( 475) CTTCCCCTCCCCACAA 1 2812 ( 77) GTGCCCCATCGGCCAC 1 4760 ( 448) CCTCCCGGCCGACCAC 1 33056 ( 437) ATCCTCTACCCACCTC 1 6236 ( 471) TTCATCCACACACCTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 8.91886 E= 1.2e-001 -175 120 24 -190 -1045 -180 -1045 179 -1045 78 -176 90 -76 100 -17 -91 -17 100 -1045 9 -76 152 -76 -1045 -1045 178 -176 -190 170 -1045 -76 -190 -1045 190 -1045 -190 -175 190 -1045 -1045 -17 52 56 -190 124 20 -76 -1045 24 152 -1045 -1045 -175 190 -1045 -1045 170 -1045 -176 -91 -175 178 -1045 -190 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 14 E= 1.2e-001 0.071429 0.571429 0.285714 0.071429 0.000000 0.071429 0.000000 0.928571 0.000000 0.428571 0.071429 0.500000 0.142857 0.500000 0.214286 0.142857 0.214286 0.500000 0.000000 0.285714 0.142857 0.714286 0.142857 0.000000 0.000000 0.857143 0.071429 0.071429 0.785714 0.000000 0.142857 0.071429 0.000000 0.928571 0.000000 0.071429 0.071429 0.928571 0.000000 0.000000 0.214286 0.357143 0.357143 0.071429 0.571429 0.285714 0.142857 0.000000 0.285714 0.714286 0.000000 0.000000 0.071429 0.928571 0.000000 0.000000 0.785714 0.000000 0.071429 0.142857 0.071429 0.857143 0.000000 0.071429 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG]T[TC][CG][CTA]CCACC[CGA][AC][CA]CAC -------------------------------------------------------------------------------- Time 1.65 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 14 llr = 151 E-value = 3.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 111:121::6::6:3 pos.-specific C 9::3::11:21:::: probability G :6649617a:69:97 matrix T :2241162:23141: bits 2.1 * 1.8 * 1.6 * * 1.4 * * * * Relative 1.2 * * * * ** Entropy 1.0 * * * **** (15.5 bits) 0.8 *** ** ** ***** 0.6 *** ** ******** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CGGGGGTGGAGGAGG consensus TTT A T CT T A sequence C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 269276 213 1.43e-08 TGAAGAGAAG CGGTGGTGGTGGAGG CGGCAGCTGT 9462 50 3.93e-07 GTGCCATGGC CGACGGCGGAGGAGG CAGTGCACCG 4760 191 4.47e-07 CTGCGTTTTA CGGTGGTCGAGGAGA ATAGTCTACG 6236 215 1.13e-06 AGGGTGGTGT AGTGGGTGGCGGAGG TTGTTCGGCG 11196 361 1.13e-06 GCGACGATGA CGGCGGTGGTGGTTG GTCCTTCATG 9650 340 2.08e-06 GCCATTCTCC CTGGGTTGGATGTGG CTTATTCCTT 10752 130 2.08e-06 GCCATTCTCC CTGGGTTGGATGTGG CTTATTCCTT 24863 122 4.82e-06 CAGAGTTGTA CTGGGAGGGAGGAGA TCACCAAGCC 33056 52 5.25e-06 TATTTTGAAT CGGTGATTGATGATG CACTCTGGTG 4249 236 8.64e-06 CCTTCGTCCG AAGCGGTTGTGGAGG GCAACGGAGG 1938 93 1.09e-05 GGCATGATGA CGATGACTGAGGTGG GAAGTAGATG 5087 307 1.37e-05 CTCATCGGAT CGTCGGAGGAGTAGA CTAGTAATCG 2812 42 1.37e-05 AGCCTCCACG CGTTAGTGGCTGTGG CTGCCTCTGA 263716 348 1.16e-04 GTCCTCACGT CAGGTGAGGCCGAGA CTGCCAGCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 269276 1.4e-08 212_[+2]_273 9462 3.9e-07 49_[+2]_436 4760 4.5e-07 190_[+2]_295 6236 1.1e-06 214_[+2]_271 11196 1.1e-06 360_[+2]_125 9650 2.1e-06 339_[+2]_146 10752 2.1e-06 129_[+2]_356 24863 4.8e-06 121_[+2]_364 33056 5.3e-06 51_[+2]_434 4249 8.6e-06 235_[+2]_250 1938 1.1e-05 92_[+2]_393 5087 1.4e-05 306_[+2]_179 2812 1.4e-05 41_[+2]_444 263716 0.00012 347_[+2]_138 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=14 269276 ( 213) CGGTGGTGGTGGAGG 1 9462 ( 50) CGACGGCGGAGGAGG 1 4760 ( 191) CGGTGGTCGAGGAGA 1 6236 ( 215) AGTGGGTGGCGGAGG 1 11196 ( 361) CGGCGGTGGTGGTTG 1 9650 ( 340) CTGGGTTGGATGTGG 1 10752 ( 130) CTGGGTTGGATGTGG 1 24863 ( 122) CTGGGAGGGAGGAGA 1 33056 ( 52) CGGTGATTGATGATG 1 4249 ( 236) AAGCGGTTGTGGAGG 1 1938 ( 93) CGATGACTGAGGTGG 1 5087 ( 307) CGTCGGAGGAGTAGA 1 2812 ( 42) CGTTAGTGGCTGTGG 1 263716 ( 348) CAGGTGAGGCCGAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6804 bayes= 8.92184 E= 3.0e-001 -76 178 -1045 -1045 -76 -1045 141 -32 -76 -1045 141 -32 -1045 20 56 42 -175 -1045 183 -190 -17 -1045 141 -91 -76 -80 -176 126 -1045 -180 156 -32 -1045 -1045 205 -1045 124 -22 -1045 -32 -1045 -180 141 9 -1045 -1045 194 -190 141 -1045 -1045 42 -1045 -1045 183 -91 24 -1045 156 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 14 E= 3.0e-001 0.142857 0.857143 0.000000 0.000000 0.142857 0.000000 0.642857 0.214286 0.142857 0.000000 0.642857 0.214286 0.000000 0.285714 0.357143 0.357143 0.071429 0.000000 0.857143 0.071429 0.214286 0.000000 0.642857 0.142857 0.142857 0.142857 0.071429 0.642857 0.000000 0.071429 0.714286 0.214286 0.000000 0.000000 1.000000 0.000000 0.571429 0.214286 0.000000 0.214286 0.000000 0.071429 0.642857 0.285714 0.000000 0.000000 0.928571 0.071429 0.642857 0.000000 0.000000 0.357143 0.000000 0.000000 0.857143 0.142857 0.285714 0.000000 0.714286 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[GT][GT][GTC]G[GA]T[GT]G[ACT][GT]G[AT]G[GA] -------------------------------------------------------------------------------- Time 3.24 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 17 sites = 4 llr = 83 E-value = 6.5e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::a::aa::5a38:a:: pos.-specific C 8::::::::::53:::3 probability G 3a:aa::aa5:::a:a8 matrix T :::::::::::3::::: bits 2.1 ******** * *** 1.8 ******** * *** 1.6 ******** * *** 1.4 ******** * *** Relative 1.2 ********* * ***** Entropy 1.0 *********** ***** (29.8 bits) 0.8 *********** ***** 0.6 *********** ***** 0.4 ***************** 0.2 ***************** 0.0 ----------------- Multilevel CGAGGAAGGAACAGAGG consensus G G AC C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ----------------- 9650 312 6.90e-11 CGGCTATCTA CGAGGAAGGAACAGAGG AGCCATTCTC 10752 102 6.90e-11 CGGCTATCTA CGAGGAAGGAACAGAGG AGCCATTCTC 24863 435 6.22e-10 ACGATTGCCC GGAGGAAGGGAAAGAGG CCGCCGCAGT 9462 175 1.49e-09 TGTTCTAGGG CGAGGAAGGGATCGAGC TCCATGGCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9650 6.9e-11 311_[+3]_172 10752 6.9e-11 101_[+3]_382 24863 6.2e-10 434_[+3]_49 9462 1.5e-09 174_[+3]_309 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=17 seqs=4 9650 ( 312) CGAGGAAGGAACAGAGG 1 10752 ( 102) CGAGGAAGGAACAGAGG 1 24863 ( 435) GGAGGAAGGGAAAGAGG 1 9462 ( 175) CGAGGAAGGGATCGAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 17 n= 6776 bayes= 11.4627 E= 6.5e-001 -865 159 5 -865 -865 -865 205 -865 205 -865 -865 -865 -865 -865 205 -865 -865 -865 205 -865 205 -865 -865 -865 205 -865 -865 -865 -865 -865 205 -865 -865 -865 205 -865 105 -865 105 -865 205 -865 -865 -865 5 100 -865 -10 163 0 -865 -865 -865 -865 205 -865 205 -865 -865 -865 -865 -865 205 -865 -865 0 163 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 17 nsites= 4 E= 6.5e-001 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.500000 0.000000 0.250000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG]GAGGAAGG[AG]A[CAT][AC]GAG[GC] -------------------------------------------------------------------------------- Time 4.68 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10752 2.73e-13 43_[+1(1.69e-05)]_42_[+3(6.90e-11)]_\ 11_[+2(2.08e-06)]_21_[+1(2.51e-08)]_319 11196 3.56e-05 8_[+1(4.88e-06)]_281_[+2(2.23e-05)]_\ 40_[+2(1.13e-06)]_125 1938 1.29e-04 20_[+2(5.86e-05)]_57_[+2(1.09e-05)]_\ 1_[+2(5.86e-05)]_284_[+1(1.95e-06)]_77 24863 6.92e-11 121_[+2(4.82e-06)]_298_\ [+3(6.22e-10)]_29_[+1(4.34e-07)]_4 263716 8.10e-03 431_[+1(7.16e-06)]_53 269276 5.24e-07 212_[+2(1.43e-08)]_152_\ [+1(3.84e-06)]_105 2812 6.17e-04 41_[+2(1.37e-05)]_20_[+1(8.39e-06)]_\ 408 33056 8.26e-04 51_[+2(5.25e-06)]_370_\ [+1(1.37e-05)]_48 4249 9.04e-04 235_[+2(8.64e-06)]_224_\ [+1(8.39e-06)]_10 4760 4.83e-05 190_[+2(4.47e-07)]_242_\ [+1(9.01e-06)]_37 5087 8.60e-04 199_[+1(5.61e-06)]_91_\ [+2(1.37e-05)]_179 6236 4.32e-04 214_[+2(1.13e-06)]_241_\ [+1(3.82e-05)]_14 9462 6.30e-13 49_[+2(3.93e-07)]_110_\ [+3(1.49e-09)]_215_[+1(1.49e-08)]_78 9650 2.73e-13 253_[+1(1.69e-05)]_42_\ [+3(6.90e-11)]_11_[+2(2.08e-06)]_21_[+1(2.51e-08)]_109 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************