******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/118/118.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 12063 1.0000 500 12095 1.0000 500 18351 1.0000 500 24384 1.0000 500 24404 1.0000 500 25853 1.0000 500 25854 1.0000 500 25857 1.0000 500 25866 1.0000 500 26467 1.0000 500 264907 1.0000 500 38815 1.0000 500 5892 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/118/118.seqs.fa -oc motifs/118 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.273 C 0.251 G 0.220 T 0.256 Background letter frequencies (from dataset with add-one prior applied): A 0.273 C 0.251 G 0.220 T 0.256 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 13 llr = 147 E-value = 3.7e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :2::::::21:2111: pos.-specific C 2:8:341:5:13:111 probability G 22226::a1:539:55 matrix T 6718169:2942:844 bits 2.2 * 2.0 * 1.7 * * 1.5 ** * * Relative 1.3 * ** * * Entropy 1.1 ** *** * ** (16.4 bits) 0.9 ******* ** ** * 0.7 ******** ** ** * 0.4 ******** ** **** 0.2 *********** **** 0.0 ---------------- Multilevel TTCTGTTGCTGCGTGG consensus C CC A TG TT sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 24384 201 2.19e-08 GGAACATGAG TTCTGTTGCTGAGTTT GAGCGTTGAC 25853 414 8.34e-08 TGTTGACAAT TTGTGTTGCTTGGTGT CGATCACACA 18351 187 9.53e-08 AGTGTGATGA CTCTGCTGATGGGTTG AGGTCACATC 12063 109 3.74e-07 TGGCGAGGAA GGCTCTTGCTGCGTGG TTGTCTGTCG 38815 155 4.67e-07 AGCGGATGTC TTCGGTTGGTTGGTGG CGCTCGGCTA 24404 282 1.18e-06 ATGACCACTC TTCTCTTGTAGTGTGG TACAATCACC 5892 142 2.46e-06 TGAGATATTG TTTTGTTGCTTTGTAG TAGAGTAGAT 12095 259 4.41e-06 AATGATCACT CTCTCCTGCTGAATGG TAAGCAAAAA 25854 33 5.57e-06 GTTCGGGAGC TGCTGCTGTTTCGATG GCCGGTGTGG 264907 403 6.50e-06 GTCGTGTTCG TACTGCCGTTGCGTTT GTATCGTACT 25866 388 7.00e-06 CTGAGATACA CACTCCTGCTGCGTCT GAAACAGCTG 26467 102 2.32e-05 AAACGTTTCG TTCGTTTGATTTGTTC TAAGTAGTGA 25857 237 3.46e-05 CAACCAAGCG GTGTGTTGATCGGCGT TCGAGGTCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24384 2.2e-08 200_[+1]_284 25853 8.3e-08 413_[+1]_71 18351 9.5e-08 186_[+1]_298 12063 3.7e-07 108_[+1]_376 38815 4.7e-07 154_[+1]_330 24404 1.2e-06 281_[+1]_203 5892 2.5e-06 141_[+1]_343 12095 4.4e-06 258_[+1]_226 25854 5.6e-06 32_[+1]_452 264907 6.5e-06 402_[+1]_82 25866 7e-06 387_[+1]_97 26467 2.3e-05 101_[+1]_383 25857 3.5e-05 236_[+1]_248 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=13 24384 ( 201) TTCTGTTGCTGAGTTT 1 25853 ( 414) TTGTGTTGCTTGGTGT 1 18351 ( 187) CTCTGCTGATGGGTTG 1 12063 ( 109) GGCTCTTGCTGCGTGG 1 38815 ( 155) TTCGGTTGGTTGGTGG 1 24404 ( 282) TTCTCTTGTAGTGTGG 1 5892 ( 142) TTTTGTTGCTTTGTAG 1 12095 ( 259) CTCTCCTGCTGAATGG 1 25854 ( 33) TGCTGCTGTTTCGATG 1 264907 ( 403) TACTGCCGTTGCGTTT 1 25866 ( 388) CACTCCTGCTGCGTCT 1 26467 ( 102) TTCGTTTGATTTGTTC 1 25857 ( 237) GTGTGTTGATCGGCGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 8.91886 E= 3.7e-003 -1035 -12 -52 127 -83 -1035 -52 144 -1035 161 -52 -173 -1035 -1035 -52 173 -1035 29 148 -173 -1035 61 -1035 127 -1035 -171 -1035 185 -1035 -1035 218 -1035 -24 88 -151 -15 -183 -1035 -1035 185 -1035 -171 129 59 -83 29 48 -15 -183 -1035 207 -1035 -183 -171 -1035 173 -183 -171 107 59 -1035 -171 129 59 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 13 E= 3.7e-003 0.000000 0.230769 0.153846 0.615385 0.153846 0.000000 0.153846 0.692308 0.000000 0.769231 0.153846 0.076923 0.000000 0.000000 0.153846 0.846154 0.000000 0.307692 0.615385 0.076923 0.000000 0.384615 0.000000 0.615385 0.000000 0.076923 0.000000 0.923077 0.000000 0.000000 1.000000 0.000000 0.230769 0.461538 0.076923 0.230769 0.076923 0.000000 0.000000 0.923077 0.000000 0.076923 0.538462 0.384615 0.153846 0.307692 0.307692 0.230769 0.076923 0.000000 0.923077 0.000000 0.076923 0.076923 0.000000 0.846154 0.076923 0.076923 0.461538 0.384615 0.000000 0.076923 0.538462 0.384615 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TC]TCT[GC][TC]TG[CAT]T[GT][CGT]GT[GT][GT] -------------------------------------------------------------------------------- Time 1.94 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 13 llr = 125 E-value = 1.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :5711923193: pos.-specific C a1:98:868:29 probability G :23::::1:111 matrix T :3::21::1:4: bits 2.2 2.0 * 1.7 * 1.5 * * * * * Relative 1.3 * * ** ** * Entropy 1.1 * ***** ** * (13.9 bits) 0.9 * ***** ** * 0.7 * ******** * 0.4 * ******** * 0.2 ************ 0.0 ------------ Multilevel CAACCACCCATC consensus TG A A sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 25853 438 8.92e-08 GTCGATCACA CAACCACCCATC ATCGGCTCTC 12095 372 1.57e-06 ACTCTGCAGA CGACCACCCACC CGAGATGACG 25866 483 4.26e-06 GGCGACACAC CTGCCACACACC CCCACC 26467 220 8.09e-06 AGGAATCATT CAACAACCCAAC ACCAGCCCAC 5892 237 1.22e-05 GATTCAACCA CAACCACACATG TACTTTCTCT 25854 479 1.22e-05 ACTGCTAATA CTACTACACAAC AATAGCAACA 38815 347 1.59e-05 TGCCGACAGA CAACCACGCAGC AAGAGCCTTC 264907 474 1.59e-05 TCTCCACCTC CAAACACACATC ACCGCACCAG 18351 473 1.59e-05 TGGACACAAT CTGCCACCTATC CAGCTACCAA 25857 351 2.17e-05 GCCAAGCAAG CTGCCTCCCAAC ACAAACGAAA 12063 304 2.54e-05 TAGCCGAGCG CCACCAACCAAC CGAAGACATC 24384 451 4.26e-05 TCGAACGCCC CAACCAACCGCC CCTGCCACCT 24404 19 9.45e-05 TCCAGTCTGT CGGCTACCAATC GACCTCATCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25853 8.9e-08 437_[+2]_51 12095 1.6e-06 371_[+2]_117 25866 4.3e-06 482_[+2]_6 26467 8.1e-06 219_[+2]_269 5892 1.2e-05 236_[+2]_252 25854 1.2e-05 478_[+2]_10 38815 1.6e-05 346_[+2]_142 264907 1.6e-05 473_[+2]_15 18351 1.6e-05 472_[+2]_16 25857 2.2e-05 350_[+2]_138 12063 2.5e-05 303_[+2]_185 24384 4.3e-05 450_[+2]_38 24404 9.4e-05 18_[+2]_470 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=13 25853 ( 438) CAACCACCCATC 1 12095 ( 372) CGACCACCCACC 1 25866 ( 483) CTGCCACACACC 1 26467 ( 220) CAACAACCCAAC 1 5892 ( 237) CAACCACACATG 1 25854 ( 479) CTACTACACAAC 1 38815 ( 347) CAACCACGCAGC 1 264907 ( 474) CAAACACACATC 1 18351 ( 473) CTGCCACCTATC 1 25857 ( 351) CTGCCTCCCAAC 1 12063 ( 304) CCACCAACCAAC 1 24384 ( 451) CAACCAACCGCC 1 24404 ( 19) CGGCTACCAATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 8.93074 E= 1.2e+000 -1035 199 -1035 -1035 76 -171 -52 27 134 -1035 48 -1035 -183 188 -1035 -1035 -183 161 -1035 -73 176 -1035 -1035 -173 -83 175 -1035 -1035 17 129 -151 -1035 -183 175 -1035 -173 176 -1035 -151 -1035 17 -12 -151 59 -1035 188 -151 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 1.2e+000 0.000000 1.000000 0.000000 0.000000 0.461538 0.076923 0.153846 0.307692 0.692308 0.000000 0.307692 0.000000 0.076923 0.923077 0.000000 0.000000 0.076923 0.769231 0.000000 0.153846 0.923077 0.000000 0.000000 0.076923 0.153846 0.846154 0.000000 0.000000 0.307692 0.615385 0.076923 0.000000 0.076923 0.846154 0.000000 0.076923 0.923077 0.000000 0.076923 0.000000 0.307692 0.230769 0.076923 0.384615 0.000000 0.923077 0.076923 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[AT][AG]CCAC[CA]CA[TAC]C -------------------------------------------------------------------------------- Time 3.85 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 13 llr = 126 E-value = 1.4e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a:21:25::::: pos.-specific C :84:28:89:44 probability G :12:2::::2:: matrix T :1395:521866 bits 2.2 2.0 * 1.7 * 1.5 * * * Relative 1.3 ** * * *** Entropy 1.1 ** * * ***** (14.0 bits) 0.9 ** * ******* 0.7 ** ********* 0.4 ** ********* 0.2 ************ 0.0 ------------ Multilevel ACCTTCTCCTTT consensus T C A CC sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 24404 430 7.66e-08 TAAGTAGTGA ACCTTCTCCTTT TTTAACTCCT 25857 13 6.30e-07 GTCAGCGACA ACCTTCACCTTC ATGGTCTCCC 12063 333 2.35e-06 ATCCGCTGAG ACCTCCTCCTCT CCACACCCAT 26467 265 4.24e-06 AGCTCCTCCT ACCTCCTCCTCC TTCTATCCCC 38815 392 6.99e-06 CCACCACATC ACGTGCACCTTC ACCTGCAGTC 12095 405 6.99e-06 TCGATCAATG ACGTGCACCTCT TCCTGCTTTG 24384 348 1.30e-05 ACTTACCTTG ATTTTCTCCTTT ATCATTTATT 264907 455 2.03e-05 ATGCTCATAC ACCATCACCTCT CCACCTCCAA 5892 250 2.21e-05 CCACACATGT ACTTTCTCTTCT GCTATCTCCT 25853 240 2.21e-05 ACTCACCCGC ACATTCACCGTC TTGGTGTGAA 25866 63 3.26e-05 ACAAACAATA ACATGAACCTTT GCTCAAGTCC 18351 395 6.58e-05 GTGAACATAA AGTTTCTTCTTC CTTAGCATGA 25854 335 1.54e-04 AATGATGCTC ACTTCATTCGTT GCCTCGTTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24404 7.7e-08 429_[+3]_59 25857 6.3e-07 12_[+3]_476 12063 2.3e-06 332_[+3]_156 26467 4.2e-06 264_[+3]_224 38815 7e-06 391_[+3]_97 12095 7e-06 404_[+3]_84 24384 1.3e-05 347_[+3]_141 264907 2e-05 454_[+3]_34 5892 2.2e-05 249_[+3]_239 25853 2.2e-05 239_[+3]_249 25866 3.3e-05 62_[+3]_426 18351 6.6e-05 394_[+3]_94 25854 0.00015 334_[+3]_154 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=13 24404 ( 430) ACCTTCTCCTTT 1 25857 ( 13) ACCTTCACCTTC 1 12063 ( 333) ACCTCCTCCTCT 1 26467 ( 265) ACCTCCTCCTCC 1 38815 ( 392) ACGTGCACCTTC 1 12095 ( 405) ACGTGCACCTCT 1 24384 ( 348) ATTTTCTCCTTT 1 264907 ( 455) ACCATCACCTCT 1 5892 ( 250) ACTTTCTCTTCT 1 25853 ( 240) ACATTCACCGTC 1 25866 ( 63) ACATGAACCTTT 1 18351 ( 395) AGTTTCTTCTTC 1 25854 ( 335) ACTTCATTCGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 8.93074 E= 1.4e+000 187 -1035 -1035 -1035 -1035 175 -151 -173 -83 61 -52 27 -183 -1035 -1035 185 -1035 -12 7 107 -83 175 -1035 -1035 76 -1035 -1035 107 -1035 175 -1035 -73 -1035 188 -1035 -173 -1035 -1035 -52 173 -1035 61 -1035 127 -1035 61 -1035 127 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 1.4e+000 1.000000 0.000000 0.000000 0.000000 0.000000 0.846154 0.076923 0.076923 0.153846 0.384615 0.153846 0.307692 0.076923 0.000000 0.000000 0.923077 0.000000 0.230769 0.230769 0.538462 0.153846 0.846154 0.000000 0.000000 0.461538 0.000000 0.000000 0.538462 0.000000 0.846154 0.000000 0.153846 0.000000 0.923077 0.000000 0.076923 0.000000 0.000000 0.153846 0.846154 0.000000 0.384615 0.000000 0.615385 0.000000 0.384615 0.000000 0.615385 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AC[CT]T[TCG]C[TA]CCT[TC][TC] -------------------------------------------------------------------------------- Time 5.77 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12063 5.56e-07 108_[+1(3.74e-07)]_179_\ [+2(2.54e-05)]_17_[+3(2.35e-06)]_156 12095 1.12e-06 104_[+1(3.09e-05)]_138_\ [+1(4.41e-06)]_97_[+2(1.57e-06)]_21_[+3(6.99e-06)]_84 18351 2.12e-06 186_[+1(9.53e-08)]_192_\ [+3(6.58e-05)]_66_[+2(1.59e-05)]_3_[+2(8.33e-05)]_1 24384 3.19e-07 200_[+1(2.19e-08)]_131_\ [+3(1.30e-05)]_91_[+2(4.26e-05)]_38 24404 2.29e-07 18_[+2(9.45e-05)]_251_\ [+1(1.18e-06)]_132_[+3(7.66e-08)]_59 25853 6.27e-09 239_[+3(2.21e-05)]_162_\ [+1(8.34e-08)]_8_[+2(8.92e-08)]_51 25854 1.25e-04 32_[+1(5.57e-06)]_430_\ [+2(1.22e-05)]_10 25857 8.52e-06 12_[+3(6.30e-07)]_212_\ [+1(3.46e-05)]_98_[+2(2.17e-05)]_138 25866 1.62e-05 62_[+3(3.26e-05)]_313_\ [+1(7.00e-06)]_79_[+2(4.26e-06)]_6 26467 1.36e-05 101_[+1(2.32e-05)]_102_\ [+2(8.09e-06)]_33_[+3(4.24e-06)]_224 264907 3.19e-05 188_[+2(8.90e-05)]_202_\ [+1(6.50e-06)]_36_[+3(2.03e-05)]_7_[+2(1.59e-05)]_15 38815 1.20e-06 154_[+1(4.67e-07)]_176_\ [+2(1.59e-05)]_33_[+3(6.99e-06)]_97 5892 1.15e-05 141_[+1(2.46e-06)]_79_\ [+2(1.22e-05)]_1_[+3(2.21e-05)]_239 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************