******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/488/488.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11466 1.0000 500 12024 1.0000 500 20766 1.0000 500 21001 1.0000 500 21451 1.0000 500 22046 1.0000 500 2220 1.0000 500 23431 1.0000 500 23543 1.0000 500 23669 1.0000 500 23851 1.0000 500 23927 1.0000 500 24499 1.0000 500 25106 1.0000 500 25107 1.0000 500 25590 1.0000 500 25685 1.0000 500 263887 1.0000 500 264466 1.0000 500 264646 1.0000 500 268311 1.0000 500 29327 1.0000 500 34280 1.0000 500 bd1801 1.0000 500 bd738 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/488/488.seqs.fa -oc motifs/488 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 25 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12500 N= 25 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.282 C 0.240 G 0.219 T 0.259 Background letter frequencies (from dataset with add-one prior applied): A 0.282 C 0.240 G 0.219 T 0.259 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 4 llr = 96 E-value = 1.5e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 3:a::5:8::3:::aa:a:8: pos.-specific C :::::::::::::3::a:a:3 probability G 8a:a3:8:a::a:8:::::38 matrix T ::::8533:a8:a:::::::: bits 2.2 * * * * 2.0 * * ** ** * * 1.8 *** ** ** ***** 1.5 *** ** ** ***** Relative 1.3 **** * ** ******** * Entropy 1.1 ***** *************** (34.7 bits) 0.9 ********************* 0.7 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GGAGTAGAGTTGTGAACACAG consensus A GTTT A C GC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- bd738 58 1.79e-13 AAACCAGACA GGAGTTGAGTTGTGAACACAG CAAATCGTTC bd1801 168 1.79e-13 AAACCAGACA GGAGTTGAGTTGTGAACACAG CAAATCGTTC 268311 168 3.58e-11 CCCATCTTAT GGAGTATAGTAGTGAACACAC AACCTGCAGT 264646 155 7.15e-11 GTTCTTCAGC AGAGGAGTGTTGTCAACACGG AGTTTGCGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd738 1.8e-13 57_[+1]_422 bd1801 1.8e-13 167_[+1]_312 268311 3.6e-11 167_[+1]_312 264646 7.1e-11 154_[+1]_325 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=4 bd738 ( 58) GGAGTTGAGTTGTGAACACAG 1 bd1801 ( 168) GGAGTTGAGTTGTGAACACAG 1 268311 ( 168) GGAGTATAGTAGTGAACACAC 1 264646 ( 155) AGAGGAGTGTTGTCAACACGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 12000 bayes= 11.5503 E= 1.5e-001 -17 -865 178 -865 -865 -865 219 -865 182 -865 -865 -865 -865 -865 219 -865 -865 -865 19 153 82 -865 -865 95 -865 -865 178 -5 141 -865 -865 -5 -865 -865 219 -865 -865 -865 -865 195 -17 -865 -865 153 -865 -865 219 -865 -865 -865 -865 195 -865 6 178 -865 182 -865 -865 -865 182 -865 -865 -865 -865 206 -865 -865 182 -865 -865 -865 -865 206 -865 -865 141 -865 19 -865 -865 6 178 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 1.5e-001 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 0.750000 0.250000 0.750000 0.000000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.250000 0.750000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA]GAG[TG][AT][GT][AT]GT[TA]GT[GC]AACAC[AG][GC] -------------------------------------------------------------------------------- Time 4.70 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 18 llr = 217 E-value = 1.2e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 633:74:732::4:21:13:: pos.-specific C 213924934729316427616 probability G :24:1:11::113::3:2::2 matrix T 2411:2::317::92281192 bits 2.2 2.0 1.8 * * 1.5 * * * * * Relative 1.3 * * * * * * Entropy 1.1 * * * * * * (17.4 bits) 0.9 * ** *** * * ** 0.7 ** ** *** ** ***** 0.4 * ************ ***** 0.2 ********************* 0.0 --------------------- Multilevel ATGCAACACCTCATCCTCCTC consensus CAA CC CA C C AGC A G sequence TGC T T G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 12024 466 3.37e-09 ACATCAAACG TAGCAACAACTCATCGTCATC AAACACATCA 25107 256 7.80e-09 AGACGAGCGA ATACCACACCTCCTCTTCATC TTCACCGCCA 11466 170 4.48e-08 ACCGACACTC ATCCATCACCTCATCGTACTG ATAATAAACA 25106 399 1.73e-07 CAACGCTCCC ATTCACCATCTCATCTCCATC AACAGTCATT 264466 250 1.94e-07 TGGAAGGAAC ATCCATCCATTCGTCGTCATC GCATCGTGTT 263887 369 2.17e-07 TCGTTGAGAG AAGTCACATCTCGTTCTCCTC CTACCGTCAC 23543 318 5.05e-07 GTCTCAAAAC CAACACCACTTCCTTCCCCTC TTCTTTAGCA 23851 370 9.03e-07 CCAGACGAAC AAACACCCTCTCATATTCCCC TTCAGACGAT 22046 391 1.09e-06 CACGTTCCTG AGCCGAGCCCGCCTCCTCCTC TGCACCAACT bd738 421 1.69e-06 AGGCCAAACT TTGCATCAAATCGTCGTTCTG ATCATTTGGA 24499 466 1.84e-06 ACGGACAGAC AAGCGACGCCCCCTCCTCATT GCTTGCCTTA 25685 241 2.18e-06 CATTGATACT TGGCACCCCCCCCCTCTGCTC TACCGGTTCT 34280 470 2.78e-06 GTTTCTTCAT CTACACCACAGCATACCCCTG CACATTAACA 25590 324 3.01e-06 CGGTGAGAGC CTGCAAGAACCCGTCGTACTT CAAGGCCCAA 20766 444 3.80e-06 TCAAGCACAA AACCCACCTCTCCTATCGCTG GCTTCTTCGG 264646 397 5.91e-06 TTACCGACAC CCCCACCAACTCATCATGTTC CTACTCAGGC 2220 479 8.94e-06 CCGTCAACGA TGATACCATACCATCATCATC T 23669 277 1.16e-05 GCTTCAAAAA AGGCCTCAACTGGCACTCCTT TATCGTATGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12024 3.4e-09 465_[+2]_14 25107 7.8e-09 255_[+2]_224 11466 4.5e-08 169_[+2]_310 25106 1.7e-07 398_[+2]_81 264466 1.9e-07 249_[+2]_230 263887 2.2e-07 368_[+2]_111 23543 5.1e-07 317_[+2]_162 23851 9e-07 369_[+2]_110 22046 1.1e-06 390_[+2]_89 bd738 1.7e-06 420_[+2]_59 24499 1.8e-06 465_[+2]_14 25685 2.2e-06 240_[+2]_239 34280 2.8e-06 469_[+2]_10 25590 3e-06 323_[+2]_156 20766 3.8e-06 443_[+2]_36 264646 5.9e-06 396_[+2]_83 2220 8.9e-06 478_[+2]_1 23669 1.2e-05 276_[+2]_203 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=18 12024 ( 466) TAGCAACAACTCATCGTCATC 1 25107 ( 256) ATACCACACCTCCTCTTCATC 1 11466 ( 170) ATCCATCACCTCATCGTACTG 1 25106 ( 399) ATTCACCATCTCATCTCCATC 1 264466 ( 250) ATCCATCCATTCGTCGTCATC 1 263887 ( 369) AAGTCACATCTCGTTCTCCTC 1 23543 ( 318) CAACACCACTTCCTTCCCCTC 1 23851 ( 370) AAACACCCTCTCATATTCCCC 1 22046 ( 391) AGCCGAGCCCGCCTCCTCCTC 1 bd738 ( 421) TTGCATCAAATCGTCGTTCTG 1 24499 ( 466) AAGCGACGCCCCCTCCTCATT 1 25685 ( 241) TGGCACCCCCCCCCTCTGCTC 1 34280 ( 470) CTACACCACAGCATACCCCTG 1 25590 ( 324) CTGCAAGAACCCGTCGTACTT 1 20766 ( 444) AACCCACCTCTCCTATCGCTG 1 264646 ( 397) CCCCACCAACTCATCATGTTC 1 2220 ( 479) TGATACCATACCATCATCATC 1 23669 ( 277) AGGCCTCAACTGGCACTCCTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 12000 bayes= 9.51315 E= 1.2e-002 98 -11 -1081 -22 24 -211 2 59 -2 21 83 -222 -1081 189 -1081 -122 124 -11 -98 -1081 46 70 -1081 -22 -1081 189 -98 -1081 124 21 -197 -1081 24 70 -1081 10 -76 159 -1081 -122 -1081 -11 -98 136 -1081 198 -197 -1081 46 47 34 -1081 -1081 -111 -1081 178 -35 135 -1081 -63 -134 70 34 -22 -1081 -11 -1081 159 -134 147 -39 -222 24 135 -1081 -222 -1081 -211 -1081 187 -1081 135 2 -63 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 18 E= 1.2e-002 0.555556 0.222222 0.000000 0.222222 0.333333 0.055556 0.222222 0.388889 0.277778 0.277778 0.388889 0.055556 0.000000 0.888889 0.000000 0.111111 0.666667 0.222222 0.111111 0.000000 0.388889 0.388889 0.000000 0.222222 0.000000 0.888889 0.111111 0.000000 0.666667 0.277778 0.055556 0.000000 0.333333 0.388889 0.000000 0.277778 0.166667 0.722222 0.000000 0.111111 0.000000 0.222222 0.111111 0.666667 0.000000 0.944444 0.055556 0.000000 0.388889 0.333333 0.277778 0.000000 0.000000 0.111111 0.000000 0.888889 0.222222 0.611111 0.000000 0.166667 0.111111 0.388889 0.277778 0.222222 0.000000 0.222222 0.000000 0.777778 0.111111 0.666667 0.166667 0.055556 0.333333 0.611111 0.000000 0.055556 0.000000 0.055556 0.000000 0.944444 0.000000 0.611111 0.222222 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [ACT][TAG][GAC]C[AC][ACT]C[AC][CAT]C[TC]C[ACG]T[CA][CGT][TC]C[CA]T[CG] -------------------------------------------------------------------------------- Time 9.35 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 8 llr = 106 E-value = 2.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::15:1:::::a pos.-specific C :::4::a4::1: probability G a:91:9:55:9: matrix T :a::a::15a:: bits 2.2 * 2.0 ** * * * 1.8 ** * * * * 1.5 *** *** *** Relative 1.3 *** *** *** Entropy 1.1 *** *** **** (19.1 bits) 0.9 *** *** **** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GTGATGCGGTGA consensus C CT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ bd738 122 1.48e-07 GACGTTTTGA GTGCTGCGTTGA ATGTCGGAGA bd1801 232 1.48e-07 GACGTTTTGA GTGCTGCGTTGA ATGTCGGAGA 21451 340 2.69e-07 CCAAACAGTT GTGATGCCTTGA GCACGGTGAA 25106 13 3.10e-07 GTGATGCCTC GTGCTGCCTTGA CAACTGCGGA 25590 308 3.38e-07 GCTCTGGGCT GTGGTGCGGTGA GAGCCTGCAA 23669 21 6.55e-07 GAGAATGATA GTGATGCGGTCA CTCAATTACA 23543 51 1.39e-06 GAGGTATAGA GTAATGCCGTGA ACGTGAAGGT 2220 157 2.46e-06 AGCACTTGGT GTGATACTGTGA GTCGCTTCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd738 1.5e-07 121_[+3]_367 bd1801 1.5e-07 231_[+3]_257 21451 2.7e-07 339_[+3]_149 25106 3.1e-07 12_[+3]_476 25590 3.4e-07 307_[+3]_181 23669 6.5e-07 20_[+3]_468 23543 1.4e-06 50_[+3]_438 2220 2.5e-06 156_[+3]_332 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=8 bd738 ( 122) GTGCTGCGTTGA 1 bd1801 ( 232) GTGCTGCGTTGA 1 21451 ( 340) GTGATGCCTTGA 1 25106 ( 13) GTGCTGCCTTGA 1 25590 ( 308) GTGGTGCGGTGA 1 23669 ( 21) GTGATGCGGTCA 1 23543 ( 51) GTAATGCCGTGA 1 2220 ( 157) GTGATACTGTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 12225 bayes= 11.3139 E= 2.0e-001 -965 -965 219 -965 -965 -965 -965 195 -117 -965 200 -965 82 64 -81 -965 -965 -965 -965 195 -117 -965 200 -965 -965 206 -965 -965 -965 64 119 -105 -965 -965 119 95 -965 -965 -965 195 -965 -94 200 -965 182 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 2.0e-001 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.125000 0.000000 0.875000 0.000000 0.500000 0.375000 0.125000 0.000000 0.000000 0.000000 0.000000 1.000000 0.125000 0.000000 0.875000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.375000 0.500000 0.125000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.125000 0.875000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GTG[AC]TGC[GC][GT]TGA -------------------------------------------------------------------------------- Time 13.96 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11466 5.89e-04 169_[+2(4.48e-08)]_310 12024 3.42e-05 465_[+2(3.37e-09)]_14 20766 3.53e-03 443_[+2(3.80e-06)]_36 21001 7.01e-01 500 21451 3.83e-03 339_[+3(2.69e-07)]_149 22046 6.68e-03 390_[+2(1.09e-06)]_89 2220 4.46e-04 156_[+3(2.46e-06)]_310_\ [+2(8.94e-06)]_1 23431 6.84e-01 500 23543 3.42e-06 50_[+3(1.39e-06)]_255_\ [+2(5.05e-07)]_162 23669 1.19e-04 20_[+3(6.55e-07)]_244_\ [+2(1.16e-05)]_203 23851 4.40e-03 369_[+2(9.03e-07)]_110 23927 8.44e-01 500 24499 5.64e-03 465_[+2(1.84e-06)]_14 25106 1.14e-06 12_[+3(3.10e-07)]_374_\ [+2(1.73e-07)]_81 25107 6.94e-05 255_[+2(7.80e-09)]_224 25590 1.06e-05 307_[+3(3.38e-07)]_4_[+2(3.01e-06)]_\ 156 25685 9.47e-03 240_[+2(2.18e-06)]_239 263887 8.59e-04 368_[+2(2.17e-07)]_1_[+2(2.49e-05)]_\ 89 264466 1.43e-03 249_[+2(1.94e-07)]_198_\ [+2(9.12e-05)]_11 264646 5.82e-09 154_[+1(7.15e-11)]_221_\ [+2(5.91e-06)]_83 268311 1.23e-06 167_[+1(3.58e-11)]_312 29327 7.10e-02 500 34280 6.16e-03 469_[+2(2.78e-06)]_10 bd1801 3.50e-12 167_[+1(1.79e-13)]_43_\ [+3(1.48e-07)]_166_[+3(7.53e-05)]_79 bd738 4.19e-15 57_[+1(1.79e-13)]_43_[+3(1.48e-07)]_\ 166_[+3(7.53e-05)]_109_[+2(1.69e-06)]_59 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************