******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/499/499.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11191 1.0000 500 11209 1.0000 500 24575 1.0000 500 25098 1.0000 500 264838 1.0000 500 264847 1.0000 500 30457 1.0000 500 4082 1.0000 500 8654 1.0000 500 9284 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/499/499.seqs.fa -oc motifs/499 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.285 C 0.248 G 0.213 T 0.254 Background letter frequencies (from dataset with add-one prior applied): A 0.285 C 0.248 G 0.213 T 0.254 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 8 llr = 107 E-value = 1.3e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 3:3::a9:16:66:1 pos.-specific C 8a:4a:1a::a:3a: probability G ::14::::63:4::6 matrix T ::63::::31::1:3 bits 2.2 2.0 * * * * * 1.8 * ** * * * 1.6 * ** * * * Relative 1.3 * **** * * Entropy 1.1 ** **** ** * (19.4 bits) 0.9 ** ***** ** ** 0.7 *** *********** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CCTCCAACGACAACG consensus A AG TG GC T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 8654 214 3.13e-09 CTAACTAACC CCTGCAACGACGACG ACAAGCTCGG 11191 348 1.00e-07 CGTTAGAGTG CCTTCAACGACATCG CATCGGTGCA 264847 190 1.52e-07 TCCATCTCTC ACTTCAACGGCAACG ATCATCGCGA 24575 301 3.44e-07 AGTAGAGAAG CCACCAACTGCGACG TCCATCACAA 30457 232 4.94e-07 CAGAACGCTC ACAGCAACGACAACT TGATCCAGAT 4082 431 6.87e-07 ACACCGTCGA CCTGCACCTACACCG ACTAACGATC 25098 234 8.63e-07 GACTGCCCAA CCTCCAACAACAACA CGAGACATCC 9284 283 1.96e-06 CGCCTCTTCC CCGCCAACGTCGCCT GTCTTACAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8654 3.1e-09 213_[+1]_272 11191 1e-07 347_[+1]_138 264847 1.5e-07 189_[+1]_296 24575 3.4e-07 300_[+1]_185 30457 4.9e-07 231_[+1]_254 4082 6.9e-07 430_[+1]_55 25098 8.6e-07 233_[+1]_252 9284 2e-06 282_[+1]_203 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=8 8654 ( 214) CCTGCAACGACGACG 1 11191 ( 348) CCTTCAACGACATCG 1 264847 ( 190) ACTTCAACGGCAACG 1 24575 ( 301) CCACCAACTGCGACG 1 30457 ( 232) ACAGCAACGACAACT 1 4082 ( 431) CCTGCACCTACACCG 1 25098 ( 234) CCTCCAACAACAACA 1 9284 ( 283) CCGCCAACGTCGCCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4860 bayes= 9.24436 E= 1.3e-001 -19 160 -965 -965 -965 201 -965 -965 -19 -965 -77 130 -965 60 81 -3 -965 201 -965 -965 181 -965 -965 -965 162 -99 -965 -965 -965 201 -965 -965 -118 -965 155 -3 113 -965 23 -102 -965 201 -965 -965 113 -965 81 -965 113 1 -965 -102 -965 201 -965 -965 -118 -965 155 -3 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 8 E= 1.3e-001 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.125000 0.625000 0.000000 0.375000 0.375000 0.250000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.125000 0.000000 0.625000 0.250000 0.625000 0.000000 0.250000 0.125000 0.000000 1.000000 0.000000 0.000000 0.625000 0.000000 0.375000 0.000000 0.625000 0.250000 0.000000 0.125000 0.000000 1.000000 0.000000 0.000000 0.125000 0.000000 0.625000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CA]C[TA][CGT]CAAC[GT][AG]C[AG][AC]C[GT] -------------------------------------------------------------------------------- Time 0.93 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 8 llr = 100 E-value = 5.4e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 48:41:1:a:::51 pos.-specific C 6:a111:3:::::9 probability G :1::3:98::495: matrix T :1:559:::a61:: bits 2.2 2.0 * * 1.8 * ** 1.6 * * ** * Relative 1.3 * ***** * * Entropy 1.1 * ********* (18.1 bits) 0.9 *** ********* 0.7 *** ********* 0.4 **** ********* 0.2 ************** 0.0 -------------- Multilevel CACTTTGGATTGAC consensus A AG C G G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 8654 486 2.70e-09 ACTCACTACA CACTTTGGATTGGC C 264847 129 6.30e-09 ACAACGCACC CACTTTGGATTGAC AATGTTTCAC 264838 486 6.30e-09 ACAACGCATC CACTTTGGATTGAC C 11209 204 6.43e-07 CGGTAGCCTT AGCATTGGATGGGC CTCAAAAAGG 4082 269 3.36e-06 ATTACATTCT CACTGTAGATTGGA TTCCCACTCC 24575 185 3.50e-06 TCCCTTTTGT ATCCGTGGATGGAC ACGGTCATTA 25098 60 5.10e-06 GAACACAATC CACACTGCATGTGC CATCGCAAAT 30457 410 7.19e-06 CGATAGTATC AACAACGCATTGAC TATCCGTGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8654 2.7e-09 485_[+2]_1 264847 6.3e-09 128_[+2]_358 264838 6.3e-09 485_[+2]_1 11209 6.4e-07 203_[+2]_283 4082 3.4e-06 268_[+2]_218 24575 3.5e-06 184_[+2]_302 25098 5.1e-06 59_[+2]_427 30457 7.2e-06 409_[+2]_77 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=8 8654 ( 486) CACTTTGGATTGGC 1 264847 ( 129) CACTTTGGATTGAC 1 264838 ( 486) CACTTTGGATTGAC 1 11209 ( 204) AGCATTGGATGGGC 1 4082 ( 269) CACTGTAGATTGGA 1 24575 ( 185) ATCCGTGGATGGAC 1 25098 ( 60) CACACTGCATGTGC 1 30457 ( 410) AACAACGCATTGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 4870 bayes= 9.98525 E= 5.4e+000 40 133 -965 -965 140 -965 -77 -102 -965 201 -965 -965 40 -99 -965 97 -118 -99 23 97 -965 -99 -965 178 -118 -965 204 -965 -965 1 181 -965 181 -965 -965 -965 -965 -965 -965 197 -965 -965 81 130 -965 -965 204 -102 81 -965 123 -965 -118 182 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 8 E= 5.4e+000 0.375000 0.625000 0.000000 0.000000 0.750000 0.000000 0.125000 0.125000 0.000000 1.000000 0.000000 0.000000 0.375000 0.125000 0.000000 0.500000 0.125000 0.125000 0.250000 0.500000 0.000000 0.125000 0.000000 0.875000 0.125000 0.000000 0.875000 0.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.375000 0.625000 0.000000 0.000000 0.875000 0.125000 0.500000 0.000000 0.500000 0.000000 0.125000 0.875000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CA]AC[TA][TG]TG[GC]AT[TG]G[AG]C -------------------------------------------------------------------------------- Time 1.77 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 5 llr = 78 E-value = 1.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::6a:6:::a2:8: pos.-specific C 4::::24a::::2:: probability G 6:a4:8::8a:6826 matrix T :a::::::2::2::4 bits 2.2 * * 2.0 ** * * 1.8 ** * * ** 1.6 ** ** **** * Relative 1.3 ** ** **** * Entropy 1.1 ****** **** *** (22.6 bits) 0.9 *********** *** 0.7 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel GTGAAGACGGAGGAG consensus C G CC T ACGT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 8654 84 5.02e-10 GTGAAGAATG GTGAAGACGGAGGAG ACCAGATGAT 11191 171 6.58e-09 AATAGTGACT GTGAAGACGGAGGGG GGCCGAAGTG 11209 105 5.13e-08 GCTGGTAAGC CTGGAGCCGGAGCAG CCCCCTGAAA 30457 5 1.52e-07 TTCG GTGGACCCGGATGAT GTAGAAGGAA 25098 108 1.71e-07 CGCAAACTCT CTGAAGACTGAAGAT ATGGAAGAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8654 5e-10 83_[+3]_402 11191 6.6e-09 170_[+3]_315 11209 5.1e-08 104_[+3]_381 30457 1.5e-07 4_[+3]_481 25098 1.7e-07 107_[+3]_378 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=5 8654 ( 84) GTGAAGACGGAGGAG 1 11191 ( 171) GTGAAGACGGAGGGG 1 11209 ( 105) CTGGAGCCGGAGCAG 1 30457 ( 5) GTGGACCCGGATGAT 1 25098 ( 108) CTGAAGACTGAAGAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4860 bayes= 10.8675 E= 1.8e+001 -897 69 149 -897 -897 -897 -897 197 -897 -897 223 -897 107 -897 91 -897 181 -897 -897 -897 -897 -31 191 -897 107 69 -897 -897 -897 201 -897 -897 -897 -897 191 -35 -897 -897 223 -897 181 -897 -897 -897 -51 -897 149 -35 -897 -31 191 -897 149 -897 -9 -897 -897 -897 149 65 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 5 E= 1.8e+001 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.000000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.600000 0.200000 0.000000 0.200000 0.800000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 0.600000 0.400000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC]TG[AG]A[GC][AC]C[GT]GA[GAT][GC][AG][GT] -------------------------------------------------------------------------------- Time 2.69 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11191 7.68e-09 170_[+3(6.58e-09)]_162_\ [+1(1.00e-07)]_138 11209 7.79e-07 104_[+3(5.13e-08)]_84_\ [+2(6.43e-07)]_283 24575 2.94e-05 184_[+2(3.50e-06)]_102_\ [+1(3.44e-07)]_185 25098 2.53e-08 59_[+2(5.10e-06)]_34_[+3(1.71e-07)]_\ 111_[+1(8.63e-07)]_252 264838 2.16e-04 485_[+2(6.30e-09)]_1 264847 3.59e-08 128_[+2(6.30e-09)]_47_\ [+1(1.52e-07)]_281_[+2(6.30e-09)]_1 30457 1.86e-08 4_[+3(1.52e-07)]_212_[+1(4.94e-07)]_\ 49_[+3(4.22e-05)]_99_[+2(7.19e-06)]_32_[+1(4.31e-05)]_30 4082 2.75e-05 268_[+2(3.36e-06)]_148_\ [+1(6.87e-07)]_55 8654 4.54e-16 83_[+3(5.02e-10)]_115_\ [+1(3.13e-09)]_62_[+2(2.43e-05)]_139_[+1(9.74e-05)]_27_[+2(2.70e-09)]_1 9284 7.09e-03 282_[+1(1.96e-06)]_203 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************