******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/105/105.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47781 1.0000 500 48833 1.0000 500 44500 1.0000 500 34011 1.0000 500 11888 1.0000 500 35688 1.0000 500 38585 1.0000 500 39107 1.0000 500 33783 1.0000 500 49222 1.0000 500 49699 1.0000 500 38803 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/105/105.seqs.fa -oc motifs/105 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.226 G 0.217 T 0.287 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.226 G 0.217 T 0.287 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 7 llr = 107 E-value = 8.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::13:73:::4:::3:1:: pos.-specific C :146::::a9::::131:: probability G 391:::::::4:::36113 matrix T 7:31a37a:11aaa31697 bits 2.2 * 2.0 * 1.8 * ** *** 1.5 * * *** *** Relative 1.3 * * *** *** * Entropy 1.1 ** ** *** *** ** (22.0 bits) 0.9 ** ****** *** ** 0.7 ** *********** * ** 0.4 ** *********** * ** 0.2 ************** **** 0.0 ------------------- Multilevel TGCCTATTCCATTTAGTTT consensus G TA TA G GC G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 44500 453 2.46e-09 TGTTATTGCT TGCATATTCCATTTACTTT ATTACGAAGG 48833 217 3.81e-09 AAAAAACATG TGTCTAATCCGTTTAGTTG AATTTCGATT 35688 438 5.70e-09 AAGTGTCAAA TGTCTATTCCGTTTCGATT TCTATATTAC 49699 454 5.89e-08 TGGTCGCGAT TGCCTTTTCCATTTTTGTT TTTTACCGAC 49222 179 7.53e-08 TGAGAGGGGC GGCCTTTTCCGTTTTGCGT ACCAAATTCC 38803 194 3.35e-07 TCCGATCACG GGATTAATCCTTTTGCTTT CTTATAGAAA 39107 55 3.70e-07 GAGTAAACTG TCGATATTCTATTTGGTTG GAACACTGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44500 2.5e-09 452_[+1]_29 48833 3.8e-09 216_[+1]_265 35688 5.7e-09 437_[+1]_44 49699 5.9e-08 453_[+1]_28 49222 7.5e-08 178_[+1]_303 38803 3.3e-07 193_[+1]_288 39107 3.7e-07 54_[+1]_427 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=7 44500 ( 453) TGCATATTCCATTTACTTT 1 48833 ( 217) TGTCTAATCCGTTTAGTTG 1 35688 ( 438) TGTCTATTCCGTTTCGATT 1 49699 ( 454) TGCCTTTTCCATTTTTGTT 1 49222 ( 179) GGCCTTTTCCGTTTTGCGT 1 38803 ( 194) GGATTAATCCTTTTGCTTT 1 39107 ( 55) TCGATATTCTATTTGGTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 5784 bayes= 9.53243 E= 8.7e+001 -945 -945 40 132 -945 -66 198 -945 -92 93 -60 0 8 134 -945 -100 -945 -945 -945 180 140 -945 -945 0 8 -945 -945 132 -945 -945 -945 180 -945 215 -945 -945 -945 192 -945 -100 66 -945 98 -100 -945 -945 -945 180 -945 -945 -945 180 -945 -945 -945 180 8 -66 40 0 -945 34 140 -100 -92 -66 -60 99 -945 -945 -60 158 -945 -945 40 132 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 7 E= 8.7e+001 0.000000 0.000000 0.285714 0.714286 0.000000 0.142857 0.857143 0.000000 0.142857 0.428571 0.142857 0.285714 0.285714 0.571429 0.000000 0.142857 0.000000 0.000000 0.000000 1.000000 0.714286 0.000000 0.000000 0.285714 0.285714 0.000000 0.000000 0.714286 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.857143 0.000000 0.142857 0.428571 0.000000 0.428571 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.285714 0.142857 0.285714 0.285714 0.000000 0.285714 0.571429 0.142857 0.142857 0.142857 0.142857 0.571429 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.285714 0.714286 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TG]G[CT][CA]T[AT][TA]TCC[AG]TTT[AGT][GC]TT[TG] -------------------------------------------------------------------------------- Time 1.63 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 5 llr = 75 E-value = 9.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::a4a2::a26:4:28 pos.-specific C :::4::4::2::2822 probability G aa:::82a:62a226: matrix T :::2::4:::2:2::: bits 2.2 ** * * 2.0 *** * ** * 1.8 *** * ** * 1.5 *** * ** * * Relative 1.3 *** ** ** * * * Entropy 1.1 *** ** ** * * * (21.6 bits) 0.9 *** ** ** * * * 0.7 *** ** ***** *** 0.4 ************ *** 0.2 ************ *** 0.0 ---------------- Multilevel GGAAAGCGAGAGACGA consensus C AT AG CGAC sequence T G CT G C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 34011 412 6.87e-10 TAACCTTGAA GGAAAGTGAGAGACGA GAATTTTTTT 44500 361 2.43e-08 ATTTGATACG GGATAGCGAGAGACAA ATACTTAACC 35688 33 1.10e-07 GAGGGAATGT GGAAAGGGACGGCCGA CAAACCTCTG 38803 257 3.01e-07 GCCTTGTTGT GGACAGTGAATGTCGC TCTCCCATCG 33783 346 3.01e-07 GCCAGGATAT GGACAACGAGAGGGCA TATGTGAAGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34011 6.9e-10 411_[+2]_73 44500 2.4e-08 360_[+2]_124 35688 1.1e-07 32_[+2]_452 38803 3e-07 256_[+2]_228 33783 3e-07 345_[+2]_139 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=5 34011 ( 412) GGAAAGTGAGAGACGA 1 44500 ( 361) GGATAGCGAGAGACAA 1 35688 ( 33) GGAAAGGGACGGCCGA 1 38803 ( 257) GGACAGTGAATGTCGC 1 33783 ( 346) GGACAACGAGAGGGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 10.4354 E= 9.0e+002 -897 -897 220 -897 -897 -897 220 -897 188 -897 -897 -897 56 83 -897 -52 188 -897 -897 -897 -44 -897 188 -897 -897 83 -12 48 -897 -897 220 -897 188 -897 -897 -897 -44 -17 147 -897 114 -897 -12 -52 -897 -897 220 -897 56 -17 -12 -52 -897 182 -12 -897 -44 -17 147 -897 156 -17 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 9.0e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.400000 0.000000 0.200000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.400000 0.200000 0.400000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.200000 0.600000 0.000000 0.600000 0.000000 0.200000 0.200000 0.000000 0.000000 1.000000 0.000000 0.400000 0.200000 0.200000 0.200000 0.000000 0.800000 0.200000 0.000000 0.200000 0.200000 0.600000 0.000000 0.800000 0.200000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GGA[ACT]A[GA][CTG]GA[GAC][AGT]G[ACGT][CG][GAC][AC] -------------------------------------------------------------------------------- Time 3.24 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 5 llr = 89 E-value = 2.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::8:::24:222:8:a::: pos.-specific C a:::244:62::::a:::: probability G :a:::244:2:2:2::a:: matrix T ::2a84:24486a::::aa bits 2.2 ** * * 2.0 ** *** 1.8 ** * * ***** 1.5 ** * * ***** Relative 1.3 ** * ******* Entropy 1.1 ***** * * ******* (25.6 bits) 0.9 ***** * * ******* 0.7 ***** * * * ******* 0.4 ********* ********* 0.2 ********* ********* 0.0 ------------------- Multilevel CGATTCCACTTTTACAGTT consensus T CTGGTAAA G sequence GAT C G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 33783 305 7.04e-10 GAAAGATGTT CGATTTCGCTTTTGCAGTT GATGTAAAAA 47781 248 2.01e-09 TCTTACAGCG CGATTGCATCTTTACAGTT AGAACAATCT 48833 353 3.36e-09 AAGGTATTTA CGATTCGATGTATACAGTT AGAACGGCTT 38585 357 7.32e-09 AGGAAAACAA CGTTTCGTCATTTACAGTT AATCGTGGCC 35688 183 2.48e-08 CGGTCAAAAC CGATCTAGCTAGTACAGTT TGTAACCCTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33783 7e-10 304_[+3]_177 47781 2e-09 247_[+3]_234 48833 3.4e-09 352_[+3]_129 38585 7.3e-09 356_[+3]_125 35688 2.5e-08 182_[+3]_299 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=5 33783 ( 305) CGATTTCGCTTTTGCAGTT 1 47781 ( 248) CGATTGCATCTTTACAGTT 1 48833 ( 353) CGATTCGATGTATACAGTT 1 38585 ( 357) CGTTTCGTCATTTACAGTT 1 35688 ( 183) CGATCTAGCTAGTACAGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 5784 bayes= 10.4264 E= 2.3e+002 -897 215 -897 -897 -897 -897 220 -897 156 -897 -897 -52 -897 -897 -897 180 -897 -17 -897 148 -897 83 -12 48 -44 83 88 -897 56 -897 88 -52 -897 141 -897 48 -44 -17 -12 48 -44 -897 -897 148 -44 -897 -12 106 -897 -897 -897 180 156 -897 -12 -897 -897 215 -897 -897 188 -897 -897 -897 -897 -897 220 -897 -897 -897 -897 180 -897 -897 -897 180 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 5 E= 2.3e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.400000 0.200000 0.400000 0.200000 0.400000 0.400000 0.000000 0.400000 0.000000 0.400000 0.200000 0.000000 0.600000 0.000000 0.400000 0.200000 0.200000 0.200000 0.400000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.200000 0.600000 0.000000 0.000000 0.000000 1.000000 0.800000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CG[AT]T[TC][CTG][CGA][AGT][CT][TACG][TA][TAG]T[AG]CAGTT -------------------------------------------------------------------------------- Time 4.92 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47781 2.57e-05 247_[+3(2.01e-09)]_234 48833 1.06e-09 216_[+1(3.81e-09)]_117_\ [+3(3.36e-09)]_129 44500 3.89e-09 360_[+2(2.43e-08)]_76_\ [+1(2.46e-09)]_29 34011 3.67e-05 411_[+2(6.87e-10)]_73 11888 8.89e-01 500 35688 1.07e-12 32_[+2(1.10e-07)]_134_\ [+3(2.48e-08)]_236_[+1(5.70e-09)]_44 38585 9.13e-05 356_[+3(7.32e-09)]_125 39107 2.65e-03 54_[+1(3.70e-07)]_427 33783 2.01e-09 304_[+3(7.04e-10)]_22_\ [+2(3.01e-07)]_139 49222 4.91e-05 178_[+1(7.53e-08)]_270_\ [+3(6.46e-05)]_14 49699 7.65e-04 453_[+1(5.89e-08)]_28 38803 3.37e-06 193_[+1(3.35e-07)]_44_\ [+2(3.01e-07)]_228 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************