******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/191/191.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31662 1.0000 500 9070 1.0000 500 24978 1.0000 500 13386 1.0000 500 36899 1.0000 500 38362 1.0000 500 14760 1.0000 500 29608 1.0000 500 39209 1.0000 500 43489 1.0000 500 9706 1.0000 500 44310 1.0000 500 19586 1.0000 500 11965 1.0000 500 35363 1.0000 500 35594 1.0000 500 12921 1.0000 500 46062 1.0000 500 47703 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/191/191.seqs.fa -oc motifs/191 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.246 G 0.227 T 0.255 Background letter frequencies (from dataset with add-one prior applied): A 0.272 C 0.246 G 0.227 T 0.255 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 14 sites = 8 llr = 102 E-value = 8.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :1::6::1:99:1: pos.-specific C :3::14:4a1:341 probability G a:aa3:a::::339 matrix T :6:::6:5::153: bits 2.1 * ** * 1.9 * ** * * 1.7 * ** * * 1.5 * ** * * * Relative 1.3 * ** * *** * Entropy 1.1 * ** ** *** * (18.5 bits) 0.9 * ** ** *** * 0.6 *********** * 0.4 ************ * 0.2 ************** 0.0 -------------- Multilevel GTGGATGTCAATCG consensus C GC C CG sequence GT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 46062 259 3.10e-09 TACCCTGGTA GTGGATGTCAATCG GTAGATACCA 9706 423 2.89e-07 GAGTGGCGAG GCGGACGTCAAGCG GAGAAACGTG 38362 470 2.89e-07 GGTGCTGTTG GTGGGTGCCAACGG TCTATTCCTG 43489 449 5.28e-07 ACCATTTGTT GTGGATGCCCATTG TGGAAGCTGC 14760 420 5.28e-07 TTCGTTTCGC GTGGATGTCATGCG AAACGAACGC 47703 47 7.71e-07 TTCGCTACCC GCGGCTGTCAATTG CGGCACGCCT 35363 99 1.33e-06 ACTGACTGTA GTGGGCGCCAACAG TAAAGTTGGC 35594 74 6.54e-06 AGTATTCTTC GAGGACGACAATGC CGACGCATTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46062 3.1e-09 258_[+1]_228 9706 2.9e-07 422_[+1]_64 38362 2.9e-07 469_[+1]_17 43489 5.3e-07 448_[+1]_38 14760 5.3e-07 419_[+1]_67 47703 7.7e-07 46_[+1]_440 35363 1.3e-06 98_[+1]_388 35594 6.5e-06 73_[+1]_413 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=14 seqs=8 46062 ( 259) GTGGATGTCAATCG 1 9706 ( 423) GCGGACGTCAAGCG 1 38362 ( 470) GTGGGTGCCAACGG 1 43489 ( 449) GTGGATGCCCATTG 1 14760 ( 420) GTGGATGTCATGCG 1 47703 ( 47) GCGGCTGTCAATTG 1 35363 ( 99) GTGGGCGCCAACAG 1 35594 ( 74) GAGGACGACAATGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 9253 bayes= 10.1745 E= 8.1e+001 -965 -965 214 -965 -112 2 -965 129 -965 -965 214 -965 -965 -965 214 -965 120 -98 14 -965 -965 61 -965 129 -965 -965 214 -965 -112 61 -965 97 -965 202 -965 -965 168 -98 -965 -965 168 -965 -965 -102 -965 2 14 97 -112 61 14 -3 -965 -98 195 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 8 E= 8.1e+001 0.000000 0.000000 1.000000 0.000000 0.125000 0.250000 0.000000 0.625000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.625000 0.125000 0.250000 0.000000 0.000000 0.375000 0.000000 0.625000 0.000000 0.000000 1.000000 0.000000 0.125000 0.375000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.875000 0.000000 0.000000 0.125000 0.000000 0.250000 0.250000 0.500000 0.125000 0.375000 0.250000 0.250000 0.000000 0.125000 0.875000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[TC]GG[AG][TC]G[TC]CAA[TCG][CGT]G -------------------------------------------------------------------------------- Time 3.45 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 11 llr = 120 E-value = 8.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :a:6:1:3:884 pos.-specific C ::a3:::291:: probability G a::::13:1:23 matrix T :::1a875:1:4 bits 2.1 * 1.9 *** * 1.7 *** * 1.5 *** * * Relative 1.3 *** * * * Entropy 1.1 *** *** *** (15.8 bits) 0.9 *** *** *** 0.6 ******* *** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GACATTTTCAAA consensus C GA T sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47703 164 4.78e-07 TCCGGGACAA GACCTTTTCAAA ATCCTGCTCA 39209 236 4.78e-07 ATTATCCGAC GACATTTACAAT TATGAGATCG 13386 24 1.16e-06 CAAAAGATAA GACATTTCCAAG AACGTCGACA 9706 101 1.22e-06 AGTCGTATAG GACATTTTCAGT CGGTCGGTCT 11965 128 1.80e-06 TAATTGCATT GACCTTGTCAAA AAGTCCGGTT 31662 255 3.64e-06 CAAAAAATTT GACATTTTGAAT CCAATCCTCG 36899 159 4.50e-06 GTCCGTTCAC GACATTTTCTAG CTTTTGGCGT 9070 43 8.56e-06 TTGCATGTCC GACATATACAAA GTGACAGTTG 35594 286 1.29e-05 TGATCGATGC GACCTTGACAGT GAAATTATGC 43489 166 1.75e-05 TACCGAACGG GACTTTTTCCAA CTGGCTCTTC 12921 357 1.87e-05 CTTGTTGGTC GACATGGCCAAG CACAACGCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47703 4.8e-07 163_[+2]_325 39209 4.8e-07 235_[+2]_253 13386 1.2e-06 23_[+2]_465 9706 1.2e-06 100_[+2]_388 11965 1.8e-06 127_[+2]_361 31662 3.6e-06 254_[+2]_234 36899 4.5e-06 158_[+2]_330 9070 8.6e-06 42_[+2]_446 35594 1.3e-05 285_[+2]_203 43489 1.8e-05 165_[+2]_323 12921 1.9e-05 356_[+2]_132 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=11 47703 ( 164) GACCTTTTCAAA 1 39209 ( 236) GACATTTACAAT 1 13386 ( 24) GACATTTCCAAG 1 9706 ( 101) GACATTTTCAGT 1 11965 ( 128) GACCTTGTCAAA 1 31662 ( 255) GACATTTTGAAT 1 36899 ( 159) GACATTTTCTAG 1 9070 ( 43) GACATATACAAA 1 35594 ( 286) GACCTTGACAGT 1 43489 ( 166) GACTTTTTCCAA 1 12921 ( 357) GACATGGCCAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9291 bayes= 10.0759 E= 8.5e+001 -1010 -1010 214 -1010 188 -1010 -1010 -1010 -1010 202 -1010 -1010 122 15 -1010 -148 -1010 -1010 -1010 197 -158 -1010 -132 168 -1010 -1010 27 151 0 -44 -1010 110 -1010 188 -132 -1010 159 -144 -1010 -148 159 -1010 -32 -1010 42 -1010 27 51 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 8.5e+001 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.636364 0.272727 0.000000 0.090909 0.000000 0.000000 0.000000 1.000000 0.090909 0.000000 0.090909 0.818182 0.000000 0.000000 0.272727 0.727273 0.272727 0.181818 0.000000 0.545455 0.000000 0.909091 0.090909 0.000000 0.818182 0.090909 0.000000 0.090909 0.818182 0.000000 0.181818 0.000000 0.363636 0.000000 0.272727 0.363636 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GAC[AC]TT[TG][TA]CAA[ATG] -------------------------------------------------------------------------------- Time 6.63 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 12 llr = 129 E-value = 1.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1:88::99a2:2 pos.-specific C 812235:::738 probability G 19::8411:::: matrix T :::::1:::27: bits 2.1 1.9 * 1.7 * * 1.5 * *** Relative 1.3 ***** *** * Entropy 1.1 ***** *** ** (15.5 bits) 0.9 ***** *** ** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CGAAGCAAACTC consensus CG C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 39209 104 4.20e-07 ATCCTCAGAT CGAACGAAACTC CCAAGACTAC 13386 84 4.94e-07 GATAGCGATC CGAAGCAAATTC GATCAGATTT 46062 170 6.42e-07 TATTTCGCTA CGAAGGAAATTC CAGTCGTAAA 9070 430 1.04e-06 CCACTGTGAG CGAAGCAAACTA CACGAATTAA 38362 200 1.93e-06 GACCCACTAG CGAAGCAGACTC GACACCAACG 36899 412 2.22e-06 CCTCCCTCCC GGAAGGAAACTC ACAGAATTCA 12921 164 5.03e-06 GATCCATTTC CCAAGCAAACCC CCTGGAACCG 19586 423 6.39e-06 CGAGACGACG CGACGTAAACTC TAGTCGCGTT 47703 317 1.25e-05 TTTTCGGTCA CGAACCGAACCC GGTCTTGCTT 29608 304 1.92e-05 ACGCACGAGC AGACGGAAACCC GGTAGATGCT 9706 72 2.05e-05 AATATATTTA CGCAGGAAAATA TCATCAAAGT 35363 334 2.17e-05 TAAGCCGTGC CGCACCAAAACC TAACATAAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39209 4.2e-07 103_[+3]_385 13386 4.9e-07 83_[+3]_405 46062 6.4e-07 169_[+3]_319 9070 1e-06 429_[+3]_59 38362 1.9e-06 199_[+3]_289 36899 2.2e-06 411_[+3]_77 12921 5e-06 163_[+3]_325 19586 6.4e-06 422_[+3]_66 47703 1.2e-05 316_[+3]_172 29608 1.9e-05 303_[+3]_185 9706 2.1e-05 71_[+3]_417 35363 2.2e-05 333_[+3]_155 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=12 39209 ( 104) CGAACGAAACTC 1 13386 ( 84) CGAAGCAAATTC 1 46062 ( 170) CGAAGGAAATTC 1 9070 ( 430) CGAAGCAAACTA 1 38362 ( 200) CGAAGCAGACTC 1 36899 ( 412) GGAAGGAAACTC 1 12921 ( 164) CCAAGCAAACCC 1 19586 ( 423) CGACGTAAACTC 1 47703 ( 317) CGAACCGAACCC 1 29608 ( 304) AGACGGAAACCC 1 9706 ( 72) CGCAGGAAAATA 1 35363 ( 334) CGCACCAAAACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9291 bayes= 10.0427 E= 1.4e+001 -171 176 -144 -1023 -1023 -156 201 -1023 161 -56 -1023 -1023 161 -56 -1023 -1023 -1023 2 172 -1023 -1023 102 88 -161 175 -1023 -144 -1023 175 -1023 -144 -1023 188 -1023 -1023 -1023 -71 144 -1023 -61 -1023 44 -1023 139 -71 176 -1023 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 1.4e+001 0.083333 0.833333 0.083333 0.000000 0.000000 0.083333 0.916667 0.000000 0.833333 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.500000 0.416667 0.083333 0.916667 0.000000 0.083333 0.000000 0.916667 0.000000 0.083333 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.666667 0.000000 0.166667 0.000000 0.333333 0.000000 0.666667 0.166667 0.833333 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CGAA[GC][CG]AAAC[TC]C -------------------------------------------------------------------------------- Time 9.68 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31662 3.87e-03 254_[+2(3.64e-06)]_234 9070 6.78e-05 42_[+2(8.56e-06)]_375_\ [+3(1.04e-06)]_59 24978 9.74e-01 500 13386 1.72e-05 23_[+2(1.16e-06)]_48_[+3(4.94e-07)]_\ 405 36899 1.20e-04 158_[+2(4.50e-06)]_241_\ [+3(2.22e-06)]_77 38362 8.11e-06 199_[+3(1.93e-06)]_258_\ [+1(2.89e-07)]_17 14760 3.35e-03 419_[+1(5.28e-07)]_67 29608 5.56e-02 303_[+3(1.92e-05)]_185 39209 9.79e-07 103_[+3(4.20e-07)]_120_\ [+2(4.78e-07)]_81_[+3(4.10e-05)]_160 43489 9.94e-05 165_[+2(1.75e-05)]_271_\ [+1(5.28e-07)]_38 9706 2.02e-07 71_[+3(2.05e-05)]_17_[+2(1.22e-06)]_\ 310_[+1(2.89e-07)]_64 44310 8.59e-01 500 19586 4.25e-02 422_[+3(6.39e-06)]_66 11965 2.70e-03 127_[+2(1.80e-06)]_361 35363 4.87e-04 98_[+1(1.33e-06)]_221_\ [+3(2.17e-05)]_155 35594 7.13e-04 73_[+1(6.54e-06)]_198_\ [+2(1.29e-05)]_203 12921 1.61e-04 163_[+3(5.03e-06)]_181_\ [+2(1.87e-05)]_132 46062 6.94e-08 169_[+3(6.42e-07)]_77_\ [+1(3.10e-09)]_63_[+3(5.82e-05)]_153 47703 1.33e-07 46_[+1(7.71e-07)]_103_\ [+2(4.78e-07)]_141_[+3(1.25e-05)]_172 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************