******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/493/493.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42714 1.0000 500 28818 1.0000 500 21868 1.0000 500 23053 1.0000 500 49741 1.0000 500 33493 1.0000 500 44792 1.0000 500 46207 1.0000 500 50610 1.0000 500 47223 1.0000 500 36426 1.0000 500 43650 1.0000 500 44921 1.0000 500 40404 1.0000 500 50093 1.0000 500 45718 1.0000 500 49236 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/493/493.seqs.fa -oc motifs/493 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.286 C 0.239 G 0.214 T 0.261 Background letter frequencies (from dataset with add-one prior applied): A 0.286 C 0.239 G 0.214 T 0.261 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 11 llr = 159 E-value = 1.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 111::1:1:::2:24:1:::1 pos.-specific C 1:2:5:35328258:411536 probability G 812::5::4:::5:33:9::3 matrix T :85a547448261:448:57: bits 2.2 2.0 * 1.8 * * 1.6 * * Relative 1.3 * * ** * * Entropy 1.1 ** ** * ** * **** (20.8 bits) 0.9 ** ** * ** * ***** 0.7 ** ***** ***** ***** 0.4 ** ****************** 0.2 ********************* 0.0 --------------------- Multilevel GTTTCGTCGTCTCCACTGCTC consensus TTCTT G TT TCG sequence C GG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 23053 5 5.99e-11 CTCC GTGTCGTCTTCTGCTCTGCTC TGGATCCAGC 50610 222 4.80e-10 TGGTTCTGTA GTTTTGTCGTCCCCTGTGTTC GCCTCGCATG 28818 421 5.52e-08 CAGCTCCGGT GTCTCGTCGCTCCCGTTGCTC CTTTTATCAT 40404 405 6.06e-08 ACACGACAAT ATCTTGTCCTCTCCTCTGTTG TCTGCCAAAC 46207 215 1.44e-07 TGATAAGGGC GTTTCGCACTCTGCATTCTTC AAGATCTATA 33493 243 1.84e-07 TACCCGGAAG CTTTCTTCGTCAGATCTGCTG ATGTTTTTAT 44921 405 1.99e-07 GATAGGCACG GTTTCTTTCTCTTCATTGCCA TCGGTAACGT 47223 121 2.15e-07 TGCCTTGCGT GGATTTTTTTCTCCGCTGTTG AGATCTGGTA 44792 17 2.70e-07 TTACTGGTGC GTTTTGCCTCCAGCAGAGTTC CAAATCTATT 49236 321 5.14e-07 TTTCTATCAT GTGTTTCTGTTTGAATTGCCC TTTCGGAAGT 50093 269 8.65e-07 GACAGTGCAG GATTCATTTTCTCCGGCGCCC CCACCTTCCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23053 6e-11 4_[+1]_475 50610 4.8e-10 221_[+1]_258 28818 5.5e-08 420_[+1]_59 40404 6.1e-08 404_[+1]_75 46207 1.4e-07 214_[+1]_265 33493 1.8e-07 242_[+1]_237 44921 2e-07 404_[+1]_75 47223 2.1e-07 120_[+1]_359 44792 2.7e-07 16_[+1]_463 49236 5.1e-07 320_[+1]_159 50093 8.6e-07 268_[+1]_211 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=11 23053 ( 5) GTGTCGTCTTCTGCTCTGCTC 1 50610 ( 222) GTTTTGTCGTCCCCTGTGTTC 1 28818 ( 421) GTCTCGTCGCTCCCGTTGCTC 1 40404 ( 405) ATCTTGTCCTCTCCTCTGTTG 1 46207 ( 215) GTTTCGCACTCTGCATTCTTC 1 33493 ( 243) CTTTCTTCGTCAGATCTGCTG 1 44921 ( 405) GTTTCTTTCTCTTCATTGCCA 1 47223 ( 121) GGATTTTTTTCTCCGCTGTTG 1 44792 ( 17) GTTTTGCCTCCAGCAGAGTTC 1 49236 ( 321) GTGTTTCTGTTTGAATTGCCC 1 50093 ( 269) GATTCATTTTCTCCGGCGCCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8160 bayes= 9.08481 E= 1.5e+000 -165 -139 193 -1010 -165 -1010 -124 165 -165 -39 -24 106 -1010 -1010 -1010 193 -1010 119 -1010 80 -165 -1010 135 48 -1010 19 -1010 148 -165 119 -1010 48 -1010 19 76 48 -1010 -39 -1010 165 -1010 178 -1010 -52 -65 -39 -1010 128 -1010 93 108 -152 -65 178 -1010 -1010 35 -1010 35 48 -1010 61 35 48 -165 -139 -1010 165 -1010 -139 208 -1010 -1010 119 -1010 80 -1010 19 -1010 148 -165 141 35 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 11 E= 1.5e+000 0.090909 0.090909 0.818182 0.000000 0.090909 0.000000 0.090909 0.818182 0.090909 0.181818 0.181818 0.545455 0.000000 0.000000 0.000000 1.000000 0.000000 0.545455 0.000000 0.454545 0.090909 0.000000 0.545455 0.363636 0.000000 0.272727 0.000000 0.727273 0.090909 0.545455 0.000000 0.363636 0.000000 0.272727 0.363636 0.363636 0.000000 0.181818 0.000000 0.818182 0.000000 0.818182 0.000000 0.181818 0.181818 0.181818 0.000000 0.636364 0.000000 0.454545 0.454545 0.090909 0.181818 0.818182 0.000000 0.000000 0.363636 0.000000 0.272727 0.363636 0.000000 0.363636 0.272727 0.363636 0.090909 0.090909 0.000000 0.818182 0.000000 0.090909 0.909091 0.000000 0.000000 0.545455 0.000000 0.454545 0.000000 0.272727 0.000000 0.727273 0.090909 0.636364 0.272727 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GTTT[CT][GT][TC][CT][GTC]TCT[CG]C[ATG][CTG]TG[CT][TC][CG] -------------------------------------------------------------------------------- Time 2.53 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 14 llr = 134 E-value = 5.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 94a:1:4:24:4 pos.-specific C 14:11a:9:1a3 probability G :::6::::11:4 matrix T :3:29:6165:: bits 2.2 2.0 * * 1.8 * * * * 1.6 * * * * * Relative 1.3 * * * * * Entropy 1.1 * * ** * * (13.9 bits) 0.9 * ****** * 0.7 * ******* * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AAAGTCTCTTCA consensus C T A AA G sequence T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 50093 182 3.26e-07 TAATTCGACA ATAGTCTCTTCG TGGCCTCCAG 28818 339 1.36e-06 CAAAAGAGCA ACAGTCTCTACC GTCTGCCTAT 46207 277 2.21e-06 TTAAGTAGAA ATAGTCTCTACC TAGCTAGCTA 44921 256 5.20e-06 AACGCCACGC AAATTCTCTTCC CTGTCAAACC 23053 154 6.26e-06 ATCGGCATTC ACAGTCACATCA AAGAAAAACC 45718 179 7.22e-06 CCAGACATTC ACACTCACTTCG CGAACGTTCC 36426 158 1.27e-05 ACATCACCAG ATATTCTCTACC GTTACGCACT 50610 127 1.90e-05 TCCTTGAAGA AAAGACTCTTCG ACATCGTTTG 49236 271 2.79e-05 AGAGTTGAAT ATAGTCATTTCG AAATGTGATG 42714 156 3.99e-05 AGACTGGATT ACAGTCACAGCA GCAATACAGC 44792 283 4.49e-05 GGATACTGCT AAACTCTCGACA TGGAATATTG 21868 160 4.81e-05 CGGAGGAAAT AAAGTCACACCA ATCAGTCCGA 43650 27 6.21e-05 TCACAAATTG CAAGTCTCGACG TGAAGTGGAC 47223 182 6.21e-05 ACTGCAACAG ACATCCACTTCA CAATCTTTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50093 3.3e-07 181_[+2]_307 28818 1.4e-06 338_[+2]_150 46207 2.2e-06 276_[+2]_212 44921 5.2e-06 255_[+2]_233 23053 6.3e-06 153_[+2]_335 45718 7.2e-06 178_[+2]_310 36426 1.3e-05 157_[+2]_331 50610 1.9e-05 126_[+2]_362 49236 2.8e-05 270_[+2]_218 42714 4e-05 155_[+2]_333 44792 4.5e-05 282_[+2]_206 21868 4.8e-05 159_[+2]_329 43650 6.2e-05 26_[+2]_462 47223 6.2e-05 181_[+2]_307 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=14 50093 ( 182) ATAGTCTCTTCG 1 28818 ( 339) ACAGTCTCTACC 1 46207 ( 277) ATAGTCTCTACC 1 44921 ( 256) AAATTCTCTTCC 1 23053 ( 154) ACAGTCACATCA 1 45718 ( 179) ACACTCACTTCG 1 36426 ( 158) ATATTCTCTACC 1 50610 ( 127) AAAGACTCTTCG 1 49236 ( 271) ATAGTCATTTCG 1 42714 ( 156) ACAGTCACAGCA 1 44792 ( 283) AAACTCTCGACA 1 21868 ( 160) AAAGTCACACCA 1 43650 ( 27) CAAGTCTCGACG 1 47223 ( 182) ACATCCACTTCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.81792 E= 5.9e+002 170 -174 -1045 -1045 32 58 -1045 13 181 -1045 -1045 -1045 -1045 -74 158 -29 -200 -174 -1045 171 -1045 207 -1045 -1045 59 -1045 -1045 113 -1045 196 -1045 -187 -41 -1045 -59 130 32 -174 -158 94 -1045 207 -1045 -1045 32 26 74 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 5.9e+002 0.928571 0.071429 0.000000 0.000000 0.357143 0.357143 0.000000 0.285714 1.000000 0.000000 0.000000 0.000000 0.000000 0.142857 0.642857 0.214286 0.071429 0.071429 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 0.428571 0.000000 0.000000 0.571429 0.000000 0.928571 0.000000 0.071429 0.214286 0.000000 0.142857 0.642857 0.357143 0.071429 0.071429 0.500000 0.000000 1.000000 0.000000 0.000000 0.357143 0.285714 0.357143 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[ACT]A[GT]TC[TA]C[TA][TA]C[AGC] -------------------------------------------------------------------------------- Time 5.00 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 4 llr = 84 E-value = 9.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::83a:3a5:aa:::38::3: pos.-specific C a5::::8::a::5a:::3:33 probability G :335:a::5:::5:a5::a58 matrix T :3:3:::::::::::338::: bits 2.2 * * * 2.0 * * * ** * 1.8 * ** * *** ** * 1.6 * ** * *** ** * Relative 1.3 * ** * *** ** * * Entropy 1.1 * * *********** *** * (30.4 bits) 0.9 * * *********** *** * 0.7 *** *********** ***** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CCAGAGCAACAACCGGATGGG consensus GGA A G G ATC AC sequence T T T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 28818 283 3.78e-12 ACGATAACAC CCAAAGCAGCAACCGGATGGG GATGGTATCG 46207 28 7.90e-11 CAAATCAGTT CGAGAGCAGCAAGCGAATGGC ATTGAAAACC 47223 447 5.75e-10 TCTCTATTAG CTAGAGAAACAACCGGTTGAG AAAAGCTGGG 21868 362 7.09e-10 AACACCTGCT CCGTAGCAACAAGCGTACGCG GGAAGAAGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 28818 3.8e-12 282_[+3]_197 46207 7.9e-11 27_[+3]_452 47223 5.7e-10 446_[+3]_33 21868 7.1e-10 361_[+3]_118 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=4 28818 ( 283) CCAAAGCAGCAACCGGATGGG 1 46207 ( 28) CGAGAGCAGCAAGCGAATGGC 1 47223 ( 447) CTAGAGAAACAACCGGTTGAG 1 21868 ( 362) CCGTAGCAACAAGCGTACGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8160 bayes= 10.9936 E= 9.7e+002 -865 206 -865 -865 -865 107 22 -6 139 -865 22 -865 -19 -865 122 -6 181 -865 -865 -865 -865 -865 222 -865 -19 165 -865 -865 181 -865 -865 -865 81 -865 122 -865 -865 206 -865 -865 181 -865 -865 -865 181 -865 -865 -865 -865 107 122 -865 -865 206 -865 -865 -865 -865 222 -865 -19 -865 122 -6 139 -865 -865 -6 -865 7 -865 152 -865 -865 222 -865 -19 7 122 -865 -865 7 180 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 9.7e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.750000 0.000000 0.250000 0.000000 0.250000 0.000000 0.500000 0.250000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.500000 0.250000 0.750000 0.000000 0.000000 0.250000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.250000 0.250000 0.500000 0.000000 0.000000 0.250000 0.750000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[CGT][AG][GAT]AG[CA]A[AG]CAA[CG]CG[GAT][AT][TC]G[GAC][GC] -------------------------------------------------------------------------------- Time 8.08 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42714 5.81e-02 155_[+2(3.99e-05)]_333 28818 2.43e-14 282_[+3(3.78e-12)]_35_\ [+2(1.36e-06)]_70_[+1(5.52e-08)]_59 21868 4.08e-07 159_[+2(4.81e-05)]_190_\ [+3(7.09e-10)]_118 23053 2.08e-08 4_[+1(5.99e-11)]_128_[+2(6.26e-06)]_\ 335 49741 8.23e-01 500 33493 2.05e-04 242_[+1(1.84e-07)]_237 44792 1.52e-04 16_[+1(2.70e-07)]_245_\ [+2(4.49e-05)]_206 46207 1.68e-12 27_[+3(7.90e-11)]_166_\ [+1(1.44e-07)]_41_[+2(2.21e-06)]_212 50610 2.36e-07 126_[+2(1.90e-05)]_83_\ [+1(4.80e-10)]_258 47223 3.53e-10 120_[+1(2.15e-07)]_40_\ [+2(6.21e-05)]_253_[+3(5.75e-10)]_33 36426 3.60e-02 157_[+2(1.27e-05)]_331 43650 1.88e-01 26_[+2(6.21e-05)]_462 44921 2.12e-05 255_[+2(5.20e-06)]_137_\ [+1(1.99e-07)]_75 40404 2.22e-04 404_[+1(6.06e-08)]_75 50093 7.01e-06 181_[+2(3.26e-07)]_75_\ [+1(8.65e-07)]_211 45718 7.80e-03 178_[+2(7.22e-06)]_310 49236 2.58e-04 270_[+2(2.79e-05)]_38_\ [+1(5.14e-07)]_159 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************