******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/394/394.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42811 1.0000 500 2761 1.0000 500 46341 1.0000 500 36569 1.0000 500 13294 1.0000 500 47143 1.0000 500 14143 1.0000 500 54756 1.0000 500 14322 1.0000 500 38270 1.0000 500 14850 1.0000 500 15102 1.0000 500 15217 1.0000 500 15479 1.0000 500 18128 1.0000 500 40124 1.0000 500 7679 1.0000 500 55114 1.0000 500 43898 1.0000 500 44226 1.0000 500 44584 1.0000 500 44719 1.0000 500 26634 1.0000 500 35739 1.0000 500 12331 1.0000 500 46970 1.0000 500 43219 1.0000 500 44727 1.0000 500 49651 1.0000 500 30609 1.0000 500 49913 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/394/394.seqs.fa -oc motifs/394 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 31 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 15500 N= 31 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.256 C 0.243 G 0.237 T 0.264 Background letter frequencies (from dataset with add-one prior applied): A 0.256 C 0.243 G 0.237 T 0.264 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 16 llr = 168 E-value = 5.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::1::::111 pos.-specific C 2:531:1:238: probability G 2:5:3a::8:19 matrix T 6a:86:9a:71: bits 2.1 * 1.9 * * * 1.7 * * * 1.5 * **** * Relative 1.2 * **** * Entropy 1.0 *** **** * (15.1 bits) 0.8 *** ******* 0.6 **** ******* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTCTTGTTGTCG consensus GCG C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44226 389 3.91e-07 ATCGTTGGAC TTGTTGTTGCCG CTATCAACGA 12331 412 7.71e-07 TACTGGTATC GTGTTGTTGTCG AGGACAGGAA 40124 91 7.71e-07 ACAAACTTGT TTCCTGTTGTCG ATTTCGAGGT 46341 288 7.71e-07 GCGCAATTTG GTGTTGTTGTCG TGTCTGTATA 13294 82 1.63e-06 GACAGCGACA TTGTTGTTGTGG ACAAGCCACC 14850 211 4.41e-06 ACCGTCTACG TTCTGGCTGTCG ACAAACACCC 49651 120 7.17e-06 GGCGCAAACT TTCTGGTTGACG GAGAACGCAA 36569 372 7.17e-06 CAATTTCTTG CTCTTGTTGTCA GCGACGCTTT 15479 308 7.29e-06 GGTCGCATCA TTCTGGTTCCCG CTTTGAGCGC 30609 292 7.91e-06 GGCAATTAGC TTCCAGTTGTCG TGCGAGAGAA 43898 194 1.06e-05 AATAGGAATC GTGTTGTTCCCG CCAACGAAAA 47143 459 1.29e-05 GCAGCTGTCT CTGTTGTTGTTG TTATTGTCCA 46970 109 1.85e-05 GTTGGCAATC TTGTGGCTCTCG CTCTTTGCTT 15217 343 2.08e-05 GACAGCGTCA TTGCCGTTGTGG GGATGGGCGT 26634 487 2.48e-05 GTCAAGGTCC CTCTCGTTGTCA AC 15102 358 2.48e-05 ATGGATTCTA TTCCTGTTGCAG AAGCAGCAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44226 3.9e-07 388_[+1]_100 12331 7.7e-07 411_[+1]_77 40124 7.7e-07 90_[+1]_398 46341 7.7e-07 287_[+1]_201 13294 1.6e-06 81_[+1]_407 14850 4.4e-06 210_[+1]_278 49651 7.2e-06 119_[+1]_369 36569 7.2e-06 371_[+1]_117 15479 7.3e-06 307_[+1]_181 30609 7.9e-06 291_[+1]_197 43898 1.1e-05 193_[+1]_295 47143 1.3e-05 458_[+1]_30 46970 1.8e-05 108_[+1]_380 15217 2.1e-05 342_[+1]_146 26634 2.5e-05 486_[+1]_2 15102 2.5e-05 357_[+1]_131 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=16 44226 ( 389) TTGTTGTTGCCG 1 12331 ( 412) GTGTTGTTGTCG 1 40124 ( 91) TTCCTGTTGTCG 1 46341 ( 288) GTGTTGTTGTCG 1 13294 ( 82) TTGTTGTTGTGG 1 14850 ( 211) TTCTGGCTGTCG 1 49651 ( 120) TTCTGGTTGACG 1 36569 ( 372) CTCTTGTTGTCA 1 15479 ( 308) TTCTGGTTCCCG 1 30609 ( 292) TTCCAGTTGTCG 1 43898 ( 194) GTGTTGTTCCCG 1 47143 ( 459) CTGTTGTTGTTG 1 46970 ( 109) TTGTGGCTCTCG 1 15217 ( 343) TTGCCGTTGTGG 1 26634 ( 487) CTCTCGTTGTCA 1 15102 ( 358) TTCCTGTTGCAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 15159 bayes= 10.6239 E= 5.2e+000 -1064 -37 -34 124 -1064 -1064 -1064 192 -1064 104 108 -1064 -1064 4 -1064 151 -203 -96 8 109 -1064 -1064 208 -1064 -1064 -96 -1064 173 -1064 -1064 -1064 192 -1064 -37 178 -1064 -203 4 -1064 138 -203 163 -92 -208 -103 -1064 188 -1064 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 16 E= 5.2e+000 0.000000 0.187500 0.187500 0.625000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.250000 0.000000 0.750000 0.062500 0.125000 0.250000 0.562500 0.000000 0.000000 1.000000 0.000000 0.000000 0.125000 0.000000 0.875000 0.000000 0.000000 0.000000 1.000000 0.000000 0.187500 0.812500 0.000000 0.062500 0.250000 0.000000 0.687500 0.062500 0.750000 0.125000 0.062500 0.125000 0.000000 0.875000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TT[CG][TC][TG]GTTG[TC]CG -------------------------------------------------------------------------------- Time 8.25 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 7 llr = 124 E-value = 1.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 39971:::4:4::::::::7: pos.-specific C 11::::1::1:17:6a:99:: probability G 6:::::9a4446::1:7::33 matrix T ::139a::14133a3:311:7 bits 2.1 * * 1.9 * * * * 1.7 * * * * 1.5 ** *** * * ** Relative 1.2 ** **** ** **** Entropy 1.0 ******* ** ****** (25.7 bits) 0.8 ******* ** ****** 0.6 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GAAATTGGAGAGCTCCGCCAT consensus A T GTGTT T T GG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 40124 365 3.62e-11 AAAGGCTGTC GAATTTGGGGGGCTTCGCCAT TCCACGCAAT 44584 15 2.17e-10 TCACAGAACC GAAATTGGGTTCCTCCGCCAT GGTGAGAGGA 15479 405 1.72e-09 AGACCGACGG GAAATTGGTGATCTGCGCCAG AATGTGGTCC 18128 450 6.82e-09 GTCGTGGCAG AAAATTCGAGAGCTTCGTCAT ATTGAAGATC 43219 446 7.43e-09 CCAAGTAGGG AAAAATGGACGTCTCCTCCAT TTTTGTCGGC 38270 100 1.89e-08 CTAGAGGGGT GATTTTGGATAGTTCCTCCGG AAATGTCCAT 14322 301 2.35e-08 CCCAGGAGAA CCAATTGGGTGGTTCCGCTGT ACATTCTTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40124 3.6e-11 364_[+2]_115 44584 2.2e-10 14_[+2]_465 15479 1.7e-09 404_[+2]_75 18128 6.8e-09 449_[+2]_30 43219 7.4e-09 445_[+2]_34 38270 1.9e-08 99_[+2]_380 14322 2.4e-08 300_[+2]_179 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=7 40124 ( 365) GAATTTGGGGGGCTTCGCCAT 1 44584 ( 15) GAAATTGGGTTCCTCCGCCAT 1 15479 ( 405) GAAATTGGTGATCTGCGCCAG 1 18128 ( 450) AAAATTCGAGAGCTTCGTCAT 1 43219 ( 446) AAAAATGGACGTCTCCTCCAT 1 38270 ( 100) GATTTTGGATAGTTCCTCCGG 1 14322 ( 301) CCAATTGGGTGGTTCCGCTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 14880 bayes= 11.659 E= 1.1e+002 16 -76 127 -945 174 -76 -945 -945 174 -945 -945 -88 148 -945 -945 11 -84 -945 -945 170 -945 -945 -945 192 -945 -76 185 -945 -945 -945 208 -945 74 -945 85 -88 -945 -76 85 70 74 -945 85 -88 -945 -76 127 11 -945 156 -945 11 -945 -945 -945 192 -945 123 -73 11 -945 204 -945 -945 -945 -945 159 11 -945 182 -945 -88 -945 182 -945 -88 148 -945 27 -945 -945 -945 27 143 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 1.1e+002 0.285714 0.142857 0.571429 0.000000 0.857143 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 0.142857 0.714286 0.000000 0.000000 0.285714 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.428571 0.000000 0.428571 0.142857 0.000000 0.142857 0.428571 0.428571 0.428571 0.000000 0.428571 0.142857 0.000000 0.142857 0.571429 0.285714 0.000000 0.714286 0.000000 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 0.571429 0.142857 0.285714 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.714286 0.285714 0.000000 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 0.142857 0.714286 0.000000 0.285714 0.000000 0.000000 0.000000 0.285714 0.714286 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GA]AA[AT]TTGG[AG][GT][AG][GT][CT]T[CT]C[GT]CC[AG][TG] -------------------------------------------------------------------------------- Time 16.10 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 15 llr = 172 E-value = 5.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :4aa288:59:68295 pos.-specific C 11::71152:51:713 probability G 95::1113213311:: matrix T :::::::21:2:1::1 bits 2.1 1.9 ** 1.7 * ** 1.5 * ** * * Relative 1.2 * ** * * Entropy 1.0 * ** ** * * * (16.6 bits) 0.8 ******* * *** 0.6 ******** ******* 0.4 ******** ******* 0.2 **************** 0.0 ---------------- Multilevel GGAACAACAACAACAA consensus A A GC GG A C sequence TG T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 14143 191 9.48e-10 CGCCACGTCC GGAACAACAACAACAC CAAGCACGCG 55114 170 1.82e-07 AAAAGCACAG GAAACAAGTATAACAA GTTTTTCATG 15479 142 3.32e-07 GAAGAATCCC GGAAAAGCAAGAACAA TTACGCCAAG 44226 285 6.40e-07 CAGAATAATA GGAACAACGAGGAGAC TTACCCAGCA 14322 45 7.12e-07 AGGAATTCCT GAAACGAGAATAACAA AAGAAGGTCT 26634 65 1.07e-06 GGATCATTGC GAAACAACAACATGAA GACCTGTATA 12331 430 1.72e-06 GTCGAGGACA GGAACCAGGAGCACAA ATCCTTAGGT 2761 152 1.88e-06 AGTTGTTCTT GGAACAATCACCACCA ACGTCCTACA 54756 438 2.25e-06 AGACCTCCGT GGAAAAAGCACAAAAT ATGCCGCCCG 44719 266 4.39e-06 TCATTATCGA CAAACAACAATAAAAC TGCCTTTCAC 46970 430 5.53e-06 ACATACGCGA GAAAGCACAGCGACAA CAGTAGCACA 14850 266 5.53e-06 GTACCATGCA GCAACAGCAACGGCAA ATCGAGAGTA 13294 386 5.53e-06 ATAGAGAACC GGAAGAATTAGAAAAC GCAACTCGGC 46341 141 9.83e-06 CGACGAACAC GAAACACGCGCGACAC GGTATCGGAC 47143 209 2.22e-05 TCTTTCGGCG GGAAAAATGACAGCCT CTGTCCGACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14143 9.5e-10 190_[+3]_294 55114 1.8e-07 169_[+3]_315 15479 3.3e-07 141_[+3]_343 44226 6.4e-07 284_[+3]_200 14322 7.1e-07 44_[+3]_440 26634 1.1e-06 64_[+3]_420 12331 1.7e-06 429_[+3]_55 2761 1.9e-06 151_[+3]_333 54756 2.3e-06 437_[+3]_47 44719 4.4e-06 265_[+3]_219 46970 5.5e-06 429_[+3]_55 14850 5.5e-06 265_[+3]_219 13294 5.5e-06 385_[+3]_99 46341 9.8e-06 140_[+3]_344 47143 2.2e-05 208_[+3]_276 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=15 14143 ( 191) GGAACAACAACAACAC 1 55114 ( 170) GAAACAAGTATAACAA 1 15479 ( 142) GGAAAAGCAAGAACAA 1 44226 ( 285) GGAACAACGAGGAGAC 1 14322 ( 45) GAAACGAGAATAACAA 1 26634 ( 65) GAAACAACAACATGAA 1 12331 ( 430) GGAACCAGGAGCACAA 1 2761 ( 152) GGAACAATCACCACCA 1 54756 ( 438) GGAAAAAGCACAAAAT 1 44719 ( 266) CAAACAACAATAAAAC 1 46970 ( 430) GAAAGCACAGCGACAA 1 14850 ( 266) GCAACAGCAACGGCAA 1 13294 ( 386) GGAAGAATTAGAAAAC 1 46341 ( 141) GAAACACGCGCGACAC 1 47143 ( 209) GGAAAAATGACAGCCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 15035 bayes= 9.89267 E= 5.4e+002 -1055 -186 198 -1055 64 -186 117 -1055 196 -1055 -1055 -1055 196 -1055 -1055 -1055 -36 146 -83 -1055 164 -86 -183 -1055 164 -186 -83 -1055 -1055 94 49 -40 87 -28 -24 -98 176 -1055 -83 -1055 -1055 113 17 -40 123 -86 17 -1055 164 -1055 -83 -198 -36 146 -83 -1055 176 -86 -1055 -1055 106 46 -1055 -98 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 15 E= 5.4e+002 0.000000 0.066667 0.933333 0.000000 0.400000 0.066667 0.533333 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.666667 0.133333 0.000000 0.800000 0.133333 0.066667 0.000000 0.800000 0.066667 0.133333 0.000000 0.000000 0.466667 0.333333 0.200000 0.466667 0.200000 0.200000 0.133333 0.866667 0.000000 0.133333 0.000000 0.000000 0.533333 0.266667 0.200000 0.600000 0.133333 0.266667 0.000000 0.800000 0.000000 0.133333 0.066667 0.200000 0.666667 0.133333 0.000000 0.866667 0.133333 0.000000 0.000000 0.533333 0.333333 0.000000 0.133333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[GA]AA[CA]AA[CGT][ACG]A[CGT][AG]A[CA]A[AC] -------------------------------------------------------------------------------- Time 24.12 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42811 8.59e-01 500 2761 9.61e-03 151_[+3(1.88e-06)]_333 46341 2.72e-05 140_[+3(9.83e-06)]_131_\ [+1(7.71e-07)]_85_[+1(2.08e-05)]_104 36569 3.67e-02 371_[+1(7.17e-06)]_117 13294 2.15e-05 81_[+1(1.63e-06)]_292_\ [+3(5.53e-06)]_99 47143 2.06e-03 208_[+3(2.22e-05)]_234_\ [+1(1.29e-05)]_30 14143 4.29e-05 190_[+3(9.48e-10)]_294 54756 1.99e-03 437_[+3(2.25e-06)]_47 14322 5.21e-07 44_[+3(7.12e-07)]_240_\ [+2(2.35e-08)]_179 38270 1.02e-04 99_[+2(1.89e-08)]_380 14850 2.77e-04 210_[+1(4.41e-06)]_43_\ [+3(5.53e-06)]_219 15102 6.34e-02 357_[+1(2.48e-05)]_131 15217 4.46e-02 342_[+1(2.08e-05)]_146 15479 2.05e-10 141_[+3(3.32e-07)]_150_\ [+1(7.29e-06)]_85_[+2(1.72e-09)]_75 18128 1.18e-05 449_[+2(6.82e-09)]_30 40124 4.66e-11 90_[+1(7.71e-07)]_71_[+3(3.08e-05)]_\ 175_[+2(3.62e-11)]_115 7679 2.39e-01 500 55114 3.28e-03 169_[+3(1.82e-07)]_315 43898 1.32e-02 193_[+1(1.06e-05)]_295 44226 7.87e-06 284_[+3(6.40e-07)]_88_\ [+1(3.91e-07)]_100 44584 3.35e-06 14_[+2(2.17e-10)]_465 44719 3.47e-02 265_[+3(4.39e-06)]_219 26634 4.49e-04 64_[+3(1.07e-06)]_406_\ [+1(2.48e-05)]_2 35739 5.51e-01 500 12331 2.90e-05 282_[+1(4.98e-06)]_117_\ [+1(7.71e-07)]_6_[+3(1.72e-06)]_55 46970 5.60e-04 108_[+1(1.85e-05)]_309_\ [+3(5.53e-06)]_55 43219 4.45e-05 445_[+2(7.43e-09)]_34 44727 9.42e-01 500 49651 1.83e-02 119_[+1(7.17e-06)]_369 30609 2.18e-02 291_[+1(7.91e-06)]_197 49913 5.42e-02 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************