******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/424/424.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42828 1.0000 500 47815 1.0000 500 43760 1.0000 500 5543 1.0000 500 6817 1.0000 500 35233 1.0000 500 46381 1.0000 500 48422 1.0000 500 36830 1.0000 500 37932 1.0000 500 50473 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/424/424.seqs.fa -oc motifs/424 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.255 C 0.271 G 0.218 T 0.255 Background letter frequencies (from dataset with add-one prior applied): A 0.255 C 0.271 G 0.218 T 0.255 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 7 llr = 109 E-value = 1.6e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 39a:1919::41374:a63: pos.-specific C 71:a:17:764:7:1a:3:9 probability G ::::6::11416:33::111 matrix T ::::3:1:1::3::1:::6: bits 2.2 2.0 ** ** 1.8 ** ** 1.5 ** ** Relative 1.3 *** * * ** * Entropy 1.1 **** * * * ** ** * (22.4 bits) 0.9 **** * *** ** ** * 0.7 ********** *** ***** 0.4 ************** ***** 0.2 ******************** 0.0 -------------------- Multilevel CAACGACACCAGCAACAATC consensus A T GCTAGG CA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 35233 394 3.32e-12 AGAACGACAA CAACGACACCAGCAACAATC TCCCTTTCCT 5543 236 1.13e-08 CCCGCAACAC CAACTACATCAACAACAATC ACGACATGGA 6817 341 3.19e-08 ACTTCTGCTA CAACGACACGCTCGTCACGC CACCGTATCC 46381 269 7.66e-08 CGGTCAAATC CAACTACGCGCGCGCCAGTC ATTTCGAAAA 42828 85 1.12e-07 TCACAATCAG AAACGCAACGCGAAGCAAAC AGTGTAGTCG 47815 465 1.26e-07 GCAGTAGCAA CAACAACAGCAGCAGCACAG TCGTTGTTTT 43760 292 2.09e-07 AGGCCTGCGA ACACGATACCGTAAACAATC TATCGCCACG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35233 3.3e-12 393_[+1]_87 5543 1.1e-08 235_[+1]_245 6817 3.2e-08 340_[+1]_140 46381 7.7e-08 268_[+1]_212 42828 1.1e-07 84_[+1]_396 47815 1.3e-07 464_[+1]_16 43760 2.1e-07 291_[+1]_189 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=7 35233 ( 394) CAACGACACCAGCAACAATC 1 5543 ( 236) CAACTACATCAACAACAATC 1 6817 ( 341) CAACGACACGCTCGTCACGC 1 46381 ( 269) CAACTACGCGCGCGCCAGTC 1 42828 ( 85) AAACGCAACGCGAAGCAAAC 1 47815 ( 465) CAACAACAGCAGCAGCACAG 1 43760 ( 292) ACACGATACCGTAAACAATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5291 bayes= 9.40372 E= 1.6e+003 16 140 -945 -945 175 -92 -945 -945 197 -945 -945 -945 -945 188 -945 -945 -84 -945 139 16 175 -92 -945 -945 -84 140 -945 -84 175 -945 -61 -945 -945 140 -61 -84 -945 107 97 -945 75 66 -61 -945 -84 -945 139 16 16 140 -945 -945 148 -945 39 -945 75 -92 39 -84 -945 188 -945 -945 197 -945 -945 -945 116 8 -61 -945 16 -945 -61 116 -945 166 -61 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 1.6e+003 0.285714 0.714286 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.000000 0.571429 0.285714 0.857143 0.142857 0.000000 0.000000 0.142857 0.714286 0.000000 0.142857 0.857143 0.000000 0.142857 0.000000 0.000000 0.714286 0.142857 0.142857 0.000000 0.571429 0.428571 0.000000 0.428571 0.428571 0.142857 0.000000 0.142857 0.000000 0.571429 0.285714 0.285714 0.714286 0.000000 0.000000 0.714286 0.000000 0.285714 0.000000 0.428571 0.142857 0.285714 0.142857 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.571429 0.285714 0.142857 0.000000 0.285714 0.000000 0.142857 0.571429 0.000000 0.857143 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CA]AAC[GT]ACAC[CG][AC][GT][CA][AG][AG]CA[AC][TA]C -------------------------------------------------------------------------------- Time 1.02 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 7 llr = 113 E-value = 4.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 97:4::11::1:a7::4:4 pos.-specific C ::3:a1::::9:::1::9: probability G 133::94131:6:31::16 matrix T ::46::4779:4::7a6:: bits 2.2 2.0 * * * 1.8 * * * 1.5 ** * * Relative 1.3 * ** ** * * * Entropy 1.1 ** ** ****** * ** (23.3 bits) 0.9 ** *** ************ 0.7 ** **************** 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel AATTCGGTTTCGAATTTCG consensus GCA T G T G A A sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 50473 406 6.93e-10 GAAGCTGTTC AAGTCGTTGTCGAATTTCA AAACAGCCCA 36830 148 1.83e-09 GCATAGTTCT AGCACGTTTTCTAATTTCG ATTGATTTTC 48422 354 4.29e-09 ACTCCGTGCA AATTCGGTTTCTAGGTTCG CCGACGAGGC 42828 463 4.78e-09 GCTCACTTGC AAGTCCTTTTCGAATTTCA CTCTCCGCAT 37932 166 3.22e-08 TAATAGTCCA AATACGGTTGAGAATTACA GTCAATGCCA 43760 86 1.41e-07 CCGACGCGCA AGTACGAATTCTAATTAGG CATATCGAAA 5543 51 2.22e-07 ATTGGAAGCC GACTCGGGGTCGAGCTACG CACGCATGGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50473 6.9e-10 405_[+2]_76 36830 1.8e-09 147_[+2]_334 48422 4.3e-09 353_[+2]_128 42828 4.8e-09 462_[+2]_19 37932 3.2e-08 165_[+2]_316 43760 1.4e-07 85_[+2]_396 5543 2.2e-07 50_[+2]_431 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=7 50473 ( 406) AAGTCGTTGTCGAATTTCA 1 36830 ( 148) AGCACGTTTTCTAATTTCG 1 48422 ( 354) AATTCGGTTTCTAGGTTCG 1 42828 ( 463) AAGTCCTTTTCGAATTTCA 1 37932 ( 166) AATACGGTTGAGAATTACA 1 43760 ( 86) AGTACGAATTCTAATTAGG 1 5543 ( 51) GACTCGGGGTCGAGCTACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 5302 bayes= 9.40672 E= 4.2e+000 175 -945 -61 -945 148 -945 39 -945 -945 8 39 75 75 -945 -945 116 -945 188 -945 -945 -945 -92 197 -945 -84 -945 97 75 -84 -945 -61 148 -945 -945 39 148 -945 -945 -61 175 -84 166 -945 -945 -945 -945 139 75 197 -945 -945 -945 148 -945 39 -945 -945 -92 -61 148 -945 -945 -945 197 75 -945 -945 116 -945 166 -61 -945 75 -945 139 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 7 E= 4.2e+000 0.857143 0.000000 0.142857 0.000000 0.714286 0.000000 0.285714 0.000000 0.000000 0.285714 0.285714 0.428571 0.428571 0.000000 0.000000 0.571429 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.142857 0.000000 0.428571 0.428571 0.142857 0.000000 0.142857 0.714286 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 0.142857 0.857143 0.142857 0.857143 0.000000 0.000000 0.000000 0.000000 0.571429 0.428571 1.000000 0.000000 0.000000 0.000000 0.714286 0.000000 0.285714 0.000000 0.000000 0.142857 0.142857 0.714286 0.000000 0.000000 0.000000 1.000000 0.428571 0.000000 0.000000 0.571429 0.000000 0.857143 0.142857 0.000000 0.428571 0.000000 0.571429 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[AG][TCG][TA]CG[GT]T[TG]TC[GT]A[AG]TT[TA]C[GA] -------------------------------------------------------------------------------- Time 2.19 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 9 llr = 96 E-value = 1.6e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :a::8:21:1a1 pos.-specific C 1:a:28::16:9 probability G 6::4:2:962:: matrix T 3::6::8:31:: bits 2.2 2.0 ** * 1.8 ** * * 1.5 ** * * Relative 1.3 ** * ** Entropy 1.1 ******* ** (15.3 bits) 0.9 ******* ** 0.7 ********* ** 0.4 ********* ** 0.2 ************ 0.0 ------------ Multilevel GACTACTGGCAC consensus T GCGA TG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 35233 14 1.12e-07 TTTTGACGGT GACTACTGGCAC CCGGACCCGC 46381 335 1.57e-06 GGTGCGTATC GACGAGTGTCAC TTCCCAGCGG 48422 206 1.92e-06 TCGCGGTAGC GACTAGTGGGAC GGGCGATCGA 5543 124 2.77e-06 CCCCATTCTA TACGCCTGGCAC GGCCGTTTCA 43760 414 3.49e-06 CACACGTTCG GACTACTGTTAC ATCGATCGGT 36830 275 1.36e-05 TTTTTCAAAT GACTACTATGAC TATCCAACTT 42828 235 1.52e-05 TGGCTTTACT TACGACAGCCAC ACCGTGGAAG 37932 389 1.89e-05 TGCAGTCACT TACGCCTGGCAA TATTCCATCC 6817 312 2.89e-05 GAGGCATCAG CACTACAGGAAC TATCCCCACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35233 1.1e-07 13_[+3]_475 46381 1.6e-06 334_[+3]_154 48422 1.9e-06 205_[+3]_283 5543 2.8e-06 123_[+3]_365 43760 3.5e-06 413_[+3]_75 36830 1.4e-05 274_[+3]_214 42828 1.5e-05 234_[+3]_254 37932 1.9e-05 388_[+3]_100 6817 2.9e-05 311_[+3]_177 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=9 35233 ( 14) GACTACTGGCAC 1 46381 ( 335) GACGAGTGTCAC 1 48422 ( 206) GACTAGTGGGAC 1 5543 ( 124) TACGCCTGGCAC 1 43760 ( 414) GACTACTGTTAC 1 36830 ( 275) GACTACTATGAC 1 42828 ( 235) TACGACAGCCAC 1 37932 ( 389) TACGCCTGGCAA 1 6817 ( 312) CACTACAGGAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5379 bayes= 9.3553 E= 1.6e+003 -982 -128 135 38 197 -982 -982 -982 -982 188 -982 -982 -982 -982 102 112 161 -29 -982 -982 -982 152 3 -982 -20 -982 -982 161 -120 -982 202 -982 -982 -128 135 38 -120 103 3 -120 197 -982 -982 -982 -120 171 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 1.6e+003 0.000000 0.111111 0.555556 0.333333 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.444444 0.555556 0.777778 0.222222 0.000000 0.000000 0.000000 0.777778 0.222222 0.000000 0.222222 0.000000 0.000000 0.777778 0.111111 0.000000 0.888889 0.000000 0.000000 0.111111 0.555556 0.333333 0.111111 0.555556 0.222222 0.111111 1.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GT]AC[TG][AC][CG][TA]G[GT][CG]AC -------------------------------------------------------------------------------- Time 3.20 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42828 3.80e-10 84_[+1(1.12e-07)]_130_\ [+3(1.52e-05)]_216_[+2(4.78e-09)]_19 47815 1.37e-03 141_[+1(3.19e-05)]_303_\ [+1(1.26e-07)]_16 43760 4.00e-09 85_[+2(1.41e-07)]_187_\ [+1(2.09e-07)]_102_[+3(3.49e-06)]_75 5543 3.30e-10 50_[+2(2.22e-07)]_54_[+3(2.77e-06)]_\ 100_[+1(1.13e-08)]_245 6817 2.89e-05 311_[+3(2.89e-05)]_17_\ [+1(3.19e-08)]_140 35233 3.74e-11 13_[+3(1.12e-07)]_368_\ [+1(3.32e-12)]_87 46381 3.63e-06 268_[+1(7.66e-08)]_46_\ [+3(1.57e-06)]_154 48422 2.93e-07 205_[+3(1.92e-06)]_136_\ [+2(4.29e-09)]_128 36830 7.47e-07 147_[+2(1.83e-09)]_108_\ [+3(1.36e-05)]_214 37932 1.82e-05 165_[+2(3.22e-08)]_204_\ [+3(1.89e-05)]_100 50473 2.12e-05 156_[+2(4.89e-05)]_230_\ [+2(6.93e-10)]_76 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************