******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/161/161.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1381 1.0000 500 22041 1.0000 500 24910 1.0000 500 268687 1.0000 500 2909 1.0000 500 33055 1.0000 500 8429 1.0000 500 8679 1.0000 500 bd1062 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/161/161.seqs.fa -oc motifs/161 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.251 C 0.249 G 0.240 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.251 C 0.249 G 0.240 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 9 llr = 109 E-value = 1.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :6:1::61::a2:321 pos.-specific C ::4:18229:::::3: probability G 83:69:2218:1a719 matrix T 2163:2:4:2:7::3: bits 2.1 * * 1.9 * * 1.6 * * * * 1.4 * * * * * Relative 1.2 * ** *** * * Entropy 1.0 * * ** *** ** * (17.4 bits) 0.8 * * ** ****** * 0.6 ******* ****** * 0.4 ******* ****** * 0.2 ************** * 0.0 ---------------- Multilevel GATGGCATCGATGGCG consensus TGCT TCC T A AT sequence GG A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 22041 118 4.32e-09 TGCATGTGTG GACGGCAGCGATGGCG ATAAGCAACT 24910 192 6.06e-09 AGCCTAGTTT GATGGCCTCGATGGCG TATTCTACCT 8679 17 4.19e-07 CACATCACAT GGTGGTGTCGATGATG ACAACGTAGC 2909 292 1.44e-06 CGACATCACG GACTGCACCGATGATA CTTGCTGCAA bd1062 40 1.57e-06 GATAGACGGT TTTGGCGGCGATGGTG AAATGCTGCC 8429 350 3.60e-06 TGGCGTTTAT GGTTCCATCGAAGGGG ACATTTGTTT 33055 288 3.60e-06 ATCTCTATTG GGCAGCAACGAGGGCG ACATCATCAG 1381 22 4.39e-06 TGTAACGTCA GACGGCCCGTATGGAG CGTCCTCTTC 268687 40 9.82e-06 CTCAGCAAGT TATTGTATCTAAGAAG CAGGAGTCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22041 4.3e-09 117_[+1]_367 24910 6.1e-09 191_[+1]_293 8679 4.2e-07 16_[+1]_468 2909 1.4e-06 291_[+1]_193 bd1062 1.6e-06 39_[+1]_445 8429 3.6e-06 349_[+1]_135 33055 3.6e-06 287_[+1]_197 1381 4.4e-06 21_[+1]_463 268687 9.8e-06 39_[+1]_445 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=9 22041 ( 118) GACGGCAGCGATGGCG 1 24910 ( 192) GATGGCCTCGATGGCG 1 8679 ( 17) GGTGGTGTCGATGATG 1 2909 ( 292) GACTGCACCGATGATA 1 bd1062 ( 40) TTTGGCGGCGATGGTG 1 8429 ( 350) GGTTCCATCGAAGGGG 1 33055 ( 288) GGCAGCAACGAGGGCG 1 1381 ( 22) GACGGCCCGTATGGAG 1 268687 ( 40) TATTGTATCTAAGAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4365 bayes= 9.76818 E= 1.8e+000 -982 -982 170 -23 114 -982 47 -123 -982 84 -982 109 -117 -982 121 36 -982 -116 189 -982 -982 164 -982 -23 114 -16 -11 -982 -117 -16 -11 77 -982 184 -111 -982 -982 -982 170 -23 199 -982 -982 -982 -18 -982 -111 136 -982 -982 206 -982 41 -982 147 -982 -18 42 -111 36 -117 -982 189 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 1.8e+000 0.000000 0.000000 0.777778 0.222222 0.555556 0.000000 0.333333 0.111111 0.000000 0.444444 0.000000 0.555556 0.111111 0.000000 0.555556 0.333333 0.000000 0.111111 0.888889 0.000000 0.000000 0.777778 0.000000 0.222222 0.555556 0.222222 0.222222 0.000000 0.111111 0.222222 0.222222 0.444444 0.000000 0.888889 0.111111 0.000000 0.000000 0.000000 0.777778 0.222222 1.000000 0.000000 0.000000 0.000000 0.222222 0.000000 0.111111 0.666667 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.222222 0.333333 0.111111 0.333333 0.111111 0.000000 0.888889 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GT][AG][TC][GT]G[CT][ACG][TCG]C[GT]A[TA]G[GA][CTA]G -------------------------------------------------------------------------------- Time 0.91 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 5 llr = 95 E-value = 1.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::2228:a::2a24242::2 pos.-specific C aa88828:6a6:868:6a:8 probability G ::::::2:2:::::::::8: matrix T ::::::::2:2::::62:2: bits 2.1 ** * * * * 1.9 ** * * * * 1.6 ** * * * * 1.4 ** * * * * Relative 1.2 ******** * ** * *** Entropy 1.0 ******** * ***** *** (27.5 bits) 0.8 ******** * ***** *** 0.6 ******************** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel CCCCCACACCCACCCTCCGC consensus AAACG G A AAAAA TA sequence T T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 24910 465 1.16e-10 AATTCAAACA CCCCCACACCCAACCTTCGC CACACAACAC 1381 467 4.88e-10 CGTAGTACTG CCACCACACCCACACACCTC CCTCCTCTCA 33055 81 6.05e-10 AAGGCCTCTC CCCAAACACCAACCCTCCGC CTTCTTGACC 2909 103 1.14e-09 CACACTGGCG CCCCCCCAGCCACCAACCGC AGGATGAGCG 268687 377 7.58e-09 CCCTCAACTC CCCCCAGATCTACACTACGA TGAATCTAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24910 1.2e-10 464_[+2]_16 1381 4.9e-10 466_[+2]_14 33055 6e-10 80_[+2]_400 2909 1.1e-09 102_[+2]_378 268687 7.6e-09 376_[+2]_104 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=5 24910 ( 465) CCCCCACACCCAACCTTCGC 1 1381 ( 467) CCACCACACCCACACACCTC 1 33055 ( 81) CCCAAACACCAACCCTCCGC 1 2909 ( 103) CCCCCCCAGCCACCAACCGC 1 268687 ( 377) CCCCCAGATCTACACTACGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4329 bayes= 10.008 E= 1.3e+001 -897 201 -897 -897 -897 201 -897 -897 -33 168 -897 -897 -33 168 -897 -897 -33 168 -897 -897 167 -31 -897 -897 -897 168 -26 -897 199 -897 -897 -897 -897 127 -26 -38 -897 201 -897 -897 -33 127 -897 -38 199 -897 -897 -897 -33 168 -897 -897 67 127 -897 -897 -33 168 -897 -897 67 -897 -897 120 -33 127 -897 -38 -897 201 -897 -897 -897 -897 173 -38 -33 168 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 5 E= 1.3e+001 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.600000 0.200000 0.200000 0.000000 1.000000 0.000000 0.000000 0.200000 0.600000 0.000000 0.200000 1.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.400000 0.000000 0.000000 0.600000 0.200000 0.600000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.200000 0.800000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CC[CA][CA][CA][AC][CG]A[CGT]C[CAT]A[CA][CA][CA][TA][CAT]C[GT][CA] -------------------------------------------------------------------------------- Time 1.89 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 8 llr = 113 E-value = 1.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :4:1:43:491413198414 pos.-specific C :::::5::3:13::31:33: probability G 96:6a13a::31986:3446 matrix T 1:a3::5:4153::::::3: bits 2.1 * * 1.9 * * * 1.6 * * * 1.4 * * * * * * * Relative 1.2 * * * * * ** ** Entropy 1.0 *** * * * ** ** * (20.4 bits) 0.8 ***** * * ***** * 0.6 ****** * * ***** * 0.4 ********** ****** * 0.2 *********** ******** 0.0 -------------------- Multilevel GGTGGCTGAATAGGGAAAGG consensus A T AA T GC AC GGCA sequence G C T CT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 2909 454 2.01e-09 GGTCTAACTT GATGGCTGTATCGGGAAGAG GCAACAAACA bd1062 303 1.38e-08 CTTGGCTTGT TGTGGCTGTATAGGGAAACA AATATAGCGG 33055 44 2.86e-08 GATAATAAGA GGTGGCTGAATTGAAAAGTG GATAAATAAG 8429 313 2.15e-07 TATATTAATT GATTGAGGCAGAGGCAACCG AGGTCTCTGG 8679 179 3.40e-07 ACACATCGTG GGTGGCAGCACCAGGAGAGG GGGAAGGATG 268687 91 4.52e-07 AGAGAGCAAT GATAGAAGAAAAGAGAAGGA CAGCTCTGGA 1381 314 5.16e-07 CGACGGGCGG GGTGGAGGAAGTGGCCACTA TGAAAGCAGC 22041 283 6.26e-07 GAAGGCGGAG GGTTGGTGTTTGGGGAGAGG CCTTATTTAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 2909 2e-09 453_[+3]_27 bd1062 1.4e-08 302_[+3]_178 33055 2.9e-08 43_[+3]_437 8429 2.1e-07 312_[+3]_168 8679 3.4e-07 178_[+3]_302 268687 4.5e-07 90_[+3]_390 1381 5.2e-07 313_[+3]_167 22041 6.3e-07 282_[+3]_198 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=8 2909 ( 454) GATGGCTGTATCGGGAAGAG 1 bd1062 ( 303) TGTGGCTGTATAGGGAAACA 1 33055 ( 44) GGTGGCTGAATTGAAAAGTG 1 8429 ( 313) GATTGAGGCAGAGGCAACCG 1 8679 ( 179) GGTGGCAGCACCAGGAGAGG 1 268687 ( 91) GATAGAAGAAAAGAGAAGGA 1 1381 ( 314) GGTGGAGGAAGTGGCCACTA 1 22041 ( 283) GGTTGGTGTTTGGGGAGAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4329 bayes= 9.07715 E= 1.8e+001 -965 -965 186 -106 58 -965 138 -965 -965 -965 -965 194 -100 -965 138 -6 -965 -965 206 -965 58 101 -94 -965 -1 -965 6 94 -965 -965 206 -965 58 1 -965 53 180 -965 -965 -106 -100 -99 6 94 58 1 -94 -6 -100 -965 186 -965 -1 -965 164 -965 -100 1 138 -965 180 -99 -965 -965 158 -965 6 -965 58 1 64 -965 -100 1 64 -6 58 -965 138 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 8 E= 1.8e+001 0.000000 0.000000 0.875000 0.125000 0.375000 0.000000 0.625000 0.000000 0.000000 0.000000 0.000000 1.000000 0.125000 0.000000 0.625000 0.250000 0.000000 0.000000 1.000000 0.000000 0.375000 0.500000 0.125000 0.000000 0.250000 0.000000 0.250000 0.500000 0.000000 0.000000 1.000000 0.000000 0.375000 0.250000 0.000000 0.375000 0.875000 0.000000 0.000000 0.125000 0.125000 0.125000 0.250000 0.500000 0.375000 0.250000 0.125000 0.250000 0.125000 0.000000 0.875000 0.000000 0.250000 0.000000 0.750000 0.000000 0.125000 0.250000 0.625000 0.000000 0.875000 0.125000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.375000 0.250000 0.375000 0.000000 0.125000 0.250000 0.375000 0.250000 0.375000 0.000000 0.625000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[GA]T[GT]G[CA][TAG]G[ATC]A[TG][ACT]G[GA][GC]A[AG][AGC][GCT][GA] -------------------------------------------------------------------------------- Time 2.85 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1381 5.85e-11 21_[+1(4.39e-06)]_276_\ [+3(5.16e-07)]_133_[+2(4.88e-10)]_14 22041 5.23e-08 117_[+1(4.32e-09)]_149_\ [+3(6.26e-07)]_198 24910 4.57e-11 36_[+2(2.76e-05)]_78_[+1(3.85e-06)]_\ 41_[+1(6.06e-09)]_257_[+2(1.16e-10)]_16 268687 1.41e-09 39_[+1(9.82e-06)]_35_[+3(4.52e-07)]_\ 266_[+2(7.58e-09)]_104 2909 2.48e-13 102_[+2(1.14e-09)]_169_\ [+1(1.44e-06)]_146_[+3(2.01e-09)]_27 33055 3.94e-12 43_[+3(2.86e-08)]_17_[+2(6.05e-10)]_\ 187_[+1(3.60e-06)]_197 8429 1.62e-05 312_[+3(2.15e-07)]_17_\ [+1(3.60e-06)]_135 8679 5.75e-07 16_[+1(4.19e-07)]_146_\ [+3(3.40e-07)]_302 bd1062 1.60e-08 39_[+1(1.57e-06)]_247_\ [+3(1.38e-08)]_156_[+2(2.18e-05)]_2 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************