******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/478/478.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10778 1.0000 500 21172 1.0000 500 23755 1.0000 500 24636 1.0000 500 262141 1.0000 500 264718 1.0000 500 268362 1.0000 500 268825 1.0000 500 31652 1.0000 500 31722 1.0000 500 5728 1.0000 500 8269 1.0000 500 8564 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/478/478.seqs.fa -oc motifs/478 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.257 C 0.242 G 0.234 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.257 C 0.242 G 0.234 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 9 llr = 117 E-value = 1.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :1:11:::16136::: pos.-specific C 124:111:1:33:1:: probability G 9269::9a226149:a matrix T :4::89::62:2::a: bits 2.1 * * 1.9 * ** 1.7 * * ** *** 1.5 * * *** *** Relative 1.3 * * *** *** Entropy 1.0 * ** *** **** (18.7 bits) 0.8 * ****** **** 0.6 * ****** ** **** 0.4 * ****** ** **** 0.2 *********** **** 0.0 ---------------- Multilevel GTGGTTGGTAGAAGTG consensus CC GGCCG sequence G T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 8564 31 1.78e-09 CATTGTCACT GTGGTTGGTAGTAGTG ATAGCAATAG 21172 26 1.31e-08 CTTCTTTGTT GTCGTTGGTGGCGGTG GTGGCGAAGT 23755 250 1.34e-07 CACGAGAGCA CGGGTTGGTAGCAGTG TAGCAGTGCA 268825 337 3.58e-07 TGCCACCGGT GGCGTCGGTACAGGTG CGGGTGTCAA 10778 57 5.95e-07 TGAGTCGCTG GAGGTTGGAAGGAGTG GGTGTGCCAT 31722 149 8.78e-07 CGATACGGGG GTCGATGGGGCCGGTG ACGATCCAAA 31652 253 1.02e-06 TGCAGCCCCA GTGGTTGGGAAAACTG AGTTTGTTCC 5728 18 2.25e-06 AGAGGAACCG GCGGCTGGCTGTGGTG GAACGTGAGG 268362 408 5.01e-06 CCCCAATCTT GCCATTCGTTCAAGTG CAGAATTGGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8564 1.8e-09 30_[+1]_454 21172 1.3e-08 25_[+1]_459 23755 1.3e-07 249_[+1]_235 268825 3.6e-07 336_[+1]_148 10778 6e-07 56_[+1]_428 31722 8.8e-07 148_[+1]_336 31652 1e-06 252_[+1]_232 5728 2.3e-06 17_[+1]_467 268362 5e-06 407_[+1]_77 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=9 8564 ( 31) GTGGTTGGTAGTAGTG 1 21172 ( 26) GTCGTTGGTGGCGGTG 1 23755 ( 250) CGGGTTGGTAGCAGTG 1 268825 ( 337) GGCGTCGGTACAGGTG 1 10778 ( 57) GAGGTTGGAAGGAGTG 1 31722 ( 149) GTCGATGGGGCCGGTG 1 31652 ( 253) GTGGTTGGGAAAACTG 1 5728 ( 18) GCGGCTGGCTGTGGTG 1 268362 ( 408) GCCATTCGTTCAAGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 9.58478 E= 1.0e-001 -982 -112 192 -982 -121 -12 -7 73 -982 88 125 -982 -121 -982 192 -982 -121 -112 -982 154 -982 -112 -982 173 -982 -112 192 -982 -982 -982 209 -982 -121 -112 -7 106 111 -982 -7 -26 -121 46 125 -982 37 46 -107 -26 111 -982 92 -982 -982 -112 192 -982 -982 -982 -982 190 -982 -982 209 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 1.0e-001 0.000000 0.111111 0.888889 0.000000 0.111111 0.222222 0.222222 0.444444 0.000000 0.444444 0.555556 0.000000 0.111111 0.000000 0.888889 0.000000 0.111111 0.111111 0.000000 0.777778 0.000000 0.111111 0.000000 0.888889 0.000000 0.111111 0.888889 0.000000 0.000000 0.000000 1.000000 0.000000 0.111111 0.111111 0.222222 0.555556 0.555556 0.000000 0.222222 0.222222 0.111111 0.333333 0.555556 0.000000 0.333333 0.333333 0.111111 0.222222 0.555556 0.000000 0.444444 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[TCG][GC]GTTGG[TG][AGT][GC][ACT][AG]GTG -------------------------------------------------------------------------------- Time 1.45 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 18 sites = 9 llr = 132 E-value = 9.6e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :29:631449:38277:3 pos.-specific C a8:a42644:a3:82:a7 probability G :::::2:11::32::3:: matrix T ::1::23::1::::1::: bits 2.1 * * * * 1.9 * * * * 1.7 * * * * 1.5 * ** ** * Relative 1.3 **** ** ** * Entropy 1.0 ***** ** ** *** (21.1 bits) 0.8 ***** ** ****** 0.6 ***** ***** ****** 0.4 ***** ************ 0.2 ***** ************ 0.0 ------------------ Multilevel CCACAACAAACAACAACC consensus A CCTCC CGACG A sequence G G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 21172 405 6.92e-09 AAGCCCACAC CCACCCCAAACAACAACA ACGGTCAAAA 24636 244 1.84e-08 AACTAACAAA CAACAATCAACCACAACC TCGCTATTTG 264718 281 2.06e-08 TTCGCCACAA CCACAGCACACGGCAGCC GAGCACAACC 262141 46 3.80e-08 TACCGCCTTT CCACCACACACGAACACC AATATCTTTG 5728 313 7.64e-08 AGTCACCAGA CCACCCACCACCACCACC GTCGTCGTGC 8269 150 1.53e-07 ACATACCGCT CCACCGCAGACGAAAGCC GAGCGAAGAG 268362 365 2.56e-07 GCAGATCAAC CCACAATGCACCACTGCC CACTGATTGC 268825 467 3.54e-07 GGTATACCTG CATCATCCAACAACAACA ACGTCATCAA 8564 473 5.01e-07 CACTAGATTA CCACATTCATCAGCAACA AGCACACAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21172 6.9e-09 404_[+2]_78 24636 1.8e-08 243_[+2]_239 264718 2.1e-08 280_[+2]_202 262141 3.8e-08 45_[+2]_437 5728 7.6e-08 312_[+2]_170 8269 1.5e-07 149_[+2]_333 268362 2.6e-07 364_[+2]_118 268825 3.5e-07 466_[+2]_16 8564 5e-07 472_[+2]_10 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=18 seqs=9 21172 ( 405) CCACCCCAAACAACAACA 1 24636 ( 244) CAACAATCAACCACAACC 1 264718 ( 281) CCACAGCACACGGCAGCC 1 262141 ( 46) CCACCACACACGAACACC 1 5728 ( 313) CCACCCACCACCACCACC 1 8269 ( 150) CCACCGCAGACGAAAGCC 1 268362 ( 365) CCACAATGCACCACTGCC 1 268825 ( 467) CATCATCCAACAACAACA 1 8564 ( 473) CCACATTCATCAGCAACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 6279 bayes= 10.2932 E= 9.6e-003 -982 205 -982 -982 -21 168 -982 -982 179 -982 -982 -126 -982 205 -982 -982 111 88 -982 -982 37 -12 -7 -26 -121 120 -982 32 79 88 -107 -982 79 88 -107 -982 179 -982 -982 -126 -982 205 -982 -982 37 46 51 -982 160 -982 -7 -982 -21 168 -982 -982 137 -12 -982 -126 137 -982 51 -982 -982 205 -982 -982 37 146 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 9 E= 9.6e-003 0.000000 1.000000 0.000000 0.000000 0.222222 0.777778 0.000000 0.000000 0.888889 0.000000 0.000000 0.111111 0.000000 1.000000 0.000000 0.000000 0.555556 0.444444 0.000000 0.000000 0.333333 0.222222 0.222222 0.222222 0.111111 0.555556 0.000000 0.333333 0.444444 0.444444 0.111111 0.000000 0.444444 0.444444 0.111111 0.000000 0.888889 0.000000 0.000000 0.111111 0.000000 1.000000 0.000000 0.000000 0.333333 0.333333 0.333333 0.000000 0.777778 0.000000 0.222222 0.000000 0.222222 0.777778 0.000000 0.000000 0.666667 0.222222 0.000000 0.111111 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[CA]AC[AC][ACGT][CT][AC][AC]AC[ACG][AG][CA][AC][AG]C[CA] -------------------------------------------------------------------------------- Time 3.00 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 7 llr = 103 E-value = 3.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :4:a6a3a:413:147 pos.-specific C 1::::::::::::3:1 probability G 9:a:3:7:a494::6: matrix T :6::1::::1:3a6:1 bits 2.1 * * 1.9 ** * ** * 1.7 ** * ** * 1.5 * ** * ** * * Relative 1.3 * ** **** * * Entropy 1.0 **** **** * * * (21.2 bits) 0.8 **** **** * * ** 0.6 *********** **** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GTGAAAGAGAGGTTGA consensus A G A G A CA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 31652 107 1.88e-08 GCTCCTTTCC GTGAAAAAGAGTTTGA ACTTCGTGCT 24636 112 2.51e-08 AAACAAACAA GAGAAAGAGAGTTCAA AATAACAGAG 268362 440 3.18e-08 TGGTAAAGCG GAGAAAGAGGGGTAAA CGAAATGCAG 268825 76 3.53e-08 GTATTTCAGA GTGAAAAAGGGGTCAA AATACTTGTA 8564 77 1.28e-07 CTCATGGTGA GAGAGAGAGAGATTGC TTGCGACGCA 10778 206 3.49e-07 ACTGCACTGA GTGAGAGAGGAGTTGT TTGTTTGTTG 5728 286 5.84e-07 GTCGATGGGG CTGATAGAGTGATTGA GAGTCACCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31652 1.9e-08 106_[+3]_378 24636 2.5e-08 111_[+3]_373 268362 3.2e-08 439_[+3]_45 268825 3.5e-08 75_[+3]_409 8564 1.3e-07 76_[+3]_408 10778 3.5e-07 205_[+3]_279 5728 5.8e-07 285_[+3]_199 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=7 31652 ( 107) GTGAAAAAGAGTTTGA 1 24636 ( 112) GAGAAAGAGAGTTCAA 1 268362 ( 440) GAGAAAGAGGGGTAAA 1 268825 ( 76) GTGAAAAAGGGGTCAA 1 8564 ( 77) GAGAGAGAGAGATTGC 1 10778 ( 206) GTGAGAGAGGAGTTGT 1 5728 ( 286) CTGATAGAGTGATTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 10.4196 E= 3.5e+000 -945 -76 187 -945 74 -945 -945 110 -945 -945 209 -945 196 -945 -945 -945 115 -945 29 -90 196 -945 -945 -945 15 -945 161 -945 196 -945 -945 -945 -945 -945 209 -945 74 -945 87 -90 -85 -945 187 -945 15 -945 87 10 -945 -945 -945 190 -85 24 -945 110 74 -945 129 -945 147 -76 -945 -90 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 3.5e+000 0.000000 0.142857 0.857143 0.000000 0.428571 0.000000 0.000000 0.571429 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.571429 0.000000 0.285714 0.142857 1.000000 0.000000 0.000000 0.000000 0.285714 0.000000 0.714286 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.428571 0.000000 0.428571 0.142857 0.142857 0.000000 0.857143 0.000000 0.285714 0.000000 0.428571 0.285714 0.000000 0.000000 0.000000 1.000000 0.142857 0.285714 0.000000 0.571429 0.428571 0.000000 0.571429 0.000000 0.714286 0.142857 0.000000 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[TA]GA[AG]A[GA]AG[AG]G[GAT]T[TC][GA]A -------------------------------------------------------------------------------- Time 4.35 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10778 7.77e-06 56_[+1(5.95e-07)]_133_\ [+3(3.49e-07)]_279 21172 2.94e-09 25_[+1(1.31e-08)]_363_\ [+2(6.92e-09)]_78 23755 1.07e-03 249_[+1(1.34e-07)]_235 24636 1.79e-08 111_[+3(2.51e-08)]_116_\ [+2(1.84e-08)]_221_[+2(8.56e-05)] 262141 9.23e-04 45_[+2(3.80e-08)]_437 264718 9.31e-05 280_[+2(2.06e-08)]_202 268362 1.70e-09 79_[+3(8.67e-05)]_269_\ [+2(2.56e-07)]_25_[+1(5.01e-06)]_16_[+3(3.18e-08)]_45 268825 2.19e-10 75_[+3(3.53e-08)]_245_\ [+1(3.58e-07)]_114_[+2(3.54e-07)]_16 31652 4.27e-07 106_[+3(1.88e-08)]_130_\ [+1(1.02e-06)]_232 31722 3.64e-03 148_[+1(8.78e-07)]_336 5728 3.92e-09 17_[+1(2.25e-06)]_252_\ [+3(5.84e-07)]_11_[+2(7.64e-08)]_170 8269 1.64e-03 149_[+2(1.53e-07)]_333 8564 7.07e-12 30_[+1(1.78e-09)]_30_[+3(1.28e-07)]_\ 354_[+2(7.16e-07)]_8_[+2(5.01e-07)]_10 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************