******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/398/398.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10953 1.0000 500 11373 1.0000 500 14923 1.0000 500 17073 1.0000 500 21898 1.0000 500 22007 1.0000 500 25393 1.0000 500 269557 1.0000 500 270060 1.0000 500 33067 1.0000 500 5256 1.0000 500 7709 1.0000 500 9110 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/398/398.seqs.fa -oc motifs/398 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.249 C 0.231 G 0.252 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.249 C 0.231 G 0.252 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 13 llr = 130 E-value = 1.4e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :95:44:97:52 pos.-specific C a:3854a:1838 probability G :111:::::::: matrix T ::2122:1222: bits 2.1 * * 1.9 * * 1.7 ** ** 1.5 ** ** * * Relative 1.3 ** * ** * * Entropy 1.1 ** * ** * * (14.5 bits) 0.8 ** * **** * 0.6 ** ** ****** 0.4 ** ********* 0.2 ************ 0.0 ------------ Multilevel CAACCACAACAC consensus C AC T C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 22007 459 1.76e-07 ATCCTCGGAA CAACAACAACAC ATCGTTCGTT 9110 418 3.83e-07 TAAGTTTGCT CAACCTCAACAC TGTTTGTTCG 25393 178 1.74e-06 GGTTTTGGTC CAACTCCAACCC CAAAAGTCCA 33067 268 6.24e-06 GGAGAGAAGC CAACACCTACAC AAATTCAGAA 270060 102 7.62e-06 GCGGCACCGA CACCAACAACCA GCGTAGGATA 11373 478 8.48e-06 CAAAGCCAAA CAATCCCAACCC TCGCTTGTGC 17073 466 1.02e-05 TTCACTTCGC CGCCAACAACAC AACGACGAAC 10953 484 1.02e-05 CACAGCCCTC CACCAACATCTC TCTAA 7709 379 1.28e-05 TTCGACAGGG CATCCCCAATCC GAAGCAGCAA 21898 488 1.38e-05 GACGCTCGCT CACCCTCATCTC A 5256 75 2.02e-05 TCTTCCATAC CAGCCCCACCAC CGCCATACAT 269557 489 4.05e-05 GACATGCATC CATCCTCATTAC 14923 311 6.62e-05 TCAACGTACA CAAGTACAACAA TTGTGACTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22007 1.8e-07 458_[+1]_30 9110 3.8e-07 417_[+1]_71 25393 1.7e-06 177_[+1]_311 33067 6.2e-06 267_[+1]_221 270060 7.6e-06 101_[+1]_387 11373 8.5e-06 477_[+1]_11 17073 1e-05 465_[+1]_23 10953 1e-05 483_[+1]_5 7709 1.3e-05 378_[+1]_110 21898 1.4e-05 487_[+1]_1 5256 2e-05 74_[+1]_414 269557 4e-05 488_[+1] 14923 6.6e-05 310_[+1]_178 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=13 22007 ( 459) CAACAACAACAC 1 9110 ( 418) CAACCTCAACAC 1 25393 ( 178) CAACTCCAACCC 1 33067 ( 268) CAACACCTACAC 1 270060 ( 102) CACCAACAACCA 1 11373 ( 478) CAATCCCAACCC 1 17073 ( 466) CGCCAACAACAC 1 10953 ( 484) CACCAACATCTC 1 7709 ( 379) CATCCCCAATCC 1 21898 ( 488) CACCCTCATCTC 1 5256 ( 75) CAGCCCCACCAC 1 269557 ( 489) CATCCTCATTAC 1 14923 ( 311) CAAGTACAACAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 9.46216 E= 1.4e-002 -1035 211 -1035 -1035 189 -1035 -171 -1035 89 41 -171 -80 -1035 187 -171 -179 63 100 -1035 -80 63 73 -1035 -21 -1035 211 -1035 -1035 189 -1035 -1035 -179 147 -159 -1035 -21 -1035 187 -1035 -80 111 41 -1035 -80 -69 187 -1035 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 1.4e-002 0.000000 1.000000 0.000000 0.000000 0.923077 0.000000 0.076923 0.000000 0.461538 0.307692 0.076923 0.153846 0.000000 0.846154 0.076923 0.076923 0.384615 0.461538 0.000000 0.153846 0.384615 0.384615 0.000000 0.230769 0.000000 1.000000 0.000000 0.000000 0.923077 0.000000 0.000000 0.076923 0.692308 0.076923 0.000000 0.230769 0.000000 0.846154 0.000000 0.153846 0.538462 0.307692 0.000000 0.153846 0.153846 0.846154 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CA[AC]C[CA][ACT]CA[AT]C[AC]C -------------------------------------------------------------------------------- Time 1.75 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 13 llr = 146 E-value = 2.8e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :275::6262112::2 pos.-specific C 1::::32::2:4::6: probability G 8825a42816944a42 matrix T 1:1::3::31:24::7 bits 2.1 1.9 * * 1.7 * * * 1.5 * * * Relative 1.3 ** * * * * Entropy 1.1 ** ** * * ** (16.2 bits) 0.8 ***** * * ** 0.6 ***** *** * *** 0.4 *********** **** 0.2 **************** 0.0 ---------------- Multilevel GGAAGGAGAGGCGGCT consensus GG CG T GT G sequence T A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 10953 107 6.75e-09 GAACGAGGGA GGAAGGAGAGGGGGGT TTGGAGTACG 270060 157 1.64e-07 CAAGTCAAGG GGAGGGAGAGGCTGGA AGAGGATTCC 22007 7 2.14e-07 CGCGAG GGAGGCGGAGGGAGGT GCGCTGTCCC 269557 323 3.57e-07 TATGAGAACT GGGAGCGGAGGCAGCT GCTAGACTGA 9110 233 4.03e-07 CCGTTGAAAG GGAGGTAGTGGCTGCA CAAAAGATGG 5256 250 6.40e-07 TTGGATCTAT GGAGGGAGAAGTGGGT AACGACATGG 17073 196 1.47e-06 AAGAAGCGAA GGAAGCCAAGGGTGGT TGCCGCGTCG 25393 54 1.62e-06 CTAGTTTTCG GGGAGTGGAGGGGGCG AAGAAGGGAG 33067 359 6.35e-06 CTGGTATCTA CAAAGGAGACGCGGCT GCTACTGCAA 11373 116 1.36e-05 TAACTGTAAT TGAGGTAGTAGTGGCT TCTCCTTCGT 7709 86 1.97e-05 AAAAATTGTT GGAAGCAGTTGAAGCG GCCGTTGTGC 14923 77 3.04e-05 CTACATCTGT GGTGGTCGTGACTGCT GCGATGAGGT 21898 205 4.28e-05 TGCCTCCTTT GAGAGGAAGCGGTGCT ATTTGAGATA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10953 6.7e-09 106_[+2]_378 270060 1.6e-07 156_[+2]_328 22007 2.1e-07 6_[+2]_478 269557 3.6e-07 322_[+2]_162 9110 4e-07 232_[+2]_252 5256 6.4e-07 249_[+2]_235 17073 1.5e-06 195_[+2]_289 25393 1.6e-06 53_[+2]_431 33067 6.3e-06 358_[+2]_126 11373 1.4e-05 115_[+2]_369 7709 2e-05 85_[+2]_399 14923 3e-05 76_[+2]_408 21898 4.3e-05 204_[+2]_280 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=13 10953 ( 107) GGAAGGAGAGGGGGGT 1 270060 ( 157) GGAGGGAGAGGCTGGA 1 22007 ( 7) GGAGGCGGAGGGAGGT 1 269557 ( 323) GGGAGCGGAGGCAGCT 1 9110 ( 233) GGAGGTAGTGGCTGCA 1 5256 ( 250) GGAGGGAGAAGTGGGT 1 17073 ( 196) GGAAGCCAAGGGTGGT 1 25393 ( 54) GGGAGTGGAGGGGGCG 1 33067 ( 359) CAAAGGAGACGCGGCT 1 11373 ( 116) TGAGGTAGTAGTGGCT 1 7709 ( 86) GGAAGCAGTTGAAGCG 1 14923 ( 77) GGTGGTCGTGACTGCT 1 21898 ( 205) GAGAGGAAGCGGTGCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 9.45029 E= 2.8e-001 -1035 -159 174 -179 -69 -1035 174 -1035 147 -1035 -13 -179 111 -1035 87 -1035 -1035 -1035 199 -1035 -1035 41 61 20 131 -59 -13 -1035 -69 -1035 174 -1035 131 -1035 -171 20 -69 -59 129 -179 -169 -1035 187 -1035 -169 73 61 -80 -11 -1035 61 52 -1035 -1035 199 -1035 -1035 141 61 -1035 -69 -1035 -71 137 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 13 E= 2.8e-001 0.000000 0.076923 0.846154 0.076923 0.153846 0.000000 0.846154 0.000000 0.692308 0.000000 0.230769 0.076923 0.538462 0.000000 0.461538 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.307692 0.384615 0.307692 0.615385 0.153846 0.230769 0.000000 0.153846 0.000000 0.846154 0.000000 0.615385 0.000000 0.076923 0.307692 0.153846 0.153846 0.615385 0.076923 0.076923 0.000000 0.923077 0.000000 0.076923 0.384615 0.384615 0.153846 0.230769 0.000000 0.384615 0.384615 0.000000 0.000000 1.000000 0.000000 0.000000 0.615385 0.384615 0.000000 0.153846 0.000000 0.153846 0.692308 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GG[AG][AG]G[GCT][AG]G[AT]GG[CG][GTA]G[CG]T -------------------------------------------------------------------------------- Time 3.21 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 9 llr = 135 E-value = 1.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 92121:47::2333:24141: pos.-specific C :79:6a6219:7:394:921a probability G :1:13:::1:6:1:1:3::6: matrix T 1::7:::1812:63:32:32: bits 2.1 * * 1.9 * * 1.7 * * * * * * 1.5 * * * * * * * Relative 1.3 * * * * * * * Entropy 1.1 * * ** * * * * * (21.6 bits) 0.8 *** ***** * * * * 0.6 ************* * * * 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel ACCTCCCATCGCTACCACAGC consensus A AG AC AAAC TG TT sequence T T AT C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 25393 297 1.73e-09 ACACTTCGCA ACCACCCATCGCAACCGCAAC AACAGTTCAT 11373 453 3.70e-09 AACTCCTTCA ACCTCCAATCGAATCCAAAGC CAAACAATCC 5256 39 4.18e-09 TCAGTCTCAT AACTCCCCTCACTACTACTGC TCGCTTCTTC 270060 394 4.71e-09 ATACGCCTCC ACCTCCCATCGATCGAGCAGC CCCCGAAACG 33067 398 6.26e-08 CAAAGGCCAC TCCTGCCACCACTACCGCTGC ATCTACATCG 269557 102 9.42e-08 ACTTGACTTG ACCTGCCTTTGCTTCTACTTC TTTTGCCAGT 14923 412 4.09e-07 AGACAACATT ACCGACACTCTCGCCTACCGC CCCCTCTCCT 10953 446 4.09e-07 CCTCCCAAAC AACACCAAGCTCTCCCTCCCC ACAGCCACAC 9110 167 5.84e-07 CGTAGGACGA AGATGCAATCGAATCATCATC AGCTTGTGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25393 1.7e-09 296_[+3]_183 11373 3.7e-09 452_[+3]_27 5256 4.2e-09 38_[+3]_441 270060 4.7e-09 393_[+3]_86 33067 6.3e-08 397_[+3]_82 269557 9.4e-08 101_[+3]_378 14923 4.1e-07 411_[+3]_68 10953 4.1e-07 445_[+3]_34 9110 5.8e-07 166_[+3]_313 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=9 25393 ( 297) ACCACCCATCGCAACCGCAAC 1 11373 ( 453) ACCTCCAATCGAATCCAAAGC 1 5256 ( 39) AACTCCCCTCACTACTACTGC 1 270060 ( 394) ACCTCCCATCGATCGAGCAGC 1 33067 ( 398) TCCTGCCACCACTACCGCTGC 1 269557 ( 102) ACCTGCCTTTGCTTCTACTTC 1 14923 ( 412) ACCGACACTCTCGCCTACCGC 1 10953 ( 446) AACACCAAGCTCTCCCTCCCC 1 9110 ( 167) AGATGCAATCGAATCATCATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6240 bayes= 9.56981 E= 1.2e+000 184 -982 -982 -126 -16 153 -118 -982 -116 194 -982 -982 -16 -982 -118 132 -116 126 40 -982 -982 211 -982 -982 84 126 -982 -982 142 -6 -982 -126 -982 -106 -118 154 -982 194 -982 -126 -16 -982 114 -27 42 153 -982 -982 42 -982 -118 105 42 53 -982 32 -982 194 -118 -982 -16 94 -982 32 84 -982 40 -27 -116 194 -982 -982 84 -6 -982 32 -116 -106 114 -27 -982 211 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 1.2e+000 0.888889 0.000000 0.000000 0.111111 0.222222 0.666667 0.111111 0.000000 0.111111 0.888889 0.000000 0.000000 0.222222 0.000000 0.111111 0.666667 0.111111 0.555556 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.444444 0.555556 0.000000 0.000000 0.666667 0.222222 0.000000 0.111111 0.000000 0.111111 0.111111 0.777778 0.000000 0.888889 0.000000 0.111111 0.222222 0.000000 0.555556 0.222222 0.333333 0.666667 0.000000 0.000000 0.333333 0.000000 0.111111 0.555556 0.333333 0.333333 0.000000 0.333333 0.000000 0.888889 0.111111 0.000000 0.222222 0.444444 0.000000 0.333333 0.444444 0.000000 0.333333 0.222222 0.111111 0.888889 0.000000 0.000000 0.444444 0.222222 0.000000 0.333333 0.111111 0.111111 0.555556 0.222222 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[CA]C[TA][CG]C[CA][AC]TC[GAT][CA][TA][ACT]C[CTA][AGT]C[ATC][GT]C -------------------------------------------------------------------------------- Time 4.58 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10953 1.20e-09 106_[+2(6.75e-09)]_323_\ [+3(4.09e-07)]_17_[+1(1.02e-05)]_5 11373 1.48e-08 115_[+2(1.36e-05)]_321_\ [+3(3.70e-09)]_4_[+1(8.48e-06)]_11 14923 1.36e-05 76_[+2(3.04e-05)]_218_\ [+1(6.62e-05)]_89_[+3(4.09e-07)]_68 17073 1.07e-04 195_[+2(1.47e-06)]_254_\ [+1(1.02e-05)]_23 21898 2.06e-03 204_[+2(4.28e-05)]_267_\ [+1(1.38e-05)]_1 22007 3.35e-07 6_[+2(2.14e-07)]_117_[+2(1.45e-05)]_\ 303_[+1(1.76e-07)]_30 25393 2.38e-10 53_[+2(1.62e-06)]_108_\ [+1(1.74e-06)]_107_[+3(1.73e-09)]_183 269557 4.28e-08 101_[+3(9.42e-08)]_200_\ [+2(3.57e-07)]_150_[+1(4.05e-05)] 270060 2.82e-10 101_[+1(7.62e-06)]_43_\ [+2(1.64e-07)]_221_[+3(4.71e-09)]_86 33067 7.46e-08 267_[+1(6.24e-06)]_79_\ [+2(6.35e-06)]_23_[+3(6.26e-08)]_82 5256 2.20e-09 38_[+3(4.18e-09)]_15_[+1(2.02e-05)]_\ 46_[+3(6.58e-05)]_96_[+2(6.40e-07)]_235 7709 2.68e-03 85_[+2(1.97e-05)]_277_\ [+1(1.28e-05)]_110 9110 3.56e-09 166_[+3(5.84e-07)]_45_\ [+2(4.03e-07)]_169_[+1(3.83e-07)]_71 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************