******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/462/462.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10428 1.0000 500 11059 1.0000 500 11266 1.0000 500 11722 1.0000 500 18775 1.0000 500 21774 1.0000 500 22669 1.0000 500 23376 1.0000 500 23948 1.0000 500 24013 1.0000 500 2524 1.0000 500 3036 1.0000 500 7493 1.0000 500 7950 1.0000 500 8897 1.0000 500 9268 1.0000 500 bd687 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/462/462.seqs.fa -oc motifs/462 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.262 C 0.252 G 0.226 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.262 C 0.252 G 0.226 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 8 llr = 151 E-value = 9.4e-009 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::434:::a:::::6:::5: pos.-specific C 48::1:98:1a:58:6a:4a probability G :36:5111:::a::4::a:: matrix T 6::8:9:1:9::53:4::1: bits 2.1 * * 1.9 * ** ** * 1.7 * ** ** * 1.5 ** **** ** * Relative 1.3 * ** **** ** * Entropy 1.1 **** ** ********** * (27.3 bits) 0.9 **** ************* * 0.6 ******************** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel TCGTGTCCATCGCCACCGAC consensus CGAAA TTGT C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 8897 163 8.84e-12 GCTCCACCCA TCGTATCCATCGTCACCGAC CAAGACTGCA 11266 472 1.12e-10 TCTCCACCCA TCGTATCCATCGTTACCGAC CAAGACCGC bd687 374 1.90e-10 ATCCGCAAGT CCATGTCCATCGCCATCGCC TCTGTTGATA 11059 375 1.90e-10 ATCCGCAAGT CCATGTCCATCGCCATCGCC TCTGTTGATA 11722 306 2.44e-10 GCTCCACCCA TCGTATCGATCGTCACCGAC CAAGACCGCA 18775 64 3.65e-09 CGCGTCGGTG TCGAGTCTATCGCTGCCGAC GAAGGGCGAG 7493 102 1.84e-08 GACGCCGGAT TGAAGTGCACCGCCGCCGCC GCACCGTGAG 24013 66 1.84e-08 CAGACGAAGG CGGTCGCCATCGTCGTCGTC GTTGAACGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8897 8.8e-12 162_[+1]_318 11266 1.1e-10 471_[+1]_9 bd687 1.9e-10 373_[+1]_107 11059 1.9e-10 374_[+1]_106 11722 2.4e-10 305_[+1]_175 18775 3.6e-09 63_[+1]_417 7493 1.8e-08 101_[+1]_379 24013 1.8e-08 65_[+1]_415 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=8 8897 ( 163) TCGTATCCATCGTCACCGAC 1 11266 ( 472) TCGTATCCATCGTTACCGAC 1 bd687 ( 374) CCATGTCCATCGCCATCGCC 1 11059 ( 375) CCATGTCCATCGCCATCGCC 1 11722 ( 306) TCGTATCGATCGTCACCGAC 1 18775 ( 64) TCGAGTCTATCGCTGCCGAC 1 7493 ( 102) TGAAGTGCACCGCCGCCGCC 1 24013 ( 66) CGGTCGCCATCGTCGTCGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 8177 bayes= 9.99594 E= 9.4e-009 -965 57 -965 127 -965 157 14 -965 52 -965 146 -965 -6 -965 -965 153 52 -101 114 -965 -965 -965 -86 175 -965 179 -86 -965 -965 157 -86 -105 193 -965 -965 -965 -965 -101 -965 175 -965 198 -965 -965 -965 -965 214 -965 -965 98 -965 95 -965 157 -965 -5 126 -965 73 -965 -965 131 -965 53 -965 198 -965 -965 -965 -965 214 -965 93 57 -965 -105 -965 198 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 8 E= 9.4e-009 0.000000 0.375000 0.000000 0.625000 0.000000 0.750000 0.250000 0.000000 0.375000 0.000000 0.625000 0.000000 0.250000 0.000000 0.000000 0.750000 0.375000 0.125000 0.500000 0.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.875000 0.125000 0.000000 0.000000 0.750000 0.125000 0.125000 1.000000 0.000000 0.000000 0.000000 0.000000 0.125000 0.000000 0.875000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.750000 0.000000 0.250000 0.625000 0.000000 0.375000 0.000000 0.000000 0.625000 0.000000 0.375000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.375000 0.000000 0.125000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TC][CG][GA][TA][GA]TCCATCG[CT][CT][AG][CT]CG[AC]C -------------------------------------------------------------------------------- Time 2.35 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 9 llr = 157 E-value = 1.1e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 4317138:1:a7:8a:4:a8: pos.-specific C :79:1::169:142:::3::: probability G 6:::37:231:24::a27:29 matrix T :::34:27::::1:::3:::1 bits 2.1 * 1.9 * ** * 1.7 * ** * * 1.5 * ** ** * * Relative 1.3 * ** *** *** Entropy 1.1 **** ** ** *** **** (25.2 bits) 0.9 **** *** *** *** **** 0.6 **** *********** **** 0.4 **** **************** 0.2 ********************* 0.0 --------------------- Multilevel GCCATGATCCAACAAGAGAAG consensus AA TGATGG GGC TC G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 8897 76 4.47e-13 CTCCTTGTCA GCCATGATCCAACAAGAGAAG ACCAACTACT 11722 219 4.47e-13 CTCCTTGTCA GCCATGATCCAACAAGAGAAG ACCAACTACT 11266 367 4.47e-13 CTCCTTGACA GCCATGATCCAACAAGAGAAG ACCAACTACT bd687 297 6.36e-09 CGGCCGAGCC AACAGAATGCAAGCAGTCAAG TCTGAAGCAC 11059 298 6.36e-09 CGGCCGAGCC AACAGAATGCAAGCAGTCAAG TCTGAAGCAC 23948 379 1.30e-08 AACAGCCATG ACCTTGTGCCACCAAGGGAAG TCAAGCGAGG 7493 386 5.92e-08 TTGGCTCTCA GCCTCAACACAAGAAGTGAGG TCTGTCCGCG 2524 8 9.28e-08 TGATTTT GCCTGGTTGCAGTAAGGCAAT TTCTTCTCAT 3036 327 1.51e-07 CTTCGATAGG AAAAAGAGCGAGGAAGAGAGG GTGGGTGACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8897 4.5e-13 75_[+2]_404 11722 4.5e-13 218_[+2]_261 11266 4.5e-13 366_[+2]_113 bd687 6.4e-09 296_[+2]_183 11059 6.4e-09 297_[+2]_182 23948 1.3e-08 378_[+2]_101 7493 5.9e-08 385_[+2]_94 2524 9.3e-08 7_[+2]_472 3036 1.5e-07 326_[+2]_153 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=9 8897 ( 76) GCCATGATCCAACAAGAGAAG 1 11722 ( 219) GCCATGATCCAACAAGAGAAG 1 11266 ( 367) GCCATGATCCAACAAGAGAAG 1 bd687 ( 297) AACAGAATGCAAGCAGTCAAG 1 11059 ( 298) AACAGAATGCAAGCAGTCAAG 1 23948 ( 379) ACCTTGTGCCACCAAGGGAAG 1 7493 ( 386) GCCTCAACACAAGAAGTGAGG 1 2524 ( 8) GCCTGGTTGCAGTAAGGCAAT 1 3036 ( 327) AAAAAGAGCGAGGAAGAGAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8160 bayes= 10.6715 E= 1.1e-006 76 -982 129 -982 35 140 -982 -982 -123 181 -982 -982 135 -982 -982 36 -123 -118 56 78 35 -982 156 -982 157 -982 -982 -22 -982 -118 -3 136 -123 114 56 -982 -982 181 -103 -982 193 -982 -982 -982 135 -118 -3 -982 -982 82 97 -122 157 -18 -982 -982 193 -982 -982 -982 -982 -982 214 -982 76 -982 -3 36 -982 40 156 -982 193 -982 -982 -982 157 -982 -3 -982 -982 -982 197 -122 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 1.1e-006 0.444444 0.000000 0.555556 0.000000 0.333333 0.666667 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.111111 0.111111 0.333333 0.444444 0.333333 0.000000 0.666667 0.000000 0.777778 0.000000 0.000000 0.222222 0.000000 0.111111 0.222222 0.666667 0.111111 0.555556 0.333333 0.000000 0.000000 0.888889 0.111111 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.111111 0.222222 0.000000 0.000000 0.444444 0.444444 0.111111 0.777778 0.222222 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.444444 0.000000 0.222222 0.333333 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.777778 0.000000 0.222222 0.000000 0.000000 0.000000 0.888889 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GA][CA]C[AT][TG][GA][AT][TG][CG]CA[AG][CG][AC]AG[ATG][GC]A[AG]G -------------------------------------------------------------------------------- Time 4.62 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 9 llr = 152 E-value = 4.6e-005 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1:8299::7::72::1:1:33 pos.-specific C 79:81:9919a1:272a:961 probability G 2:1::111:1:227:::1:16 matrix T :11:::::2:::6137:81:: bits 2.1 1.9 * * 1.7 * * 1.5 * **** ** * * Relative 1.3 * ***** ** * * Entropy 1.1 ******* ** * *** (24.3 bits) 0.9 ******** *** ** *** 0.6 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CCACAACCACCATGCTCTCCG consensus G A T GACTC AA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 11722 152 2.88e-13 ACCACAATTA CCACAACCACCATGCTCTCCG ATTTCCAATC 11266 293 2.88e-13 ATTACAATTA CCACAACCACCATGCTCTCCG CTTTCCGATC 8897 8 2.55e-12 ACAATTA CCACAACCACCATGCCCTCCG CTTTCCAATC 3036 479 3.75e-08 GGTAAGCCGG GCAAAACCTCCATGCCCTTGG A 18775 414 4.31e-08 AAGCAAGGCC ACAAAACGACCGGGTTCTCCA GACTGCCTGA 24013 357 4.62e-08 GTAGACACCC GCGCCACCACCAACTTCTCCA CACAAACACG 7493 446 1.46e-07 GGAACGAGTT CCTCAAGCACCCTGCTCGCAC CCTCACCTCC 22669 460 1.81e-07 TCATACACTG CTACAACCTCCGATCTCACAA TACAACACTC 21774 409 2.60e-07 AGCCATGCCA CCACAGCCCGCAGCTACTCAG CACACGTCTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11722 2.9e-13 151_[+3]_328 11266 2.9e-13 292_[+3]_187 8897 2.5e-12 7_[+3]_472 3036 3.8e-08 478_[+3]_1 18775 4.3e-08 413_[+3]_66 24013 4.6e-08 356_[+3]_123 7493 1.5e-07 445_[+3]_34 22669 1.8e-07 459_[+3]_20 21774 2.6e-07 408_[+3]_71 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=9 11722 ( 152) CCACAACCACCATGCTCTCCG 1 11266 ( 293) CCACAACCACCATGCTCTCCG 1 8897 ( 8) CCACAACCACCATGCCCTCCG 1 3036 ( 479) GCAAAACCTCCATGCCCTTGG 1 18775 ( 414) ACAAAACGACCGGGTTCTCCA 1 24013 ( 357) GCGCCACCACCAACTTCTCCA 1 7493 ( 446) CCTCAAGCACCCTGCTCGCAC 1 22669 ( 460) CTACAACCTCCGATCTCACAA 1 21774 ( 409) CCACAGCCCGCAGCTACTCAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8160 bayes= 10.6715 E= 4.6e-005 -123 140 -3 -982 -982 181 -982 -122 157 -982 -103 -122 -23 162 -982 -982 176 -118 -982 -982 176 -982 -103 -982 -982 181 -103 -982 -982 181 -103 -982 135 -118 -982 -22 -982 181 -103 -982 -982 198 -982 -982 135 -118 -3 -982 -23 -982 -3 110 -982 -18 156 -122 -982 140 -982 36 -123 -18 -982 136 -982 198 -982 -982 -123 -982 -103 158 -982 181 -982 -122 35 114 -103 -982 35 -118 129 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 4.6e-005 0.111111 0.666667 0.222222 0.000000 0.000000 0.888889 0.000000 0.111111 0.777778 0.000000 0.111111 0.111111 0.222222 0.777778 0.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.888889 0.000000 0.111111 0.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.888889 0.111111 0.000000 0.666667 0.111111 0.000000 0.222222 0.000000 0.888889 0.111111 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.111111 0.222222 0.000000 0.222222 0.000000 0.222222 0.555556 0.000000 0.222222 0.666667 0.111111 0.000000 0.666667 0.000000 0.333333 0.111111 0.222222 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.111111 0.000000 0.111111 0.777778 0.000000 0.888889 0.000000 0.111111 0.333333 0.555556 0.111111 0.000000 0.333333 0.111111 0.555556 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG]CA[CA]AACC[AT]CC[AG][TAG][GC][CT][TC]CTC[CA][GA] -------------------------------------------------------------------------------- Time 6.88 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10428 2.53e-01 500 11059 1.07e-10 297_[+2(6.36e-09)]_56_\ [+1(1.90e-10)]_106 11266 3.15e-24 292_[+3(2.88e-13)]_53_\ [+2(4.47e-13)]_50_[+1(6.67e-06)]_14_[+1(1.12e-10)]_9 11722 6.70e-24 151_[+3(2.88e-13)]_46_\ [+2(4.47e-13)]_26_[+3(2.47e-05)]_19_[+1(2.44e-10)]_175 18775 7.12e-09 63_[+1(3.65e-09)]_330_\ [+3(4.31e-08)]_66 21774 9.70e-04 408_[+3(2.60e-07)]_71 22669 3.12e-03 459_[+3(1.81e-07)]_20 23376 7.21e-01 500 23948 1.28e-04 378_[+2(1.30e-08)]_101 24013 1.06e-08 65_[+1(1.84e-08)]_271_\ [+3(4.62e-08)]_123 2524 3.88e-04 7_[+2(9.28e-08)]_472 3036 9.86e-08 326_[+2(1.51e-07)]_131_\ [+3(3.75e-08)]_1 7493 9.37e-12 101_[+1(1.84e-08)]_264_\ [+2(5.92e-08)]_39_[+3(1.46e-07)]_34 7950 1.22e-01 500 8897 2.22e-24 7_[+3(2.55e-12)]_47_[+2(4.47e-13)]_\ 26_[+3(1.51e-05)]_19_[+1(8.84e-12)]_73_[+1(2.58e-05)]_225 9268 4.52e-01 500 bd687 1.07e-10 296_[+2(6.36e-09)]_56_\ [+1(1.90e-10)]_107 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************