******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/53/53.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 21813 1.0000 500 24126 1.0000 500 261527 1.0000 500 261823 1.0000 500 270391 1.0000 500 28667 1.0000 500 31921 1.0000 500 39941 1.0000 500 40978 1.0000 500 4349 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/53/53.seqs.fa -oc motifs/53 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.267 C 0.251 G 0.232 T 0.250 Background letter frequencies (from dataset with add-one prior applied): A 0.267 C 0.251 G 0.232 T 0.250 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 9 llr = 147 E-value = 1.2e-005 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :392:392:9826a38:1631 pos.-specific C a31692:891:84:7:97138 probability G :3:214::1:2:::::12::1 matrix T ::::::1::::::::2::33: bits 2.1 1.9 * * 1.7 * * 1.5 * * * * ** * * Relative 1.3 * * * ****** * ** Entropy 1.1 * * * *********** * (23.6 bits) 0.8 * * * ************ * 0.6 * *** ************* * 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CAACCGACCAACAACACCAAC consensus C A A A GAC AT GTC sequence G G C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 39941 271 9.97e-10 ACAAAGTCGC CGACCGACCAACCACTCGTTC CACCGAACAC 21813 417 9.97e-10 ACAAAGTCGC CGACCGACCAACCACTCGTTC CACCGAACAC 270391 461 1.79e-09 GGAGCGGCTT CCAGCAACCAGCAAAACCACC TCCTACAAAA 261823 74 1.79e-09 GGAGCGGCTT CCAGCAACCAGCAAAACCACC TCCTACAAAA 40978 302 8.28e-09 ACAAACAGAC CACCCGACCAACAACACCAAA CAACAAACCA 261527 403 3.58e-08 TCCTCCCTCA CAACCGACGCAACACACCAAC AAAGGGTCGG 24126 477 5.42e-08 ACTCCAAACT CCAACCTCCAACAACAGCTCC ATG 4349 171 9.58e-08 AAGTGACACA CAAACAAACAAACAAACAAAC ACACTAGTAA 31921 454 1.35e-07 GCAACCTCCT CGACGCAACAACAACACCCTG TATACTCCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39941 1e-09 270_[+1]_209 21813 1e-09 416_[+1]_63 270391 1.8e-09 460_[+1]_19 261823 1.8e-09 73_[+1]_406 40978 8.3e-09 301_[+1]_178 261527 3.6e-08 402_[+1]_77 24126 5.4e-08 476_[+1]_3 4349 9.6e-08 170_[+1]_309 31921 1.4e-07 453_[+1]_26 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=9 39941 ( 271) CGACCGACCAACCACTCGTTC 1 21813 ( 417) CGACCGACCAACCACTCGTTC 1 270391 ( 461) CCAGCAACCAGCAAAACCACC 1 261823 ( 74) CCAGCAACCAGCAAAACCACC 1 40978 ( 302) CACCCGACCAACAACACCAAA 1 261527 ( 403) CAACCGACGCAACACACCAAC 1 24126 ( 477) CCAACCTCCAACAACAGCTCC 1 4349 ( 171) CAAACAAACAAACAAACAAAC 1 31921 ( 454) CGACGCAACAACAACACCCTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 9.19073 E= 1.2e-005 -982 199 -982 -982 32 41 52 -982 173 -117 -982 -982 -27 115 -6 -982 -982 182 -106 -982 32 -18 94 -982 173 -982 -982 -117 -27 163 -982 -982 -982 182 -106 -982 173 -117 -982 -982 154 -982 -6 -982 -27 163 -982 -982 106 82 -982 -982 190 -982 -982 -982 32 141 -982 -982 154 -982 -982 -17 -982 182 -106 -982 -126 141 -6 -982 106 -117 -982 42 32 41 -982 42 -126 163 -106 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 1.2e-005 0.000000 1.000000 0.000000 0.000000 0.333333 0.333333 0.333333 0.000000 0.888889 0.111111 0.000000 0.000000 0.222222 0.555556 0.222222 0.000000 0.000000 0.888889 0.111111 0.000000 0.333333 0.222222 0.444444 0.000000 0.888889 0.000000 0.000000 0.111111 0.222222 0.777778 0.000000 0.000000 0.000000 0.888889 0.111111 0.000000 0.888889 0.111111 0.000000 0.000000 0.777778 0.000000 0.222222 0.000000 0.222222 0.777778 0.000000 0.000000 0.555556 0.444444 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.777778 0.000000 0.000000 0.222222 0.000000 0.888889 0.111111 0.000000 0.111111 0.666667 0.222222 0.000000 0.555556 0.111111 0.000000 0.333333 0.333333 0.333333 0.000000 0.333333 0.111111 0.777778 0.111111 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[ACG]A[CAG]C[GAC]A[CA]CA[AG][CA][AC]A[CA][AT]C[CG][AT][ACT]C -------------------------------------------------------------------------------- Time 0.85 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 9 llr = 142 E-value = 6.5e-005 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::::1::::334:::1::: pos.-specific C 4113:8::6:12:41::2:7: probability G :3244:9::9:34:7116:32 matrix T 667262:a4191212991a:8 bits 2.1 * * 1.9 * * 1.7 ** * * 1.5 ** ** ** * Relative 1.3 *** ** ** * * Entropy 1.1 * ******* ** *** (22.8 bits) 0.8 * * ******* *** *** 0.6 *** ******* **** *** 0.4 *********** ********* 0.2 ********************* 0.0 --------------------- Multilevel TTTGTCGTCGTAGAGTTGTCT consensus CGGCGT T GACT C GG sequence T CT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 270391 388 8.35e-11 CCCTCGTCGT TTTTGCGTCGTCGAGTTGTCT TTGACAAACT 261823 1 8.35e-11 . TTTTGCGTCGTCGAGTTGTCT TTGACAAACT 39941 308 1.08e-09 ACACCGCCCT TTGCTCGTCGTGACGTTCTCT CGAGACGACA 21813 454 1.08e-09 ACACCGCCCT TTGCTCGTCGTGACGTTCTCT CGAGACGACA 31921 314 4.20e-09 GTGTGGCTCT CGTGGTGTTGTAGCGTTGTCG TCGTCGGAGG 4349 10 9.74e-08 GTCGTCCGT TTTGTCGTTTTTGATTTGTGG CTTTTTTAAA 28667 288 3.51e-07 GTCGTTCCAA CCCGTCGTCGCAATGTTGTGT GATAATGTGA 24126 281 5.84e-07 TGAGAGAATA CGTGTTGTTGTGTCCGTTTGT TTGAGTTCGT 261527 4 6.91e-07 GCA CGTCGCATTGTATATTGATCT TTACGCCAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 270391 8.4e-11 387_[+2]_92 261823 8.4e-11 [+2]_479 39941 1.1e-09 307_[+2]_172 21813 1.1e-09 453_[+2]_26 31921 4.2e-09 313_[+2]_166 4349 9.7e-08 9_[+2]_470 28667 3.5e-07 287_[+2]_192 24126 5.8e-07 280_[+2]_199 261527 6.9e-07 3_[+2]_476 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=9 270391 ( 388) TTTTGCGTCGTCGAGTTGTCT 1 261823 ( 1) TTTTGCGTCGTCGAGTTGTCT 1 39941 ( 308) TTGCTCGTCGTGACGTTCTCT 1 21813 ( 454) TTGCTCGTCGTGACGTTCTCT 1 31921 ( 314) CGTGGTGTTGTAGCGTTGTCG 1 4349 ( 10) TTTGTCGTTTTTGATTTGTGG 1 28667 ( 288) CCCGTCGTCGCAATGTTGTGT 1 24126 ( 281) CGTGTTGTTGTGTCCGTTTGT 1 261527 ( 4) CGTCGCATTGTATATTGATCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 9.90539 E= 6.5e-005 -982 82 -982 115 -982 -117 52 115 -982 -117 -6 142 -982 41 94 -17 -982 -982 94 115 -982 163 -982 -17 -126 -982 194 -982 -982 -982 -982 200 -982 115 -982 83 -982 -982 194 -117 -982 -117 -982 183 32 -18 52 -117 32 -982 94 -17 73 82 -982 -117 -982 -117 152 -17 -982 -982 -106 183 -982 -982 -106 183 -126 -18 126 -117 -982 -982 -982 200 -982 141 52 -982 -982 -982 -6 164 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 6.5e-005 0.000000 0.444444 0.000000 0.555556 0.000000 0.111111 0.333333 0.555556 0.000000 0.111111 0.222222 0.666667 0.000000 0.333333 0.444444 0.222222 0.000000 0.000000 0.444444 0.555556 0.000000 0.777778 0.000000 0.222222 0.111111 0.000000 0.888889 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.555556 0.000000 0.444444 0.000000 0.000000 0.888889 0.111111 0.000000 0.111111 0.000000 0.888889 0.333333 0.222222 0.333333 0.111111 0.333333 0.000000 0.444444 0.222222 0.444444 0.444444 0.000000 0.111111 0.000000 0.111111 0.666667 0.222222 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.111111 0.888889 0.111111 0.222222 0.555556 0.111111 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 0.222222 0.777778 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TC][TG][TG][GCT][TG][CT]GT[CT]GT[AGC][GAT][AC][GT]TT[GC]T[CG][TG] -------------------------------------------------------------------------------- Time 1.66 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 106 E-value = 9.0e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :444::::::::642:::::: pos.-specific C 4666:6::48:2:::a::a:: probability G 6::::::84:a8468:a8::: matrix T ::::a4a222:::::::2:aa bits 2.1 * * * * ** 1.9 * * * ** *** 1.7 * * * ** *** 1.5 * * * ** *** Relative 1.3 * ** *** ******* Entropy 1.1 ******** ************ (30.6 bits) 0.8 ******** ************ 0.6 ******** ************ 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GCCCTCTGCCGGAGGCGGCTT consensus CAAA T TGT CGAA T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 39941 144 5.97e-12 ATGTTGGTCA GACCTTTGCCGGAGGCGGCTT GAATTCGGAA 21813 290 5.97e-12 ATGTTGGTCA GACCTTTGCCGGAGGCGGCTT GAATTCGGAA 270391 440 5.99e-11 AGAAACGGGA CCAATCTGGCGGGAGCGGCTT CCAGCAACCA 261823 53 5.99e-11 AGAAACGGGA CCAATCTGGCGGGAGCGGCTT CCAGCAACCA 28667 58 2.23e-09 TGTCGGGTGT GCCCTCTTTTGCAGACGTCTT TCCCCCGACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39941 6e-12 143_[+3]_336 21813 6e-12 289_[+3]_190 270391 6e-11 439_[+3]_40 261823 6e-11 52_[+3]_427 28667 2.2e-09 57_[+3]_422 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 39941 ( 144) GACCTTTGCCGGAGGCGGCTT 1 21813 ( 290) GACCTTTGCCGGAGGCGGCTT 1 270391 ( 440) CCAATCTGGCGGGAGCGGCTT 1 261823 ( 53) CCAATCTGGCGGGAGCGGCTT 1 28667 ( 58) GCCCTCTTTTGCAGACGTCTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 10.1572 E= 9.0e-003 -897 67 137 -897 58 126 -897 -897 58 126 -897 -897 58 126 -897 -897 -897 -897 -897 200 -897 126 -897 68 -897 -897 -897 200 -897 -897 178 -32 -897 67 78 -32 -897 167 -897 -32 -897 -897 210 -897 -897 -33 178 -897 117 -897 78 -897 58 -897 137 -897 -42 -897 178 -897 -897 199 -897 -897 -897 -897 210 -897 -897 -897 178 -32 -897 199 -897 -897 -897 -897 -897 200 -897 -897 -897 200 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 9.0e-003 0.000000 0.400000 0.600000 0.000000 0.400000 0.600000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.400000 0.400000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.600000 0.000000 0.400000 0.000000 0.400000 0.000000 0.600000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC][CA][CA][CA]T[CT]T[GT][CGT][CT]G[GC][AG][GA][GA]CG[GT]CTT -------------------------------------------------------------------------------- Time 2.47 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21813 8.77e-19 289_[+3(5.97e-12)]_106_\ [+1(9.97e-10)]_16_[+2(1.08e-09)]_26 24126 1.41e-06 280_[+2(5.84e-07)]_175_\ [+1(5.42e-08)]_3 261527 6.97e-07 3_[+2(6.91e-07)]_74_[+1(5.30e-06)]_\ 283_[+1(3.58e-08)]_77 261823 1.21e-18 [+2(8.35e-11)]_31_[+3(5.99e-11)]_\ [+1(1.79e-09)]_99_[+1(6.55e-05)]_213_[+1(1.13e-05)]_52 270391 1.21e-18 387_[+2(8.35e-11)]_31_\ [+3(5.99e-11)]_[+1(1.79e-09)]_19 28667 4.64e-08 57_[+3(2.23e-09)]_209_\ [+2(3.51e-07)]_192 31921 1.57e-08 313_[+2(4.20e-09)]_52_\ [+1(4.67e-05)]_46_[+1(1.35e-07)]_26 39941 8.77e-19 143_[+3(5.97e-12)]_106_\ [+1(9.97e-10)]_16_[+2(1.08e-09)]_172 40978 2.60e-04 248_[+1(5.31e-05)]_32_\ [+1(8.28e-09)]_178 4349 2.51e-07 9_[+2(9.74e-08)]_140_[+1(9.58e-08)]_\ 309 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************