******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/249/249.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 17199 1.0000 500 22404 1.0000 500 50497 1.0000 500 11759 1.0000 500 44266 1.0000 500 38531 1.0000 500 49878 1.0000 500 45037 1.0000 500 48137 1.0000 500 49018 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/249/249.seqs.fa -oc motifs/249 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.257 C 0.235 G 0.219 T 0.290 Background letter frequencies (from dataset with add-one prior applied): A 0.257 C 0.235 G 0.219 T 0.290 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 106 E-value = 6.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::a:5:7a28:53:53:2857 pos.-specific C aa::522:2:5:28:322::3 probability G :::a:82:7::55:338323: matrix T :::::::::25::22::3:2: bits 2.2 ** * 2.0 **** * 1.8 **** * 1.5 **** * * * Relative 1.3 **** * * * * * * Entropy 1.1 ****** * * * * * * * (25.5 bits) 0.9 ************ * * * * 0.7 *************** * *** 0.4 ***************** *** 0.2 ***************** *** 0.0 --------------------- Multilevel CCAGAGAAGACAGCAAGGAAA consensus C TGA GC T GC sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 49878 386 1.84e-10 CACGTTCGTA CCAGCGAACACAGCACGGAGA TTTCGTATTT 38531 245 3.77e-10 ATGAGAAGGC CCAGAGAAGATGACTCGGAGA AAAAGCAACG 44266 372 6.29e-10 TCGTTCAAGA CCAGCGAAGATAGTGGGTAAA GGGATAACAG 50497 108 1.29e-08 AGATTTTGTA CCAGCGGAGTTGACAGGAAAC AGTGCTTTGA 49018 110 1.67e-08 GCGATATCTG CCAGAGCAGACACCGAGCATC GAATTATTTT 11759 120 4.23e-08 ACAACATCTT CCAGACAAAACGGCAACTGAA AGAGCACTAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49878 1.8e-10 385_[+1]_94 38531 3.8e-10 244_[+1]_235 44266 6.3e-10 371_[+1]_108 50497 1.3e-08 107_[+1]_372 49018 1.7e-08 109_[+1]_370 11759 4.2e-08 119_[+1]_360 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 49878 ( 386) CCAGCGAACACAGCACGGAGA 1 38531 ( 245) CCAGAGAAGATGACTCGGAGA 1 44266 ( 372) CCAGCGAAGATAGTGGGTAAA 1 50497 ( 108) CCAGCGGAGTTGACAGGAAAC 1 49018 ( 110) CCAGAGCAGACACCGAGCATC 1 11759 ( 120) CCAGACAAAACGGCAACTGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 10.09 E= 6.1e+000 -923 209 -923 -923 -923 209 -923 -923 196 -923 -923 -923 -923 -923 219 -923 96 109 -923 -923 -923 -49 193 -923 138 -49 -39 -923 196 -923 -923 -923 -62 -49 161 -923 170 -923 -923 -80 -923 109 -923 79 96 -923 119 -923 38 -49 119 -923 -923 183 -923 -80 96 -923 61 -80 38 51 61 -923 -923 -49 193 -923 -62 -49 61 20 170 -923 -39 -923 96 -923 61 -80 138 51 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 6.1e+000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.666667 0.166667 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.166667 0.666667 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.500000 0.000000 0.500000 0.500000 0.000000 0.500000 0.000000 0.333333 0.166667 0.500000 0.000000 0.000000 0.833333 0.000000 0.166667 0.500000 0.000000 0.333333 0.166667 0.333333 0.333333 0.333333 0.000000 0.000000 0.166667 0.833333 0.000000 0.166667 0.166667 0.333333 0.333333 0.833333 0.000000 0.166667 0.000000 0.500000 0.000000 0.333333 0.166667 0.666667 0.333333 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CCAG[AC]GAAGA[CT][AG][GA]C[AG][ACG]G[GT]A[AG][AC] -------------------------------------------------------------------------------- Time 0.97 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 9 llr = 129 E-value = 3.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :23:a317:317:21:266: pos.-specific C a1:9:21:27911::98:1: probability G :711:24:::::434::::: matrix T ::6::2338::24441:43a bits 2.2 * 2.0 * * 1.8 * * * 1.5 * ** * * * Relative 1.3 * ** * ** * Entropy 1.1 * ** *** ** * (20.6 bits) 0.9 ** ** **** *** * 0.7 ** ** ****** ****** 0.4 ***** ************* 0.2 ***** ************** 0.0 -------------------- Multilevel CGTCAAGATCCAGTGCCAAT consensus AA CTTCA TTGT ATT sequence G A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 38531 298 4.42e-10 CGTATTGGAT CGACATGATCCATTGCCTAT TAGTCGGTAC 45037 171 7.00e-09 GCCCCGTTCT CGTCAGAATACAGTGCCTAT TGGCTGTTAG 50497 45 2.16e-08 ATATGTAAAA CGTCACCATCCATGTCAAAT GTAAATATAA 17199 480 1.41e-07 AGCTGCTGTA CGTCAAGACACAGAGCAACT G 22404 297 1.96e-07 CGGCGTTTGA CGACAGTTTCCAGGTTCTTT TCCCCTGGAA 49878 241 3.57e-07 GACGATAGCT CATCAATATCCTGAACCTTT CTTTCAGAAA 48137 140 4.69e-07 GGGTCGTGCC CGACATGACCATTGTCCATT CTGCGTCGAA 44266 339 5.01e-07 CGACCATTTC CCTGACGTTCCACTGCCAAT TGCTCGTTCA 11759 51 6.45e-07 AGTGTGCCGG CAGCAATTTACCTTTCCAAT AGCTTGCTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38531 4.4e-10 297_[+2]_183 45037 7e-09 170_[+2]_310 50497 2.2e-08 44_[+2]_436 17199 1.4e-07 479_[+2]_1 22404 2e-07 296_[+2]_184 49878 3.6e-07 240_[+2]_240 48137 4.7e-07 139_[+2]_341 44266 5e-07 338_[+2]_142 11759 6.4e-07 50_[+2]_430 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=9 38531 ( 298) CGACATGATCCATTGCCTAT 1 45037 ( 171) CGTCAGAATACAGTGCCTAT 1 50497 ( 45) CGTCACCATCCATGTCAAAT 1 17199 ( 480) CGTCAAGACACAGAGCAACT 1 22404 ( 297) CGACAGTTTCCAGGTTCTTT 1 49878 ( 241) CATCAATATCCTGAACCTTT 1 48137 ( 140) CGACATGACCATTGTCCATT 1 44266 ( 339) CCTGACGTTCCACTGCCAAT 1 11759 ( 51) CAGCAATTTACCTTTCCAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4810 bayes= 9.19374 E= 3.6e-001 -982 209 -982 -982 -21 -108 161 -982 38 -982 -98 94 -982 192 -98 -982 196 -982 -982 -982 38 -8 2 -38 -121 -108 102 20 138 -982 -982 20 -982 -8 -982 142 38 151 -982 -982 -121 192 -982 -982 138 -108 -982 -38 -982 -108 102 62 -21 -982 61 62 -121 -982 102 62 -982 192 -982 -138 -21 173 -982 -982 111 -982 -982 62 111 -108 -982 20 -982 -982 -982 178 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 9 E= 3.6e-001 0.000000 1.000000 0.000000 0.000000 0.222222 0.111111 0.666667 0.000000 0.333333 0.000000 0.111111 0.555556 0.000000 0.888889 0.111111 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.222222 0.222222 0.222222 0.111111 0.111111 0.444444 0.333333 0.666667 0.000000 0.000000 0.333333 0.000000 0.222222 0.000000 0.777778 0.333333 0.666667 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.666667 0.111111 0.000000 0.222222 0.000000 0.111111 0.444444 0.444444 0.222222 0.000000 0.333333 0.444444 0.111111 0.000000 0.444444 0.444444 0.000000 0.888889 0.000000 0.111111 0.222222 0.777778 0.000000 0.000000 0.555556 0.000000 0.000000 0.444444 0.555556 0.111111 0.000000 0.333333 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[GA][TA]CA[ACGT][GT][AT][TC][CA]C[AT][GT][TGA][GT]C[CA][AT][AT]T -------------------------------------------------------------------------------- Time 1.90 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 5 llr = 77 E-value = 3.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :82:8::22a:a4::2 pos.-specific C a:8::4222:a:6288 probability G :2:a22826::::8:: matrix T :::::4:4::::::2: bits 2.2 * * * 2.0 * * *** 1.8 * * *** 1.5 * * * *** * Relative 1.3 ***** * *** *** Entropy 1.1 ***** * ******* (22.1 bits) 0.9 ***** * ******* 0.7 ***** * ******** 0.4 ******* ******** 0.2 ******* ******** 0.0 ---------------- Multilevel CACGACGTGACACGCC consensus GA GTCAA ACTA sequence G CC G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 22404 341 4.87e-09 CTCACGTCAA CACGACGGAACACGCC TAGCTCCAAC 48137 69 6.93e-09 TCATAAAGTT CACGATGTGACACGTC GTTACATCTT 49878 28 9.12e-08 ATCCAGATGG CACGGCCAGACAAGCC AGAATCTCCT 17199 169 9.12e-08 CGGAGGAGTG CACGAGGTCACAACCC AAGACTGGCG 45037 4 2.32e-07 TCC CGAGATGCGACACGCA GTCAGTCGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22404 4.9e-09 340_[+3]_144 48137 6.9e-09 68_[+3]_416 49878 9.1e-08 27_[+3]_457 17199 9.1e-08 168_[+3]_316 45037 2.3e-07 3_[+3]_481 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=5 22404 ( 341) CACGACGGAACACGCC 1 48137 ( 69) CACGATGTGACACGTC 1 49878 ( 28) CACGGCCAGACAAGCC 1 17199 ( 169) CACGAGGTCACAACCC 1 45037 ( 4) CGAGATGCGACACGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4850 bayes= 10.1721 E= 3.9e+002 -897 209 -897 -897 164 -897 -13 -897 -36 177 -897 -897 -897 -897 219 -897 164 -897 -13 -897 -897 77 -13 46 -897 -23 187 -897 -36 -23 -13 46 -36 -23 145 -897 196 -897 -897 -897 -897 209 -897 -897 196 -897 -897 -897 64 135 -897 -897 -897 -23 187 -897 -897 177 -897 -53 -36 177 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 3.9e+002 0.000000 1.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.400000 0.200000 0.400000 0.000000 0.200000 0.800000 0.000000 0.200000 0.200000 0.200000 0.400000 0.200000 0.200000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.800000 0.000000 0.200000 0.200000 0.800000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[AG][CA]G[AG][CTG][GC][TACG][GAC]ACA[CA][GC][CT][CA] -------------------------------------------------------------------------------- Time 2.78 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17199 5.56e-07 168_[+3(9.12e-08)]_8_[+2(6.83e-05)]_\ 267_[+2(1.41e-07)]_1 22404 8.77e-09 296_[+2(1.96e-07)]_24_\ [+3(4.87e-09)]_144 50497 6.29e-09 44_[+2(2.16e-08)]_43_[+1(1.29e-08)]_\ 372 11759 1.10e-06 50_[+2(6.45e-07)]_49_[+1(4.23e-08)]_\ 360 44266 1.43e-08 146_[+2(4.11e-06)]_172_\ [+2(5.01e-07)]_13_[+1(6.29e-10)]_108 38531 7.63e-12 244_[+1(3.77e-10)]_32_\ [+2(4.42e-10)]_183 49878 4.33e-13 27_[+3(9.12e-08)]_197_\ [+2(3.57e-07)]_56_[+2(9.11e-05)]_49_[+1(1.84e-10)]_94 45037 7.40e-08 3_[+3(2.32e-07)]_151_[+2(7.00e-09)]_\ 310 48137 2.31e-08 68_[+3(6.93e-09)]_55_[+2(4.69e-07)]_\ 341 49018 9.51e-05 109_[+1(1.67e-08)]_370 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************