******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/466/466.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11963 1.0000 500 14370 1.0000 500 14700 1.0000 500 16772 1.0000 500 20719 1.0000 500 22399 1.0000 500 261403 1.0000 500 263713 1.0000 500 263715 1.0000 500 269683 1.0000 500 270299 1.0000 500 3056 1.0000 500 31622 1.0000 500 32430 1.0000 500 32704 1.0000 500 32782 1.0000 500 33696 1.0000 500 35373 1.0000 500 5920 1.0000 500 bd176 1.0000 500 bd758 1.0000 500 bd837 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/466/466.seqs.fa -oc motifs/466 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 22 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 11000 N= 22 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.257 C 0.231 G 0.243 T 0.269 Background letter frequencies (from dataset with add-one prior applied): A 0.257 C 0.231 G 0.243 T 0.269 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 14 llr = 176 E-value = 5.5e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 112:6:1713923a16 pos.-specific C :48a:59:94167:63 probability G 34::4::::2::::2: matrix T 6::::5:311:1:::1 bits 2.1 * 1.9 * * 1.7 * * * 1.5 * * * * Relative 1.3 ** * * * ** Entropy 1.1 ******* * ** (18.2 bits) 0.8 * ******* ****** 0.6 ********* ****** 0.4 ********* ****** 0.2 **************** 0.0 ---------------- Multilevel TCCCACCACCACCACA consensus GGA GT T A AA GC sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 35373 355 2.68e-10 CACTCTTCTC TCCCACCACCACCACA GCTCAGCGCC 22399 474 8.19e-08 CCTCCCTCCT TGCCACCACTACCACC CATCCCAGTC 11963 360 1.18e-07 GGGTACCACC GCCCATCTCCACAACA CCGCAACCTT 14700 379 2.10e-07 GGGACAGGAG TACCACCTCCACCAGA TGCACCAGCT 14370 379 2.10e-07 GGGACAGGAG TACCACCTCCACCAGA TGCACCAGCT 31622 344 2.92e-07 TGTCACTGGG TGACGTCTCAACCACA GTCAACGCAT 32430 324 5.23e-07 ACAGTGTTGC TGCCACCACAAACAAC ACTACATCGA bd176 16 6.84e-07 CCACAACCTC GCCCGTCACGAAAACA AACAAGGCAC 261403 275 1.59e-06 TTTGAGGATT GCCCATAACCACAACC AAGCGAACCC 32704 70 1.84e-06 GCCGAGACCA AGCCGTCATCACCACA AAGCACTGGG 3056 88 1.99e-06 ATTGCAAGTC TCCCGCCACAACAAGT ACCGTCCAGT 5920 439 2.13e-06 ACTCATCCAA TGACATCACGCTCACA ACTTTGATCG 270299 440 2.13e-06 ACTCATCCAA TGACATCACGCTCACA ACTTTGATCG 16772 460 1.02e-05 GACGCCCGCC GCCCGCCAAAAACAAC GAAGGCGGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35373 2.7e-10 354_[+1]_130 22399 8.2e-08 473_[+1]_11 11963 1.2e-07 359_[+1]_125 14700 2.1e-07 378_[+1]_106 14370 2.1e-07 378_[+1]_106 31622 2.9e-07 343_[+1]_141 32430 5.2e-07 323_[+1]_161 bd176 6.8e-07 15_[+1]_469 261403 1.6e-06 274_[+1]_210 32704 1.8e-06 69_[+1]_415 3056 2e-06 87_[+1]_397 5920 2.1e-06 438_[+1]_46 270299 2.1e-06 439_[+1]_45 16772 1e-05 459_[+1]_25 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=14 35373 ( 355) TCCCACCACCACCACA 1 22399 ( 474) TGCCACCACTACCACC 1 11963 ( 360) GCCCATCTCCACAACA 1 14700 ( 379) TACCACCTCCACCAGA 1 14370 ( 379) TACCACCTCCACCAGA 1 31622 ( 344) TGACGTCTCAACCACA 1 32430 ( 324) TGCCACCACAAACAAC 1 bd176 ( 16) GCCCGTCACGAAAACA 1 261403 ( 275) GCCCATAACCACAACC 1 32704 ( 70) AGCCGTCATCACCACA 1 3056 ( 88) TCCCGCCACAACAAGT 1 5920 ( 439) TGACATCACGCTCACA 1 270299 ( 440) TGACATCACGCTCACA 1 16772 ( 460) GCCCGCCAAAAACAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 10670 bayes= 10.7955 E= 5.5e-006 -184 -1045 23 126 -85 89 82 -1045 -26 176 -1045 -1045 -1045 211 -1045 -1045 132 -1045 56 -1045 -1045 111 -1045 89 -184 201 -1045 -1045 147 -1045 -1045 9 -184 189 -1045 -191 15 89 -18 -191 174 -69 -1045 -1045 -26 148 -1045 -91 15 163 -1045 -1045 196 -1045 -1045 -1045 -85 148 -18 -1045 132 31 -1045 -191 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 14 E= 5.5e-006 0.071429 0.000000 0.285714 0.642857 0.142857 0.428571 0.428571 0.000000 0.214286 0.785714 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.642857 0.000000 0.357143 0.000000 0.000000 0.500000 0.000000 0.500000 0.071429 0.928571 0.000000 0.000000 0.714286 0.000000 0.000000 0.285714 0.071429 0.857143 0.000000 0.071429 0.285714 0.428571 0.214286 0.071429 0.857143 0.142857 0.000000 0.000000 0.214286 0.642857 0.000000 0.142857 0.285714 0.714286 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.642857 0.214286 0.000000 0.642857 0.285714 0.000000 0.071429 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TG][CG][CA]C[AG][CT]C[AT]C[CAG]A[CA][CA]A[CG][AC] -------------------------------------------------------------------------------- Time 3.83 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 11 llr = 174 E-value = 1.4e-005 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::::3:1:::51::::2:: pos.-specific C :9::7::63::7117:26::5 probability G 2:a:2953645:13:884151 matrix T 81:a1131:6534532::754 bits 2.1 * 1.9 ** 1.7 *** * 1.5 *** * Relative 1.3 **** * * *** Entropy 1.1 ****** *** **** * (22.8 bits) 0.8 ****** ***** ****** 0.6 ****** ***** ******* 0.4 ************ ******** 0.2 ********************* 0.0 --------------------- Multilevel TCGTCGGCGTGCATCGGCTTC consensus AGCGTTTGT G GT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 14700 162 4.85e-11 CTCCTGAAAG TCGTCGGGCTGCATCGGCTTC TGAATCGGGA 14370 162 4.85e-11 CTCCTGAAAG TCGTCGGGCTGCATCGGCTTC TGAATCGGGA 3056 36 4.11e-10 AAGTGCAACG TCGTCGGCAGTCATCGGCTTC TGTGGTTGTC bd837 316 2.24e-09 AGTAGCTTCT TCGTCGACGGTTATCGGCATC GTCGTCTTCC 5920 231 1.35e-08 CCTCTCAGTT TCGTGGACGTGTTGTGGCTGC TTTGTTCAGT 270299 232 1.35e-08 CCTCTCAGTT TCGTGGACGTGTTGTGGCTGC TTTGTTCAGT 31622 213 5.67e-08 TGTAAAACTG GCGTCGTCGTTCTTCTGGAGT GGTAGAGTTG 269683 265 2.59e-07 ATATTCTCTC TCGTTGTGGTGCCGCTGGTGT GGTGTTGTAG 33696 250 2.90e-07 AGTAAGTCTG TCGTCGGTCGTCTCCGGGGGT TGGTTCGCGG 16772 311 2.90e-07 GGTTTGGGTT GTGTCGTCGGTCGTCGCCTTT GCTCGCTCGT 263715 64 4.00e-07 GATGCCCGCT TCGTCTGCGTGCAATGCGTTG GTCAGGGCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14700 4.8e-11 161_[+2]_318 14370 4.8e-11 161_[+2]_318 3056 4.1e-10 35_[+2]_444 bd837 2.2e-09 315_[+2]_164 5920 1.3e-08 230_[+2]_249 270299 1.3e-08 231_[+2]_248 31622 5.7e-08 212_[+2]_267 269683 2.6e-07 264_[+2]_215 33696 2.9e-07 249_[+2]_230 16772 2.9e-07 310_[+2]_169 263715 4e-07 63_[+2]_416 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=11 14700 ( 162) TCGTCGGGCTGCATCGGCTTC 1 14370 ( 162) TCGTCGGGCTGCATCGGCTTC 1 3056 ( 36) TCGTCGGCAGTCATCGGCTTC 1 bd837 ( 316) TCGTCGACGGTTATCGGCATC 1 5920 ( 231) TCGTGGACGTGTTGTGGCTGC 1 270299 ( 232) TCGTGGACGTGTTGTGGCTGC 1 31622 ( 213) GCGTCGTCGTTCTTCTGGAGT 1 269683 ( 265) TCGTTGTGGTGCCGCTGGTGT 1 33696 ( 250) TCGTCGGTCGTCTCCGGGGGT 1 16772 ( 311) GTGTCGTCGGTCGTCGCCTTT 1 263715 ( 64) TCGTCTGCGTGCAATGCGTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 10560 bayes= 10.9326 E= 1.4e-005 -1010 -1010 -42 160 -1010 197 -1010 -156 -1010 -1010 204 -1010 -1010 -1010 -1010 189 -1010 165 -42 -156 -1010 -1010 190 -156 9 -1010 90 2 -1010 146 17 -156 -150 24 139 -1010 -1010 -1010 58 124 -1010 -1010 117 76 -1010 165 -1010 2 82 -134 -142 43 -150 -134 17 102 -1010 165 -1010 2 -1010 -1010 175 -57 -1010 -35 175 -1010 -1010 146 58 -1010 -50 -1010 -142 143 -1010 -1010 90 102 -1010 124 -142 43 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 11 E= 1.4e-005 0.000000 0.000000 0.181818 0.818182 0.000000 0.909091 0.000000 0.090909 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.727273 0.181818 0.090909 0.000000 0.000000 0.909091 0.090909 0.272727 0.000000 0.454545 0.272727 0.000000 0.636364 0.272727 0.090909 0.090909 0.272727 0.636364 0.000000 0.000000 0.000000 0.363636 0.636364 0.000000 0.000000 0.545455 0.454545 0.000000 0.727273 0.000000 0.272727 0.454545 0.090909 0.090909 0.363636 0.090909 0.090909 0.272727 0.545455 0.000000 0.727273 0.000000 0.272727 0.000000 0.000000 0.818182 0.181818 0.000000 0.181818 0.818182 0.000000 0.000000 0.636364 0.363636 0.000000 0.181818 0.000000 0.090909 0.727273 0.000000 0.000000 0.454545 0.545455 0.000000 0.545455 0.090909 0.363636 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TCGTCG[GAT][CG][GC][TG][GT][CT][AT][TG][CT]GG[CG]T[TG][CT] -------------------------------------------------------------------------------- Time 7.52 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 11 llr = 150 E-value = 1.7e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::::::312:::6:: pos.-specific C :1::::221:::2::: probability G :841a1:48:158:33 matrix T a169:982:895:477 bits 2.1 * 1.9 * * 1.7 * * 1.5 * *** * Relative 1.3 ** **** *** * Entropy 1.1 ******* ******** (19.7 bits) 0.8 ******* ******** 0.6 ******* ******** 0.4 ******* ******** 0.2 ******* ******** 0.0 ---------------- Multilevel TGTTGTTGGTTTGATT consensus G A G TGG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 269683 359 4.32e-10 GTTGTTTGTC TGTTGTTGGTTTGATT CTTCAGAGGC 14700 268 1.09e-08 CATCATTTTG TGGTGTTTGTTTGATT GATTGCGTCT 14370 268 1.09e-08 CATCATTTTG TGGTGTTTGTTTGATT GATTGCGTCT 5920 293 1.79e-08 AACTTGGGGT TGGTGTTAGTTGGTTT TGGTAGCTTC 270299 294 1.79e-08 AACTTGGGGT TGGTGTTAGTTGGTTT TGGTAGCTTC 20719 340 3.86e-07 TACTCCTGCC TGTGGTTGGTTGGAGG TTGGAGGATG 32704 205 5.14e-07 TCGTCGTTTT TGTTGTTGGATTCTTG ACCATCGGAT bd758 142 9.56e-07 CATCTTGTGA TGTTGTCCATTTGAGT TGTTATTAGT 33696 320 1.09e-06 TCCAGCCGTA TTTTGTTCGATTGATG GTTGATCCCT 16772 290 2.25e-06 GGGGGGTTCA TCTTGTCGCTTGGTTT GGGTTGTGTC 22399 295 3.75e-06 ACGAGGAGGG TGTTGGTAGTGGCAGT GGACGTCATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 269683 4.3e-10 358_[+3]_126 14700 1.1e-08 267_[+3]_217 14370 1.1e-08 267_[+3]_217 5920 1.8e-08 292_[+3]_192 270299 1.8e-08 293_[+3]_191 20719 3.9e-07 339_[+3]_145 32704 5.1e-07 204_[+3]_280 bd758 9.6e-07 141_[+3]_343 33696 1.1e-06 319_[+3]_165 16772 2.2e-06 289_[+3]_195 22399 3.8e-06 294_[+3]_190 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=11 269683 ( 359) TGTTGTTGGTTTGATT 1 14700 ( 268) TGGTGTTTGTTTGATT 1 14370 ( 268) TGGTGTTTGTTTGATT 1 5920 ( 293) TGGTGTTAGTTGGTTT 1 270299 ( 294) TGGTGTTAGTTGGTTT 1 20719 ( 340) TGTGGTTGGTTGGAGG 1 32704 ( 205) TGTTGTTGGATTCTTG 1 bd758 ( 142) TGTTGTCCATTTGAGT 1 33696 ( 320) TTTTGTTCGATTGATG 1 16772 ( 290) TCTTGTCGCTTGGTTT 1 22399 ( 295) TGTTGGTAGTGGCAGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 10670 bayes= 10.2758 E= 1.7e-003 -1010 -1010 -1010 189 -1010 -134 175 -156 -1010 -1010 58 124 -1010 -1010 -142 175 -1010 -1010 204 -1010 -1010 -1010 -142 175 -1010 -35 -1010 160 9 -35 58 -57 -150 -134 175 -1010 -50 -1010 -1010 160 -1010 -1010 -142 175 -1010 -1010 90 102 -1010 -35 175 -1010 131 -1010 -1010 43 -1010 -1010 17 143 -1010 -1010 17 143 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 1.7e-003 0.000000 0.000000 0.000000 1.000000 0.000000 0.090909 0.818182 0.090909 0.000000 0.000000 0.363636 0.636364 0.000000 0.000000 0.090909 0.909091 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.090909 0.909091 0.000000 0.181818 0.000000 0.818182 0.272727 0.181818 0.363636 0.181818 0.090909 0.090909 0.818182 0.000000 0.181818 0.000000 0.000000 0.818182 0.000000 0.000000 0.090909 0.909091 0.000000 0.000000 0.454545 0.545455 0.000000 0.181818 0.818182 0.000000 0.636364 0.000000 0.000000 0.363636 0.000000 0.000000 0.272727 0.727273 0.000000 0.000000 0.272727 0.727273 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TG[TG]TGTT[GA]GTT[TG]G[AT][TG][TG] -------------------------------------------------------------------------------- Time 11.10 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11963 7.17e-04 359_[+1(1.18e-07)]_125 14370 9.97e-15 161_[+2(4.85e-11)]_85_\ [+3(1.09e-08)]_95_[+1(2.10e-07)]_106 14700 9.97e-15 161_[+2(4.85e-11)]_85_\ [+3(1.09e-08)]_95_[+1(2.10e-07)]_106 16772 1.81e-07 164_[+3(1.78e-05)]_109_\ [+3(2.25e-06)]_5_[+2(2.90e-07)]_128_[+1(1.02e-05)]_25 20719 2.12e-04 317_[+2(4.01e-05)]_1_[+3(3.86e-07)]_\ 145 22399 9.43e-06 294_[+3(3.75e-06)]_163_\ [+1(8.19e-08)]_11 261403 2.31e-02 274_[+1(1.59e-06)]_210 263713 1.22e-02 302_[+2(3.38e-05)]_138_\ [+1(3.89e-05)]_23 263715 4.43e-03 63_[+2(4.00e-07)]_416 269683 4.41e-09 264_[+2(2.59e-07)]_73_\ [+3(4.32e-10)]_126 270299 2.88e-11 231_[+2(1.35e-08)]_41_\ [+3(1.79e-08)]_130_[+1(2.13e-06)]_45 3056 2.45e-08 35_[+2(4.11e-10)]_31_[+1(1.99e-06)]_\ 397 31622 4.11e-07 212_[+2(5.67e-08)]_110_\ [+1(2.92e-07)]_141 32430 1.28e-03 323_[+1(5.23e-07)]_161 32704 2.22e-05 69_[+1(1.84e-06)]_119_\ [+3(5.14e-07)]_280 32782 8.92e-01 500 33696 7.10e-06 249_[+2(2.90e-07)]_49_\ [+3(1.09e-06)]_165 35373 9.76e-07 354_[+1(2.68e-10)]_130 5920 2.88e-11 230_[+2(1.35e-08)]_41_\ [+3(1.79e-08)]_130_[+1(2.13e-06)]_46 bd176 6.28e-04 15_[+1(6.84e-07)]_104_\ [+2(9.52e-05)]_344 bd758 1.73e-03 141_[+3(9.56e-07)]_343 bd837 1.77e-05 315_[+2(2.24e-09)]_164 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************