******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/494/494.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31407 1.0000 500 37956 1.0000 500 38149 1.0000 500 48423 1.0000 500 25769 1.0000 500 33543 1.0000 500 35949 1.0000 500 35009 1.0000 500 35154 1.0000 500 38283 1.0000 500 44748 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/494/494.seqs.fa -oc motifs/494 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.248 C 0.259 G 0.234 T 0.259 Background letter frequencies (from dataset with add-one prior applied): A 0.248 C 0.259 G 0.234 T 0.259 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 11 llr = 141 E-value = 5.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 45322:21::11:::31::3 pos.-specific C 647::a:::5312652387: probability G :1:7::722:4:::5:2122 matrix T :1:18:17853884:55115 bits 2.1 1.9 * 1.7 * 1.5 * Relative 1.3 ** * * Entropy 1.0 * ***** ** **** * (18.5 bits) 0.8 * ******** **** ** 0.6 * ******** ***** *** 0.4 * ******** ***** *** 0.2 ******************** 0.0 -------------------- Multilevel CACGTCGTTTGTTCGTTCCT consensus ACA CC TCAC A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 25769 169 1.54e-10 CCGGGTCAAA CACGTCGTTCCTTCGATCCT GGCGGTGGTG 35949 359 7.48e-10 GCAAAGCATT CCCGTCGTGTGTTCCTTCCT ACTACTAGCT 44748 236 4.15e-08 TTTCTCCAAC CACGTCATTCATTCGTCCCA CGACGACGCA 35154 391 8.11e-08 GAATTCTCCA ACCAACGTTCGTTTCTTCCT TTCGACAACG 33543 367 5.98e-07 GAAAGCTATG CAAATCGTTCTTTCCCACCA TGGTAACTGG 35009 138 7.05e-07 CCCCTCCAAC ACCGTCTTGTCTTCGTCCGT GGTCGTATTG 31407 399 8.29e-07 CCGTTTTCCT AAAGTCGATCGTTCCTGGCT CATTCTTTGC 38283 262 1.05e-06 GAATAGAGTG CGAGTCGGTTGATTGATCCT ACACAATGCA 48423 370 2.66e-06 GTTTTGCTAC CACGTCGTTTTCTTCCGCTA GCTAGCGGAG 37956 294 3.67e-06 AAAGAGAGAC CCCTTCATTTTTCTGTTCGG GTCACTCTTG 38149 32 2.08e-05 CAAAGTTTCG ATCGACGGTTCTCCGACTCG GCACTGCCGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25769 1.5e-10 168_[+1]_312 35949 7.5e-10 358_[+1]_122 44748 4.1e-08 235_[+1]_245 35154 8.1e-08 390_[+1]_90 33543 6e-07 366_[+1]_114 35009 7.1e-07 137_[+1]_343 31407 8.3e-07 398_[+1]_82 38283 1e-06 261_[+1]_219 48423 2.7e-06 369_[+1]_111 37956 3.7e-06 293_[+1]_187 38149 2.1e-05 31_[+1]_449 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=11 25769 ( 169) CACGTCGTTCCTTCGATCCT 1 35949 ( 359) CCCGTCGTGTGTTCCTTCCT 1 44748 ( 236) CACGTCATTCATTCGTCCCA 1 35154 ( 391) ACCAACGTTCGTTTCTTCCT 1 33543 ( 367) CAAATCGTTCTTTCCCACCA 1 35009 ( 138) ACCGTCTTGTCTTCGTCCGT 1 31407 ( 399) AAAGTCGATCGTTCCTGGCT 1 38283 ( 262) CGAGTCGGTTGATTGATCCT 1 48423 ( 370) CACGTCGTTTTCTTCCGCTA 1 37956 ( 294) CCCTTCATTTTTCTGTTCGG 1 38149 ( 32) ATCGACGGTTCTCCGACTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5291 bayes= 8.90689 E= 5.7e+001 55 129 -1010 -1010 87 49 -136 -151 14 149 -1010 -1010 -45 -1010 164 -151 -45 -1010 -1010 166 -1010 195 -1010 -1010 -45 -1010 164 -151 -144 -1010 -36 149 -1010 -1010 -36 166 -1010 81 -1010 107 -144 7 64 7 -144 -151 -1010 166 -1010 -51 -1010 166 -1010 129 -1010 49 -1010 81 122 -1010 14 -51 -1010 107 -144 7 -36 81 -1010 166 -136 -151 -1010 149 -36 -151 14 -1010 -36 107 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 11 E= 5.7e+001 0.363636 0.636364 0.000000 0.000000 0.454545 0.363636 0.090909 0.090909 0.272727 0.727273 0.000000 0.000000 0.181818 0.000000 0.727273 0.090909 0.181818 0.000000 0.000000 0.818182 0.000000 1.000000 0.000000 0.000000 0.181818 0.000000 0.727273 0.090909 0.090909 0.000000 0.181818 0.727273 0.000000 0.000000 0.181818 0.818182 0.000000 0.454545 0.000000 0.545455 0.090909 0.272727 0.363636 0.272727 0.090909 0.090909 0.000000 0.818182 0.000000 0.181818 0.000000 0.818182 0.000000 0.636364 0.000000 0.363636 0.000000 0.454545 0.545455 0.000000 0.272727 0.181818 0.000000 0.545455 0.090909 0.272727 0.181818 0.454545 0.000000 0.818182 0.090909 0.090909 0.000000 0.727273 0.181818 0.090909 0.272727 0.000000 0.181818 0.545455 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CA][AC][CA]GTCGTT[TC][GCT]TT[CT][GC][TA][TC]CC[TA] -------------------------------------------------------------------------------- Time 1.17 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 18 sites = 6 llr = 93 E-value = 7.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::2:2:2:2:32::2::: pos.-specific C 2:8:::82:252::7::2 probability G 85:882:782:3a::8a: matrix T :5:2:8:2:723:a22:8 bits 2.1 * * 1.9 ** * 1.7 ** * 1.5 * ** * ** ** Relative 1.3 * ***** * ** *** Entropy 1.0 ******* * ** *** (22.4 bits) 0.8 ********* ** *** 0.6 ********** ****** 0.4 *********** ****** 0.2 *********** ****** 0.0 ------------------ Multilevel GGCGGTCGGTCGGTCGGT consensus T AT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 38283 114 1.20e-10 GGAAGGCTTG GTCGGTCGGTCAGTCGGT TGGAGACGGA 35154 219 1.95e-09 CGGGATCGTT GGCGGTCCGTCCGTCGGT CCACGAGAGA 25769 254 4.43e-08 ATATATAGGT GTAGGTAGGTAGGTAGGT AGGCAATTAT 35949 326 7.29e-08 CGCGTTGGAT GGCGATCGGCATGTCTGT CCTTGGCAAA 33543 395 1.81e-07 CATGGTAACT GGCTGGCTGTCTGTCGGC CTTCATGACT 38149 358 3.60e-07 TGAGGATTCA CTCGGTCGAGTGGTTGGT TGGTTGTTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38283 1.2e-10 113_[+2]_369 35154 1.9e-09 218_[+2]_264 25769 4.4e-08 253_[+2]_229 35949 7.3e-08 325_[+2]_157 33543 1.8e-07 394_[+2]_88 38149 3.6e-07 357_[+2]_125 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=18 seqs=6 38283 ( 114) GTCGGTCGGTCAGTCGGT 1 35154 ( 219) GGCGGTCCGTCCGTCGGT 1 25769 ( 254) GTAGGTAGGTAGGTAGGT 1 35949 ( 326) GGCGATCGGCATGTCTGT 1 33543 ( 395) GGCTGGCTGTCTGTCGGC 1 38149 ( 358) CTCGGTCGAGTGGTTGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 5313 bayes= 10.2366 E= 7.9e+001 -923 -64 183 -923 -923 -923 110 95 -57 168 -923 -923 -923 -923 183 -63 -57 -923 183 -923 -923 -923 -49 168 -57 168 -923 -923 -923 -64 151 -63 -57 -923 183 -923 -923 -64 -49 136 43 95 -923 -63 -57 -64 51 36 -923 -923 209 -923 -923 -923 -923 195 -57 136 -923 -63 -923 -923 183 -63 -923 -923 209 -923 -923 -64 -923 168 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 6 E= 7.9e+001 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.500000 0.500000 0.166667 0.833333 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 0.166667 0.833333 0.166667 0.833333 0.000000 0.000000 0.000000 0.166667 0.666667 0.166667 0.166667 0.000000 0.833333 0.000000 0.000000 0.166667 0.166667 0.666667 0.333333 0.500000 0.000000 0.166667 0.166667 0.166667 0.333333 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.166667 0.666667 0.000000 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.000000 0.833333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[GT]CGGTCGGT[CA][GT]GTCGGT -------------------------------------------------------------------------------- Time 2.44 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 100 E-value = 2.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::32:778:2732852:a:: pos.-specific C ::23::::2353:5:3:::8: probability G 837:8a32:52:73222a:2a matrix T 2723:::2:22:::::7:::: bits 2.1 * ** * 1.9 * ** * 1.7 * ** * 1.5 * ** * ** * Relative 1.3 * ** * * **** Entropy 1.0 ** *** * ** * **** (24.2 bits) 0.8 *** ***** ** * **** 0.6 *** ****** ********** 0.4 ********** ********** 0.2 ********************* 0.0 --------------------- Multilevel GTGAGGAAAGCAGCAATGACG consensus G C G C CAG C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 48423 9 6.84e-10 AACGATTC GTGAGGGAATGAGGAATGACG GAGGAATCGC 33543 260 2.20e-09 ATTTTTGCCC GTGCGGATACCAGAAGTGACG TCATTCGGCA 35949 213 8.63e-09 GATGACGAAA TTCCGGAAAGTCGCAATGACG TGGTGTTTTC 38149 110 1.68e-08 GACGCACTAC GGGTAGGAAGCCGGACGGACG GATAAGTAGC 25769 120 2.82e-08 TTGTGTTCCC GTTAGGAGAGAAACGATGACG CAGGTTGGTT 35154 95 4.51e-08 TCCAAAAGTG GGGTGGAACCCAACACAGAGG CTTTCTCTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48423 6.8e-10 8_[+3]_471 33543 2.2e-09 259_[+3]_220 35949 8.6e-09 212_[+3]_267 38149 1.7e-08 109_[+3]_370 25769 2.8e-08 119_[+3]_360 35154 4.5e-08 94_[+3]_385 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 48423 ( 9) GTGAGGGAATGAGGAATGACG 1 33543 ( 260) GTGCGGATACCAGAAGTGACG 1 35949 ( 213) TTCCGGAAAGTCGCAATGACG 1 38149 ( 110) GGGTAGGAAGCCGGACGGACG 1 25769 ( 120) GTTAGGAGAGAAACGATGACG 1 35154 ( 95) GGGTGGAACCCAACACAGAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 10.2276 E= 2.0e+002 -923 -923 183 -63 -923 -923 51 136 -923 -64 151 -63 43 36 -923 36 -57 -923 183 -923 -923 -923 209 -923 143 -923 51 -923 143 -923 -49 -63 175 -64 -923 -923 -923 36 110 -63 -57 95 -49 -63 143 36 -923 -923 43 -923 151 -923 -57 95 51 -923 175 -923 -49 -923 101 36 -49 -923 -57 -923 -49 136 -923 -923 209 -923 201 -923 -923 -923 -923 168 -49 -923 -923 -923 209 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 2.0e+002 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.333333 0.666667 0.000000 0.166667 0.666667 0.166667 0.333333 0.333333 0.000000 0.333333 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.666667 0.000000 0.166667 0.166667 0.833333 0.166667 0.000000 0.000000 0.000000 0.333333 0.500000 0.166667 0.166667 0.500000 0.166667 0.166667 0.666667 0.333333 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.166667 0.500000 0.333333 0.000000 0.833333 0.000000 0.166667 0.000000 0.500000 0.333333 0.166667 0.000000 0.166667 0.000000 0.166667 0.666667 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[TG]G[ACT]GG[AG]AA[GC]C[AC][GA][CG]A[AC]TGACG -------------------------------------------------------------------------------- Time 3.59 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31407 3.76e-03 398_[+1(8.29e-07)]_82 37956 7.15e-03 293_[+1(3.67e-06)]_187 38149 4.73e-09 31_[+1(2.08e-05)]_58_[+3(1.68e-08)]_\ 227_[+2(3.60e-07)]_125 48423 8.31e-08 8_[+3(6.84e-10)]_340_[+1(2.66e-06)]_\ 111 25769 1.66e-14 119_[+3(2.82e-08)]_28_\ [+1(1.54e-10)]_65_[+2(4.43e-08)]_229 33543 1.38e-11 259_[+3(2.20e-09)]_86_\ [+1(5.98e-07)]_8_[+2(1.81e-07)]_88 35949 3.89e-14 77_[+3(5.96e-05)]_114_\ [+3(8.63e-09)]_92_[+2(7.29e-08)]_15_[+1(7.48e-10)]_122 35009 6.06e-03 137_[+1(7.05e-07)]_343 35154 5.08e-13 53_[+2(4.16e-05)]_23_[+3(4.51e-08)]_\ 103_[+2(1.95e-09)]_154_[+1(8.11e-08)]_90 38283 2.78e-09 113_[+2(1.20e-10)]_130_\ [+1(1.05e-06)]_219 44748 4.95e-04 235_[+1(4.15e-08)]_245 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************