******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/477/477.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1566 1.0000 500 23878 1.0000 500 25537 1.0000 500 31075 1.0000 500 34476 1.0000 500 35404 1.0000 500 586 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/477/477.seqs.fa -oc motifs/477 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.266 C 0.238 G 0.229 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.266 C 0.238 G 0.229 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 7 llr = 99 E-value = 2.0e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 14:::63::4:93:a: pos.-specific C ::::11::11::6::: probability G 9:3a93:1914::a:9 matrix T :67:::79:3611::1 bits 2.1 * * 1.9 * ** 1.7 * ** 1.5 * ** * *** Relative 1.3 * ** ** * *** Entropy 1.1 * *** *** ** *** (20.3 bits) 0.9 ***** *** ** *** 0.6 ********* ****** 0.4 ********* ****** 0.2 **************** 0.0 ---------------- Multilevel GTTGGATTGATACGAG consensus AG GA TG A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 586 196 1.50e-09 AGATGATCCA GTTGGATTGTGACGAG GGAGTACTTG 35404 196 1.50e-09 AGATGATCCA GTTGGATTGTGACGAG GGAGTACTTG 25537 248 4.37e-08 TGACGAGGAG GAGGGGTTGATAAGAG GCCACCGCCA 23878 356 1.31e-07 AGGCGATGCT GTTGGGTTGGTTCGAG AGGTAGGGCT 34476 205 6.60e-07 CGAGACTGCA GATGCAATGAGACGAT GCTTTAGGAG 31075 196 1.04e-06 GACATCTGTC GTTGGCTTCCTATGAG CAAGGACATT 1566 56 1.46e-06 TATTAGAATG AAGGGAAGGATAAGAG TTCATAAAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 586 1.5e-09 195_[+1]_289 35404 1.5e-09 195_[+1]_289 25537 4.4e-08 247_[+1]_237 23878 1.3e-07 355_[+1]_129 34476 6.6e-07 204_[+1]_280 31075 1e-06 195_[+1]_289 1566 1.5e-06 55_[+1]_429 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=7 586 ( 196) GTTGGATTGTGACGAG 1 35404 ( 196) GTTGGATTGTGACGAG 1 25537 ( 248) GAGGGGTTGATAAGAG 1 23878 ( 356) GTTGGGTTGGTTCGAG 1 34476 ( 205) GATGCAATGAGACGAT 1 31075 ( 196) GTTGGCTTCCTATGAG 1 1566 ( 56) AAGGGAAGGATAAGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 3395 bayes= 8.91886 E= 2.0e-002 -89 -945 190 -945 69 -945 -945 110 -945 -945 32 142 -945 -945 212 -945 -945 -74 190 -945 110 -74 32 -945 10 -945 -945 142 -945 -945 -68 168 -945 -74 190 -945 69 -74 -68 10 -945 -945 90 110 169 -945 -945 -90 10 126 -945 -90 -945 -945 212 -945 191 -945 -945 -945 -945 -945 190 -90 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 2.0e-002 0.142857 0.000000 0.857143 0.000000 0.428571 0.000000 0.000000 0.571429 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.571429 0.142857 0.285714 0.000000 0.285714 0.000000 0.000000 0.714286 0.000000 0.000000 0.142857 0.857143 0.000000 0.142857 0.857143 0.000000 0.428571 0.142857 0.142857 0.285714 0.000000 0.000000 0.428571 0.571429 0.857143 0.000000 0.000000 0.142857 0.285714 0.571429 0.000000 0.142857 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.857143 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[TA][TG]GG[AG][TA]TG[AT][TG]A[CA]GAG -------------------------------------------------------------------------------- Time 0.50 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 6 llr = 90 E-value = 1.1e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :8::::::::2:::: pos.-specific C 3::::522:8::22: probability G 72:aa:58722:88: matrix T ::a::53:3:7a::a bits 2.1 ** 1.9 *** * * 1.7 *** * * 1.5 *** * * **** Relative 1.3 ***** * * **** Entropy 1.1 ****** *** **** (21.7 bits) 0.9 ****** *** **** 0.6 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel GATGGCGGGCTTGGT consensus C TT T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 586 54 5.79e-10 CGAGACGACG GATGGCGGGCTTGGT GATTCCCTGG 35404 54 5.79e-10 CGAGACGACG GATGGCGGGCTTGGT GATTCCCTGG 25537 219 9.26e-08 AATATCAATT GATGGCTGTCTTGCT TGCATGACGA 1566 260 1.14e-07 TGATTCTGAT CATGGTGGTGTTGGT ATTGTACACC 34476 250 2.99e-07 TCAAATGCAA CATGGTTCGCATGGT GGTTGAGATT 23878 447 5.86e-07 CGCGTCGGTC GGTGGTCGGCGTCGT CCATTTGATA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 586 5.8e-10 53_[+2]_432 35404 5.8e-10 53_[+2]_432 25537 9.3e-08 218_[+2]_267 1566 1.1e-07 259_[+2]_226 34476 3e-07 249_[+2]_236 23878 5.9e-07 446_[+2]_39 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=6 586 ( 54) GATGGCGGGCTTGGT 1 35404 ( 54) GATGGCGGGCTTGGT 1 25537 ( 219) GATGGCTGTCTTGCT 1 1566 ( 260) CATGGTGGTGTTGGT 1 34476 ( 250) CATGGTTCGCATGGT 1 23878 ( 447) GGTGGTCGGCGTCGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 3402 bayes= 8.96375 E= 1.1e-002 -923 48 154 -923 165 -923 -46 -923 -923 -923 -923 190 -923 -923 212 -923 -923 -923 212 -923 -923 107 -923 90 -923 -51 112 32 -923 -51 186 -923 -923 -923 154 32 -923 180 -46 -923 -67 -923 -46 132 -923 -923 -923 190 -923 -51 186 -923 -923 -51 186 -923 -923 -923 -923 190 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 6 E= 1.1e-002 0.000000 0.333333 0.666667 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.166667 0.500000 0.333333 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.833333 0.166667 0.000000 0.166667 0.000000 0.166667 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GC]ATGG[CT][GT]G[GT]CTTGGT -------------------------------------------------------------------------------- Time 0.89 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 102 E-value = 7.1e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::4a8::a:26:a::6::a:: pos.-specific C :a::::::46:::8::26:a: probability G 8:6:28a:22:2:26444::a matrix T 2::::2::4:48::4:4:::: bits 2.1 * * ** 1.9 * * ** * *** 1.7 * * ** * *** 1.5 * * ** * *** Relative 1.3 ** ***** *** *** Entropy 1.1 ******** ***** **** (29.5 bits) 0.9 ******** ****** **** 0.6 ******** ******* **** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GCGAAGGACCATACGAGCACG consensus T A GT TATG GTGTG sequence GG C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 586 87 8.52e-12 GGATGTTGTC GCGAAGGACCTTACGATGACG CTTGGCGCCT 35404 87 8.52e-12 GGATGTTGTC GCGAAGGACCTTACGATGACG CTTGGCGCCT 31075 249 5.88e-10 AAGAGCTTCA GCAAAGGATGATAGTAGCACG ACGATTGAAG 25537 123 7.26e-10 CATGCGAAAG GCAAGGGAGCAGACGGGCACG TCGGCGTACC 1566 339 2.59e-09 CCCAGAACCT TCGAATGATAATACTGCCACG TGGATTCAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 586 8.5e-12 86_[+3]_393 35404 8.5e-12 86_[+3]_393 31075 5.9e-10 248_[+3]_231 25537 7.3e-10 122_[+3]_357 1566 2.6e-09 338_[+3]_141 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 586 ( 87) GCGAAGGACCTTACGATGACG 1 35404 ( 87) GCGAAGGACCTTACGATGACG 1 31075 ( 249) GCAAAGGATGATAGTAGCACG 1 25537 ( 123) GCAAGGGAGCAGACGGGCACG 1 1566 ( 339) TCGAATGATAATACTGCCACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3360 bayes= 9.64205 E= 7.1e-002 -897 -897 180 -42 -897 207 -897 -897 59 -897 139 -897 191 -897 -897 -897 159 -897 -20 -897 -897 -897 180 -42 -897 -897 212 -897 191 -897 -897 -897 -897 75 -20 58 -41 133 -20 -897 117 -897 -897 58 -897 -897 -20 158 191 -897 -897 -897 -897 175 -20 -897 -897 -897 139 58 117 -897 80 -897 -897 -25 80 58 -897 133 80 -897 191 -897 -897 -897 -897 207 -897 -897 -897 -897 212 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 7.1e-002 0.000000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.400000 0.200000 0.400000 0.200000 0.600000 0.200000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 0.000000 0.200000 0.800000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.600000 0.400000 0.600000 0.000000 0.400000 0.000000 0.000000 0.200000 0.400000 0.400000 0.000000 0.600000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GT]C[GA]A[AG][GT]GA[CTG][CAG][AT][TG]A[CG][GT][AG][GTC][CG]ACG -------------------------------------------------------------------------------- Time 1.30 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1566 2.45e-11 55_[+1(1.46e-06)]_188_\ [+2(1.14e-07)]_64_[+3(2.59e-09)]_141 23878 4.23e-07 355_[+1(1.31e-07)]_75_\ [+2(5.86e-07)]_39 25537 2.23e-13 122_[+3(7.26e-10)]_75_\ [+2(9.26e-08)]_14_[+1(4.37e-08)]_237 31075 4.95e-09 195_[+1(1.04e-06)]_37_\ [+3(5.88e-10)]_231 34476 6.23e-06 204_[+1(6.60e-07)]_29_\ [+2(2.99e-07)]_236 35404 1.03e-18 53_[+2(5.79e-10)]_18_[+3(8.52e-12)]_\ 88_[+1(1.50e-09)]_289 586 1.03e-18 53_[+2(5.79e-10)]_18_[+3(8.52e-12)]_\ 88_[+1(1.50e-09)]_289 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************