******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/90/90.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 17166 1.0000 500 42826 1.0000 500 37595 1.0000 500 47471 1.0000 500 1764 1.0000 500 33277 1.0000 500 44168 1.0000 500 44275 1.0000 500 44906 1.0000 500 11527 1.0000 500 11743 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/90/90.seqs.fa -oc motifs/90 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.235 G 0.221 T 0.272 Background letter frequencies (from dataset with add-one prior applied): A 0.272 C 0.235 G 0.221 T 0.272 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 10 llr = 114 E-value = 1.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :1:33::1::3::82 pos.-specific C 1::43::::922:27 probability G 79:12:324:38::1 matrix T 2:a22a77612:a:: bits 2.2 2.0 * * * 1.7 ** * * 1.5 ** * * ** Relative 1.3 ** * * *** Entropy 1.1 ** ** ** *** (16.5 bits) 0.9 *** ***** **** 0.7 *** ***** **** 0.4 *** ***** **** 0.2 **** ***** **** 0.0 --------------- Multilevel GGTCATTTTCAGTAC consensus T AC GGG GC CA sequence TG C T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 44906 35 7.23e-08 AGATATTTTT GGTCGTTTGCTGTAC AGTGATTTCG 44275 274 7.23e-08 TTACAGTTAA GGTCCTGTTCTGTAC CGGTTTTCAC 42826 424 6.70e-07 AGCTATTTGC GGTTATTTGCGGTAA TACCGCATTT 1764 314 7.54e-07 CGGCCTACAT TGTACTGTTCAGTAC TCTGCTCACG 37595 72 1.39e-06 TGAAAAGCCA GGTATTTTGCCCTAC CTCCCCAGGC 11743 159 4.26e-06 GCAAAAATCG GGTTGTTTTCCGTCA GGCGTGCGAC 44168 473 4.26e-06 CGTCTGGACC CGTCATTATCAGTAC CACCAGTACC 33277 37 9.78e-06 GGGAGAACCT GATACTTGTCGCTAC GAAGTTGAAG 47471 271 1.11e-05 GTTTGCCGGC GGTGTTGTTCGGTCG TGGAGCACGG 17166 188 1.11e-05 AGTAGATCGC TGTCATTGGTAGTAC CAGTCGGCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44906 7.2e-08 34_[+1]_451 44275 7.2e-08 273_[+1]_212 42826 6.7e-07 423_[+1]_62 1764 7.5e-07 313_[+1]_172 37595 1.4e-06 71_[+1]_414 11743 4.3e-06 158_[+1]_327 44168 4.3e-06 472_[+1]_13 33277 9.8e-06 36_[+1]_449 47471 1.1e-05 270_[+1]_215 17166 1.1e-05 187_[+1]_298 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=10 44906 ( 35) GGTCGTTTGCTGTAC 1 44275 ( 274) GGTCCTGTTCTGTAC 1 42826 ( 424) GGTTATTTGCGGTAA 1 1764 ( 314) TGTACTGTTCAGTAC 1 37595 ( 72) GGTATTTTGCCCTAC 1 11743 ( 159) GGTTGTTTTCCGTCA 1 44168 ( 473) CGTCATTATCAGTAC 1 33277 ( 37) GATACTTGTCGCTAC 1 47471 ( 271) GGTGTTGTTCGGTCG 1 17166 ( 188) TGTCATTGGTAGTAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 5346 bayes= 9.31159 E= 1.5e+001 -997 -123 166 -44 -144 -997 203 -997 -997 -997 -997 188 14 77 -114 -44 14 35 -14 -44 -997 -997 -997 188 -997 -997 44 136 -144 -997 -14 136 -997 -997 86 114 -997 194 -997 -144 14 -23 44 -44 -997 -23 186 -997 -997 -997 -997 188 156 -23 -997 -997 -44 157 -114 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 10 E= 1.5e+001 0.000000 0.100000 0.700000 0.200000 0.100000 0.000000 0.900000 0.000000 0.000000 0.000000 0.000000 1.000000 0.300000 0.400000 0.100000 0.200000 0.300000 0.300000 0.200000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.300000 0.700000 0.100000 0.000000 0.200000 0.700000 0.000000 0.000000 0.400000 0.600000 0.000000 0.900000 0.000000 0.100000 0.300000 0.200000 0.300000 0.200000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.000000 1.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.700000 0.100000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GT]GT[CAT][ACGT]T[TG][TG][TG]C[AGCT][GC]T[AC][CA] -------------------------------------------------------------------------------- Time 1.05 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 5 llr = 89 E-value = 2.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::::4::24:222:2a:: pos.-specific C 2::2842::::82::::::: probability G :88:2428a8::648a8:a4 matrix T 8228:222::62:4:::::6 bits 2.2 * * * 2.0 * * ** 1.7 * * ** 1.5 * * ** Relative 1.3 ** * *** * ***** Entropy 1.1 ***** *** * ****** (25.7 bits) 0.9 ***** ***** ****** 0.7 ****** ****** ****** 0.4 ****** ************* 0.2 ****** ************* 0.0 -------------------- Multilevel TGGTCCAGGGTCGGGGGAGT consensus CTTCGGCT AATATA A G sequence TG CA T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 44168 61 1.00e-09 CGGCCCCTCA TGGTGGTGGGACGAGGGAGG ATTGTGGGAC 17166 428 2.90e-09 CGGTAGAAGA TTGTCTCGGGTCCTGGGAGT TACCAATCCA 47471 307 3.16e-09 TTCTATTTTG TGGTCCATGGTCAGGGAAGT CAGTTGGAAG 1764 55 5.67e-09 CCATTTCCAC CGGTCGGGGATTGTGGGAGT ATTCCACTGT 11527 158 6.72e-09 TGTTCGCCGT TGTCCCAGGGACGGAGGAGG CGAAGGCTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44168 1e-09 60_[+2]_420 17166 2.9e-09 427_[+2]_53 47471 3.2e-09 306_[+2]_174 1764 5.7e-09 54_[+2]_426 11527 6.7e-09 157_[+2]_323 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=5 44168 ( 61) TGGTGGTGGGACGAGGGAGG 1 17166 ( 428) TTGTCTCGGGTCCTGGGAGT 1 47471 ( 307) TGGTCCATGGTCAGGGAAGT 1 1764 ( 55) CGGTCGGGGATTGTGGGAGT 1 11527 ( 158) TGTCCCAGGGACGGAGGAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5291 bayes= 10.2978 E= 2.6e+002 -897 -23 -897 155 -897 -897 185 -44 -897 -897 185 -44 -897 -23 -897 155 -897 176 -14 -897 -897 77 86 -44 56 -23 -14 -44 -897 -897 185 -44 -897 -897 218 -897 -44 -897 185 -897 56 -897 -897 114 -897 176 -897 -44 -44 -23 144 -897 -44 -897 86 55 -44 -897 185 -897 -897 -897 218 -897 -44 -897 185 -897 188 -897 -897 -897 -897 -897 218 -897 -897 -897 86 114 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 5 E= 2.6e+002 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.200000 0.000000 0.800000 0.000000 0.800000 0.200000 0.000000 0.000000 0.400000 0.400000 0.200000 0.400000 0.200000 0.200000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.400000 0.000000 0.000000 0.600000 0.000000 0.800000 0.000000 0.200000 0.200000 0.200000 0.600000 0.000000 0.200000 0.000000 0.400000 0.400000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.600000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TC][GT][GT][TC][CG][CGT][ACGT][GT]G[GA][TA][CT][GAC][GTA][GA]G[GA]AG[TG] -------------------------------------------------------------------------------- Time 2.27 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 11 llr = 123 E-value = 6.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 812121:22::52611 pos.-specific C 1416318::4:18::2 probability G 15::1::8::a3:427 matrix T :173582:86:1::7: bits 2.2 * 2.0 * 1.7 * 1.5 * * Relative 1.3 ** * * Entropy 1.1 * ****** ** * (16.1 bits) 0.9 * ** ****** **** 0.7 * ** ****** **** 0.4 **** *********** 0.2 **************** 0.0 ---------------- Multilevel AGTCTTCGTTGACATG consensus C TC C G G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 1764 413 1.05e-08 CTGGCATGAC ACTCCTCGTCGGCATG TTGGCGGGAG 33277 262 2.43e-07 AGCCTAAGAA AGTTTTCGTCGGCGGG AACATAGATG 17166 34 6.03e-07 GATTCCGTAG AATCTTCGACGACGTG AATTCTATCA 44906 138 1.01e-06 CTATGATTCC ATTCATCGTTGTCATG TCGCTTGACC 44275 246 3.32e-06 TAAATGTAAG ACTTTTCGTTGAAAGC ATTTACAGTT 11527 403 3.94e-06 GAAAGTAGCA CGTCTTCGTTGCCAAG TCTGGATGAA 44168 175 4.65e-06 GAGTTGGGAA AGCCGTCGTTGACGTA CTTTCCGGGT 42826 157 4.65e-06 CTGGGAGACA GGTCCCTGTTGACATG ATCTTGCTGA 11743 195 5.46e-06 GTAGTTTTCC ACAACTTGTTGACGTG CCAAAATTCT 47471 195 8.03e-06 TGCGACGAGG AGACTTCATTGGAATC GCAAGGGTTC 37595 48 2.55e-05 GGCGTCGGCG ACTTAACAACGACATG AAAAGCCAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1764 1.1e-08 412_[+3]_72 33277 2.4e-07 261_[+3]_223 17166 6e-07 33_[+3]_451 44906 1e-06 137_[+3]_347 44275 3.3e-06 245_[+3]_239 11527 3.9e-06 402_[+3]_82 44168 4.6e-06 174_[+3]_310 42826 4.6e-06 156_[+3]_328 11743 5.5e-06 194_[+3]_290 47471 8e-06 194_[+3]_290 37595 2.5e-05 47_[+3]_437 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=11 1764 ( 413) ACTCCTCGTCGGCATG 1 33277 ( 262) AGTTTTCGTCGGCGGG 1 17166 ( 34) AATCTTCGACGACGTG 1 44906 ( 138) ATTCATCGTTGTCATG 1 44275 ( 246) ACTTTTCGTTGAAAGC 1 11527 ( 403) CGTCTTCGTTGCCAAG 1 44168 ( 175) AGCCGTCGTTGACGTA 1 42826 ( 157) GGTCCCTGTTGACATG 1 11743 ( 195) ACAACTTGTTGACGTG 1 47471 ( 195) AGACTTCATTGGAATC 1 37595 ( 48) ACTTAACAACGACATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5335 bayes= 8.91886 E= 6.7e+002 159 -137 -128 -1010 -158 63 104 -158 -58 -137 -1010 142 -158 144 -1010 0 -58 21 -128 74 -158 -137 -1010 159 -1010 180 -1010 -58 -58 -1010 189 -1010 -58 -1010 -1010 159 -1010 63 -1010 122 -1010 -1010 218 -1010 101 -137 30 -158 -58 180 -1010 -1010 123 -1010 72 -1010 -158 -1010 -28 142 -158 -37 172 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 6.7e+002 0.818182 0.090909 0.090909 0.000000 0.090909 0.363636 0.454545 0.090909 0.181818 0.090909 0.000000 0.727273 0.090909 0.636364 0.000000 0.272727 0.181818 0.272727 0.090909 0.454545 0.090909 0.090909 0.000000 0.818182 0.000000 0.818182 0.000000 0.181818 0.181818 0.000000 0.818182 0.000000 0.181818 0.000000 0.000000 0.818182 0.000000 0.363636 0.000000 0.636364 0.000000 0.000000 1.000000 0.000000 0.545455 0.090909 0.272727 0.090909 0.181818 0.818182 0.000000 0.000000 0.636364 0.000000 0.363636 0.000000 0.090909 0.000000 0.181818 0.727273 0.090909 0.181818 0.727273 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[GC]T[CT][TC]TCGT[TC]G[AG]C[AG]TG -------------------------------------------------------------------------------- Time 3.33 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17166 8.50e-10 33_[+3(6.03e-07)]_138_\ [+1(1.11e-05)]_225_[+2(2.90e-09)]_53 42826 8.16e-05 156_[+3(4.65e-06)]_251_\ [+1(6.70e-07)]_62 37595 3.40e-04 47_[+3(2.55e-05)]_8_[+1(1.39e-06)]_\ 414 47471 1.01e-08 194_[+3(8.03e-06)]_60_\ [+1(1.11e-05)]_21_[+2(3.16e-09)]_174 1764 2.93e-12 54_[+2(5.67e-09)]_239_\ [+1(7.54e-07)]_84_[+3(1.05e-08)]_72 33277 4.11e-05 36_[+1(9.78e-06)]_210_\ [+3(2.43e-07)]_223 44168 8.70e-10 60_[+2(1.00e-09)]_94_[+3(4.65e-06)]_\ 282_[+1(4.26e-06)]_13 44275 8.71e-06 245_[+3(3.32e-06)]_12_\ [+1(7.23e-08)]_212 44906 1.90e-06 34_[+1(7.23e-08)]_88_[+3(1.01e-06)]_\ 195_[+1(1.47e-05)]_137 11527 5.79e-07 157_[+2(6.72e-09)]_225_\ [+3(3.94e-06)]_82 11743 2.89e-04 158_[+1(4.26e-06)]_21_\ [+3(5.46e-06)]_290 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************