******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/158/158.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 38038 1.0000 500 48012 1.0000 500 40962 1.0000 500 26802 1.0000 500 54360 1.0000 500 26896 1.0000 500 34971 1.0000 500 45387 1.0000 500 36992 1.0000 500 50283 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/158/158.seqs.fa -oc motifs/158 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.278 C 0.232 G 0.199 T 0.291 Background letter frequencies (from dataset with add-one prior applied): A 0.278 C 0.232 G 0.199 T 0.291 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 124 E-value = 2.3e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::::2525:2a22::8::: pos.-specific C :5:32::5:::5:::3a::2a probability G :33:8a7:85a3::87:::5: matrix T a277::2::::::8:::2a3: bits 2.3 * * 2.1 * * * * 1.9 * * * * * * * 1.6 * ** * * * * * * * Relative 1.4 * ** * * * *** * * Entropy 1.2 * ** *** ******* * (29.8 bits) 0.9 * ********* ******* * 0.7 ********************* 0.5 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TCTTGGGAGAGCATGGCATGC consensus GGC C G G C T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45387 311 6.41e-13 CTCTGAAATG TCTTGGGAGAGCATGGCATGC TCTATTTCTT 40962 311 6.41e-13 CTCTGAAATG TCTTGGGAGAGCATGGCATGC TCTATTTCTT 38038 311 6.41e-13 CTCTGAAATG TCTTGGGAGAGCATGGCATGC TCTATTTCTT 54360 417 1.84e-09 AATTTAAACA TGTTCGGCAGGGAAGGCATTC CCGTCCGCAA 48012 442 1.93e-09 GGTCGTGCCG TGGCGGTCGGGGATACCATTC CGGTCGTGCA 26802 173 5.77e-09 TCATCAGATG TTGCGGACGGGAATGCCTTCC CTGCCGAACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45387 6.4e-13 310_[+1]_169 40962 6.4e-13 310_[+1]_169 38038 6.4e-13 310_[+1]_169 54360 1.8e-09 416_[+1]_63 48012 1.9e-09 441_[+1]_38 26802 5.8e-09 172_[+1]_307 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 45387 ( 311) TCTTGGGAGAGCATGGCATGC 1 40962 ( 311) TCTTGGGAGAGCATGGCATGC 1 38038 ( 311) TCTTGGGAGAGCATGGCATGC 1 54360 ( 417) TGTTCGGCAGGGAAGGCATTC 1 48012 ( 442) TGGCGGTCGGGGATACCATTC 1 26802 ( 173) TTGCGGACGGGAATGCCTTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 10.09 E= 2.3e-006 -923 -923 -923 178 -923 110 74 -80 -923 -923 74 120 -923 52 -923 120 -923 -48 206 -923 -923 -923 233 -923 -74 -923 174 -80 85 110 -923 -923 -74 -923 206 -923 85 -923 133 -923 -923 -923 233 -923 -74 110 74 -923 185 -923 -923 -923 -74 -923 -923 152 -74 -923 206 -923 -923 52 174 -923 -923 210 -923 -923 158 -923 -923 -80 -923 -923 -923 178 -923 -48 133 20 -923 210 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 2.3e-006 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.333333 0.166667 0.000000 0.000000 0.333333 0.666667 0.000000 0.333333 0.000000 0.666667 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.666667 0.166667 0.500000 0.500000 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.500000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.833333 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.500000 0.333333 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[CG][TG][TC]GGG[AC]G[AG]G[CG]ATG[GC]CAT[GT]C -------------------------------------------------------------------------------- Time 0.97 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 6 llr = 120 E-value = 1.7e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :8:::5::::::a:::87:: pos.-specific C 22:88:2:785::::72::: probability G 8:a::2:53:5a:823:2:a matrix T :::22385:2:::28::2a: bits 2.3 * * * 2.1 * * * 1.9 * ** ** 1.6 * * *** ** Relative 1.4 * *** * *** ** Entropy 1.2 ***** *********** ** (28.8 bits) 0.9 ***** *********** ** 0.7 ***** ************** 0.5 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GAGCCATGCCCGAGTCAATG consensus T TG G G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 45387 415 2.17e-12 GACCCGAACT GAGCCATTCCCGAGTCAATG TAGTGTCCAT 40962 415 2.17e-12 GACCCGAACT GAGCCATTCCCGAGTCAATG TAGTGTCCAT 38038 415 2.17e-12 GACCCGAACT GAGCCATTCCCGAGTCAATG TAGTGTCCAT 54360 50 1.21e-09 AATTGAAGCC CAGCCTTGGCGGAGGCAGTG GTGGATTTCT 26802 83 7.87e-09 GCCTTCTTCG GCGCCTTGGCGGATTGCTTG GAGGGTGTCT 48012 244 9.16e-09 TGCTCTGTAC GAGTTGCGCTGGAGTGAATG ATCGACAGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45387 2.2e-12 414_[+2]_66 40962 2.2e-12 414_[+2]_66 38038 2.2e-12 414_[+2]_66 54360 1.2e-09 49_[+2]_431 26802 7.9e-09 82_[+2]_398 48012 9.2e-09 243_[+2]_237 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=6 45387 ( 415) GAGCCATTCCCGAGTCAATG 1 40962 ( 415) GAGCCATTCCCGAGTCAATG 1 38038 ( 415) GAGCCATTCCCGAGTCAATG 1 54360 ( 50) CAGCCTTGGCGGAGGCAGTG 1 26802 ( 83) GCGCCTTGGCGGATTGCTTG 1 48012 ( 244) GAGTTGCGCTGGAGTGAATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4810 bayes= 10.093 E= 1.7e-006 -923 -48 206 -923 158 -48 -923 -923 -923 -923 233 -923 -923 184 -923 -80 -923 184 -923 -80 85 -923 -26 20 -923 -48 -923 152 -923 -923 133 78 -923 152 74 -923 -923 184 -923 -80 -923 110 133 -923 -923 -923 233 -923 185 -923 -923 -923 -923 -923 206 -80 -923 -923 -26 152 -923 152 74 -923 158 -48 -923 -923 126 -923 -26 -80 -923 -923 -923 178 -923 -923 233 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 1.7e-006 0.000000 0.166667 0.833333 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.833333 0.000000 0.166667 0.500000 0.000000 0.166667 0.333333 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.500000 0.500000 0.000000 0.666667 0.333333 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.166667 0.833333 0.000000 0.666667 0.333333 0.000000 0.833333 0.166667 0.000000 0.000000 0.666667 0.000000 0.166667 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GAGCC[AT]T[GT][CG]C[CG]GAGT[CG]AATG -------------------------------------------------------------------------------- Time 1.79 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 7 llr = 131 E-value = 4.8e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::a:a:::9a141:14739:9 pos.-specific C :9:6:::a::4634:::3:9: probability G :::::a::1:::::4::::1: matrix T a1:4::a:::4:6646341:1 bits 2.3 * 2.1 * * 1.9 * * **** * 1.6 * * **** * * Relative 1.4 *** ****** * Entropy 1.2 *** ****** *** (27.0 bits) 0.9 ********** * * ** *** 0.7 ********** * * ** *** 0.5 ***************** *** 0.2 ********************* 0.0 --------------------- Multilevel TCACAGTCAACCTTGTATACA consensus T TACCTATA sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45387 453 4.60e-11 ATTGCATCCC TCATAGTCAACCTTTTATACA CAACTTACAC 40962 453 4.60e-11 ATTGCATCCC TCATAGTCAACCTTTTATACA CAACTTACAC 38038 453 4.60e-11 ATTGCATCCC TCATAGTCAACCTTTTATACA CAACTTACAC 26896 467 2.27e-10 ATCGTCTTAC TCACAGTCAATACCGAAAACA AACAACAGCC 34971 467 2.68e-09 ACAAAAGTCT TTACAGTCAATACCGAAAACA AACAACAGCC 48012 307 1.49e-08 CTTTTACACT TCACAGTCAATCACAATCAGA GGTAGTGTCG 26802 384 3.19e-08 CACAATCGGG TCACAGTCGAAATTGTTCTCT GCGCCGGATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45387 4.6e-11 452_[+3]_27 40962 4.6e-11 452_[+3]_27 38038 4.6e-11 452_[+3]_27 26896 2.3e-10 466_[+3]_13 34971 2.7e-09 466_[+3]_13 48012 1.5e-08 306_[+3]_173 26802 3.2e-08 383_[+3]_96 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=7 45387 ( 453) TCATAGTCAACCTTTTATACA 1 40962 ( 453) TCATAGTCAACCTTTTATACA 1 38038 ( 453) TCATAGTCAACCTTTTATACA 1 26896 ( 467) TCACAGTCAATACCGAAAACA 1 34971 ( 467) TTACAGTCAATACCGAAAACA 1 48012 ( 307) TCACAGTCAATCACAATCAGA 1 26802 ( 384) TCACAGTCGAAATTGTTCTCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 9.263 E= 4.8e-006 -945 -945 -945 178 -945 188 -945 -102 185 -945 -945 -945 -945 130 -945 56 185 -945 -945 -945 -945 -945 233 -945 -945 -945 -945 178 -945 210 -945 -945 162 -945 -48 -945 185 -945 -945 -945 -96 88 -945 56 62 130 -945 -945 -96 30 -945 97 -945 88 -945 97 -96 -945 110 56 62 -945 -945 97 136 -945 -945 -2 4 30 -945 56 162 -945 -945 -102 -945 188 -48 -945 162 -945 -945 -102 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 4.8e-006 0.000000 0.000000 0.000000 1.000000 0.000000 0.857143 0.000000 0.142857 1.000000 0.000000 0.000000 0.000000 0.000000 0.571429 0.000000 0.428571 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.428571 0.000000 0.428571 0.428571 0.571429 0.000000 0.000000 0.142857 0.285714 0.000000 0.571429 0.000000 0.428571 0.000000 0.571429 0.142857 0.000000 0.428571 0.428571 0.428571 0.000000 0.000000 0.571429 0.714286 0.000000 0.000000 0.285714 0.285714 0.285714 0.000000 0.428571 0.857143 0.000000 0.000000 0.142857 0.000000 0.857143 0.142857 0.000000 0.857143 0.000000 0.000000 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TCA[CT]AGTCAA[CT][CA][TC][TC][GT][TA][AT][TAC]ACA -------------------------------------------------------------------------------- Time 2.60 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38038 1.33e-23 310_[+1(6.41e-13)]_83_\ [+2(2.17e-12)]_18_[+3(4.60e-11)]_27 48012 2.22e-14 243_[+2(9.16e-09)]_43_\ [+3(1.49e-08)]_114_[+1(1.93e-09)]_38 40962 1.33e-23 310_[+1(6.41e-13)]_83_\ [+2(2.17e-12)]_18_[+3(4.60e-11)]_27 26802 1.12e-13 82_[+2(7.87e-09)]_70_[+1(5.77e-09)]_\ 190_[+3(3.19e-08)]_96 54360 1.60e-10 49_[+2(1.21e-09)]_347_\ [+1(1.84e-09)]_63 26896 6.14e-06 466_[+3(2.27e-10)]_13 34971 8.04e-05 466_[+3(2.68e-09)]_13 45387 1.33e-23 310_[+1(6.41e-13)]_83_\ [+2(2.17e-12)]_18_[+3(4.60e-11)]_27 36992 9.00e-02 500 50283 1.05e-01 168_[+2(4.82e-05)]_312 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************