******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/40/40.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47193 1.0000 500 21660 1.0000 500 47697 1.0000 500 48388 1.0000 500 40606 1.0000 500 49814 1.0000 500 40880 1.0000 500 44673 1.0000 500 45766 1.0000 500 48203 1.0000 500 37524 1.0000 500 43073 1.0000 500 47231 1.0000 500 46758 1.0000 500 49739 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/40/40.seqs.fa -oc motifs/40 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.267 C 0.237 G 0.223 T 0.272 Background letter frequencies (from dataset with add-one prior applied): A 0.267 C 0.237 G 0.223 T 0.272 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 11 llr = 128 E-value = 8.7e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::a:9:1:95: pos.-specific C 3:7:9:::8::3 probability G 2:::1:a1:15: matrix T 5a3::1:82::7 bits 2.2 * 1.9 * * * 1.7 * ** * 1.5 * **** * Relative 1.3 * **** ** Entropy 1.1 *********** (16.8 bits) 0.9 *********** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTCACAGTCAGT consensus C T AC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43073 340 6.90e-08 TTTTCCCATT TTCACAGTCAGT ATAACATACC 48388 225 6.90e-08 GGGCTTTCGG TTCACAGTCAGT CGTGATGACC 47697 446 6.90e-08 TTGATTGTGC TTCACAGTCAGT AGATGAGCTC 47231 73 4.00e-07 GATTTTTTGT CTCACAGTCAAT CAGTTCAGGA 44673 163 6.19e-07 GCCAATCACA GTCACAGTCAAT CAGATACGAA 40880 443 3.52e-06 TTACTGTTGA TTCACAGACAGC TTGCACTCGT 45766 407 4.81e-06 AGATATGGGG CTCACTGTCAAT CAAATTAGGA 48203 269 5.33e-06 CTTCTAGTTA CTTACAGTTAGT GAATGAAGCG 40606 125 6.77e-06 TGTTGTGGTA GTTACAGTTAGT GCATGCTCAT 46758 368 1.00e-05 TTTACTAGAA TTTACAGGCAAC GCTCTTATTT 37524 465 1.95e-05 GTTCTCTTCA TTCAGAGTCGAC CTTCCTGTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43073 6.9e-08 339_[+1]_149 48388 6.9e-08 224_[+1]_264 47697 6.9e-08 445_[+1]_43 47231 4e-07 72_[+1]_416 44673 6.2e-07 162_[+1]_326 40880 3.5e-06 442_[+1]_46 45766 4.8e-06 406_[+1]_82 48203 5.3e-06 268_[+1]_220 40606 6.8e-06 124_[+1]_364 46758 1e-05 367_[+1]_121 37524 2e-05 464_[+1]_24 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=11 43073 ( 340) TTCACAGTCAGT 1 48388 ( 225) TTCACAGTCAGT 1 47697 ( 446) TTCACAGTCAGT 1 47231 ( 73) CTCACAGTCAAT 1 44673 ( 163) GTCACAGTCAAT 1 40880 ( 443) TTCACAGACAGC 1 45766 ( 407) CTCACTGTCAAT 1 48203 ( 269) CTTACAGTTAGT 1 40606 ( 125) GTTACAGTTAGT 1 46758 ( 368) TTTACAGGCAAC 1 37524 ( 465) TTCAGAGTCGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 10.4066 E= 8.7e-004 -1010 20 -29 100 -1010 -1010 -1010 188 -1010 161 -1010 0 190 -1010 -1010 -1010 -1010 194 -129 -1010 177 -1010 -1010 -158 -1010 -1010 216 -1010 -155 -1010 -129 159 -1010 178 -1010 -58 177 -1010 -129 -1010 77 -1010 129 -1010 -1010 20 -1010 142 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 8.7e-004 0.000000 0.272727 0.181818 0.545455 0.000000 0.000000 0.000000 1.000000 0.000000 0.727273 0.000000 0.272727 1.000000 0.000000 0.000000 0.000000 0.000000 0.909091 0.090909 0.000000 0.909091 0.000000 0.000000 0.090909 0.000000 0.000000 1.000000 0.000000 0.090909 0.000000 0.090909 0.818182 0.000000 0.818182 0.000000 0.181818 0.909091 0.000000 0.090909 0.000000 0.454545 0.000000 0.545455 0.000000 0.000000 0.272727 0.000000 0.727273 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TC]T[CT]ACAGTCA[GA][TC] -------------------------------------------------------------------------------- Time 2.18 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 8 llr = 105 E-value = 1.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::8:11:::a18:: pos.-specific C 1953:3511::4:1: probability G :15::63::a:1:6: matrix T 9:::a:199::433a bits 2.2 * 1.9 * ** * 1.7 * ** * 1.5 * * ** * Relative 1.3 ** * **** * Entropy 1.1 ***** **** * * (18.8 bits) 0.9 ****** **** *** 0.6 ****** **** *** 0.4 ****** **** *** 0.2 *************** 0.0 --------------- Multilevel TCCATGCTTGACAGT consensus GC CG TTT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 48388 83 6.95e-09 TTAAGCCAGT TCGATGGTTGATAGT TCGTTGGTCG 40880 349 2.46e-08 CTTACTACGG TCCATGCTTGACATT GACTCTATAG 45766 444 7.44e-08 AACCGTCACA TCCATACTTGACAGT ATACAAATAC 47697 293 6.65e-07 AGAGCGATTT CCCATGCCTGACAGT ACGTGGCTGT 47193 70 7.99e-07 ATCCGCAGTT TGGATCCTTGAGAGT GACACTGCGT 37524 5 1.63e-06 CCGG TCGCTGGTCGAAAGT TGTTTGATCC 48203 405 1.75e-06 GGTGTGCGTC TCGCTGTTTGATTTT TGACAAATCC 49739 51 2.58e-06 GGGTGATCAG TCCATCATTGATTCT TATCGGATCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48388 7e-09 82_[+2]_403 40880 2.5e-08 348_[+2]_137 45766 7.4e-08 443_[+2]_42 47697 6.7e-07 292_[+2]_193 47193 8e-07 69_[+2]_416 37524 1.6e-06 4_[+2]_481 48203 1.8e-06 404_[+2]_81 49739 2.6e-06 50_[+2]_435 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=8 48388 ( 83) TCGATGGTTGATAGT 1 40880 ( 349) TCCATGCTTGACATT 1 45766 ( 444) TCCATACTTGACAGT 1 47697 ( 293) CCCATGCCTGACAGT 1 47193 ( 70) TGGATCCTTGAGAGT 1 37524 ( 5) TCGCTGGTCGAAAGT 1 48203 ( 405) TCGCTGTTTGATTTT 1 49739 ( 51) TCCATCATTGATTCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 7290 bayes= 9.83012 E= 1.1e+002 -965 -92 -965 168 -965 188 -83 -965 -965 107 116 -965 149 7 -965 -965 -965 -965 -965 187 -109 7 149 -965 -109 107 16 -112 -965 -92 -965 168 -965 -92 -965 168 -965 -965 216 -965 190 -965 -965 -965 -109 66 -83 46 149 -965 -965 -12 -965 -92 149 -12 -965 -965 -965 187 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 8 E= 1.1e+002 0.000000 0.125000 0.000000 0.875000 0.000000 0.875000 0.125000 0.000000 0.000000 0.500000 0.500000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.125000 0.250000 0.625000 0.000000 0.125000 0.500000 0.250000 0.125000 0.000000 0.125000 0.000000 0.875000 0.000000 0.125000 0.000000 0.875000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.375000 0.125000 0.375000 0.750000 0.000000 0.000000 0.250000 0.000000 0.125000 0.625000 0.250000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TC[CG][AC]T[GC][CG]TTGA[CT][AT][GT]T -------------------------------------------------------------------------------- Time 4.16 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 8 llr = 91 E-value = 6.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 33:::3168a:a pos.-specific C 61:a::53:::: probability G :4::a:413:a: matrix T 13a::8:::::: bits 2.2 ** * 1.9 *** *** 1.7 *** *** 1.5 *** *** Relative 1.3 *** *** Entropy 1.1 **** **** (16.5 bits) 0.9 **** **** 0.6 * ********** 0.4 * ********** 0.2 ************ 0.0 ------------ Multilevel CGTCGTCAAAGA consensus AA AGCG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48203 41 2.42e-07 AAGGCGGCCC CATCGTCAAAGA TCGATTTCTG 48388 476 2.42e-07 TCCAACAATT CTTCGTCAAAGA CGTGGTGTGG 47697 176 1.06e-06 TCAAACCATG CTTCGTCCAAGA TTTTCGAAAA 47193 388 5.00e-06 TCCGATTCTT CGTCGTGGGAGA AAATCATCAA 37524 416 5.83e-06 TCGAATTCTA CCTCGTAAAAGA AGATAAATCT 21660 154 5.83e-06 GTCCGACGCC AGTCGTCCGAGA CGTTAAATGT 40606 335 6.58e-06 AGAATCGTAT AATCGAGAAAGA GTGATAGCAA 49739 462 7.49e-06 GCACCCGAGG TGTCGAGAAAGA GCCTCTACGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48203 2.4e-07 40_[+3]_448 48388 2.4e-07 475_[+3]_13 47697 1.1e-06 175_[+3]_313 47193 5e-06 387_[+3]_101 37524 5.8e-06 415_[+3]_73 21660 5.8e-06 153_[+3]_335 40606 6.6e-06 334_[+3]_154 49739 7.5e-06 461_[+3]_27 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=8 48203 ( 41) CATCGTCAAAGA 1 48388 ( 476) CTTCGTCAAAGA 1 47697 ( 176) CTTCGTCCAAGA 1 47193 ( 388) CGTCGTGGGAGA 1 37524 ( 416) CCTCGTAAAAGA 1 21660 ( 154) AGTCGTCCGAGA 1 40606 ( 335) AATCGAGAAAGA 1 49739 ( 462) TGTCGAGAAAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 10.5766 E= 6.7e+002 -10 139 -965 -112 -10 -92 75 -12 -965 -965 -965 187 -965 207 -965 -965 -965 -965 216 -965 -10 -965 -965 146 -109 107 75 -965 123 7 -83 -965 149 -965 16 -965 190 -965 -965 -965 -965 -965 216 -965 190 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 6.7e+002 0.250000 0.625000 0.000000 0.125000 0.250000 0.125000 0.375000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.125000 0.500000 0.375000 0.000000 0.625000 0.250000 0.125000 0.000000 0.750000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA][GAT]TCG[TA][CG][AC][AG]AGA -------------------------------------------------------------------------------- Time 6.15 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47193 6.96e-05 69_[+2(7.99e-07)]_303_\ [+3(5.00e-06)]_101 21660 4.96e-02 153_[+3(5.83e-06)]_335 47697 2.05e-09 175_[+3(1.06e-06)]_105_\ [+2(6.65e-07)]_138_[+1(6.90e-08)]_43 48388 7.32e-12 82_[+2(6.95e-09)]_127_\ [+1(6.90e-08)]_239_[+3(2.42e-07)]_13 40606 5.85e-04 124_[+1(6.77e-06)]_198_\ [+3(6.58e-06)]_154 49814 2.04e-01 472_[+1(6.07e-05)]_16 40880 2.86e-06 271_[+2(3.26e-05)]_62_\ [+2(2.46e-08)]_79_[+1(3.52e-06)]_46 44673 3.43e-03 162_[+1(6.19e-07)]_326 45766 7.51e-06 315_[+1(2.53e-05)]_79_\ [+1(4.81e-06)]_25_[+2(7.44e-08)]_42 48203 6.99e-08 40_[+3(2.42e-07)]_216_\ [+1(5.33e-06)]_124_[+2(1.75e-06)]_81 37524 3.73e-06 4_[+2(1.63e-06)]_396_[+3(5.83e-06)]_\ 37_[+1(1.95e-05)]_24 43073 3.01e-04 339_[+1(6.90e-08)]_149 47231 6.62e-03 72_[+1(4.00e-07)]_416 46758 5.23e-02 367_[+1(1.00e-05)]_121 49739 3.83e-04 50_[+2(2.58e-06)]_396_\ [+3(7.49e-06)]_27 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************