******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/155/155.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 24084 1.0000 500 24855 1.0000 500 261602 1.0000 500 264001 1.0000 500 31298 1.0000 500 8003 1.0000 500 8420 1.0000 500 8459 1.0000 500 8582 1.0000 500 914 1.0000 500 9910 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/155/155.seqs.fa -oc motifs/155 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.248 C 0.223 G 0.245 T 0.284 Background letter frequencies (from dataset with add-one prior applied): A 0.248 C 0.223 G 0.245 T 0.284 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 4 llr = 86 E-value = 3.2e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :85:::3::::::5:::::5: pos.-specific C a:3a3888a8a855a8:aa38 probability G :33:5::::3:3:::3:::3: matrix T ::::33:3::::5:::a:::3 bits 2.2 * * * * * ** 1.9 * * * * * ** 1.7 * * * * * *** 1.5 * * * * * *** Relative 1.3 ** * ******* ***** * Entropy 1.1 ** * ************** * (31.0 bits) 0.9 ** * ************** * 0.6 **** **************** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CAACGCCCCCCCCACCTCCAC consensus GC CTAT G GTC G CT sequence G T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 31298 243 4.11e-11 ATCTCGTTAT CAACCTCCCCCCTACCTCCCC CCCACGCCCA 264001 94 4.95e-11 GTGACATTTA CGGCGCCCCCCCCCCCTCCAT ACGCTCGAAC 24855 469 9.06e-11 CTGCTCTCTG CAACTCCTCGCCCACCTCCAC CACTATCAAT 261602 433 3.79e-10 TCTCACCAAA CACCGCACCCCGTCCGTCCGC CCTTTGCTCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31298 4.1e-11 242_[+1]_237 264001 5e-11 93_[+1]_386 24855 9.1e-11 468_[+1]_11 261602 3.8e-10 432_[+1]_47 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=4 31298 ( 243) CAACCTCCCCCCTACCTCCCC 1 264001 ( 94) CGGCGCCCCCCCCCCCTCCAT 1 24855 ( 469) CAACTCCTCGCCCACCTCCAC 1 261602 ( 433) CACCGCACCCCGTCCGTCCGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 10.3652 E= 3.2e-001 -865 216 -865 -865 159 -865 3 -865 101 16 3 -865 -865 216 -865 -865 -865 16 103 -18 -865 175 -865 -18 1 175 -865 -865 -865 175 -865 -18 -865 216 -865 -865 -865 175 3 -865 -865 216 -865 -865 -865 175 3 -865 -865 116 -865 81 101 116 -865 -865 -865 216 -865 -865 -865 175 3 -865 -865 -865 -865 181 -865 216 -865 -865 -865 216 -865 -865 101 16 3 -865 -865 175 -865 -18 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 3.2e-001 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.750000 0.000000 0.250000 0.250000 0.750000 0.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.500000 0.000000 0.500000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.750000 0.000000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[AG][ACG]C[GCT][CT][CA][CT]C[CG]C[CG][CT][AC]C[CG]TCC[ACG][CT] -------------------------------------------------------------------------------- Time 1.48 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 8 llr = 120 E-value = 5.4e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :56:9::543:3::1::3:: pos.-specific C ::::1:5:351::41113:: probability G a51a:8:41193163:95:9 matrix T ::3::35131:59:59::a1 bits 2.2 1.9 * * 1.7 * * * 1.5 * ** * * ** Relative 1.3 * ** * * ** ** Entropy 1.1 ** **** * ** ** ** (21.6 bits) 0.9 ** **** * ** ** ** 0.6 ******** * ** ***** 0.4 ******** **** ***** 0.2 ******** *********** 0.0 -------------------- Multilevel GAAGAGCAACGTTGTTGGTG consensus GT TTGCA A CG A sequence T G C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 8459 160 1.62e-10 ATGTCCGCTA GGAGAGTACCGTTGGTGGTG CAGTTGGAGA 9910 128 7.40e-10 TCTACTTTGA GAAGAGCATCGATGTTGATG GGGGACTTGA 914 181 2.60e-09 AGAAGCCCAT GGAGAGTGAGGTTCTTGGTG TACGTGTACG 24855 43 3.14e-08 TTATGAGAGT GAAGATCGCCGGTGTTCGTG TCGTGTGGTC 8582 251 1.71e-07 ATGGAGTGGT GGAGAGTGGACATCTTGCTG CAAGGCTGTT 261602 100 3.34e-07 GATGCTCTGC GATGCGCATCGGTCGCGGTG GAGTCCATCA 31298 33 6.50e-07 TCAGGTAAAG GAGGATTAAAGTGGATGCTG TCGAGAGTCA 264001 323 8.28e-07 AGATGCAACG GGTGAGCTATGTTGCTGATT CTACTGGCTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8459 1.6e-10 159_[+2]_321 9910 7.4e-10 127_[+2]_353 914 2.6e-09 180_[+2]_300 24855 3.1e-08 42_[+2]_438 8582 1.7e-07 250_[+2]_230 261602 3.3e-07 99_[+2]_381 31298 6.5e-07 32_[+2]_448 264001 8.3e-07 322_[+2]_158 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=8 8459 ( 160) GGAGAGTACCGTTGGTGGTG 1 9910 ( 128) GAAGAGCATCGATGTTGATG 1 914 ( 181) GGAGAGTGAGGTTCTTGGTG 1 24855 ( 43) GAAGATCGCCGGTGTTCGTG 1 8582 ( 251) GGAGAGTGGACATCTTGCTG 1 261602 ( 100) GATGCGCATCGGTCGCGGTG 1 31298 ( 33) GAGGATTAAAGTGGATGCTG 1 264001 ( 323) GGTGAGCTATGTTGCTGATT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5291 bayes= 9.36714 E= 5.4e+000 -965 -965 203 -965 101 -965 103 -965 133 -965 -97 -18 -965 -965 203 -965 182 -83 -965 -965 -965 -965 161 -18 -965 116 -965 81 101 -965 61 -118 60 17 -97 -18 1 116 -97 -118 -965 -83 184 -965 1 -965 3 81 -965 -965 -97 162 -965 75 135 -965 -99 -83 3 81 -965 -83 -965 162 -965 -83 184 -965 1 17 103 -965 -965 -965 -965 181 -965 -965 184 -118 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 8 E= 5.4e+000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.625000 0.000000 0.125000 0.250000 0.000000 0.000000 1.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.500000 0.000000 0.500000 0.500000 0.000000 0.375000 0.125000 0.375000 0.250000 0.125000 0.250000 0.250000 0.500000 0.125000 0.125000 0.000000 0.125000 0.875000 0.000000 0.250000 0.000000 0.250000 0.500000 0.000000 0.000000 0.125000 0.875000 0.000000 0.375000 0.625000 0.000000 0.125000 0.125000 0.250000 0.500000 0.000000 0.125000 0.000000 0.875000 0.000000 0.125000 0.875000 0.000000 0.250000 0.250000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[AG][AT]GA[GT][CT][AG][ACT][CA]G[TAG]T[GC][TG]TG[GAC]TG -------------------------------------------------------------------------------- Time 2.89 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 8 llr = 127 E-value = 2.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :6518845:995569554833 pos.-specific C 9:491:6:91:5431446318 probability G :41:1::31:1::::::::6: matrix T 1::::3:3::::11:11:::: bits 2.2 1.9 1.7 1.5 * * *** * Relative 1.3 * * *** * * * Entropy 1.1 ** * ** **** * ** * (22.9 bits) 0.9 ** **** **** * ** * 0.6 ******* ************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CAACAACACAAAAAAAACAGC consensus GC TAG CCC CCACAA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 8459 459 1.20e-10 CCCAATTTCA CAACAACTCAAACAAAACAAC ACCACTTCAA 24084 234 5.15e-10 CAGCCGACGA CGGCAACACAAAAAACCAAGC CAGCCAGCTG 8003 69 1.40e-09 GGCCGCCACC CACCGACGCAACACACACAGC ACCCGCACAT 8420 472 3.09e-09 ATTTCTAAAT TACCAAAACAACAAAACCAAC TTCTCCCG 24855 433 9.44e-08 TCTGTATTTG CACCCTCACCGACAAACCAGC CGCTCCTGCT 8582 467 1.58e-07 CCGATGCATT CGACAAAACAACTCCATCAGA CTGCACCGCC 261602 475 2.52e-07 CTCAACCCTT CAACATCTCAAACTACAACCA ACACA 31298 326 3.31e-07 CTGCTTCTAT CGAAAAAGGAACAAATAACGC TCATACTGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8459 1.2e-10 458_[+3]_21 24084 5.1e-10 233_[+3]_246 8003 1.4e-09 68_[+3]_411 8420 3.1e-09 471_[+3]_8 24855 9.4e-08 432_[+3]_47 8582 1.6e-07 466_[+3]_13 261602 2.5e-07 474_[+3]_5 31298 3.3e-07 325_[+3]_154 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=8 8459 ( 459) CAACAACTCAAACAAAACAAC 1 24084 ( 234) CGGCAACACAAAAAACCAAGC 1 8003 ( 69) CACCGACGCAACACACACAGC 1 8420 ( 472) TACCAAAACAACAAAACCAAC 1 24855 ( 433) CACCCTCACCGACAAACCAGC 1 8582 ( 467) CGACAAAACAACTCCATCAGA 1 261602 ( 475) CAACATCTCAAACTACAACCA 1 31298 ( 326) CGAAAAAGGAACAAATAACGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 9.36413 E= 2.0e+000 -965 197 -965 -118 133 -965 61 -965 101 75 -97 -965 -99 197 -965 -965 160 -83 -97 -965 160 -965 -965 -18 60 149 -965 -965 101 -965 3 -18 -965 197 -97 -965 182 -83 -965 -965 182 -965 -97 -965 101 116 -965 -965 101 75 -965 -118 133 17 -965 -118 182 -83 -965 -965 101 75 -965 -118 101 75 -965 -118 60 149 -965 -965 160 17 -965 -965 1 -83 135 -965 1 175 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 2.0e+000 0.000000 0.875000 0.000000 0.125000 0.625000 0.000000 0.375000 0.000000 0.500000 0.375000 0.125000 0.000000 0.125000 0.875000 0.000000 0.000000 0.750000 0.125000 0.125000 0.000000 0.750000 0.000000 0.000000 0.250000 0.375000 0.625000 0.000000 0.000000 0.500000 0.000000 0.250000 0.250000 0.000000 0.875000 0.125000 0.000000 0.875000 0.125000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.500000 0.500000 0.000000 0.000000 0.500000 0.375000 0.000000 0.125000 0.625000 0.250000 0.000000 0.125000 0.875000 0.125000 0.000000 0.000000 0.500000 0.375000 0.000000 0.125000 0.500000 0.375000 0.000000 0.125000 0.375000 0.625000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.125000 0.625000 0.000000 0.250000 0.750000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[AG][AC]CA[AT][CA][AGT]CAA[AC][AC][AC]A[AC][AC][CA][AC][GA][CA] -------------------------------------------------------------------------------- Time 4.21 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24084 6.40e-07 233_[+3(5.15e-10)]_15_\ [+1(4.67e-05)]_52_[+3(2.53e-05)]_137 24855 2.27e-14 42_[+2(3.14e-08)]_370_\ [+3(9.44e-08)]_15_[+1(9.06e-11)]_11 261602 2.08e-12 99_[+2(3.34e-07)]_313_\ [+1(3.79e-10)]_21_[+3(2.52e-07)]_5 264001 2.60e-09 93_[+1(4.95e-11)]_208_\ [+2(8.28e-07)]_158 31298 6.19e-13 32_[+2(6.50e-07)]_190_\ [+1(4.11e-11)]_62_[+3(3.31e-07)]_154 8003 4.84e-06 68_[+3(1.40e-09)]_411 8420 1.01e-04 471_[+3(3.09e-09)]_8 8459 2.03e-13 159_[+2(1.62e-10)]_279_\ [+3(1.20e-10)]_21 8582 9.85e-07 250_[+2(1.71e-07)]_196_\ [+3(1.58e-07)]_13 914 1.58e-05 180_[+2(2.60e-09)]_300 9910 2.10e-05 127_[+2(7.40e-10)]_270_\ [+2(1.60e-05)]_63 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************