******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/276/276.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 38980 1.0000 500 37227 1.0000 500 32850 1.0000 500 41291 1.0000 500 40630 1.0000 500 39077 1.0000 500 46052 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/276/276.seqs.fa -oc motifs/276 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.241 G 0.215 T 0.284 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.241 G 0.215 T 0.284 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 115 E-value = 1.0e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 8:8::::3:53:::38a:22: pos.-specific C :525:352:::::27::a::2 probability G 2::5:733a53:a2:::::88 matrix T :5::a:22::3a:7:2::8:: bits 2.2 * * 2.0 * * ** 1.8 * * ** ** 1.6 * * ** ** ** Relative 1.3 * * ** * ** *** ** Entropy 1.1 * **** ** ** ******* (27.5 bits) 0.9 ****** ** ** ******* 0.7 ******* ** ********** 0.4 ******* ************* 0.2 ********************* 0.0 --------------------- Multilevel ACACTGCAGAATGTCAACTGG consensus T G CGG GG A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 39077 479 5.69e-12 CGAGTACCTC ACAGTGGGGAGTGTCAACTGG A 38980 479 5.69e-12 CGAGTACCTC ACAGTGGGGAGTGTCAACTGG A 40630 479 4.42e-11 GGTCTACTAG ATACTCCAGGATGTCAACTGG A 41291 479 4.42e-11 GGTCTACTAG ATACTCCAGGATGTCAACTGG A 46052 395 2.31e-08 AAAGCCTGGA ACACTGCTGGTTGCAAACAAC ACCGTCCTCC 32850 100 2.99e-08 CCACGTGGTA GTCGTGTCGATTGGATACTGG GTTTCTTGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39077 5.7e-12 478_[+1]_1 38980 5.7e-12 478_[+1]_1 40630 4.4e-11 478_[+1]_1 41291 4.4e-11 478_[+1]_1 46052 2.3e-08 394_[+1]_85 32850 3e-08 99_[+1]_380 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 39077 ( 479) ACAGTGGGGAGTGTCAACTGG 1 38980 ( 479) ACAGTGGGGAGTGTCAACTGG 1 40630 ( 479) ATACTCCAGGATGTCAACTGG 1 41291 ( 479) ATACTCCAGGATGTCAACTGG 1 46052 ( 395) ACACTGCTGGTTGCAAACAAC 1 32850 ( 100) GTCGTGTCGATTGGATACTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3360 bayes= 9.57485 E= 1.0e-004 168 -923 -36 -923 -923 105 -923 82 168 -53 -923 -923 -923 105 122 -923 -923 -923 -923 182 -923 47 163 -923 -923 105 63 -77 35 -53 63 -77 -923 -923 222 -923 94 -923 122 -923 35 -923 63 23 -923 -923 -923 182 -923 -923 222 -923 -923 -53 -36 123 35 147 -923 -923 168 -923 -923 -77 194 -923 -923 -923 -923 205 -923 -923 -64 -923 -923 155 -64 -923 196 -923 -923 -53 196 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 1.0e-004 0.833333 0.000000 0.166667 0.000000 0.000000 0.500000 0.000000 0.500000 0.833333 0.166667 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.500000 0.333333 0.166667 0.333333 0.166667 0.333333 0.166667 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.333333 0.000000 0.333333 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.166667 0.666667 0.333333 0.666667 0.000000 0.000000 0.833333 0.000000 0.000000 0.166667 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.833333 0.000000 0.000000 0.166667 0.833333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[CT]A[CG]T[GC][CG][AG]G[AG][AGT]TGT[CA]AACTGG -------------------------------------------------------------------------------- Time 0.41 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 7 llr = 121 E-value = 9.5e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :1774:73794a::3:::3:7 pos.-specific C a::3::16314:::31:::41 probability G ::3:6611:::::13::a1:1 matrix T :9:::4::::1:a919a:66: bits 2.2 * 2.0 * * * 1.8 * ** ** 1.6 * ** ** Relative 1.3 ** * *** *** Entropy 1.1 ****** ** *** *** (25.0 bits) 0.9 ******* ** *** *** ** 0.7 ********** *** *** ** 0.4 ************** ****** 0.2 ************** ****** 0.0 --------------------- Multilevel CTAAGGACAAAATTATTGTTA consensus GCAT AC C C AC sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 40630 359 4.16e-11 CATCCCTGTT CTAAATACAACATTCTTGTTA ACTGGGAAAC 41291 359 4.16e-11 TACAAAGGTT CTAAATACAACATTCTTGTTA ACTGGGAAAC 39077 142 5.04e-09 GCTCGTTTGG CTACAGAACAAATTATTGACA TGCGTCTTAC 38980 142 5.04e-09 GCTCGTTTGG CTACGGAGCAAATTATTGACA TGCGTCTTAC 37227 4 1.98e-08 GGA CTGAGTGCAACATTGTTGGCG GGGACGTATG 46052 148 4.33e-08 TATGCTGAGT CTGAGGCCAAAATGTCTGTTA TGTAATGTAA 32850 451 6.13e-08 AACCAGTGAT CAAAGGAAACTATTGTTGTTC CCGTTATGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40630 4.2e-11 358_[+2]_121 41291 4.2e-11 358_[+2]_121 39077 5e-09 141_[+2]_338 38980 5e-09 141_[+2]_338 37227 2e-08 3_[+2]_476 46052 4.3e-08 147_[+2]_332 32850 6.1e-08 450_[+2]_29 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=7 40630 ( 359) CTAAATACAACATTCTTGTTA 1 41291 ( 359) CTAAATACAACATTCTTGTTA 1 39077 ( 142) CTACAGAACAAATTATTGACA 1 38980 ( 142) CTACGGAGCAAATTATTGACA 1 37227 ( 4) CTGAGTGCAACATTGTTGGCG 1 46052 ( 148) CTGAGGCCAAAATGTCTGTTA 1 32850 ( 451) CAAAGGAAACTATTGTTGTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3360 bayes= 8.90388 E= 9.5e-004 -945 205 -945 -945 -87 -945 -945 159 145 -945 41 -945 145 24 -945 -945 72 -945 141 -945 -945 -945 141 59 145 -75 -59 -945 13 124 -59 -945 145 24 -945 -945 172 -75 -945 -945 72 83 -945 -99 194 -945 -945 -945 -945 -945 -945 182 -945 -945 -59 159 13 24 41 -99 -945 -75 -945 159 -945 -945 -945 182 -945 -945 222 -945 13 -945 -59 101 -945 83 -945 101 145 -75 -59 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 9.5e-004 0.000000 1.000000 0.000000 0.000000 0.142857 0.000000 0.000000 0.857143 0.714286 0.000000 0.285714 0.000000 0.714286 0.285714 0.000000 0.000000 0.428571 0.000000 0.571429 0.000000 0.000000 0.000000 0.571429 0.428571 0.714286 0.142857 0.142857 0.000000 0.285714 0.571429 0.142857 0.000000 0.714286 0.285714 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.428571 0.428571 0.000000 0.142857 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.857143 0.285714 0.285714 0.285714 0.142857 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.285714 0.000000 0.142857 0.571429 0.000000 0.428571 0.000000 0.571429 0.714286 0.142857 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CT[AG][AC][GA][GT]A[CA][AC]A[AC]ATT[ACG]TTG[TA][TC]A -------------------------------------------------------------------------------- Time 0.81 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 7 llr = 120 E-value = 5.5e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 6::4:676:1::9a1:3:69: pos.-specific C ::16:1:17:49::1:4a::9 probability G :66:1:3:376:1:7a:::1: matrix T 443:93:3:1:1::::3:4:1 bits 2.2 * 2.0 * * * 1.8 * * * 1.6 * * * Relative 1.3 * * *** * * ** Entropy 1.1 * ** * * ****** * ** (24.7 bits) 0.9 ** ** * ******** **** 0.7 ***** * ******** **** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel AGGCTAAACGGCAAGGCCAAC consensus TTTA TGTG C A T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 40630 455 4.25e-11 TGTGCTGCCT ATGCTAAACGCCAAGGTCTAC TAGATACTCC 41291 455 4.25e-11 TGTGCTGCCT ATGCTAAACGCCAAGGTCTAC TAGATACTCC 37227 443 1.79e-09 AACAAGTTCC AGTCTAACCTGCAAGGCCAAC AAAACCACAT 39077 342 7.37e-09 GCATTTTTTA TGGATTGAGGGCAAAGACAAC GCTAGCGATG 38980 342 9.47e-09 GCATTTTTTA TGGATTGAGGGTAAGGACAAC GCTAGCGATG 46052 17 7.94e-08 AACGATATGA ATCATCATCAGCAAGGCCAAT GACGTGCAGA 32850 32 1.49e-07 TTCATGTTGT TGTCGAATCGCCGACGCCTGC AGAGGCCTAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40630 4.2e-11 454_[+3]_25 41291 4.2e-11 454_[+3]_25 37227 1.8e-09 442_[+3]_37 39077 7.4e-09 341_[+3]_138 38980 9.5e-09 341_[+3]_138 46052 7.9e-08 16_[+3]_463 32850 1.5e-07 31_[+3]_448 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=7 40630 ( 455) ATGCTAAACGCCAAGGTCTAC 1 41291 ( 455) ATGCTAAACGCCAAGGTCTAC 1 37227 ( 443) AGTCTAACCTGCAAGGCCAAC 1 39077 ( 342) TGGATTGAGGGCAAAGACAAC 1 38980 ( 342) TGGATTGAGGGTAAGGACAAC 1 46052 ( 17) ATCATCATCAGCAAGGCCAAT 1 32850 ( 32) TGTCGAATCGCCGACGCCTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3360 bayes= 8.90388 E= 5.5e-003 113 -945 -945 59 -945 -945 141 59 -945 -75 141 1 72 124 -945 -945 -945 -945 -59 159 113 -75 -945 1 145 -945 41 -945 113 -75 -945 1 -945 157 41 -945 -87 -945 173 -99 -945 83 141 -945 -945 183 -945 -99 172 -945 -59 -945 194 -945 -945 -945 -87 -75 173 -945 -945 -945 222 -945 13 83 -945 1 -945 205 -945 -945 113 -945 -945 59 172 -945 -59 -945 -945 183 -945 -99 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 5.5e-003 0.571429 0.000000 0.000000 0.428571 0.000000 0.000000 0.571429 0.428571 0.000000 0.142857 0.571429 0.285714 0.428571 0.571429 0.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.571429 0.142857 0.000000 0.285714 0.714286 0.000000 0.285714 0.000000 0.571429 0.142857 0.000000 0.285714 0.000000 0.714286 0.285714 0.000000 0.142857 0.000000 0.714286 0.142857 0.000000 0.428571 0.571429 0.000000 0.000000 0.857143 0.000000 0.142857 0.857143 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.142857 0.714286 0.000000 0.000000 0.000000 1.000000 0.000000 0.285714 0.428571 0.000000 0.285714 0.000000 1.000000 0.000000 0.000000 0.571429 0.000000 0.000000 0.428571 0.857143 0.000000 0.142857 0.000000 0.000000 0.857143 0.000000 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AT][GT][GT][CA]T[AT][AG][AT][CG]G[GC]CAAGG[CAT]C[AT]AC -------------------------------------------------------------------------------- Time 1.22 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38980 3.17e-17 85_[+1(7.12e-05)]_35_[+2(5.04e-09)]_\ 179_[+3(9.47e-09)]_116_[+1(5.69e-12)]_1 37227 1.14e-09 3_[+2(1.98e-08)]_418_[+3(1.79e-09)]_\ 37 32850 1.56e-11 31_[+3(1.49e-07)]_47_[+1(2.99e-08)]_\ 330_[+2(6.13e-08)]_29 41291 1.26e-20 358_[+2(4.16e-11)]_75_\ [+3(4.25e-11)]_3_[+1(4.42e-11)]_1 40630 1.26e-20 358_[+2(4.16e-11)]_75_\ [+3(4.25e-11)]_3_[+1(4.42e-11)]_1 39077 2.49e-17 141_[+2(5.04e-09)]_179_\ [+3(7.37e-09)]_116_[+1(5.69e-12)]_1 46052 4.90e-12 16_[+3(7.94e-08)]_110_\ [+2(4.33e-08)]_226_[+1(2.31e-08)]_85 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************