******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/251/251.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31906 1.0000 500 48693 1.0000 500 15749 1.0000 500 44216 1.0000 500 54251 1.0000 500 48263 1.0000 500 33640 1.0000 500 49145 1.0000 500 43083 1.0000 500 49011 1.0000 500 34600 1.0000 500 11423 1.0000 500 47562 1.0000 500 35397 1.0000 500 49218 1.0000 500 43542 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/251/251.seqs.fa -oc motifs/251 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.260 C 0.263 G 0.216 T 0.261 Background letter frequencies (from dataset with add-one prior applied): A 0.260 C 0.263 G 0.216 T 0.261 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 4 llr = 91 E-value = 1.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :3:::3::3::33:::a:::3 pos.-specific C ::5::::a5::8:5:::8::: probability G a85a:8::38a:83:a:35a8 matrix T ::::a:a::3:::3a:::5:: bits 2.2 * * * * * 2.0 * ** ** * *** * 1.8 * ** ** * *** * 1.5 * ** ** * *** * Relative 1.3 ** ***** ** * *** ** Entropy 1.1 ******** **** ******* (32.8 bits) 0.9 ******** **** ******* 0.7 ******** **** ******* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GGCGTGTCCGGCGCTGACGGG consensus AG A AT AAG GT A sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 11423 66 2.06e-13 TCCATCCCGC GGCGTGTCCGGCGCTGACGGG AGCAACGCCA 15749 15 2.06e-13 TCCATCCCGC GGCGTGTCCGGCGCTGACGGG AGCAACGCCA 49011 36 1.62e-10 CTTTCATAGC GGGGTGTCAGGCAGTGAGTGA CTAGTCACAG 49218 131 3.11e-10 AAGTAATTCT GAGGTATCGTGAGTTGACTGG GTAAAACGGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11423 2.1e-13 65_[+1]_414 15749 2.1e-13 14_[+1]_465 49011 1.6e-10 35_[+1]_444 49218 3.1e-10 130_[+1]_349 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=4 11423 ( 66) GGCGTGTCCGGCGCTGACGGG 1 15749 ( 15) GGCGTGTCCGGCGCTGACGGG 1 49011 ( 36) GGGGTGTCAGGCAGTGAGTGA 1 49218 ( 131) GAGGTATCGTGAGTTGACTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 10.9061 E= 1.6e+000 -865 -865 221 -865 -6 -865 179 -865 -865 93 121 -865 -865 -865 221 -865 -865 -865 -865 194 -6 -865 179 -865 -865 -865 -865 194 -865 193 -865 -865 -6 93 21 -865 -865 -865 179 -6 -865 -865 221 -865 -6 151 -865 -865 -6 -865 179 -865 -865 93 21 -6 -865 -865 -865 194 -865 -865 221 -865 194 -865 -865 -865 -865 151 21 -865 -865 -865 121 94 -865 -865 221 -865 -6 -865 179 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 1.6e+000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[GA][CG]GT[GA]TC[CAG][GT]G[CA][GA][CGT]TGA[CG][GT]G[GA] -------------------------------------------------------------------------------- Time 2.12 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 12 llr = 161 E-value = 1.1e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 4551:225a3:23::2823: pos.-specific C 62:::::::222::3::363 probability G :338:785:433::68:1:7 matrix T :131a2:::1637a11352: bits 2.2 2.0 * * * 1.8 * * * 1.5 * * * * Relative 1.3 ** * * * Entropy 1.1 ** *** ** ** * (19.4 bits) 0.9 * ****** ***** * 0.7 * ****** * ***** * 0.4 * ******* * ***** ** 0.2 *********** ******** 0.0 -------------------- Multilevel CAAGTGGAAGTGTTGGATCG consensus AGG G AGTA C TCAC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 15749 327 9.44e-10 CAACGTGCCG CAGGTGGAAATTTTGGAACG GTACGCGGTG 11423 378 2.65e-09 CAATGTGCCG CAGGTGGAAATTTTGGAGCG GTACGCAGTG 47562 189 2.86e-08 ATCCAACAAT CATGTGAGAGTGTTCGTTCG GACTGTGAAC 31906 134 5.16e-08 GCGTTGTTCT AAAGTAGAAGGGTTCGATTG CTACAAGTGG 49011 114 8.89e-08 TGGACGGAGC CAAGTGGAAGTATTGAACAC ATTTATTTTC 48693 10 9.88e-08 GGCGTTGGG AGTGTGGGATTTATGGTTCG ATCCGTCCCG 43542 161 1.34e-07 TTCGCCCAAA CCAGTGGAAGCTTTGAAACG CGCTCCGACG 54251 245 3.40e-07 GTGACACAAA CGTGTGGGACTGTTGTATAC ACGACCTCGC 48263 402 1.18e-06 ACAGACACTC AAAGTTGGACTAATTGTTCG ACGGTAAAGG 34600 346 2.13e-06 GCTACTTTCG AGAATGGGAGCGATCGACAC ATACGCGCAC 35397 372 4.64e-06 ACTGCAACAA ATAGTAAGAAGCATCGATCC TTACGTCGTT 33640 30 4.88e-06 TTTCTAAGTG CCGTTTGAAAGCTTGGACTG GCTGAACTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 15749 9.4e-10 326_[+2]_154 11423 2.7e-09 377_[+2]_103 47562 2.9e-08 188_[+2]_292 31906 5.2e-08 133_[+2]_347 49011 8.9e-08 113_[+2]_367 48693 9.9e-08 9_[+2]_471 43542 1.3e-07 160_[+2]_320 54251 3.4e-07 244_[+2]_236 48263 1.2e-06 401_[+2]_79 34600 2.1e-06 345_[+2]_135 35397 4.6e-06 371_[+2]_109 33640 4.9e-06 29_[+2]_451 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=12 15749 ( 327) CAGGTGGAAATTTTGGAACG 1 11423 ( 378) CAGGTGGAAATTTTGGAGCG 1 47562 ( 189) CATGTGAGAGTGTTCGTTCG 1 31906 ( 134) AAAGTAGAAGGGTTCGATTG 1 49011 ( 114) CAAGTGGAAGTATTGAACAC 1 48693 ( 10) AGTGTGGGATTTATGGTTCG 1 43542 ( 161) CCAGTGGAAGCTTTGAAACG 1 54251 ( 245) CGTGTGGGACTGTTGTATAC 1 48263 ( 402) AAAGTTGGACTAATTGTTCG 1 34600 ( 346) AGAATGGGAGCGATCGACAC 1 35397 ( 372) ATAGTAAGAAGCATCGATCC 1 33640 ( 30) CCGTTTGAAAGCTTGGACTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7696 bayes= 10.4234 E= 1.1e-001 68 115 -1023 -1023 94 -66 21 -164 94 -1023 21 -6 -164 -1023 194 -164 -1023 -1023 -1023 194 -64 -1023 162 -64 -64 -1023 194 -1023 94 -1023 121 -1023 194 -1023 -1023 -1023 36 -66 94 -164 -1023 -66 21 116 -64 -66 62 35 36 -1023 -1023 135 -1023 -1023 -1023 194 -1023 34 143 -164 -64 -1023 179 -164 153 -1023 -1023 -6 -64 -7 -137 94 -6 115 -1023 -64 -1023 34 162 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 12 E= 1.1e-001 0.416667 0.583333 0.000000 0.000000 0.500000 0.166667 0.250000 0.083333 0.500000 0.000000 0.250000 0.250000 0.083333 0.000000 0.833333 0.083333 0.000000 0.000000 0.000000 1.000000 0.166667 0.000000 0.666667 0.166667 0.166667 0.000000 0.833333 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.166667 0.416667 0.083333 0.000000 0.166667 0.250000 0.583333 0.166667 0.166667 0.333333 0.333333 0.333333 0.000000 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.583333 0.083333 0.166667 0.000000 0.750000 0.083333 0.750000 0.000000 0.000000 0.250000 0.166667 0.250000 0.083333 0.500000 0.250000 0.583333 0.000000 0.166667 0.000000 0.333333 0.666667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CA][AG][AGT]GTGG[AG]A[GA][TG][GT][TA]T[GC]G[AT][TC][CA][GC] -------------------------------------------------------------------------------- Time 4.34 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 5 llr = 100 E-value = 5.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 26:a24::4a:::8::42:: pos.-specific C :24:66:a::a:::8::2aa probability G 8:6:::8:6::aa2:a2::: matrix T :2::2:2:::::::2:46:: bits 2.2 ** * 2.0 * * **** * ** 1.8 * * **** * ** 1.5 * * **** * ** Relative 1.3 * * ** ***** * ** Entropy 1.1 * ** ********** ** (29.0 bits) 0.9 * ** *********** ** 0.7 **************** *** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GAGACCGCGACGGACGATCC consensus ACC AAT A GT TA sequence T T GC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 11423 168 1.19e-12 CGGTCCGGAG GAGACCGCGACGGACGTTCC GTTAGCGACC 15749 117 1.19e-12 CGGTCCGGAG GAGACCGCGACGGACGTTCC GTTGGCGACC 34600 220 1.38e-09 CGCGAGTTCG GTCACAGCGACGGGCGAACC CAAGTTTCTG 49145 97 3.23e-09 GTATCCCTCC AACATCGCAACGGATGGTCC TGTCCGCTCC 54251 78 3.38e-09 AGGCAATGCT GCGAAATCAACGGACGACCC TGGATTCTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11423 1.2e-12 167_[+3]_313 15749 1.2e-12 116_[+3]_364 34600 1.4e-09 219_[+3]_261 49145 3.2e-09 96_[+3]_384 54251 3.4e-09 77_[+3]_403 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=5 11423 ( 168) GAGACCGCGACGGACGTTCC 1 15749 ( 117) GAGACCGCGACGGACGTTCC 1 34600 ( 220) GTCACAGCGACGGGCGAACC 1 49145 ( 97) AACATCGCAACGGATGGTCC 1 54251 ( 78) GCGAAATCAACGGACGACCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7696 bayes= 10.8387 E= 5.8e+000 -38 -897 188 -897 120 -39 -897 -38 -897 61 147 -897 194 -897 -897 -897 -38 119 -897 -38 62 119 -897 -897 -897 -897 188 -38 -897 193 -897 -897 62 -897 147 -897 194 -897 -897 -897 -897 193 -897 -897 -897 -897 221 -897 -897 -897 221 -897 162 -897 -11 -897 -897 160 -897 -38 -897 -897 221 -897 62 -897 -11 62 -38 -39 -897 120 -897 193 -897 -897 -897 193 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 5 E= 5.8e+000 0.200000 0.000000 0.800000 0.000000 0.600000 0.200000 0.000000 0.200000 0.000000 0.400000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.600000 0.000000 0.200000 0.400000 0.600000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.200000 0.400000 0.200000 0.200000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA][ACT][GC]A[CAT][CA][GT]C[GA]ACGG[AG][CT]G[ATG][TAC]CC -------------------------------------------------------------------------------- Time 6.34 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31906 1.35e-03 133_[+2(5.16e-08)]_347 48693 7.80e-05 9_[+2(9.88e-08)]_231_[+3(3.06e-05)]_\ 220 15749 4.59e-23 14_[+1(2.06e-13)]_81_[+3(1.19e-12)]_\ 190_[+2(9.44e-10)]_91_[+1(8.03e-05)]_42 44216 1.44e-01 153_[+1(9.70e-05)]_326 54251 5.44e-08 77_[+3(3.38e-09)]_147_\ [+2(3.40e-07)]_236 48263 8.78e-03 401_[+2(1.18e-06)]_79 33640 2.78e-02 29_[+2(4.88e-06)]_451 49145 3.22e-05 96_[+3(3.23e-09)]_384 43083 8.00e-01 500 49011 1.00e-09 35_[+1(1.62e-10)]_57_[+2(8.89e-08)]_\ 367 34600 7.09e-08 219_[+3(1.38e-09)]_106_\ [+2(2.13e-06)]_135 11423 1.25e-22 65_[+1(2.06e-13)]_81_[+3(1.19e-12)]_\ 190_[+2(2.65e-09)]_103 47562 6.37e-04 188_[+2(2.86e-08)]_292 35397 1.53e-02 371_[+2(4.64e-06)]_109 49218 1.77e-05 130_[+1(3.11e-10)]_349 43542 1.49e-03 160_[+2(1.34e-07)]_320 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************