******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/168/168.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 8891 1.0000 500 54067 1.0000 500 46699 1.0000 500 36906 1.0000 500 14387 1.0000 500 43430 1.0000 500 49226 1.0000 500 49396 1.0000 500 49447 1.0000 500 16725 1.0000 500 33915 1.0000 500 45936 1.0000 500 32411 1.0000 500 40277 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/168/168.seqs.fa -oc motifs/168 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.274 C 0.233 G 0.223 T 0.270 Background letter frequencies (from dataset with add-one prior applied): A 0.274 C 0.233 G 0.223 T 0.270 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 7 llr = 88 E-value = 6.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 4:::9:1:::a: pos.-specific C 14:::a3::::: probability G 4::9::19:a:a matrix T :6a11:41a::: bits 2.2 * * * 1.9 * * **** 1.7 * * **** 1.5 ** * ***** Relative 1.3 **** ***** Entropy 1.1 ***** ***** (18.1 bits) 0.9 ***** ***** 0.6 ****** ***** 0.4 ****** ***** 0.2 ************ 0.0 ------------ Multilevel ATTGACTGTGAG consensus GC C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 49396 335 1.58e-07 TCGTTACGAA ATTGACTGTGAG AGTGAGTACC 16725 137 2.57e-07 ATAGTAAGGC ACTGACTGTGAG TTAAGAGATT 54067 288 3.50e-07 GTTGTCAGGC GCTGACCGTGAG CGATACTCTC 14387 129 9.34e-07 TTGAAAAAGG ACTGACAGTGAG GAAACGGAGC 32411 292 1.30e-06 ACCGTCGTCG CTTGACGGTGAG AAAAAAGGTC 46699 474 1.30e-06 CCTTCCCTTT GTTGACTTTGAG CCTGCCCATC 49226 24 5.28e-06 CGTTGGAACT GTTTTCCGTGAG TTGATGTTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49396 1.6e-07 334_[+1]_154 16725 2.6e-07 136_[+1]_352 54067 3.5e-07 287_[+1]_201 14387 9.3e-07 128_[+1]_360 32411 1.3e-06 291_[+1]_197 46699 1.3e-06 473_[+1]_15 49226 5.3e-06 23_[+1]_465 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=7 49396 ( 335) ATTGACTGTGAG 1 16725 ( 137) ACTGACTGTGAG 1 54067 ( 288) GCTGACCGTGAG 1 14387 ( 129) ACTGACAGTGAG 1 32411 ( 292) CTTGACGGTGAG 1 46699 ( 474) GTTGACTTTGAG 1 49226 ( 24) GTTTTCCGTGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 10.5384 E= 6.8e+000 65 -71 94 -945 -945 88 -945 108 -945 -945 -945 189 -945 -945 194 -92 164 -945 -945 -92 -945 210 -945 -945 -94 29 -64 67 -945 -945 194 -92 -945 -945 -945 189 -945 -945 216 -945 187 -945 -945 -945 -945 -945 216 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 6.8e+000 0.428571 0.142857 0.428571 0.000000 0.000000 0.428571 0.000000 0.571429 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.857143 0.142857 0.857143 0.000000 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 0.142857 0.285714 0.142857 0.428571 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AG][TC]TGAC[TC]GTGAG -------------------------------------------------------------------------------- Time 1.86 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 8 llr = 121 E-value = 4.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 8943:3a61493aaa65:3: pos.-specific C 3:11:5:331::::::3:5: probability G :1:693::4:18:::::331 matrix T ::5:1::135:::::438:9 bits 2.2 1.9 * *** 1.7 * *** 1.5 * * *** Relative 1.3 * * * ***** * Entropy 1.1 ** * * ***** * * (21.7 bits) 0.9 ** ** * ****** * * 0.6 ** ***** ****** *** 0.4 ******** *********** 0.2 ******************** 0.0 -------------------- Multilevel AATGGCAAGTAGAAAAATCT consensus C AA A CCA A TCGA sequence G T T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 45936 179 2.49e-09 CAAGCTTTCA AACGGCACTTAGAAAAATCT GTCTGGCGCG 49396 261 7.20e-09 TGCACCTCAA AATGGCAACTGGAAATATGT GGGGTCGCCT 40277 170 2.47e-08 GGGCCGATAA CAAAGAAAGTAGAAAAATAT CAGTTGATCA 43430 18 6.81e-08 TTGACAACAG AAAGGAAATAAAAAAATTAT CTTGTATTTC 54067 51 1.02e-07 CTTGGGTTTC AAAATCACGTAGAAATATGT ACCGGTTCTA 14387 39 1.37e-07 CCTGAGAATG AATCGCAACCAGAAAACTCG CTTCGCAGAA 16725 475 1.58e-07 ACGGTTGGAG AATGGGATGAAAAAATTGCT AGATTT 36906 90 2.35e-07 TAGGTATTTA CGTGGGAAAAAGAAAACGCT TGTCTTTCTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45936 2.5e-09 178_[+2]_302 49396 7.2e-09 260_[+2]_220 40277 2.5e-08 169_[+2]_311 43430 6.8e-08 17_[+2]_463 54067 1e-07 50_[+2]_430 14387 1.4e-07 38_[+2]_442 16725 1.6e-07 474_[+2]_6 36906 2.3e-07 89_[+2]_391 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=8 45936 ( 179) AACGGCACTTAGAAAAATCT 1 49396 ( 261) AATGGCAACTGGAAATATGT 1 40277 ( 170) CAAAGAAAGTAGAAAAATAT 1 43430 ( 18) AAAGGAAATAAAAAAATTAT 1 54067 ( 51) AAAATCACGTAGAAATATGT 1 14387 ( 39) AATCGCAACCAGAAAACTCG 1 16725 ( 475) AATGGGATGAAAAAATTGCT 1 36906 ( 90) CGTGGGAAAAAGAAAACGCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6734 bayes= 9.71553 E= 4.7e+002 145 10 -965 -965 167 -965 -83 -965 45 -90 -965 89 -13 -90 149 -965 -965 -965 197 -111 -13 110 17 -965 187 -965 -965 -965 119 10 -965 -111 -113 10 75 -11 45 -90 -965 89 167 -965 -83 -965 -13 -965 175 -965 187 -965 -965 -965 187 -965 -965 -965 187 -965 -965 -965 119 -965 -965 47 87 10 -965 -11 -965 -965 17 147 -13 110 17 -965 -965 -965 -83 170 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 8 E= 4.7e+002 0.750000 0.250000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.375000 0.125000 0.000000 0.500000 0.250000 0.125000 0.625000 0.000000 0.000000 0.000000 0.875000 0.125000 0.250000 0.500000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.625000 0.250000 0.000000 0.125000 0.125000 0.250000 0.375000 0.250000 0.375000 0.125000 0.000000 0.500000 0.875000 0.000000 0.125000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.625000 0.000000 0.000000 0.375000 0.500000 0.250000 0.000000 0.250000 0.000000 0.000000 0.250000 0.750000 0.250000 0.500000 0.250000 0.000000 0.000000 0.000000 0.125000 0.875000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AC]A[TA][GA]G[CAG]A[AC][GCT][TA]A[GA]AAA[AT][ACT][TG][CAG]T -------------------------------------------------------------------------------- Time 3.54 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 9 llr = 129 E-value = 1.0e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::96879436:42:6a66:: pos.-specific C 61::1::3::9:8:4:3::: probability G 18:21:123:11:2::1:7: matrix T 3112:3::34:4:8:::43a bits 2.2 1.9 * * 1.7 * * 1.5 * * * Relative 1.3 * * * * * * Entropy 1.1 ** * * **** ** (20.8 bits) 0.9 ** *** ** **** *** 0.6 *** *** ** ******** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel CGAAAAAAAACACTAAAAGT consensus T G T CGT TAGC CTT sequence T GT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 14387 181 5.48e-11 CATAGTGCAT CGAAAAACGTCTCTAAAAGT TCTTCTACAC 45936 110 3.50e-09 ACTTGGGAGA CGAAAAAGTACTCTAAATTT AACAAGGATT 33915 370 2.73e-08 CGAAATTTTA TGAGAAAATACGCTAACAGT AAAGGGGATG 54067 323 6.21e-08 AACCTGACTG TGAAAAACATGACTAACTGT AAGAGAGCTG 49226 426 2.59e-07 GAAACCAGAT CGATCTAGGTCTCTCAATTT CATGCTCGAG 40277 200 2.79e-07 CAGTTGATCA CCATAAAAGTCACGAAATTT GTTCGAAAAG 36906 271 4.50e-07 TTTTGCGCCT GTAGATACTACACTCACAGT CAATCACAGT 49447 58 5.11e-07 GGTACCCCCG CGAAGTGAAACAATCAAAGT AGACGGCAAT 49396 90 9.62e-07 GTTCATGTTT TGTAAAAAAACTAGCAGAGT CAAATGTAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14387 5.5e-11 180_[+3]_300 45936 3.5e-09 109_[+3]_371 33915 2.7e-08 369_[+3]_111 54067 6.2e-08 322_[+3]_158 49226 2.6e-07 425_[+3]_55 40277 2.8e-07 199_[+3]_281 36906 4.5e-07 270_[+3]_210 49447 5.1e-07 57_[+3]_423 49396 9.6e-07 89_[+3]_391 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=9 14387 ( 181) CGAAAAACGTCTCTAAAAGT 1 45936 ( 110) CGAAAAAGTACTCTAAATTT 1 33915 ( 370) TGAGAAAATACGCTAACAGT 1 54067 ( 323) TGAAAAACATGACTAACTGT 1 49226 ( 426) CGATCTAGGTCTCTCAATTT 1 40277 ( 200) CCATAAAAGTCACGAAATTT 1 36906 ( 271) GTAGATACTACACTCACAGT 1 49447 ( 58) CGAAGTGAAACAATCAAAGT 1 49396 ( 90) TGTAAAAAAACTAGCAGAGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6734 bayes= 9.67987 E= 1.0e+003 -982 125 -100 30 -982 -107 180 -128 170 -982 -982 -128 102 -982 0 -28 150 -107 -100 -982 128 -982 -982 30 170 -982 -100 -982 70 51 0 -982 28 -982 58 30 102 -982 -982 72 -982 193 -100 -982 70 -982 -100 72 -30 174 -982 -982 -982 -982 0 153 102 93 -982 -982 187 -982 -982 -982 102 51 -100 -982 102 -982 -982 72 -982 -982 158 30 -982 -982 -982 189 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 9 E= 1.0e+003 0.000000 0.555556 0.111111 0.333333 0.000000 0.111111 0.777778 0.111111 0.888889 0.000000 0.000000 0.111111 0.555556 0.000000 0.222222 0.222222 0.777778 0.111111 0.111111 0.000000 0.666667 0.000000 0.000000 0.333333 0.888889 0.000000 0.111111 0.000000 0.444444 0.333333 0.222222 0.000000 0.333333 0.000000 0.333333 0.333333 0.555556 0.000000 0.000000 0.444444 0.000000 0.888889 0.111111 0.000000 0.444444 0.000000 0.111111 0.444444 0.222222 0.777778 0.000000 0.000000 0.000000 0.000000 0.222222 0.777778 0.555556 0.444444 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.555556 0.333333 0.111111 0.000000 0.555556 0.000000 0.000000 0.444444 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CT]GA[AGT]A[AT]A[ACG][AGT][AT]C[AT][CA][TG][AC]A[AC][AT][GT]T -------------------------------------------------------------------------------- Time 5.28 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8891 9.26e-01 500 54067 1.13e-10 50_[+2(1.02e-07)]_217_\ [+1(3.50e-07)]_23_[+3(6.21e-08)]_158 46699 4.28e-03 473_[+1(1.30e-06)]_15 36906 2.85e-06 89_[+2(2.35e-07)]_161_\ [+3(4.50e-07)]_210 14387 5.08e-13 38_[+2(1.37e-07)]_70_[+1(9.34e-07)]_\ 40_[+3(5.48e-11)]_300 43430 1.01e-03 17_[+2(6.81e-08)]_463 49226 3.35e-06 23_[+1(5.28e-06)]_390_\ [+3(2.59e-07)]_55 49396 5.87e-11 89_[+3(9.62e-07)]_151_\ [+2(7.20e-09)]_54_[+1(1.58e-07)]_154 49447 7.58e-03 57_[+3(5.11e-07)]_423 16725 4.72e-07 136_[+1(2.57e-07)]_326_\ [+2(1.58e-07)]_6 33915 4.67e-05 369_[+3(2.73e-08)]_111 45936 5.34e-11 109_[+3(3.50e-09)]_49_\ [+2(2.49e-09)]_302 32411 1.22e-02 291_[+1(1.30e-06)]_197 40277 2.76e-07 58_[+2(4.53e-05)]_91_[+2(2.47e-08)]_\ 10_[+3(2.79e-07)]_281 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************