******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/387/387.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 46387 1.0000 500 47094 1.0000 500 47164 1.0000 500 8461 1.0000 500 47358 1.0000 500 38407 1.0000 500 48220 1.0000 500 5286 1.0000 500 49280 1.0000 500 49787 1.0000 500 40691 1.0000 500 10187 1.0000 500 12221 1.0000 500 38892 1.0000 500 48944 1.0000 500 48355 1.0000 500 50128 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/387/387.seqs.fa -oc motifs/387 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.265 C 0.239 G 0.223 T 0.273 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.239 G 0.223 T 0.273 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 17 llr = 161 E-value = 1.9e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 53a99824a15: pos.-specific C :7::12:6:518 probability G 5::1::3::412 matrix T ::::1:5::141 bits 2.2 2.0 * * 1.7 * * 1.5 ** * Relative 1.3 *** * Entropy 1.1 ****** ** * (13.7 bits) 0.9 ****** ** * 0.7 ****** *** * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel ACAAAATCACAC consensus GA CGA GT sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 5286 298 7.14e-07 ACCGGAAAAC ACAAAATCAGTC GGACTCCCAA 46387 74 2.00e-06 GCGAATGATC ACAAAATAACTC TTCCATCGGC 49787 50 2.61e-06 CGCTATTTGG GAAAAATCACAC TGTTTCTAAG 38407 352 2.61e-06 CGCTATTTGG GAAAAATCACAC TGATTCTAAG 47164 210 4.95e-06 GTTGCTAGGA ACAAAATCACAG TAGTGAGCGC 40691 153 5.47e-06 TGCTGGAACG GCAAACGCACTC GCAGCGGGCT 48355 264 5.89e-06 GCGACTGCGA ACAAAAAAACTC CGAAGGACAG 48944 460 7.36e-06 TACCATAAAC GCAAACACACAC ACGCACACTG 47358 436 1.40e-05 CCTTTGGTTT GCAAAATAAGTG GTGACCACTA 8461 6 1.52e-05 TGTAA ACAAAAGCAAAC ACGATCGCAA 49280 338 3.27e-05 CCGACAAAGA AAAAAAGCAGCC CAGTCGCTTC 50128 289 3.80e-05 GTAGAGGACC ACAAAAGAATTC ATCACATAAT 12221 218 4.47e-05 AATACAGACG GAAAAATAAGGC GAAAGAACGA 48220 449 6.32e-05 AAACTCCCAA ACAAACGCACAT CCTCCTGCGA 38892 160 7.12e-05 ATGACTGACA GCAGAAAAAGTC TCTAACAAAG 10187 365 8.07e-05 AAAGAAAGAC GCAACCACAGAC CGGTGACGCA 47094 7 1.31e-04 AAATTG AAAATATCAGAG CTATTGTCTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5286 7.1e-07 297_[+1]_191 46387 2e-06 73_[+1]_415 49787 2.6e-06 49_[+1]_439 38407 2.6e-06 351_[+1]_137 47164 4.9e-06 209_[+1]_279 40691 5.5e-06 152_[+1]_336 48355 5.9e-06 263_[+1]_225 48944 7.4e-06 459_[+1]_29 47358 1.4e-05 435_[+1]_53 8461 1.5e-05 5_[+1]_483 49280 3.3e-05 337_[+1]_151 50128 3.8e-05 288_[+1]_200 12221 4.5e-05 217_[+1]_271 48220 6.3e-05 448_[+1]_40 38892 7.1e-05 159_[+1]_329 10187 8.1e-05 364_[+1]_124 47094 0.00013 6_[+1]_482 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=17 5286 ( 298) ACAAAATCAGTC 1 46387 ( 74) ACAAAATAACTC 1 49787 ( 50) GAAAAATCACAC 1 38407 ( 352) GAAAAATCACAC 1 47164 ( 210) ACAAAATCACAG 1 40691 ( 153) GCAAACGCACTC 1 48355 ( 264) ACAAAAAAACTC 1 48944 ( 460) GCAAACACACAC 1 47358 ( 436) GCAAAATAAGTG 1 8461 ( 6) ACAAAAGCAAAC 1 49280 ( 338) AAAAAAGCAGCC 1 50128 ( 289) ACAAAAGAATTC 1 12221 ( 218) GAAAAATAAGGC 1 48220 ( 449) ACAAACGCACAT 1 38892 ( 160) GCAGAAAAAGTC 1 10187 ( 365) GCAACCACAGAC 1 47094 ( 7) AAAATATCAGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.00042 E= 1.9e-002 100 -1073 108 -1073 15 156 -1073 -1073 192 -1073 -1073 -1073 183 -1073 -192 -1073 173 -202 -1073 -221 153 -2 -1073 -1073 -17 -1073 40 78 41 144 -1073 -1073 192 -1073 -1073 -1073 -217 98 89 -221 83 -202 -192 59 -1073 168 -33 -221 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 17 E= 1.9e-002 0.529412 0.000000 0.470588 0.000000 0.294118 0.705882 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.941176 0.000000 0.058824 0.000000 0.882353 0.058824 0.000000 0.058824 0.764706 0.235294 0.000000 0.000000 0.235294 0.000000 0.294118 0.470588 0.352941 0.647059 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.058824 0.470588 0.411765 0.058824 0.470588 0.058824 0.058824 0.411765 0.000000 0.764706 0.176471 0.058824 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AG][CA]AAA[AC][TGA][CA]A[CG][AT]C -------------------------------------------------------------------------------- Time 2.72 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 4 llr = 65 E-value = 1.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :a::a3:::3:5aa pos.-specific C :::a::::3:35:: probability G a:a::88a8:8::: matrix T ::::::3::8:::: bits 2.2 * ** * 2.0 ***** * ** 1.7 ***** * ** 1.5 ***** * ** Relative 1.3 ********* * ** Entropy 1.1 ************** (23.5 bits) 0.9 ************** 0.7 ************** 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel GAGCAGGGGTGAAA consensus AT CACC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 38407 324 4.45e-09 TATTTGGGAT GAGCAGGGGTGAAA TTTTCGCTAT 49787 22 2.90e-08 TATTTGGGAT GAGCAAGGGTGAAA TTTTCGCTAT 8461 210 3.34e-08 CCCGAAGTTC GAGCAGGGCAGCAA TGTACCGACG 48355 201 5.92e-08 GTGCATTTAC GAGCAGTGGTCCAA ACGCGGTACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38407 4.4e-09 323_[+2]_163 49787 2.9e-08 21_[+2]_465 8461 3.3e-08 209_[+2]_277 48355 5.9e-08 200_[+2]_286 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=4 38407 ( 324) GAGCAGGGGTGAAA 1 49787 ( 22) GAGCAAGGGTGAAA 1 8461 ( 210) GAGCAGGGCAGCAA 1 48355 ( 201) GAGCAGTGGTCCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 8279 bayes= 11.0145 E= 1.9e+002 -865 -865 216 -865 191 -865 -865 -865 -865 -865 216 -865 -865 206 -865 -865 191 -865 -865 -865 -8 -865 175 -865 -865 -865 175 -13 -865 -865 216 -865 -865 6 175 -865 -8 -865 -865 145 -865 6 175 -865 91 106 -865 -865 191 -865 -865 -865 191 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 4 E= 1.9e+002 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.250000 0.750000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GAGCA[GA][GT]G[GC][TA][GC][AC]AA -------------------------------------------------------------------------------- Time 5.27 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 18 sites = 6 llr = 99 E-value = 2.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 22::::::3:3::::3:: pos.-specific C :::::35::2:7:a:722 probability G 85a::::a23::2:7:88 matrix T :3:aa75:55738:3::: bits 2.2 * * * 2.0 *** * * 1.7 *** * * 1.5 * *** * * ** Relative 1.3 * *** * ** ** Entropy 1.1 * **** * ******* (23.8 bits) 0.9 * ****** ******** 0.7 ******** ********* 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel GGGTTTCGTTTCTCGCGG consensus T CT AGAT TA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 49787 71 3.73e-10 CTGTTTCTAA GTGTTTTGATTCTCGCGG GAAATACTTT 38407 403 3.73e-10 CTGATTCTAA GTGTTTTGATTCTCGCGG GAAATACTTT 47164 444 3.73e-10 TACTCGGAAC GGGTTTCGTGTTTCGCGG TTTTTCGACA 48944 276 8.00e-08 GTTTCGCCTG GAGTTCCGTTTCTCTACG AACGACTTTT 48220 268 1.06e-07 CATCGTCGTC GGGTTTCGTCACGCTCGC TAGCTCTCTG 38892 245 1.16e-07 TACACGATAG AGGTTCTGGGATTCGAGG CACCAGCTCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49787 3.7e-10 70_[+3]_412 38407 3.7e-10 402_[+3]_80 47164 3.7e-10 443_[+3]_39 48944 8e-08 275_[+3]_207 48220 1.1e-07 267_[+3]_215 38892 1.2e-07 244_[+3]_238 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=18 seqs=6 49787 ( 71) GTGTTTTGATTCTCGCGG 1 38407 ( 403) GTGTTTTGATTCTCGCGG 1 47164 ( 444) GGGTTTCGTGTTTCGCGG 1 48944 ( 276) GAGTTCCGTTTCTCTACG 1 48220 ( 268) GGGTTTCGTCACGCTCGC 1 38892 ( 245) AGGTTCTGGGATTCGAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 8211 bayes= 10.8651 E= 2.3e+002 -67 -923 190 -923 -67 -923 117 29 -923 -923 217 -923 -923 -923 -923 187 -923 -923 -923 187 -923 48 -923 129 -923 106 -923 87 -923 -923 217 -923 33 -923 -42 87 -923 -52 58 87 33 -923 -923 129 -923 148 -923 29 -923 -923 -42 161 -923 206 -923 -923 -923 -923 158 29 33 148 -923 -923 -923 -52 190 -923 -923 -52 190 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 6 E= 2.3e+002 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.500000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.166667 0.500000 0.000000 0.166667 0.333333 0.500000 0.333333 0.000000 0.000000 0.666667 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.333333 0.666667 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.166667 0.833333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[GT]GTT[TC][CT]G[TA][TG][TA][CT]TC[GT][CA]GG -------------------------------------------------------------------------------- Time 7.92 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46387 1.33e-02 73_[+1(2.00e-06)]_415 47094 2.92e-01 500 47164 9.41e-08 209_[+1(4.95e-06)]_222_\ [+3(3.73e-10)]_39 8461 1.69e-05 5_[+1(1.52e-05)]_192_[+2(3.34e-08)]_\ 277 47358 5.70e-02 435_[+1(1.40e-05)]_53 38407 3.26e-13 323_[+2(4.45e-09)]_14_\ [+1(2.61e-06)]_9_[+3(3.12e-07)]_12_[+3(3.73e-10)]_80 48220 3.95e-05 267_[+3(1.06e-07)]_163_\ [+1(6.32e-05)]_40 5286 6.97e-03 297_[+1(7.14e-07)]_191 49280 3.17e-02 337_[+1(3.27e-05)]_151 49787 1.91e-12 21_[+2(2.90e-08)]_14_[+1(2.61e-06)]_\ 9_[+3(3.73e-10)]_412 40691 4.15e-03 152_[+1(5.47e-06)]_336 10187 8.84e-02 364_[+1(8.07e-05)]_124 12221 2.12e-01 217_[+1(4.47e-05)]_271 38892 7.05e-05 159_[+1(7.12e-05)]_73_\ [+3(1.16e-07)]_238 48944 1.74e-05 275_[+3(8.00e-08)]_166_\ [+1(7.36e-06)]_29 48355 1.20e-05 200_[+2(5.92e-08)]_49_\ [+1(5.89e-06)]_225 50128 6.25e-02 288_[+1(3.80e-05)]_200 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************