******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/115/115.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42659 1.0000 500 47395 1.0000 500 47788 1.0000 500 54969 1.0000 500 44639 1.0000 500 1884 1.0000 500 51876 1.0000 500 12346 1.0000 500 43509 1.0000 500 44393 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/115/115.seqs.fa -oc motifs/115 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.264 C 0.245 G 0.227 T 0.264 Background letter frequencies (from dataset with add-one prior applied): A 0.264 C 0.245 G 0.227 T 0.264 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 9 llr = 113 E-value = 2.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 722:663a181737:a pos.-specific C :4::24::9:1:4:3: probability G :189::7::28::37: matrix T 32:12::::::32::: bits 2.1 1.9 * * 1.7 * * * 1.5 * ** * Relative 1.3 ** *** ** Entropy 1.1 * ** ******* *** (18.1 bits) 0.9 * ** ******* *** 0.6 * ** ******* *** 0.4 * ************** 0.2 **************** 0.0 ---------------- Multilevel ACGGAAGACAGACAGA consensus TAA CCA G TAGC sequence T T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 51876 48 2.09e-10 CGGTATCGAA ACGGAAGACAGACAGA CTAGCCGCCG 47788 193 2.89e-08 TGAAACCCTC ACGGTAGACAGATAGA TAGATAGATA 42659 267 4.78e-08 CCGTAAAGCT AAGGACGACGGACAGA TGCTATTAGC 47395 77 2.37e-07 TCCATACCGG ATGGCAAACAGACGGA TCTTTCACTA 1884 132 1.47e-06 CACGAGCTCT AAAGAAGACAGTAGCA AGATGGATCA 54969 13 3.50e-06 AACAATAACA AGGTAAAACAGATGGA AACGAAACAA 44393 329 4.41e-06 CAAATGCGAA TTGGTCAACACACACA CGCACCCGAA 12346 58 4.41e-06 TCTCTCTTTG TCAGCCGACGGTAACA GGTATCAAGA 44639 159 4.64e-06 TCTTGATGTC TCGGACGAAAATAAGA GTGGTTCCTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 51876 2.1e-10 47_[+1]_437 47788 2.9e-08 192_[+1]_292 42659 4.8e-08 266_[+1]_218 47395 2.4e-07 76_[+1]_408 1884 1.5e-06 131_[+1]_353 54969 3.5e-06 12_[+1]_472 44393 4.4e-06 328_[+1]_156 12346 4.4e-06 57_[+1]_427 44639 4.6e-06 158_[+1]_326 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=9 51876 ( 48) ACGGAAGACAGACAGA 1 47788 ( 193) ACGGTAGACAGATAGA 1 42659 ( 267) AAGGACGACGGACAGA 1 47395 ( 77) ATGGCAAACAGACGGA 1 1884 ( 132) AAAGAAGACAGTAGCA 1 54969 ( 13) AGGTAAAACAGATGGA 1 44393 ( 329) TTGGTCAACACACACA 1 12346 ( 58) TCAGCCGACGGTAACA 1 44639 ( 159) TCGGACGAAAATAAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4850 bayes= 9.92035 E= 2.9e+001 134 -982 -982 33 -25 86 -103 -25 -25 -982 178 -982 -982 -982 197 -125 107 -14 -982 -25 107 86 -982 -982 34 -982 155 -982 192 -982 -982 -982 -125 186 -982 -982 156 -982 -3 -982 -125 -114 178 -982 134 -982 -982 33 34 86 -982 -25 134 -982 55 -982 -982 44 155 -982 192 -982 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 2.9e+001 0.666667 0.000000 0.000000 0.333333 0.222222 0.444444 0.111111 0.222222 0.222222 0.000000 0.777778 0.000000 0.000000 0.000000 0.888889 0.111111 0.555556 0.222222 0.000000 0.222222 0.555556 0.444444 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.777778 0.000000 0.222222 0.000000 0.111111 0.111111 0.777778 0.000000 0.666667 0.000000 0.000000 0.333333 0.333333 0.444444 0.000000 0.222222 0.666667 0.000000 0.333333 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AT][CAT][GA]G[ACT][AC][GA]AC[AG]G[AT][CAT][AG][GC]A -------------------------------------------------------------------------------- Time 0.98 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 10 llr = 102 E-value = 1.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::2::1::3:: pos.-specific C :a2:131:::83 probability G 6:1:::8:2621 matrix T 4:7897:a81:6 bits 2.1 1.9 * * 1.7 * * 1.5 * * * Relative 1.3 * ** *** * Entropy 1.1 ** ****** * (14.8 bits) 0.9 *********** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GCTTTTGTTGCT consensus T CA C GAGC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 51876 176 7.79e-07 TCGTTCGTCT TCTTTTGTTACT TTATAATTCA 47395 463 2.13e-06 ACTCTTCGAT TCTTTCGTTGCC TTACTTCCAG 12346 480 2.72e-06 GTTACCGAAC GCCTTTGTGGCT TCCATCGTC 1884 39 4.37e-06 ATACACGACA GCCTTTGTTACC CATTCTCAAA 44639 105 9.01e-06 TCTACCCATT GCTATTGTTTCT ATTTCCAGGA 54969 177 9.99e-06 CGTTTACTGT GCTTCTGTTGGT CAAAATATGT 43509 464 1.07e-05 CAAGCCATCA TCTTTTATTGCC CTCGACAGTA 47788 134 1.33e-05 AATGTCGGGT GCTTTCGTTGGG ACTCTTCGCG 44393 282 1.84e-05 TGTAAGCTTC TCTATTCTTGCT GGTCCCCACA 42659 226 2.92e-05 GAGTTTTTTA GCGTTCGTGACT CAAAACAGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 51876 7.8e-07 175_[+2]_313 47395 2.1e-06 462_[+2]_26 12346 2.7e-06 479_[+2]_9 1884 4.4e-06 38_[+2]_450 44639 9e-06 104_[+2]_384 54969 1e-05 176_[+2]_312 43509 1.1e-05 463_[+2]_25 47788 1.3e-05 133_[+2]_355 44393 1.8e-05 281_[+2]_207 42659 2.9e-05 225_[+2]_263 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=10 51876 ( 176) TCTTTTGTTACT 1 47395 ( 463) TCTTTCGTTGCC 1 12346 ( 480) GCCTTTGTGGCT 1 1884 ( 39) GCCTTTGTTACC 1 44639 ( 105) GCTATTGTTTCT 1 54969 ( 177) GCTTCTGTTGGT 1 43509 ( 464) TCTTTTATTGCC 1 47788 ( 134) GCTTTCGTTGGG 1 44393 ( 282) TCTATTCTTGCT 1 42659 ( 226) GCGTTCGTGACT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 9.18275 E= 1.3e+002 -997 -997 140 60 -997 203 -997 -997 -997 -29 -118 140 -40 -997 -997 160 -997 -129 -997 177 -997 29 -997 140 -140 -129 182 -997 -997 -997 -997 192 -997 -997 -18 160 19 -997 140 -140 -997 171 -18 -997 -997 29 -118 118 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 1.3e+002 0.000000 0.000000 0.600000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.100000 0.700000 0.200000 0.000000 0.000000 0.800000 0.000000 0.100000 0.000000 0.900000 0.000000 0.300000 0.000000 0.700000 0.100000 0.100000 0.800000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.300000 0.000000 0.600000 0.100000 0.000000 0.800000 0.200000 0.000000 0.000000 0.300000 0.100000 0.600000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GT]C[TC][TA]T[TC]GT[TG][GA][CG][TC] -------------------------------------------------------------------------------- Time 1.98 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 8 llr = 118 E-value = 1.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 114::::::1:6:::11::3 pos.-specific C :35661::1::::98::43: probability G 911331a1695141394:68 matrix T :5:118:93:536:::561: bits 2.1 * 1.9 * 1.7 * 1.5 * ** * * * Relative 1.3 * ** * *** * Entropy 1.1 * ** ** **** * * (21.4 bits) 0.9 * ****** **** *** 0.6 * ****************** 0.4 * ****************** 0.2 ******************** 0.0 -------------------- Multilevel GTCCCTGTGGGATCCGTTGG consensus CAGG T TTG G GCCA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 47395 295 2.89e-09 AGATGGGTTG GCAGCTGTGGTATCCGATGG TATGATGGCG 43509 42 2.20e-08 CTGGGGTATT GTGCCTGTGGTTGCCGTTCA GAGCTGCTAC 47788 30 2.95e-08 CGAGGTACAC GTCTCTGTGGGATCGATCGG CAGTGACTGA 44639 254 3.93e-08 TTGCGAAAAT GTCCGTGTCGTATCGGTTTG ATAAGTAAAA 51876 443 9.44e-08 TAGTTGATCC GAAGCTGTGAGATCCGGCCG GATAGAAGTC 12346 332 1.03e-07 TCGATGCGGC GCACCGGGTGGAGCCGTCGG CGGTCTTTCG 42659 57 1.21e-07 TCCACCTTGC AGCCTTGTGGGGTCCGGTGG CTTTCTTGTT 1884 477 5.27e-07 TACACCCGTG GTCCGCGTTGTTGGCGGTGA TTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47395 2.9e-09 294_[+3]_186 43509 2.2e-08 41_[+3]_439 47788 3e-08 29_[+3]_451 44639 3.9e-08 253_[+3]_227 51876 9.4e-08 442_[+3]_38 12346 1e-07 331_[+3]_149 42659 1.2e-07 56_[+3]_424 1884 5.3e-07 476_[+3]_4 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=8 47395 ( 295) GCAGCTGTGGTATCCGATGG 1 43509 ( 42) GTGCCTGTGGTTGCCGTTCA 1 47788 ( 30) GTCTCTGTGGGATCGATCGG 1 44639 ( 254) GTCCGTGTCGTATCGGTTTG 1 51876 ( 443) GAAGCTGTGAGATCCGGCCG 1 12346 ( 332) GCACCGGGTGGAGCCGTCGG 1 42659 ( 57) AGCCTTGTGGGGTCCGGTGG 1 1884 ( 477) GTCCGCGTTGTTGGCGGTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4810 bayes= 9.22942 E= 1.8e+001 -108 -965 195 -965 -108 3 -86 92 51 103 -86 -965 -965 135 14 -108 -965 135 14 -108 -965 -97 -86 150 -965 -965 214 -965 -965 -965 -86 173 -965 -97 146 -8 -108 -965 195 -965 -965 -965 114 92 124 -965 -86 -8 -965 -965 72 124 -965 184 -86 -965 -965 161 14 -965 -108 -965 195 -965 -108 -965 72 92 -965 61 -965 124 -965 3 146 -108 -8 -965 172 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 8 E= 1.8e+001 0.125000 0.000000 0.875000 0.000000 0.125000 0.250000 0.125000 0.500000 0.375000 0.500000 0.125000 0.000000 0.000000 0.625000 0.250000 0.125000 0.000000 0.625000 0.250000 0.125000 0.000000 0.125000 0.125000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.125000 0.625000 0.250000 0.125000 0.000000 0.875000 0.000000 0.000000 0.000000 0.500000 0.500000 0.625000 0.000000 0.125000 0.250000 0.000000 0.000000 0.375000 0.625000 0.000000 0.875000 0.125000 0.000000 0.000000 0.750000 0.250000 0.000000 0.125000 0.000000 0.875000 0.000000 0.125000 0.000000 0.375000 0.500000 0.000000 0.375000 0.000000 0.625000 0.000000 0.250000 0.625000 0.125000 0.250000 0.000000 0.750000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[TC][CA][CG][CG]TGT[GT]G[GT][AT][TG]C[CG]G[TG][TC][GC][GA] -------------------------------------------------------------------------------- Time 2.89 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42659 6.28e-09 56_[+3(1.21e-07)]_149_\ [+2(2.92e-05)]_29_[+1(4.78e-08)]_218 47395 7.71e-11 76_[+1(2.37e-07)]_86_[+2(9.38e-05)]_\ 104_[+3(2.89e-09)]_148_[+2(2.13e-06)]_26 47788 5.19e-10 29_[+3(2.95e-08)]_84_[+2(1.33e-05)]_\ 47_[+1(2.89e-08)]_292 54969 5.47e-04 12_[+1(3.50e-06)]_148_\ [+2(9.99e-06)]_262_[+1(2.31e-05)]_34 44639 5.11e-08 104_[+2(9.01e-06)]_42_\ [+1(4.64e-06)]_79_[+3(3.93e-08)]_227 1884 9.93e-08 38_[+2(4.37e-06)]_81_[+1(1.47e-06)]_\ 329_[+3(5.27e-07)]_4 51876 1.07e-12 47_[+1(2.09e-10)]_112_\ [+2(7.79e-07)]_255_[+3(9.44e-08)]_38 12346 3.93e-08 57_[+1(4.41e-06)]_258_\ [+3(1.03e-07)]_128_[+2(2.72e-06)]_9 43509 7.72e-06 41_[+3(2.20e-08)]_215_\ [+3(4.14e-05)]_167_[+2(1.07e-05)]_25 44393 1.19e-03 281_[+2(1.84e-05)]_35_\ [+1(4.41e-06)]_156 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************