******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/273/273.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11112 1.0000 500 20365 1.0000 500 21046 1.0000 500 262343 1.0000 500 268816 1.0000 500 28816 1.0000 500 31665 1.0000 500 32996 1.0000 500 37139 1.0000 500 37442 1.0000 500 38802 1.0000 500 6606 1.0000 500 8417 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/273/273.seqs.fa -oc motifs/273 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.260 C 0.242 G 0.239 T 0.259 Background letter frequencies (from dataset with add-one prior applied): A 0.260 C 0.242 G 0.239 T 0.259 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 9 llr = 121 E-value = 5.2e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::6232::a:271242 pos.-specific C :1:::11:::::2::: probability G 3948749a:a837831 matrix T 7::::2::::::::27 bits 2.1 * * 1.9 *** 1.7 * **** 1.4 * **** Relative 1.2 * * ***** * Entropy 1.0 ***** ****** * (19.4 bits) 0.8 ***** ******** * 0.6 ***** ******** * 0.4 ***** ********** 0.2 **************** 0.0 ---------------- Multilevel TGAGGGGGAGGAGGAT consensus G GAAA AGCAGA sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 37139 166 8.13e-09 TGAAGAAGGA GGAGGAGGAGGAGGAT TGGTTGACCA 32996 315 1.33e-08 TGGATTTGAG TGAGGGGGAGAAGGGT GTTGAGTGGC 11112 289 1.07e-07 CGTCGTTAGA GGGAGTGGAGGAGGAT ACGGTGGCCA 268816 262 3.68e-07 TGGTTTGGTT TGGGGCGGAGGGCGTT TGTGTTGTTG 37442 21 4.05e-07 TCAGATAGAC TGAGGGGGAGAGAGGT CGTCTTCATA 31665 432 4.05e-07 CTTCATTGTG TGAGAGCGAGGAGGAA GAAGTGTTCT 262343 197 5.51e-07 TCAGATTATT GCGGAGGGAGGAGGTT TAGGGTAGAC 21046 167 8.66e-07 AGTATGTTGG TGGAGAGGAGGAGAGA TAGGGTGGGG 6606 179 2.32e-06 GAGTTGTGAT TGAGATGGAGGGCAAG ATACGCATAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37139 8.1e-09 165_[+1]_319 32996 1.3e-08 314_[+1]_170 11112 1.1e-07 288_[+1]_196 268816 3.7e-07 261_[+1]_223 37442 4.1e-07 20_[+1]_464 31665 4.1e-07 431_[+1]_53 262343 5.5e-07 196_[+1]_288 21046 8.7e-07 166_[+1]_318 6606 2.3e-06 178_[+1]_306 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=9 37139 ( 166) GGAGGAGGAGGAGGAT 1 32996 ( 315) TGAGGGGGAGAAGGGT 1 11112 ( 289) GGGAGTGGAGGAGGAT 1 268816 ( 262) TGGGGCGGAGGGCGTT 1 37442 ( 21) TGAGGGGGAGAGAGGT 1 31665 ( 432) TGAGAGCGAGGAGGAA 1 262343 ( 197) GCGGAGGGAGGAGGTT 1 21046 ( 167) TGGAGAGGAGGAGAGA 1 6606 ( 179) TGAGATGGAGGGCAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 10.856 E= 5.2e-002 -982 -982 48 136 -982 -112 189 -982 110 -982 89 -982 -22 -982 170 -982 36 -982 148 -982 -22 -112 89 -22 -982 -112 189 -982 -982 -982 206 -982 194 -982 -982 -982 -982 -982 206 -982 -22 -982 170 -982 136 -982 48 -982 -122 -12 148 -982 -22 -982 170 -982 77 -982 48 -22 -22 -982 -110 136 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 5.2e-002 0.000000 0.000000 0.333333 0.666667 0.000000 0.111111 0.888889 0.000000 0.555556 0.000000 0.444444 0.000000 0.222222 0.000000 0.777778 0.000000 0.333333 0.000000 0.666667 0.000000 0.222222 0.111111 0.444444 0.222222 0.000000 0.111111 0.888889 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.222222 0.000000 0.777778 0.000000 0.666667 0.000000 0.333333 0.000000 0.111111 0.222222 0.666667 0.000000 0.222222 0.000000 0.777778 0.000000 0.444444 0.000000 0.333333 0.222222 0.222222 0.000000 0.111111 0.666667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TG]G[AG][GA][GA][GAT]GGAG[GA][AG][GC][GA][AGT][TA] -------------------------------------------------------------------------------- Time 1.52 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 13 llr = 145 E-value = 2.7e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 72:8:681:73152:9 pos.-specific C 258:74:2a1:8:18: probability G :11:2::::15:2221 matrix T 13122:27:22126:: bits 2.1 * 1.9 * 1.7 * * 1.4 * ** Relative 1.2 ** * * * ** Entropy 1.0 ** ** * * ** (16.1 bits) 0.8 * ******* * ** 0.6 * *********** ** 0.4 * ************** 0.2 **************** 0.0 ---------------- Multilevel ACCACAATCAGCATCA consensus CT C C A G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 11112 103 3.26e-08 CACCCGTAGC ACCATAATCAGCGTCA TCGTCAGTGG 32996 76 1.71e-07 CAGTGATGCC ACCACCACCAACAACA ACAACAACAG 8417 142 4.73e-07 GTTGGTGGAG ACCTCAATCAGCTTGA CAGTGTAGAG 6606 90 5.22e-07 ATGATGTACC AGCAGCATCAACATCA CTCACATGGC 21046 15 7.12e-07 CATCGTCGAA ACCACCATCGTCGTCA AACTTTTTAC 37139 485 1.05e-06 TCTCAACCAA CACACCACCAGCAGCA 28816 51 1.53e-06 CGCTTGTATC CACAGAATCTGCATCA CTGATGAAAG 37442 445 3.65e-06 GGGACAGCCA ATCACATCCAGCTCCA GCCGAGCCAT 20365 459 3.65e-06 ACGTCCAAAC ATCACCATCCAAATCA CTCCAAACAT 38802 120 1.14e-05 CATATTAACC CCTACAATCAGTAACA AATTATTTTC 268816 82 1.31e-05 CTCCCTATGA ATGTCAAACAGCGTCA ACCCCGGAAC 31665 174 1.60e-05 GAGTACCTCA TCCACAATCAACTTGG ATAGGATCGA 262343 406 1.95e-05 ACCTACATCC ATCATATTCTTCAGCA GGACGAATCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11112 3.3e-08 102_[+2]_382 32996 1.7e-07 75_[+2]_409 8417 4.7e-07 141_[+2]_343 6606 5.2e-07 89_[+2]_395 21046 7.1e-07 14_[+2]_470 37139 1.1e-06 484_[+2] 28816 1.5e-06 50_[+2]_434 37442 3.7e-06 444_[+2]_40 20365 3.7e-06 458_[+2]_26 38802 1.1e-05 119_[+2]_365 268816 1.3e-05 81_[+2]_403 31665 1.6e-05 173_[+2]_311 262343 1.9e-05 405_[+2]_79 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=13 11112 ( 103) ACCATAATCAGCGTCA 1 32996 ( 76) ACCACCACCAACAACA 1 8417 ( 142) ACCTCAATCAGCTTGA 1 6606 ( 90) AGCAGCATCAACATCA 1 21046 ( 15) ACCACCATCGTCGTCA 1 37139 ( 485) CACACCACCAGCAGCA 1 28816 ( 51) CACAGAATCTGCATCA 1 37442 ( 445) ATCACATCCAGCTCCA 1 20365 ( 459) ATCACCATCCAAATCA 1 38802 ( 120) CCTACAATCAGTAACA 1 268816 ( 82) ATGTCAAACAGCGTCA 1 31665 ( 174) TCCACAATCAACTTGG 1 262343 ( 406) ATCATATTCTTCAGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 8.91886 E= 2.7e-001 141 -7 -1035 -175 -75 93 -163 25 -1035 180 -163 -175 170 -1035 -1035 -75 -1035 151 -64 -75 124 67 -1035 -1035 170 -1035 -1035 -75 -175 -7 -1035 142 -1035 205 -1035 -1035 141 -165 -163 -75 24 -1035 117 -75 -175 180 -1035 -175 105 -1035 -5 -17 -75 -165 -64 125 -1035 180 -64 -1035 183 -1035 -163 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 13 E= 2.7e-001 0.692308 0.230769 0.000000 0.076923 0.153846 0.461538 0.076923 0.307692 0.000000 0.846154 0.076923 0.076923 0.846154 0.000000 0.000000 0.153846 0.000000 0.692308 0.153846 0.153846 0.615385 0.384615 0.000000 0.000000 0.846154 0.000000 0.000000 0.153846 0.076923 0.230769 0.000000 0.692308 0.000000 1.000000 0.000000 0.000000 0.692308 0.076923 0.076923 0.153846 0.307692 0.000000 0.538462 0.153846 0.076923 0.846154 0.000000 0.076923 0.538462 0.000000 0.230769 0.230769 0.153846 0.076923 0.153846 0.615385 0.000000 0.846154 0.153846 0.000000 0.923077 0.000000 0.076923 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AC][CT]CAC[AC]A[TC]CA[GA]C[AGT]TCA -------------------------------------------------------------------------------- Time 2.98 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 9 llr = 115 E-value = 1.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :2121::::13:::2: pos.-specific C :121:4::::1::::: probability G a:6224a6:746aa:8 matrix T :71471:4a214::82 bits 2.1 * * ** 1.9 * * * ** 1.7 * * * ** 1.4 * * * ** Relative 1.2 * * * **** Entropy 1.0 * *** ***** (18.4 bits) 0.8 ** * **** ***** 0.6 ** ****** ***** 0.4 *** ****** ***** 0.2 **************** 0.0 ---------------- Multilevel GTGTTCGGTGGGGGTG consensus ACAGG T TAT AT sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 37139 205 3.31e-09 TGTGATCGAT GTGATGGGTGGGGGTG CACCGACGAG 38802 22 7.26e-09 GTAAGGCAAA GTGTTCGGTTGGGGTG GCCCGTTGAA 20365 318 4.75e-07 TTTGTTGCTC GTGCTCGTTGATGGTT GCATCGTTCT 31665 89 6.33e-07 AACAGAGAGT GTATAGGTTGGGGGTG TTGGAAATGT 268816 241 8.21e-07 GGCGACGAGG GAGTTGGGTGCTGGTT TGGTTTGGGG 11112 242 8.99e-07 TGTCTCGCCT GACGGCGGTGATGGTG ATGATTCCAT 21046 207 1.14e-06 TCGTCATGTT GCCTTCGTTGTGGGTG GGCTCGCGTT 262343 65 3.37e-06 ACCAGTATGC GTTGTGGGTAAGGGAG TTCCAATGGG 32996 375 4.38e-06 AGAAGAACGG GTGAGTGTTTGTGGAG AATGAATGGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37139 3.3e-09 204_[+3]_280 38802 7.3e-09 21_[+3]_463 20365 4.7e-07 317_[+3]_167 31665 6.3e-07 88_[+3]_396 268816 8.2e-07 240_[+3]_244 11112 9e-07 241_[+3]_243 21046 1.1e-06 206_[+3]_278 262343 3.4e-06 64_[+3]_420 32996 4.4e-06 374_[+3]_110 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=9 37139 ( 205) GTGATGGGTGGGGGTG 1 38802 ( 22) GTGTTCGGTTGGGGTG 1 20365 ( 318) GTGCTCGTTGATGGTT 1 31665 ( 89) GTATAGGTTGGGGGTG 1 268816 ( 241) GAGTTGGGTGCTGGTT 1 11112 ( 242) GACGGCGGTGATGGTG 1 21046 ( 207) GCCTTCGTTGTGGGTG 1 262343 ( 65) GTTGTGGGTAAGGGAG 1 32996 ( 375) GTGAGTGTTTGTGGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 10.856 E= 1.2e+001 -982 -982 206 -982 -22 -112 -982 136 -122 -12 121 -122 -22 -112 -11 78 -122 -982 -11 136 -982 88 89 -122 -982 -982 206 -982 -982 -982 121 78 -982 -982 -982 195 -122 -982 148 -22 36 -112 89 -122 -982 -982 121 78 -982 -982 206 -982 -982 -982 206 -982 -22 -982 -982 159 -982 -982 170 -22 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 1.2e+001 0.000000 0.000000 1.000000 0.000000 0.222222 0.111111 0.000000 0.666667 0.111111 0.222222 0.555556 0.111111 0.222222 0.111111 0.222222 0.444444 0.111111 0.000000 0.222222 0.666667 0.000000 0.444444 0.444444 0.111111 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.555556 0.444444 0.000000 0.000000 0.000000 1.000000 0.111111 0.000000 0.666667 0.222222 0.333333 0.111111 0.444444 0.111111 0.000000 0.000000 0.555556 0.444444 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.222222 0.000000 0.000000 0.777778 0.000000 0.000000 0.777778 0.222222 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[TA][GC][TAG][TG][CG]G[GT]T[GT][GA][GT]GG[TA][GT] -------------------------------------------------------------------------------- Time 4.38 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11112 1.57e-10 102_[+2(3.26e-08)]_123_\ [+3(8.99e-07)]_31_[+1(1.07e-07)]_196 20365 2.85e-05 317_[+3(4.75e-07)]_125_\ [+2(3.65e-06)]_26 21046 2.37e-08 14_[+2(7.12e-07)]_136_\ [+1(8.66e-07)]_24_[+3(1.14e-06)]_278 262343 8.48e-07 64_[+3(3.37e-06)]_116_\ [+1(5.51e-07)]_193_[+2(1.95e-05)]_79 268816 1.14e-07 81_[+2(1.31e-05)]_143_\ [+3(8.21e-07)]_5_[+1(3.68e-07)]_223 28816 4.58e-03 50_[+2(1.53e-06)]_434 31665 1.18e-07 88_[+3(6.33e-07)]_69_[+2(1.60e-05)]_\ 242_[+1(4.05e-07)]_53 32996 4.62e-10 75_[+2(1.71e-07)]_223_\ [+1(1.33e-08)]_44_[+3(4.38e-06)]_110 37139 1.91e-12 165_[+1(8.13e-09)]_23_\ [+3(3.31e-09)]_264_[+2(1.05e-06)] 37442 2.82e-05 20_[+1(4.05e-07)]_408_\ [+2(3.65e-06)]_40 38802 2.77e-06 21_[+3(7.26e-09)]_82_[+2(1.14e-05)]_\ 365 6606 1.99e-05 89_[+2(5.22e-07)]_73_[+1(2.32e-06)]_\ 306 8417 9.55e-04 141_[+2(4.73e-07)]_343 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************