******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/368/368.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 16511 1.0000 500 24770 1.0000 500 24813 1.0000 500 260787 1.0000 500 260891 1.0000 500 260899 1.0000 500 261692 1.0000 500 262809 1.0000 500 27273 1.0000 500 41733 1.0000 500 42965 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/368/368.seqs.fa -oc motifs/368 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.256 C 0.227 G 0.243 T 0.274 Background letter frequencies (from dataset with add-one prior applied): A 0.256 C 0.227 G 0.243 T 0.274 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 11 llr = 152 E-value = 7.8e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1137:46::8::54191161: pos.-specific C ::::12121:3:311:4:1:1 probability G 566:55:7821a1271:92:9 matrix T 53134:311:6:141:5:19: bits 2.1 * 1.9 * 1.7 * * 1.5 * * * ** Relative 1.3 * * * * ** Entropy 1.1 * ** * * * ** (19.9 bits) 0.9 ** *** * ** * ** 0.6 ************ **** ** 0.4 ************* ******* 0.2 ********************* 0.0 --------------------- Multilevel GGGAGGAGGATGAAGATGATG consensus TTATTAT C CT C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 24770 141 4.30e-10 GTGGAGGACG TGATGAAGGATGATGATGATG CTGGGGCAAC 261692 215 2.61e-09 TTGTGGTTGC TGAAGGACGATGACGATGATG CGTCTGTCTT 260891 132 2.95e-09 ATGAGTGGAG GTGATGATGATGATGATGATG ATGATGATGG 24813 103 9.79e-09 GTCGGTATGG TGGAGAAGCACGAAGATGGTG GATGTGCATT 42965 261 5.72e-08 GCCTGAAGTG GGGTTCTGGAGGCTGACGATG TAAATTGTGT 41733 141 1.30e-07 GATAATGAAG TGGACGTCGACGCAGACGGTG ACGTCGCTGA 262809 198 2.17e-07 TAGAGGAAAG AGGAGATGGATGCACAAGATG TGTTTCAGGC 27273 133 3.01e-07 TTGATGGCGT GGGATACGGGCGAGTATGATG GCTGACGCTT 260899 195 6.96e-07 TCTTCTGGAG TTGAGGAGGATGAGAGCGAAG ACGAGCTTTG 16511 105 3.13e-06 ATTCCCACGT GAAATCAGTATGGAGACGCTG TCTTGATCTC 260787 12 2.67e-05 ACCAAAACGA GTTTGGAGGGTGTTGATATTC CGATTCACAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24770 4.3e-10 140_[+1]_339 261692 2.6e-09 214_[+1]_265 260891 2.9e-09 131_[+1]_348 24813 9.8e-09 102_[+1]_377 42965 5.7e-08 260_[+1]_219 41733 1.3e-07 140_[+1]_339 262809 2.2e-07 197_[+1]_282 27273 3e-07 132_[+1]_347 260899 7e-07 194_[+1]_285 16511 3.1e-06 104_[+1]_375 260787 2.7e-05 11_[+1]_468 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=11 24770 ( 141) TGATGAAGGATGATGATGATG 1 261692 ( 215) TGAAGGACGATGACGATGATG 1 260891 ( 132) GTGATGATGATGATGATGATG 1 24813 ( 103) TGGAGAAGCACGAAGATGGTG 1 42965 ( 261) GGGTTCTGGAGGCTGACGATG 1 41733 ( 141) TGGACGTCGACGCAGACGGTG 1 262809 ( 198) AGGAGATGGATGCACAAGATG 1 27273 ( 133) GGGATACGGGCGAGTATGATG 1 260899 ( 195) TTGAGGAGGATGAGAGCGAAG 1 16511 ( 105) GAAATCAGTATGGAGACGCTG 1 260787 ( 12) GTTTGGAGGGTGTTGATATTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 8.90388 E= 7.8e-003 -149 -1010 90 73 -149 -1010 139 0 9 -1010 139 -159 151 -1010 -1010 0 -1010 -132 116 41 51 -32 90 -1010 131 -132 -1010 0 -1010 -32 158 -159 -1010 -132 175 -159 168 -1010 -42 -1010 -1010 26 -142 122 -1010 -1010 204 -1010 109 26 -142 -159 51 -132 -42 41 -149 -132 158 -159 183 -1010 -142 -1010 -149 68 -1010 99 -149 -1010 190 -1010 131 -132 -42 -159 -149 -1010 -1010 173 -1010 -132 190 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 11 E= 7.8e-003 0.090909 0.000000 0.454545 0.454545 0.090909 0.000000 0.636364 0.272727 0.272727 0.000000 0.636364 0.090909 0.727273 0.000000 0.000000 0.272727 0.000000 0.090909 0.545455 0.363636 0.363636 0.181818 0.454545 0.000000 0.636364 0.090909 0.000000 0.272727 0.000000 0.181818 0.727273 0.090909 0.000000 0.090909 0.818182 0.090909 0.818182 0.000000 0.181818 0.000000 0.000000 0.272727 0.090909 0.636364 0.000000 0.000000 1.000000 0.000000 0.545455 0.272727 0.090909 0.090909 0.363636 0.090909 0.181818 0.363636 0.090909 0.090909 0.727273 0.090909 0.909091 0.000000 0.090909 0.000000 0.090909 0.363636 0.000000 0.545455 0.090909 0.000000 0.909091 0.000000 0.636364 0.090909 0.181818 0.090909 0.090909 0.000000 0.000000 0.909091 0.000000 0.090909 0.909091 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GT][GT][GA][AT][GT][GA][AT]GGA[TC]G[AC][AT]GA[TC]GATG -------------------------------------------------------------------------------- Time 1.18 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 11 llr = 126 E-value = 2.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::92:a246334112: pos.-specific C 9a::7:142564:6:: probability G 1:::1:2:22115:82 matrix T ::182:53:::253:8 bits 2.1 * 1.9 * * 1.7 ** * 1.5 *** * Relative 1.3 **** * ** Entropy 1.1 ****** ** (16.5 bits) 0.9 ****** * *** 0.6 ****** *** **** 0.4 ****** **** **** 0.2 **************** 0.0 ---------------- Multilevel CCATCATAACCAGCGT consensus C AACTT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 262809 455 1.18e-09 TCCATCAATA CCATCATAACCCTCGT AAAGACTCAT 260891 23 1.38e-08 AGAGGTAGCT CCATCATCGCCCTCGT CAATATGATA 260899 142 1.02e-06 ACATCTTCGT CCATTATAAACCTCAT TTTGTTGACG 260787 84 1.79e-06 AGCCTGTGCT CCATCAATCCGAGCGT TGCTCTATTT 42965 127 2.60e-06 CTGCTCTAAT CCATCATCAGCGGTGG CAGCAACCAT 24770 57 3.08e-06 GTATCATCTC CCATGAACAGCTGCGT TGATGATCTT 41733 374 4.55e-06 TGGCGGTCTG CCATCACAAAATTTGT GGCTGTCTTG 16511 228 6.17e-06 AGGAGTCAAT CCAACAGAACAAGTGG TACAAATACG 24813 6 7.17e-06 CAATC CCTACATTCACAGCGT TGTAAGACGT 261692 363 8.26e-06 CAGCCACCAG CCATCAGTACACAAGT TTTCTATCAG 27273 25 1.30e-05 TCCATTCATC GCATTATCGCCATCAT TGTCGACACG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 262809 1.2e-09 454_[+2]_30 260891 1.4e-08 22_[+2]_462 260899 1e-06 141_[+2]_343 260787 1.8e-06 83_[+2]_401 42965 2.6e-06 126_[+2]_358 24770 3.1e-06 56_[+2]_428 41733 4.5e-06 373_[+2]_111 16511 6.2e-06 227_[+2]_257 24813 7.2e-06 5_[+2]_479 261692 8.3e-06 362_[+2]_122 27273 1.3e-05 24_[+2]_460 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=11 262809 ( 455) CCATCATAACCCTCGT 1 260891 ( 23) CCATCATCGCCCTCGT 1 260899 ( 142) CCATTATAAACCTCAT 1 260787 ( 84) CCATCAATCCGAGCGT 1 42965 ( 127) CCATCATCAGCGGTGG 1 24770 ( 57) CCATGAACAGCTGCGT 1 41733 ( 374) CCATCACAAAATTTGT 1 16511 ( 228) CCAACAGAACAAGTGG 1 24813 ( 6) CCTACATTCACAGCGT 1 261692 ( 363) CCATCAGTACACAAGT 1 27273 ( 25) GCATTATCGCCATCAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5335 bayes= 8.91886 E= 2.5e+000 -1010 200 -142 -1010 -1010 214 -1010 -1010 183 -1010 -1010 -159 -49 -1010 -1010 158 -1010 168 -142 -59 197 -1010 -1010 -1010 -49 -132 -42 99 51 68 -1010 0 131 -32 -42 -1010 9 126 -42 -1010 9 148 -142 -1010 51 68 -142 -59 -149 -1010 90 73 -149 148 -1010 0 -49 -1010 175 -1010 -1010 -1010 -42 158 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 2.5e+000 0.000000 0.909091 0.090909 0.000000 0.000000 1.000000 0.000000 0.000000 0.909091 0.000000 0.000000 0.090909 0.181818 0.000000 0.000000 0.818182 0.000000 0.727273 0.090909 0.181818 1.000000 0.000000 0.000000 0.000000 0.181818 0.090909 0.181818 0.545455 0.363636 0.363636 0.000000 0.272727 0.636364 0.181818 0.181818 0.000000 0.272727 0.545455 0.181818 0.000000 0.272727 0.636364 0.090909 0.000000 0.363636 0.363636 0.090909 0.181818 0.090909 0.000000 0.454545 0.454545 0.090909 0.636364 0.000000 0.272727 0.181818 0.000000 0.818182 0.000000 0.000000 0.000000 0.181818 0.818182 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CCATCAT[ACT]A[CA][CA][AC][GT][CT]GT -------------------------------------------------------------------------------- Time 2.32 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 9 llr = 100 E-value = 3.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :73::2::::9: pos.-specific C a3281:3::a11 probability G ::1:7::a1::: matrix T ::32287:9::9 bits 2.1 * * * 1.9 * * * 1.7 * * * 1.5 * * *** Relative 1.3 * * ***** Entropy 1.1 ** * ******* (16.0 bits) 0.9 ** ********* 0.6 ** ********* 0.4 ** ********* 0.2 ** ********* 0.0 ------------ Multilevel CAACGTTGTCAT consensus CTTTAC sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 262809 100 1.35e-07 CTTCCACATT CATCGTTGTCAT GTGCCTGCTT 260891 477 1.93e-07 CCTCGCAACC CACCGTTGTCAT AAAACCGCCA 42965 59 3.63e-07 GGACGAGTTG CATCGTCGTCAT GAGAACAGGA 16511 349 8.99e-07 AACGGTAGGC CAACGATGTCAT CATCTGCAAT 27273 405 6.06e-06 GCTGTTGCCT CCACCTCGTCAT TCTCGAAGAT 24813 226 8.77e-06 CTTATATGAC CCGCTTCGTCAT CGGAGTTTGT 260787 217 1.01e-05 CAAATACCGC CCCTTTTGTCAT TTTCGAGCTG 41733 97 1.56e-05 CAGATGATGA CATCGTTGGCCT ACCGTTGCCG 260899 12 2.16e-05 TTGTGATTTT CAATGATGTCAC GATACTCTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 262809 1.4e-07 99_[+3]_389 260891 1.9e-07 476_[+3]_12 42965 3.6e-07 58_[+3]_430 16511 9e-07 348_[+3]_140 27273 6.1e-06 404_[+3]_84 24813 8.8e-06 225_[+3]_263 260787 1e-05 216_[+3]_272 41733 1.6e-05 96_[+3]_392 260899 2.2e-05 11_[+3]_477 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=9 262809 ( 100) CATCGTTGTCAT 1 260891 ( 477) CACCGTTGTCAT 1 42965 ( 59) CATCGTCGTCAT 1 16511 ( 349) CAACGATGTCAT 1 27273 ( 405) CCACCTCGTCAT 1 24813 ( 226) CCGCTTCGTCAT 1 260787 ( 217) CCCTTTTGTCAT 1 41733 ( 97) CATCGTTGGCCT 1 260899 ( 12) CAATGATGTCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5379 bayes= 9.3553 E= 3.1e+000 -982 214 -982 -982 138 55 -982 -982 38 -3 -113 28 -982 177 -982 -30 -982 -103 145 -30 -20 -982 -982 151 -982 55 -982 128 -982 -982 204 -982 -982 -982 -113 170 -982 214 -982 -982 180 -103 -982 -982 -982 -103 -982 170 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 3.1e+000 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.333333 0.222222 0.111111 0.333333 0.000000 0.777778 0.000000 0.222222 0.000000 0.111111 0.666667 0.222222 0.222222 0.000000 0.000000 0.777778 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.000000 1.000000 0.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.000000 0.111111 0.000000 0.888889 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[AC][ATC][CT][GT][TA][TC]GTCAT -------------------------------------------------------------------------------- Time 3.36 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 16511 4.38e-07 104_[+1(3.13e-06)]_102_\ [+2(6.17e-06)]_39_[+3(2.27e-05)]_54_[+3(8.99e-07)]_5_[+2(9.18e-05)]_119 24770 6.17e-08 56_[+2(3.08e-06)]_35_[+1(8.03e-07)]_\ 12_[+1(4.30e-10)]_339 24813 2.08e-08 5_[+2(7.17e-06)]_81_[+1(9.79e-09)]_\ 102_[+3(8.77e-06)]_263 260787 8.56e-06 11_[+1(2.67e-05)]_51_[+2(1.79e-06)]_\ 117_[+3(1.01e-05)]_272 260891 5.70e-13 22_[+2(1.38e-08)]_93_[+1(2.95e-09)]_\ 4_[+1(2.78e-06)]_299_[+3(1.93e-07)]_12 260899 3.88e-07 11_[+3(2.16e-05)]_118_\ [+2(1.02e-06)]_37_[+1(6.96e-07)]_285 261692 8.70e-07 183_[+1(9.05e-05)]_10_\ [+1(2.61e-09)]_127_[+2(8.26e-06)]_122 262809 2.30e-12 99_[+3(1.35e-07)]_86_[+1(2.17e-07)]_\ 236_[+2(1.18e-09)]_30 27273 5.78e-07 24_[+2(1.30e-05)]_92_[+1(3.01e-07)]_\ 251_[+3(6.06e-06)]_84 41733 2.47e-07 96_[+3(1.56e-05)]_32_[+1(1.30e-07)]_\ 212_[+2(4.55e-06)]_111 42965 2.21e-09 58_[+3(3.63e-07)]_56_[+2(2.60e-06)]_\ 118_[+1(5.72e-08)]_219 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************