******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/486/486.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 37164 1.0000 500 13895 1.0000 500 21736 1.0000 500 37908 1.0000 500 54791 1.0000 500 42247 1.0000 500 43374 1.0000 500 9883 1.0000 500 43541 1.0000 500 43657 1.0000 500 41413 1.0000 500 34177 1.0000 500 12411 1.0000 500 35561 1.0000 500 46071 1.0000 500 46303 1.0000 500 43570 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/486/486.seqs.fa -oc motifs/486 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.249 C 0.253 G 0.235 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.249 C 0.253 G 0.235 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 10 llr = 149 E-value = 3.8e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 328:::::9363339277348 pos.-specific C :::7::11132343142132: probability G :8:::a:9:4:424:4:244: matrix T 7:23a:9:::2:1:::1:::2 bits 2.1 * 1.9 ** 1.7 ** * 1.5 ***** * Relative 1.3 ** ***** * * Entropy 1.0 ********* * * (21.5 bits) 0.8 ********* * ** * 0.6 ********* * * ** * 0.4 ************ ******** 0.2 ********************* 0.0 --------------------- Multilevel TGACTGTGAGAGCGACAAGAA consensus AATT ACAAA GCGAGT sequence CTCGC A CC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 43374 401 1.04e-09 CACGGTATCT TGACTGTGAAACCGAGAAGAT TTCCAGTCGG 37164 102 1.19e-09 GTTATCGAGT TGACTGTGAAAGGAAAAACGA GCGCGAAACG 42247 54 2.88e-09 GTCCCTTGGC TGACTGTGAGAGCACGAAAGA AGCACCGCCG 34177 282 3.60e-08 TTACTGTTTG TGTTTGTGACCACGACAACAA TCCAACGAAG 13895 43 1.04e-07 GTCCCCCCGC TGACTGTGAATCACACCGGCA CGTACGACGA 41413 171 1.64e-07 TAGATGTGAT AAACTGTGACCAAAACCACGA TCAAACCTTC 46071 132 2.19e-07 AAGGGAAACC AGATTGTCAGAATGAAAAGAA GCTCGAAGGC 54791 411 2.52e-07 CGTGGGACCA AATCTGTGAGAGGCACACGAA CGCCAAGGCT 9883 69 2.69e-07 TGACTGTGAG TGACTGTGAGTGACAGTGAGT GGAACGATAG 43570 297 3.07e-07 GGCAATAGGT TGATTGCGCCACCGAGAAACA ACGAAGCTAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43374 1e-09 400_[+1]_79 37164 1.2e-09 101_[+1]_378 42247 2.9e-09 53_[+1]_426 34177 3.6e-08 281_[+1]_198 13895 1e-07 42_[+1]_437 41413 1.6e-07 170_[+1]_309 46071 2.2e-07 131_[+1]_348 54791 2.5e-07 410_[+1]_69 9883 2.7e-07 68_[+1]_411 43570 3.1e-07 296_[+1]_183 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=10 43374 ( 401) TGACTGTGAAACCGAGAAGAT 1 37164 ( 102) TGACTGTGAAAGGAAAAACGA 1 42247 ( 54) TGACTGTGAGAGCACGAAAGA 1 34177 ( 282) TGTTTGTGACCACGACAACAA 1 13895 ( 43) TGACTGTGAATCACACCGGCA 1 41413 ( 171) AAACTGTGACCAAAACCACGA 1 46071 ( 132) AGATTGTCAGAATGAAAAGAA 1 54791 ( 411) AATCTGTGAGAGGCACACGAA 1 9883 ( 69) TGACTGTGAGTGACAGTGAGT 1 43570 ( 297) TGATTGCGCCACCGAGAAACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8160 bayes= 9.92248 E= 3.8e-001 27 -997 -997 141 -32 -997 177 -997 168 -997 -997 -40 -997 147 -997 19 -997 -997 -997 192 -997 -997 209 -997 -997 -134 -997 177 -997 -134 194 -997 185 -134 -997 -997 27 25 77 -997 127 -34 -997 -40 27 25 77 -997 27 66 -23 -139 27 25 77 -997 185 -134 -997 -997 -32 66 77 -997 149 -34 -997 -139 149 -134 -23 -997 27 25 77 -997 68 -34 77 -997 168 -997 -997 -40 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 3.8e-001 0.300000 0.000000 0.000000 0.700000 0.200000 0.000000 0.800000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.700000 0.000000 0.300000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.100000 0.000000 0.900000 0.000000 0.100000 0.900000 0.000000 0.900000 0.100000 0.000000 0.000000 0.300000 0.300000 0.400000 0.000000 0.600000 0.200000 0.000000 0.200000 0.300000 0.300000 0.400000 0.000000 0.300000 0.400000 0.200000 0.100000 0.300000 0.300000 0.400000 0.000000 0.900000 0.100000 0.000000 0.000000 0.200000 0.400000 0.400000 0.000000 0.700000 0.200000 0.000000 0.100000 0.700000 0.100000 0.200000 0.000000 0.300000 0.300000 0.400000 0.000000 0.400000 0.200000 0.400000 0.000000 0.800000 0.000000 0.000000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TA][GA][AT][CT]TGTGA[GAC][ACT][GAC][CAG][GAC]A[CGA][AC][AG][GAC][AGC][AT] -------------------------------------------------------------------------------- Time 3.27 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 14 llr = 138 E-value = 6.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 1:414:1:743: pos.-specific C :9164::a::2a probability G :::1:a::33:: matrix T 91512:9::35: bits 2.1 * 1.9 * * * 1.7 * * * * 1.5 ** *** * Relative 1.3 ** **** * Entropy 1.0 ** **** * (14.2 bits) 0.8 ** **** * 0.6 *** **** * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TCTCAGTCAATC consensus A C GGA sequence T TC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 21736 476 2.78e-07 TTTCGACGTA TCACAGTCAATC TCTTCTCGCC 43657 228 6.20e-07 ACATGTCATG TCACAGTCAGTC CCCTTGCAAA 46303 136 1.17e-06 TTGGGATAAC TCACAGTCATTC CTCGTCGGGC 41413 285 1.17e-06 ACGTTCACTG TCTCAGTCAGAC ATCTATTGAC 37164 485 1.38e-06 CGCAGCTTTC TCTCCGTCGATC CATC 43374 310 3.80e-06 AAGCAACCCG TCACAGTCAGCC CGAGGACCTT 43570 189 1.21e-05 AATCGGCGCT TCTCTGTCGGAC TTACACTGTT 13895 105 1.51e-05 GGACGATGAT TCATTGTCAATC CACGGCTGAT 9883 128 2.09e-05 CATCGTTGGA TCTGCGTCATCC CTACGACGTG 12411 98 3.21e-05 TGTGCCTTTT TCTACGTCGTAC GCCATCGACG 54791 456 3.21e-05 CCCCCGATAT TCCAAGTCAACC TTTCAACAAG 46071 175 4.69e-05 TTTTGCTGTC TTCCCGTCAATC GAACAGATAG 43541 272 9.18e-05 AAAACATTCG ACTTTGTCAAAC CCGAAATCCT 34177 229 9.61e-05 GCGGGAATGT TCTGCGACGTTC GCAACATTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21736 2.8e-07 475_[+2]_13 43657 6.2e-07 227_[+2]_261 46303 1.2e-06 135_[+2]_353 41413 1.2e-06 284_[+2]_204 37164 1.4e-06 484_[+2]_4 43374 3.8e-06 309_[+2]_179 43570 1.2e-05 188_[+2]_300 13895 1.5e-05 104_[+2]_384 9883 2.1e-05 127_[+2]_361 12411 3.2e-05 97_[+2]_391 54791 3.2e-05 455_[+2]_33 46071 4.7e-05 174_[+2]_314 43541 9.2e-05 271_[+2]_217 34177 9.6e-05 228_[+2]_260 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=14 21736 ( 476) TCACAGTCAATC 1 43657 ( 228) TCACAGTCAGTC 1 46303 ( 136) TCACAGTCATTC 1 41413 ( 285) TCTCAGTCAGAC 1 37164 ( 485) TCTCCGTCGATC 1 43374 ( 310) TCACAGTCAGCC 1 43570 ( 189) TCTCTGTCGGAC 1 13895 ( 105) TCATTGTCAATC 1 9883 ( 128) TCTGCGTCATCC 1 12411 ( 98) TCTACGTCGTAC 1 54791 ( 456) TCCAAGTCAACC 1 46071 ( 175) TTCCCGTCAATC 1 43541 ( 272) ACTTTGTCAAAC 1 34177 ( 229) TCTGCGACGTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.81792 E= 6.5e+001 -180 -1045 -1045 182 -1045 187 -1045 -188 52 -82 -1045 93 -80 117 -71 -88 78 50 -1045 -30 -1045 -1045 209 -1045 -180 -1045 -1045 182 -1045 198 -1045 -1045 152 -1045 28 -1045 78 -1045 28 12 20 -24 -1045 93 -1045 198 -1045 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 6.5e+001 0.071429 0.000000 0.000000 0.928571 0.000000 0.928571 0.000000 0.071429 0.357143 0.142857 0.000000 0.500000 0.142857 0.571429 0.142857 0.142857 0.428571 0.357143 0.000000 0.214286 0.000000 0.000000 1.000000 0.000000 0.071429 0.000000 0.000000 0.928571 0.000000 1.000000 0.000000 0.000000 0.714286 0.000000 0.285714 0.000000 0.428571 0.000000 0.285714 0.285714 0.285714 0.214286 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TC[TA]C[ACT]GTC[AG][AGT][TAC]C -------------------------------------------------------------------------------- Time 6.45 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 4 llr = 84 E-value = 2.9e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a555:3:85:::aa8::8a83 pos.-specific C :5::8833::::::35:::38 probability G ::55::8:5:aa:::3a3::: matrix T ::::3::::a:::::3::::: bits 2.1 * **** * * 1.9 * ***** * * 1.7 * ***** * * 1.5 * ***** * * Relative 1.3 * **** ****** ***** Entropy 1.0 *************** ***** (30.4 bits) 0.8 *************** ***** 0.6 *************** ***** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel AAAACCGAATGGAAACGAAAC consensus CGGTACCG CG G CA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 37164 281 2.76e-12 TTATAACAGT AAAACCGAATGGAAACGAAAC GTTCCCAAAG 54791 214 2.69e-10 GACTCACGTC AAAGTCGCGTGGAAATGAAAC GCCGTCTGTT 43541 9 3.33e-10 TGTCCTCT ACGACAGAATGGAAACGGAAA TTTTACCGTC 46303 287 6.36e-10 CAAAAACAAT ACGGCCCAGTGGAACGGAACC AAAGGACCCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37164 2.8e-12 280_[+3]_199 54791 2.7e-10 213_[+3]_266 43541 3.3e-10 8_[+3]_471 46303 6.4e-10 286_[+3]_193 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=4 37164 ( 281) AAAACCGAATGGAAACGAAAC 1 54791 ( 214) AAAGTCGCGTGGAAATGAAAC 1 43541 ( 9) ACGACAGAATGGAAACGGAAA 1 46303 ( 287) ACGGCCCAGTGGAACGGAACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8160 bayes= 10.9936 E= 2.9e+003 200 -865 -865 -865 100 98 -865 -865 100 -865 109 -865 100 -865 109 -865 -865 157 -865 -7 0 157 -865 -865 -865 -2 167 -865 159 -2 -865 -865 100 -865 109 -865 -865 -865 -865 192 -865 -865 209 -865 -865 -865 209 -865 200 -865 -865 -865 200 -865 -865 -865 159 -2 -865 -865 -865 98 9 -7 -865 -865 209 -865 159 -865 9 -865 200 -865 -865 -865 159 -2 -865 -865 0 157 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 2.9e+003 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.750000 0.000000 0.250000 0.250000 0.750000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.750000 0.250000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[AC][AG][AG][CT][CA][GC][AC][AG]TGGAA[AC][CGT]G[AG]A[AC][CA] -------------------------------------------------------------------------------- Time 9.31 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37164 4.76e-16 101_[+1(1.19e-09)]_158_\ [+3(2.76e-12)]_183_[+2(1.38e-06)]_4 13895 1.04e-05 42_[+1(1.04e-07)]_41_[+2(1.51e-05)]_\ 384 21736 1.12e-03 475_[+2(2.78e-07)]_13 37908 9.41e-01 500 54791 1.10e-10 213_[+3(2.69e-10)]_38_\ [+1(2.25e-05)]_117_[+1(2.52e-07)]_24_[+2(3.21e-05)]_33 42247 7.99e-05 53_[+1(2.88e-09)]_426 43374 1.46e-07 309_[+2(3.80e-06)]_79_\ [+1(1.04e-09)]_79 9883 9.83e-05 68_[+1(2.69e-07)]_38_[+2(2.09e-05)]_\ 361 43541 3.70e-07 8_[+3(3.33e-10)]_242_[+2(9.18e-05)]_\ 217 43657 1.40e-03 227_[+2(6.20e-07)]_261 41413 2.90e-06 170_[+1(1.64e-07)]_93_\ [+2(1.17e-06)]_204 34177 3.92e-06 228_[+2(9.61e-05)]_41_\ [+1(3.60e-08)]_198 12411 1.21e-01 97_[+2(3.21e-05)]_391 35561 7.05e-01 500 46071 9.56e-05 131_[+1(2.19e-07)]_22_\ [+2(4.69e-05)]_314 46303 2.34e-08 135_[+2(1.17e-06)]_139_\ [+3(6.36e-10)]_193 43570 6.73e-05 188_[+2(1.21e-05)]_96_\ [+1(3.07e-07)]_183 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************