******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/235/235.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 21039 1.0000 500 261174 1.0000 500 261285 1.0000 500 261481 1.0000 500 263701 1.0000 500 268146 1.0000 500 268889 1.0000 500 270121 1.0000 500 32776 1.0000 500 34344 1.0000 500 38987 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/235/235.seqs.fa -oc motifs/235 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.263 C 0.231 G 0.240 T 0.266 Background letter frequencies (from dataset with add-one prior applied): A 0.263 C 0.231 G 0.240 T 0.266 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 14 sites = 11 llr = 121 E-value = 4.7e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 2573586111a6:9 pos.-specific C 8:27324288::a: probability G :31:1::5:::::1 matrix T :2::1::311:4:: bits 2.1 * 1.9 * * 1.7 * * 1.5 * * ** Relative 1.3 * * * *** ** Entropy 1.1 * * ** ****** (15.9 bits) 0.8 * ** ** ****** 0.6 * ** ** ****** 0.4 ******* ****** 0.2 ************** 0.0 -------------- Multilevel CAACAAAGCCAACA consensus G AC CT T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 261285 315 3.61e-09 CAGCAACACA CAACAAAGCCAACA CGGAACGGTT 270121 361 9.97e-08 CTACTACAAA CGACCACGCCAACA ACCCGCACCT 261481 463 3.13e-07 TCAACCACGT CGACAACTCCATCA CCGCCACCAC 38987 442 1.33e-06 CGTCATCCAG AAACAACTCCATCA CCCTCACCCA 32776 358 3.30e-06 GTGGTAAGTG ATCCAAAGCCAACA AAGCCACAGA 268146 123 3.30e-06 AACACAACAA CAACAAAAACAACA ACAACTACCA 21039 137 4.38e-06 CCCTCTTTAT CAGCAAAGCAAACA TATCTTCTCA 261174 225 1.57e-05 GGTCAAAGAG CAAAGAACCCAACG TCTTGCCTTG 268889 474 1.67e-05 GTGAAGAGGA CTACCCATCTAACA TCAAACTCTT 263701 483 1.77e-05 CCCCCCCGAC CGAACAACTCATCA CAAA 34344 436 2.54e-05 ACTGGACTAA CACATCCGCCATCA ACTCTTTAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 261285 3.6e-09 314_[+1]_172 270121 1e-07 360_[+1]_126 261481 3.1e-07 462_[+1]_24 38987 1.3e-06 441_[+1]_45 32776 3.3e-06 357_[+1]_129 268146 3.3e-06 122_[+1]_364 21039 4.4e-06 136_[+1]_350 261174 1.6e-05 224_[+1]_262 268889 1.7e-05 473_[+1]_13 263701 1.8e-05 482_[+1]_4 34344 2.5e-05 435_[+1]_51 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=14 seqs=11 261285 ( 315) CAACAAAGCCAACA 1 270121 ( 361) CGACCACGCCAACA 1 261481 ( 463) CGACAACTCCATCA 1 38987 ( 442) AAACAACTCCATCA 1 32776 ( 358) ATCCAAAGCCAACA 1 268146 ( 123) CAACAAAAACAACA 1 21039 ( 137) CAGCAAAGCAAACA 1 261174 ( 225) CAAAGAACCCAACG 1 268889 ( 474) CTACCCATCTAACA 1 263701 ( 483) CGAACAACTCATCA 1 34344 ( 436) CACATCCGCCATCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 5357 bayes= 8.92481 E= 4.7e-001 -53 182 -1010 -1010 105 -1010 18 -55 147 -35 -140 -1010 5 165 -1010 -1010 105 24 -140 -155 164 -35 -1010 -1010 128 65 -1010 -1010 -153 -35 92 4 -153 182 -1010 -155 -153 182 -1010 -155 193 -1010 -1010 -1010 128 -1010 -1010 45 -1010 211 -1010 -1010 179 -1010 -140 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 11 E= 4.7e-001 0.181818 0.818182 0.000000 0.000000 0.545455 0.000000 0.272727 0.181818 0.727273 0.181818 0.090909 0.000000 0.272727 0.727273 0.000000 0.000000 0.545455 0.272727 0.090909 0.090909 0.818182 0.181818 0.000000 0.000000 0.636364 0.363636 0.000000 0.000000 0.090909 0.181818 0.454545 0.272727 0.090909 0.818182 0.000000 0.090909 0.090909 0.818182 0.000000 0.090909 1.000000 0.000000 0.000000 0.000000 0.636364 0.000000 0.000000 0.363636 0.000000 1.000000 0.000000 0.000000 0.909091 0.000000 0.090909 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[AG]A[CA][AC]A[AC][GT]CCA[AT]CA -------------------------------------------------------------------------------- Time 1.22 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 11 llr = 139 E-value = 5.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :122::4:3559:46:24:2 pos.-specific C 8161386:6:51a4162388 probability G 1125:1:4:::::2:111:: matrix T 17:371:6151::133532: bits 2.1 * 1.9 * 1.7 * 1.5 ** ** Relative 1.3 * * ** ** Entropy 1.1 * **** ** ** (18.2 bits) 0.8 * * ****** ** * ** 0.6 *** ********* ** ** 0.4 *** ********* ** ** 0.2 ******************** 0.0 -------------------- Multilevel CTCGTCCTCTAACAACTACC consensus TC AGAAC CTT C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 261481 432 2.96e-09 GCTGGGAGTC CTGGTCCTCAAACCACAACC ATCAACCACG 32776 285 4.11e-08 GAGGAAGGGC CTCATCCTATCACGACGACC GTCTCTCTGC 261174 455 1.10e-07 TCCAAGATGG CTGGTCAGCTCACGTTTCCC TGCTTCAACT 268146 367 1.83e-07 CGTTTATTCT CTCTCCAGCTAACCACACTC TGACACTGTG 34344 477 2.22e-07 ATCATTTCTT CTATCCATCTAACAAGTACC AACC 261285 441 2.44e-07 CCGACACGTC CTCTTCCTTAAACCTCTTTC AATTCTCTAT 38987 475 5.52e-07 ATCACATCAC TTCGTTCTCTCACAACTCCA TCTACA 270121 339 9.88e-07 GCCTGGAACT CCCCTCCGCACACTACTACA AACGACCACG 263701 449 1.97e-06 AGAGGTGAGG CGCGCCATAAAACCCTTTCC TCCTCCCCCC 21039 378 7.33e-06 GTGATTGCTG CTCGTGCTCATCCAATCGCC CACCTCATTC 268889 395 1.22e-05 GTTTGAGGGT GAAATCCGATCACATCCTCC TGGGTAAGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 261481 3e-09 431_[+2]_49 32776 4.1e-08 284_[+2]_196 261174 1.1e-07 454_[+2]_26 268146 1.8e-07 366_[+2]_114 34344 2.2e-07 476_[+2]_4 261285 2.4e-07 440_[+2]_40 38987 5.5e-07 474_[+2]_6 270121 9.9e-07 338_[+2]_142 263701 2e-06 448_[+2]_32 21039 7.3e-06 377_[+2]_103 268889 1.2e-05 394_[+2]_86 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=11 261481 ( 432) CTGGTCCTCAAACCACAACC 1 32776 ( 285) CTCATCCTATCACGACGACC 1 261174 ( 455) CTGGTCAGCTCACGTTTCCC 1 268146 ( 367) CTCTCCAGCTAACCACACTC 1 34344 ( 477) CTATCCATCTAACAAGTACC 1 261285 ( 441) CTCTTCCTTAAACCTCTTTC 1 38987 ( 475) TTCGTTCTCTCACAACTCCA 1 270121 ( 339) CCCCTCCGCACACTACTACA 1 263701 ( 449) CGCGCCATAAAACCCTTTCC 1 21039 ( 378) CTCGTGCTCATCCAATCGCC 1 268889 ( 395) GAAATCCGATCACATCCTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5291 bayes= 9.26264 E= 5.6e+000 -1010 182 -140 -155 -153 -135 -140 145 -53 146 -40 -1010 -53 -135 92 4 -1010 24 -1010 145 -1010 182 -140 -155 47 146 -1010 -1010 -1010 -1010 60 126 5 146 -1010 -155 79 -1010 -1010 104 79 97 -1010 -155 179 -135 -1010 -1010 -1010 211 -1010 -1010 47 65 -40 -155 128 -135 -1010 4 -1010 146 -140 4 -53 -35 -140 104 47 24 -140 4 -1010 182 -1010 -55 -53 182 -1010 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 11 E= 5.6e+000 0.000000 0.818182 0.090909 0.090909 0.090909 0.090909 0.090909 0.727273 0.181818 0.636364 0.181818 0.000000 0.181818 0.090909 0.454545 0.272727 0.000000 0.272727 0.000000 0.727273 0.000000 0.818182 0.090909 0.090909 0.363636 0.636364 0.000000 0.000000 0.000000 0.000000 0.363636 0.636364 0.272727 0.636364 0.000000 0.090909 0.454545 0.000000 0.000000 0.545455 0.454545 0.454545 0.000000 0.090909 0.909091 0.090909 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.363636 0.363636 0.181818 0.090909 0.636364 0.090909 0.000000 0.272727 0.000000 0.636364 0.090909 0.272727 0.181818 0.181818 0.090909 0.545455 0.363636 0.272727 0.090909 0.272727 0.000000 0.818182 0.000000 0.181818 0.181818 0.818182 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CTC[GT][TC]C[CA][TG][CA][TA][AC]AC[AC][AT][CT]T[ACT]CC -------------------------------------------------------------------------------- Time 2.42 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 10 llr = 103 E-value = 7.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 6:8318:46::3 pos.-specific C 1:21:::1:::: probability G 3a:69:a32aa4 matrix T :::::2:22::3 bits 2.1 * * ** 1.9 * * ** 1.7 * * ** 1.5 * * * ** Relative 1.3 ** *** ** Entropy 1.1 ** *** ** (14.8 bits) 0.8 ** *** ** 0.6 ******* *** 0.4 ******* **** 0.2 ************ 0.0 ------------ Multilevel AGAGGAGAAGGG consensus G CA T GG A sequence TT T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 263701 409 4.06e-07 ACGGGAGGAG AGAGGAGGAGGT GTCAGGACAC 34344 6 8.02e-07 TGTGA GGAGGAGAAGGT TGGTATGGTT 38987 266 1.14e-06 TGGTAGCATG AGAGGAGATGGG CAAAAGGGAC 268889 340 2.35e-06 TGAAGATTAA AGAGGAGATGGA GGAGCTTGAA 32776 336 8.07e-06 CGTTTGCCGA GGACGAGGAGGG GTGGTAAGTG 268146 21 9.69e-06 TTGACGAGTG AGCAGAGGAGGA AGAATATGTT 261174 255 1.24e-05 CTTGTCGTAT AGCGGAGCAGGA TGAAGCGGGC 261481 289 1.59e-05 TGTAAGGATT AGAGGTGTGGGG AACTCCTGTT 21039 273 4.37e-05 TCTTGACTTT GGAAGTGTGGGG AGAGCAATCA 261285 230 6.04e-05 GATTTGGCAA CGAAAAGAAGGT GTATGTAGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 263701 4.1e-07 408_[+3]_80 34344 8e-07 5_[+3]_483 38987 1.1e-06 265_[+3]_223 268889 2.3e-06 339_[+3]_149 32776 8.1e-06 335_[+3]_153 268146 9.7e-06 20_[+3]_468 261174 1.2e-05 254_[+3]_234 261481 1.6e-05 288_[+3]_200 21039 4.4e-05 272_[+3]_216 261285 6e-05 229_[+3]_259 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=10 263701 ( 409) AGAGGAGGAGGT 1 34344 ( 6) GGAGGAGAAGGT 1 38987 ( 266) AGAGGAGATGGG 1 268889 ( 340) AGAGGAGATGGA 1 32776 ( 336) GGACGAGGAGGG 1 268146 ( 21) AGCAGAGGAGGA 1 261174 ( 255) AGCGGAGCAGGA 1 261481 ( 289) AGAGGTGTGGGG 1 21039 ( 273) GGAAGTGTGGGG 1 261285 ( 230) CGAAAAGAAGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5379 bayes= 9.32048 E= 7.4e+001 119 -121 32 -997 -997 -997 206 -997 161 -21 -997 -997 19 -121 132 -997 -139 -997 191 -997 161 -997 -997 -41 -997 -997 206 -997 61 -121 32 -41 119 -997 -26 -41 -997 -997 206 -997 -997 -997 206 -997 19 -997 74 17 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 7.4e+001 0.600000 0.100000 0.300000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.300000 0.100000 0.600000 0.000000 0.100000 0.000000 0.900000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.400000 0.100000 0.300000 0.200000 0.600000 0.000000 0.200000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.300000 0.000000 0.400000 0.300000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AG]G[AC][GA]G[AT]G[AGT][AGT]GG[GAT] -------------------------------------------------------------------------------- Time 3.48 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21039 2.21e-05 136_[+1(4.38e-06)]_122_\ [+3(4.37e-05)]_93_[+2(7.33e-06)]_103 261174 5.32e-07 224_[+1(1.57e-05)]_16_\ [+3(1.24e-05)]_188_[+2(1.10e-07)]_26 261285 2.17e-09 229_[+3(6.04e-05)]_73_\ [+1(3.61e-09)]_112_[+2(2.44e-07)]_40 261481 6.68e-10 288_[+3(1.59e-05)]_131_\ [+2(2.96e-09)]_11_[+1(3.13e-07)]_24 263701 3.65e-07 408_[+3(4.06e-07)]_28_\ [+2(1.97e-06)]_14_[+1(1.77e-05)]_4 268146 1.64e-07 20_[+3(9.69e-06)]_90_[+1(3.30e-06)]_\ 230_[+2(1.83e-07)]_114 268889 8.53e-06 294_[+3(8.92e-05)]_33_\ [+3(2.35e-06)]_43_[+2(1.22e-05)]_59_[+1(1.67e-05)]_13 270121 3.94e-06 53_[+1(2.54e-05)]_271_\ [+2(9.88e-07)]_2_[+1(9.97e-08)]_126 32776 3.56e-08 284_[+2(4.11e-08)]_31_\ [+3(8.07e-06)]_10_[+1(3.30e-06)]_129 34344 1.29e-07 5_[+3(8.02e-07)]_418_[+1(2.54e-05)]_\ 27_[+2(2.22e-07)]_4 38987 2.79e-08 145_[+2(5.76e-06)]_100_\ [+3(1.14e-06)]_164_[+1(1.33e-06)]_19_[+2(5.52e-07)]_6 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************