******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/18/18.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 4937 1.0000 500 42434 1.0000 500 47698 1.0000 500 47872 1.0000 500 52304 1.0000 500 48496 1.0000 500 2593 1.0000 500 49291 1.0000 500 5608 1.0000 500 49833 1.0000 500 16341 1.0000 500 40891 1.0000 500 44629 1.0000 500 44975 1.0000 500 45748 1.0000 500 35832 1.0000 500 49085 1.0000 500 47849 1.0000 500 49212 1.0000 500 46216 1.0000 500 49344 1.0000 500 46945 1.0000 500 50171 1.0000 500 49276 1.0000 500 45729 1.0000 500 34694 1.0000 500 44578 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/18/18.seqs.fa -oc motifs/18 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 27 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 13500 N= 27 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.262 C 0.254 G 0.219 T 0.265 Background letter frequencies (from dataset with add-one prior applied): A 0.262 C 0.254 G 0.219 T 0.265 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 14 llr = 182 E-value = 5.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::44:43:61:1319236416 pos.-specific C 33::a4:6::7:19:173:92 probability G :616:13431192114::1:: matrix T 715::14:19214::4:16:2 bits 2.2 2.0 * 1.8 * 1.5 * * Relative 1.3 * * ** * Entropy 1.1 * ** * * * ** * * (18.7 bits) 0.9 * ** * *** ** * * 0.7 ***** ***** ** **** 0.4 ***** ****** ** ***** 0.2 ************ ******** 0.0 --------------------- Multilevel TGTGCATCATCGTCAGCATCA consensus CCAA CAGG T A TACA C sequence G G A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 47698 382 2.72e-09 ACGCTGCACC TGAGCAACATCGGCAGCCTCC TTGTCTGGCG 5608 435 3.15e-08 TGCAATCATG TCAACATCGTCGTCATCCTCC GCTGCCGCAC 46216 119 4.02e-08 CTCTTATCTG TGAGCAGGATGGGCATCCTCA CAGCAAATGT 52304 184 6.38e-08 GGGACAGCAG TCGGCCTCGTTGTCATCATCA TTGTTGTCAT 34694 423 2.01e-07 CTCCGTGTCC TTTGCCGCATTGGCAACAACT CTCTCTCGAC 16341 342 2.67e-07 TCTCACCGAA TCTACAGCATCGCCATCTACT CGATGGCAAA 49212 94 2.93e-07 TACCCACATA CGAGCCGGGTCGAGAGAATCA CGGGGATATT 49833 14 2.93e-07 AAATTTACGT TGTGCATGATTGTAAACATCT TTGTGTGATC 45748 410 1.12e-06 CGCTGGTTTT CGTGCCAGTTCGACGGCAACC GAAACCAGGG 49291 379 1.12e-06 TAAACGATCC TGAGCGACAGCGACAGCTTAA ACGAGAACAG 35832 474 1.20e-06 GTACATGTCG TCAACGTCATCATCATCAGCA CCAATC 44975 478 2.29e-06 GTACATCTGA TTTGCTTCGTCGTCACAAAAA CA 2593 131 2.62e-06 ACACCTGAAT CGTACTTGATCTCCAGACTCA GTTCTATCAC 44629 252 2.79e-06 ACTAAAAAGC CGTACCACTACGACAAAAACA TCTCTTCCCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47698 2.7e-09 381_[+1]_98 5608 3.2e-08 434_[+1]_45 46216 4e-08 118_[+1]_361 52304 6.4e-08 183_[+1]_296 34694 2e-07 422_[+1]_57 16341 2.7e-07 341_[+1]_138 49212 2.9e-07 93_[+1]_386 49833 2.9e-07 13_[+1]_466 45748 1.1e-06 409_[+1]_70 49291 1.1e-06 378_[+1]_101 35832 1.2e-06 473_[+1]_6 44975 2.3e-06 477_[+1]_2 2593 2.6e-06 130_[+1]_349 44629 2.8e-06 251_[+1]_228 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=14 47698 ( 382) TGAGCAACATCGGCAGCCTCC 1 5608 ( 435) TCAACATCGTCGTCATCCTCC 1 46216 ( 119) TGAGCAGGATGGGCATCCTCA 1 52304 ( 184) TCGGCCTCGTTGTCATCATCA 1 34694 ( 423) TTTGCCGCATTGGCAACAACT 1 16341 ( 342) TCTACAGCATCGCCATCTACT 1 49212 ( 94) CGAGCCGGGTCGAGAGAATCA 1 49833 ( 14) TGTGCATGATTGTAAACATCT 1 45748 ( 410) CGTGCCAGTTCGACGGCAACC 1 49291 ( 379) TGAGCGACAGCGACAGCTTAA 1 35832 ( 474) TCAACGTCATCATCATCAGCA 1 44975 ( 478) TTTGCTTCGTCGTCACAAAAA 1 2593 ( 131) CGTACTTGATCTCCAGACTCA 1 44629 ( 252) CGTACCACTACGACAAAAACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 12960 bayes= 9.69657 E= 5.7e+001 -1045 17 -1045 143 -1045 17 138 -89 71 -1045 -162 92 45 -1045 155 -1045 -1045 197 -1045 -1045 45 49 -62 -89 13 -1045 38 70 -1045 134 70 -1045 112 -1045 38 -89 -187 -1045 -162 170 -1045 149 -162 -30 -187 -1045 197 -189 13 -83 -3 43 -187 175 -162 -1045 183 -1045 -162 -1045 -29 -183 70 43 13 149 -1045 -1045 112 17 -1045 -89 45 -1045 -162 111 -87 175 -1045 -1045 112 -25 -1045 -30 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 14 E= 5.7e+001 0.000000 0.285714 0.000000 0.714286 0.000000 0.285714 0.571429 0.142857 0.428571 0.000000 0.071429 0.500000 0.357143 0.000000 0.642857 0.000000 0.000000 1.000000 0.000000 0.000000 0.357143 0.357143 0.142857 0.142857 0.285714 0.000000 0.285714 0.428571 0.000000 0.642857 0.357143 0.000000 0.571429 0.000000 0.285714 0.142857 0.071429 0.000000 0.071429 0.857143 0.000000 0.714286 0.071429 0.214286 0.071429 0.000000 0.857143 0.071429 0.285714 0.142857 0.214286 0.357143 0.071429 0.857143 0.071429 0.000000 0.928571 0.000000 0.071429 0.000000 0.214286 0.071429 0.357143 0.357143 0.285714 0.714286 0.000000 0.000000 0.571429 0.285714 0.000000 0.142857 0.357143 0.000000 0.071429 0.571429 0.142857 0.857143 0.000000 0.000000 0.571429 0.214286 0.000000 0.214286 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TC][GC][TA][GA]C[AC][TAG][CG][AG]T[CT]G[TAG]CA[GTA][CA][AC][TA]C[ACT] -------------------------------------------------------------------------------- Time 5.75 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 8 llr = 99 E-value = 1.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :4:33:1::::: pos.-specific C 4:8::::6:a:: probability G ::385a::a::a matrix T 66::3:94::a: bits 2.2 * * * 2.0 * **** 1.8 * **** 1.5 * **** Relative 1.3 ** ** **** Entropy 1.1 ** ******* (17.9 bits) 0.9 **** ******* 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTCGGGTCGCTG consensus CAGAA T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46216 452 4.07e-08 CGCCACAAAT TTCGGGTCGCTG ATGAGGTGAG 47872 109 1.62e-07 TGCCAGTGCA TACGGGTCGCTG TCTTCCTGGC 42434 357 2.60e-07 CTTTCCAGCG TTCGTGTCGCTG TCATTGCCCC 46945 61 7.58e-07 TGGGTTTCGT TTCGAGTTGCTG CAAAGCACGG 35832 402 2.22e-06 CTTTTGCATG CTGGTGTCGCTG CGTTTGGGTA 49212 366 2.90e-06 CAGTTTCAAC CACAGGTTGCTG CTAGTCTCTT 49833 145 2.90e-06 CTTTAGAAAA CACAGGTTGCTG CTTGCTCACG 47698 366 5.52e-06 AGTTTTTCAG TTGGAGACGCTG CACCTGAGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46216 4.1e-08 451_[+2]_37 47872 1.6e-07 108_[+2]_380 42434 2.6e-07 356_[+2]_132 46945 7.6e-07 60_[+2]_428 35832 2.2e-06 401_[+2]_87 49212 2.9e-06 365_[+2]_123 49833 2.9e-06 144_[+2]_344 47698 5.5e-06 365_[+2]_123 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=8 46216 ( 452) TTCGGGTCGCTG 1 47872 ( 109) TACGGGTCGCTG 1 42434 ( 357) TTCGTGTCGCTG 1 46945 ( 61) TTCGAGTTGCTG 1 35832 ( 402) CTGGTGTCGCTG 1 49212 ( 366) CACAGGTTGCTG 1 49833 ( 145) CACAGGTTGCTG 1 47698 ( 366) TTGGAGACGCTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 13203 bayes= 11.425 E= 1.2e+003 -965 56 -965 124 52 -965 -965 124 -965 156 19 -965 -7 -965 177 -965 -7 -965 119 -8 -965 -965 219 -965 -107 -965 -965 172 -965 130 -965 50 -965 -965 219 -965 -965 197 -965 -965 -965 -965 -965 192 -965 -965 219 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 1.2e+003 0.000000 0.375000 0.000000 0.625000 0.375000 0.000000 0.000000 0.625000 0.000000 0.750000 0.250000 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.500000 0.250000 0.000000 0.000000 1.000000 0.000000 0.125000 0.000000 0.000000 0.875000 0.000000 0.625000 0.000000 0.375000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TC][TA][CG][GA][GAT]GT[CT]GCTG -------------------------------------------------------------------------------- Time 11.35 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 13 llr = 138 E-value = 5.4e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a23222::2::: pos.-specific C :::6:::5::5: probability G :812::11:1:a matrix T ::6:8895895: bits 2.2 * 2.0 * * 1.8 * * 1.5 * * * * Relative 1.3 ** *** * * Entropy 1.1 ** *** ** * (15.3 bits) 0.9 ** *** **** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AGTCTTTCTTTG consensus AAA TA C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 16341 41 1.52e-07 AAAATCAAAC AGTCTTTCTTTG GTAGTAGGAC 44578 305 4.48e-07 GCCCAAATCC AGACTTTTTTTG GGTACGAAAT 48496 57 5.92e-07 TATTTATATG AGACTTTTTTCG GTTCCCGGGG 34694 109 1.05e-06 CATCGAATTA AGTATTTCTTCG TGGTAACGCC 49276 297 4.12e-06 CTCATGTCCA AAACTTTTTTCG TTGACAAGAA 50171 231 7.25e-06 ACTATCTGTA AGTCTTTGATTG CACAAATGTG 44629 60 8.19e-06 ACCACATGAT AGTCTATTATTG GTAGAGCCCT 49833 96 8.19e-06 ATCCCTCTAA AGGATTTCTTTG TTTTCACAAT 45729 455 8.83e-06 GCCCATTGTG AGTGATTCTTTG ATAACAGAAG 2593 419 1.35e-05 ACTCTTGGTA AGTCAATTTTTG CGTTTATGCA 49085 199 1.82e-05 TATGGGGCAG AATATTTCATCG TGGATGTACG 47698 239 1.82e-05 TTGTCTGTAG AGTGTTTCTGCG GACGACGTTC 47849 483 2.98e-05 TGTTTTCTTC AAACTTGTTTCG TTCGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 16341 1.5e-07 40_[+3]_448 44578 4.5e-07 304_[+3]_184 48496 5.9e-07 56_[+3]_432 34694 1e-06 108_[+3]_380 49276 4.1e-06 296_[+3]_192 50171 7.3e-06 230_[+3]_258 44629 8.2e-06 59_[+3]_429 49833 8.2e-06 95_[+3]_393 45729 8.8e-06 454_[+3]_34 2593 1.3e-05 418_[+3]_70 49085 1.8e-05 198_[+3]_290 47698 1.8e-05 238_[+3]_250 47849 3e-05 482_[+3]_6 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=13 16341 ( 41) AGTCTTTCTTTG 1 44578 ( 305) AGACTTTTTTTG 1 48496 ( 57) AGACTTTTTTCG 1 34694 ( 109) AGTATTTCTTCG 1 49276 ( 297) AAACTTTTTTCG 1 50171 ( 231) AGTCTTTGATTG 1 44629 ( 60) AGTCTATTATTG 1 49833 ( 96) AGGATTTCTTTG 1 45729 ( 455) AGTGATTCTTTG 1 2593 ( 419) AGTCAATTTTTG 1 49085 ( 199) AATATTTCATCG 1 47698 ( 239) AGTGTTTCTGCG 1 47849 ( 483) AAACTTGTTTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 13203 bayes= 10.5177 E= 5.4e+003 193 -1035 -1035 -1035 -18 -1035 181 -1035 23 -1035 -151 122 -18 127 -51 -1035 -77 -1035 -1035 168 -77 -1035 -1035 168 -1035 -1035 -151 180 -1035 86 -151 80 -18 -1035 -1035 154 -1035 -1035 -151 180 -1035 86 -1035 102 -1035 -1035 219 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 5.4e+003 1.000000 0.000000 0.000000 0.000000 0.230769 0.000000 0.769231 0.000000 0.307692 0.000000 0.076923 0.615385 0.230769 0.615385 0.153846 0.000000 0.153846 0.000000 0.000000 0.846154 0.153846 0.000000 0.000000 0.846154 0.000000 0.000000 0.076923 0.923077 0.000000 0.461538 0.076923 0.461538 0.230769 0.000000 0.000000 0.769231 0.000000 0.000000 0.076923 0.923077 0.000000 0.461538 0.000000 0.538462 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[GA][TA][CA]TTT[CT][TA]T[TC]G -------------------------------------------------------------------------------- Time 17.06 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 4937 7.33e-01 500 42434 3.87e-04 356_[+2(2.60e-07)]_132 47698 9.88e-09 238_[+3(1.82e-05)]_115_\ [+2(5.52e-06)]_4_[+1(2.72e-09)]_98 47872 4.86e-04 108_[+2(1.62e-07)]_380 52304 8.16e-04 183_[+1(6.38e-08)]_296 48496 4.86e-03 56_[+3(5.92e-07)]_432 2593 5.72e-04 130_[+1(2.62e-06)]_267_\ [+3(1.35e-05)]_70 49291 1.20e-02 378_[+1(1.12e-06)]_101 5608 8.42e-04 434_[+1(3.15e-08)]_45 49833 1.92e-07 13_[+1(2.93e-07)]_61_[+3(8.19e-06)]_\ 37_[+2(2.90e-06)]_344 16341 9.52e-07 40_[+3(1.52e-07)]_289_\ [+1(2.67e-07)]_138 40891 9.22e-01 500 44629 3.24e-04 59_[+3(8.19e-06)]_180_\ [+1(2.79e-06)]_228 44975 1.06e-02 220_[+1(4.05e-05)]_236_\ [+1(2.29e-06)]_2 45748 4.03e-03 409_[+1(1.12e-06)]_70 35832 6.53e-05 401_[+2(2.22e-06)]_60_\ [+1(1.20e-06)]_6 49085 3.63e-02 198_[+3(1.82e-05)]_290 47849 1.06e-01 482_[+3(2.98e-05)]_6 49212 1.89e-05 93_[+1(2.93e-07)]_251_\ [+2(2.90e-06)]_123 46216 4.77e-08 118_[+1(4.02e-08)]_312_\ [+2(4.07e-08)]_37 49344 7.05e-01 500 46945 1.26e-02 60_[+2(7.58e-07)]_428 50171 7.23e-02 230_[+3(7.25e-06)]_258 49276 3.47e-02 296_[+3(4.12e-06)]_192 45729 6.78e-02 454_[+3(8.83e-06)]_34 34694 3.37e-06 108_[+3(1.05e-06)]_302_\ [+1(2.01e-07)]_57 44578 1.83e-04 304_[+3(4.48e-07)]_184 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************