******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/37/37.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 43106 1.0000 500 46424 1.0000 500 36397 1.0000 500 46461 1.0000 500 36476 1.0000 500 47255 1.0000 500 37579 1.0000 500 22095 1.0000 500 32726 1.0000 500 49469 1.0000 500 50221 1.0000 500 43835 1.0000 500 24186 1.0000 500 44784 1.0000 500 45039 1.0000 500 46163 1.0000 500 46320 1.0000 500 48857 1.0000 500 45117 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/37/37.seqs.fa -oc motifs/37 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.258 C 0.245 G 0.221 T 0.276 Background letter frequencies (from dataset with add-one prior applied): A 0.258 C 0.245 G 0.221 T 0.276 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 5 llr = 104 E-value = 1.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :26:a2:2::68::86:4::: pos.-specific C a828:428aa::a::2:::aa probability G ::22:4:::::2:a:::26:: matrix T ::::::8:::4:::22a44:: bits 2.2 * 2.0 * * ** ** * ** 1.7 * * ** ** * ** 1.5 * * ** ** * ** Relative 1.3 ** ** *** **** * ** Entropy 1.1 ** ** **** **** * *** (30.0 bits) 0.9 ** ** ********* * *** 0.7 ***************** *** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CCACACTCCCAACGAATAGCC consensus ACG GCA TG TC TT sequence G A T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45117 316 6.32e-12 AATCACAGCC CCACAGTCCCTACGAATATCC AAGAGTACAA 46163 79 6.32e-12 AATCACAGCC CCACAGTCCCTACGAATATCC AAGAGTACAG 24186 84 2.46e-10 TAGTGTCATT CCCCAATCCCAACGACTGGCC TCTGGAGTTA 36476 226 1.18e-09 CCTCTCTTAC CAGCACCACCAACGAATTGCC TAGTTTGGTG 44784 142 1.24e-09 GCCGCGCGTC CCAGACTCCCAGCGTTTTGCC GGCCTTCGGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45117 6.3e-12 315_[+1]_164 46163 6.3e-12 78_[+1]_401 24186 2.5e-10 83_[+1]_396 36476 1.2e-09 225_[+1]_254 44784 1.2e-09 141_[+1]_338 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=5 45117 ( 316) CCACAGTCCCTACGAATATCC 1 46163 ( 79) CCACAGTCCCTACGAATATCC 1 24186 ( 84) CCCCAATCCCAACGACTGGCC 1 36476 ( 226) CAGCACCACCAACGAATTGCC 1 44784 ( 142) CCAGACTCCCAGCGTTTTGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9120 bayes= 11.0838 E= 1.6e+000 -897 203 -897 -897 -37 171 -897 -897 122 -29 -14 -897 -897 171 -14 -897 195 -897 -897 -897 -37 71 85 -897 -897 -29 -897 153 -37 171 -897 -897 -897 203 -897 -897 -897 203 -897 -897 122 -897 -897 54 163 -897 -14 -897 -897 203 -897 -897 -897 -897 217 -897 163 -897 -897 -46 122 -29 -897 -46 -897 -897 -897 186 63 -897 -14 54 -897 -897 144 54 -897 203 -897 -897 -897 203 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 1.6e+000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.600000 0.200000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.400000 0.400000 0.000000 0.000000 0.200000 0.000000 0.800000 0.200000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.000000 0.000000 0.400000 0.800000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.600000 0.200000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.400000 0.000000 0.200000 0.400000 0.000000 0.000000 0.600000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[CA][ACG][CG]A[CGA][TC][CA]CC[AT][AG]CG[AT][ACT]T[ATG][GT]CC -------------------------------------------------------------------------------- Time 3.24 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 10 llr = 148 E-value = 1.4e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 21337:a:92584:83::6:9 pos.-specific C :774:8:8152152:2:7:a1 probability G 12:1:2:::1:115:2:32:: matrix T 7::23::2:23::323a:2:: bits 2.2 2.0 * * * 1.7 * * * 1.5 * * * ** Relative 1.3 **** * * ** Entropy 1.1 * ***** * * ** ** (21.4 bits) 0.9 ** ***** * * ** ** 0.7 *** ***** **** ***** 0.4 *** ***** ***** ***** 0.2 *************** ***** 0.0 --------------------- Multilevel TCCCACACACAACGAATCACA consensus AGAATG T AT ATTT GG sequence T TC C C T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45117 292 2.71e-13 GTCGCTTCCG TCCCACACACAACGAATCACA GCCCCACAGT 46163 55 2.71e-13 GTCGCTTCCG TCCCACACACAACGAATCACA GCCCCACAGT 43835 434 7.07e-08 TCCTATACTA ACCAAGACAACAACATTCACA ACGGCATTGT 32726 439 9.93e-08 CTAAACTTTT TCCTTGACATCAACATTCACA CGCAGCTCTG 48857 410 1.27e-07 ACCCGACATC TCACTCACAGTCAGAGTCACA TCGTTCTTCC 24186 61 2.85e-07 CTCTGTGGGG TGCATCACCCAACTAGTGTCA TTCCCCAATC 46461 23 3.05e-07 AGAAAGAAAG AGCAACATACAAGGAATGTCA GATTAATTCC 36397 175 3.72e-07 GGAAGTAAAT GCACACACATTAAGACTGACC TTGCTTGCTG 22095 428 5.08e-07 ACAGCAACAA TACGACATACAACTTCTCGCA CCACAAACGA 36476 475 6.06e-07 CCCTCACGTT TCATACACAATGCTTTTCGCA ACTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45117 2.7e-13 291_[+2]_188 46163 2.7e-13 54_[+2]_425 43835 7.1e-08 433_[+2]_46 32726 9.9e-08 438_[+2]_41 48857 1.3e-07 409_[+2]_70 24186 2.9e-07 60_[+2]_419 46461 3.1e-07 22_[+2]_457 36397 3.7e-07 174_[+2]_305 22095 5.1e-07 427_[+2]_52 36476 6.1e-07 474_[+2]_5 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=10 45117 ( 292) TCCCACACACAACGAATCACA 1 46163 ( 55) TCCCACACACAACGAATCACA 1 43835 ( 434) ACCAAGACAACAACATTCACA 1 32726 ( 439) TCCTTGACATCAACATTCACA 1 48857 ( 410) TCACTCACAGTCAGAGTCACA 1 24186 ( 61) TGCATCACCCAACTAGTGTCA 1 46461 ( 23) AGCAACATACAAGGAATGTCA 1 36397 ( 175) GCACACACATTAAGACTGACC 1 22095 ( 428) TACGACATACAACTTCTCGCA 1 36476 ( 475) TCATACACAATGCTTTTCGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9120 bayes= 10.0831 E= 1.4e+000 -37 -997 -114 134 -137 151 -15 -997 22 151 -997 -997 22 71 -114 -46 144 -997 -997 12 -997 171 -15 -997 195 -997 -997 -997 -997 171 -997 -46 180 -129 -997 -997 -37 103 -114 -46 95 -29 -997 12 163 -129 -114 -997 63 103 -114 -997 -997 -29 118 12 163 -997 -997 -46 22 -29 -15 12 -997 -997 -997 186 -997 151 44 -997 122 -997 -15 -46 -997 203 -997 -997 180 -129 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 1.4e+000 0.200000 0.000000 0.100000 0.700000 0.100000 0.700000 0.200000 0.000000 0.300000 0.700000 0.000000 0.000000 0.300000 0.400000 0.100000 0.200000 0.700000 0.000000 0.000000 0.300000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.900000 0.100000 0.000000 0.000000 0.200000 0.500000 0.100000 0.200000 0.500000 0.200000 0.000000 0.300000 0.800000 0.100000 0.100000 0.000000 0.400000 0.500000 0.100000 0.000000 0.000000 0.200000 0.500000 0.300000 0.800000 0.000000 0.000000 0.200000 0.300000 0.200000 0.200000 0.300000 0.000000 0.000000 0.000000 1.000000 0.000000 0.700000 0.300000 0.000000 0.600000 0.000000 0.200000 0.200000 0.000000 1.000000 0.000000 0.000000 0.900000 0.100000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TA][CG][CA][CAT][AT][CG]A[CT]A[CAT][ATC]A[CA][GTC][AT][ATCG]T[CG][AGT]CA -------------------------------------------------------------------------------- Time 6.47 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 19 llr = 183 E-value = 4.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 5456:54238195818 pos.-specific C :41:6:622:9:119: probability G 524235:532:15:11 matrix T 11121:123::::1:1 bits 2.2 2.0 1.7 * 1.5 ** * Relative 1.3 *** * Entropy 1.1 * *** *** (13.9 bits) 0.9 * ** ******* 0.7 * **** ******* 0.4 * ***** ******* 0.2 ******** ******* 0.0 ---------------- Multilevel AAAACGCGAACAAACA consensus GCGGGAAAG G sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 44784 262 9.33e-08 GTCTAGTGAC GGGACGCGCACAGACA ACAGCAAAAC 43835 133 4.86e-07 GACGACAACA ACAACAACAACAAACA CGGTTGACGT 50221 313 9.54e-07 ATCAACCTCA GACAGACGTACAGACA TGACCGCCAC 46424 74 2.41e-06 GTTTAGAGTT AGAATGCAAACAGACA TTGTTGAGAT 22095 47 4.70e-06 GGAAAATGCC AAGAGGAGCACAACCA ACCTGGGTTT 47255 183 5.81e-06 TTTCCAGGTG GAATGGCGGACAGACT ATAGAAGTTT 46320 17 6.43e-06 GGTTAGATGC GCGGCGCAAACGGACA CTTTCGTTGC 48857 61 7.10e-06 TGTGGCTGGC ACAAGGATTGCAAACA AAAGCGAGCG 45039 328 7.83e-06 AGCTGACTTT GCAAGAATTGCAGACA TAGGCTCACC 46163 310 1.13e-05 CGTGGTGACG AAGACAAAGACAAAGA TAAGGTGAAG 36476 389 1.60e-05 ATAGCTCCAA AAAACGCAAACAGTCG AACGAGAAGG 32726 253 1.74e-05 TACCCTAATG GTGTCGCCTACAGACA GCGTTCTAAC 43106 448 1.74e-05 TGCATATAGC TCAGCACTGACAAACA AGATCTTCGT 49469 482 2.23e-05 GTAGATCGCA ACAACGCCAACAATCG ACC 45117 46 3.09e-05 GTTAAAGACC GCGTCGAGCAAAAACA GTTTATTATT 24186 186 3.09e-05 TCCTAGCTCT AGTACACGTACACACA CACACAAACG 46461 2 4.52e-05 C GACGGACGAACAGAAA GAAAGAGCAA 37579 296 7.38e-05 CTATCTTGAT AAAATATGGACAACCA AATATATATT 36397 244 1.76e-04 ACTTTTTTGG GGGGCAAGGGCGAACT TCGTGGTCCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44784 9.3e-08 261_[+3]_223 43835 4.9e-07 132_[+3]_352 50221 9.5e-07 312_[+3]_172 46424 2.4e-06 73_[+3]_411 22095 4.7e-06 46_[+3]_438 47255 5.8e-06 182_[+3]_302 46320 6.4e-06 16_[+3]_468 48857 7.1e-06 60_[+3]_424 45039 7.8e-06 327_[+3]_157 46163 1.1e-05 309_[+3]_175 36476 1.6e-05 388_[+3]_96 32726 1.7e-05 252_[+3]_232 43106 1.7e-05 447_[+3]_37 49469 2.2e-05 481_[+3]_3 45117 3.1e-05 45_[+3]_439 24186 3.1e-05 185_[+3]_299 46461 4.5e-05 1_[+3]_483 37579 7.4e-05 295_[+3]_189 36397 0.00018 243_[+3]_241 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=19 44784 ( 262) GGGACGCGCACAGACA 1 43835 ( 133) ACAACAACAACAAACA 1 50221 ( 313) GACAGACGTACAGACA 1 46424 ( 74) AGAATGCAAACAGACA 1 22095 ( 47) AAGAGGAGCACAACCA 1 47255 ( 183) GAATGGCGGACAGACT 1 46320 ( 17) GCGGCGCAAACGGACA 1 48857 ( 61) ACAAGGATTGCAAACA 1 45039 ( 328) GCAAGAATTGCAGACA 1 46163 ( 310) AAGACAAAGACAAAGA 1 36476 ( 389) AAAACGCAAACAGTCG 1 32726 ( 253) GTGTCGCCTACAGACA 1 43106 ( 448) TCAGCACTGACAAACA 1 49469 ( 482) ACAACGCCAACAATCG 1 45117 ( 46) GCGTCGAGCAAAAACA 1 24186 ( 186) AGTACACGTACACACA 1 46461 ( 2) GACGGACGAACAGAAA 1 37579 ( 296) AAAATATGGACAACCA 1 36397 ( 244) GGGGCAAGGGCGAACT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9215 bayes= 8.91886 E= 4.2e+001 88 -1089 110 -239 51 59 -7 -239 88 -122 74 -239 129 -1089 -7 -80 -1089 124 51 -139 88 -1089 125 -1089 51 124 -1089 -239 -29 -63 110 -80 29 -63 25 -7 171 -1089 -49 -1089 -229 195 -1089 -1089 179 -1089 -107 -1089 88 -222 110 -1089 161 -122 -1089 -139 -229 187 -207 -1089 161 -1089 -107 -139 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 19 E= 4.2e+001 0.473684 0.000000 0.473684 0.052632 0.368421 0.368421 0.210526 0.052632 0.473684 0.105263 0.368421 0.052632 0.631579 0.000000 0.210526 0.157895 0.000000 0.578947 0.315789 0.105263 0.473684 0.000000 0.526316 0.000000 0.368421 0.578947 0.000000 0.052632 0.210526 0.157895 0.473684 0.157895 0.315789 0.157895 0.263158 0.263158 0.842105 0.000000 0.157895 0.000000 0.052632 0.947368 0.000000 0.000000 0.894737 0.000000 0.105263 0.000000 0.473684 0.052632 0.473684 0.000000 0.789474 0.105263 0.000000 0.105263 0.052632 0.894737 0.052632 0.000000 0.789474 0.000000 0.105263 0.105263 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AG][ACG][AG][AG][CG][GA][CA][GA][AGT]ACA[AG]ACA -------------------------------------------------------------------------------- Time 9.53 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43106 3.57e-02 447_[+3(1.74e-05)]_37 46424 1.40e-02 73_[+3(2.41e-06)]_411 36397 6.40e-04 174_[+2(3.72e-07)]_305 46461 1.51e-04 1_[+3(4.52e-05)]_5_[+2(3.05e-07)]_\ 457 36476 5.13e-10 225_[+1(1.18e-09)]_142_\ [+3(1.60e-05)]_70_[+2(6.06e-07)]_5 47255 2.97e-02 182_[+3(5.81e-06)]_302 37579 8.29e-02 295_[+3(7.38e-05)]_189 22095 1.63e-05 46_[+3(4.70e-06)]_365_\ [+2(5.08e-07)]_52 32726 3.01e-05 252_[+3(1.74e-05)]_170_\ [+2(9.93e-08)]_41 49469 1.95e-02 481_[+3(2.23e-05)]_3 50221 4.86e-04 87_[+2(5.49e-05)]_204_\ [+3(9.54e-07)]_172 43835 8.28e-08 132_[+3(4.86e-07)]_257_\ [+1(8.40e-05)]_7_[+2(7.07e-08)]_46 24186 1.09e-10 60_[+2(2.85e-07)]_2_[+1(2.46e-10)]_\ 81_[+3(3.09e-05)]_299 44784 7.76e-09 141_[+1(1.24e-09)]_99_\ [+3(9.33e-08)]_223 45039 4.82e-02 327_[+3(7.83e-06)]_157 46163 2.55e-18 54_[+2(2.71e-13)]_3_[+1(6.32e-12)]_\ 210_[+3(1.13e-05)]_175 46320 2.04e-02 16_[+3(6.43e-06)]_468 48857 1.77e-06 60_[+3(7.10e-06)]_333_\ [+2(1.27e-07)]_70 45117 6.65e-18 45_[+3(3.09e-05)]_230_\ [+2(2.71e-13)]_3_[+1(6.32e-12)]_164 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************