******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/22/22.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 8658 1.0000 500 14553 1.0000 500 15565 1.0000 500 49176 1.0000 500 55073 1.0000 500 45710 1.0000 500 42825 1.0000 500 43215 1.0000 500 43602 1.0000 500 36649 1.0000 500 36877 1.0000 500 48105 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/22/22.seqs.fa -oc motifs/22 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.277 C 0.237 G 0.226 T 0.261 Background letter frequencies (from dataset with add-one prior applied): A 0.277 C 0.237 G 0.226 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 18 sites = 3 llr = 62 E-value = 1.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a::::::a:::7::::3: pos.-specific C :::a:::::7::3::7:: probability G :7::aa::a3a37a733a matrix T :3a:::a:::::::3:3: bits 2.1 *** * * * * 1.9 * ******* * * * 1.7 * ******* * * * 1.5 * ******* * * * Relative 1.3 * ********* ** * * Entropy 1.1 **************** * (29.9 bits) 0.9 **************** * 0.6 **************** * 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel AGTCGGTAGCGAGGGCAG consensus T G GC TGG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 14553 187 1.14e-10 GCGCAATCAC AGTCGGTAGCGACGGCTG CAGCATCAAG 49176 272 3.49e-10 TTATATAGTG AGTCGGTAGGGAGGGGAG AGGGTACCTA 8658 121 6.48e-10 TCGACACCGG ATTCGGTAGCGGGGTCGG TGATTACAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14553 1.1e-10 186_[+1]_296 49176 3.5e-10 271_[+1]_211 8658 6.5e-10 120_[+1]_362 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=18 seqs=3 14553 ( 187) AGTCGGTAGCGACGGCTG 1 49176 ( 272) AGTCGGTAGGGAGGGGAG 1 8658 ( 121) ATTCGGTAGCGGGGTCGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 5796 bayes= 11.3628 E= 1.7e+002 185 -823 -823 -823 -823 -823 156 35 -823 -823 -823 194 -823 207 -823 -823 -823 -823 214 -823 -823 -823 214 -823 -823 -823 -823 194 185 -823 -823 -823 -823 -823 214 -823 -823 149 56 -823 -823 -823 214 -823 127 -823 56 -823 -823 49 156 -823 -823 -823 214 -823 -823 -823 156 35 -823 149 56 -823 27 -823 56 35 -823 -823 214 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 3 E= 1.7e+002 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.666667 0.333333 0.000000 0.333333 0.000000 0.333333 0.333333 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[GT]TCGGTAG[CG]G[AG][GC]G[GT][CG][AGT]G -------------------------------------------------------------------------------- Time 1.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 18 sites = 10 llr = 124 E-value = 2.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 11::732:31:3:24721 pos.-specific C ::4:276726::a23:44 probability G ::4a::::2::7:13::: matrix T 992:1:2333a::5:345 bits 2.1 * * 1.9 * * * 1.7 * * * 1.5 ** * * * Relative 1.3 ** * * * Entropy 1.1 ** * * * *** * (17.9 bits) 0.9 ** * * * *** * 0.6 ******** **** * * 0.4 ******** **** **** 0.2 ******** ********* 0.0 ------------------ Multilevel TTCGACCCACTGCTAACT consensus G CAATTT A ACTTC sequence T T C CG A G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 45710 269 3.25e-10 GTTCTATAGG TTGGACCCACTGCTCACT GACTGTGACA 43602 469 1.03e-07 TCAATAGAAT TTGGACTCGCTACTGACC GTTCTGAGGT 8658 402 1.16e-07 GGAAGCGTTT TTCGACACTCTGCGAATT ATGTGCTGCT 48105 335 2.01e-07 AGTGTAGACG TTCGAACCACTGCACTTT GTTGCCGTCG 36649 324 1.21e-06 AGAAAGTACC TAGGACATGCTGCTAACT AGCCCAAGCA 42825 286 1.65e-06 GCTCTAGATC TTGGCCTCCTTGCCATTT CCTTTTTCCT 43215 33 2.10e-06 TCATTTACAT TTTGACCCTTTACTGTTA AAAAATTCTT 36877 67 3.58e-06 CCTGACAGCC TTTGTACCTATGCTCACC TTTTCTTATA 14553 367 4.05e-06 CTACCGGCCG ATCGACCTATTGCAAAAC TGTCCGTCTC 49176 428 5.15e-06 CCCCCCCCCC TTCGCACTCCTACCGAAC GACTCTCGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45710 3.2e-10 268_[+2]_214 43602 1e-07 468_[+2]_14 8658 1.2e-07 401_[+2]_81 48105 2e-07 334_[+2]_148 36649 1.2e-06 323_[+2]_159 42825 1.7e-06 285_[+2]_197 43215 2.1e-06 32_[+2]_450 36877 3.6e-06 66_[+2]_416 14553 4.1e-06 366_[+2]_116 49176 5.1e-06 427_[+2]_55 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=18 seqs=10 45710 ( 269) TTGGACCCACTGCTCACT 1 43602 ( 469) TTGGACTCGCTACTGACC 1 8658 ( 402) TTCGACACTCTGCGAATT 1 48105 ( 335) TTCGAACCACTGCACTTT 1 36649 ( 324) TAGGACATGCTGCTAACT 1 42825 ( 286) TTGGCCTCCTTGCCATTT 1 43215 ( 33) TTTGACCCTTTACTGTTA 1 36877 ( 67) TTTGTACCTATGCTCACC 1 14553 ( 367) ATCGACCTATTGCAAAAC 1 49176 ( 428) TTCGCACTCCTACCGAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 5796 bayes= 9.42836 E= 2.3e+002 -147 -997 -997 179 -147 -997 -997 179 -997 75 82 -38 -997 -997 215 -997 134 -24 -997 -138 12 156 -997 -997 -47 134 -997 -38 -997 156 -997 20 12 -24 -18 20 -147 134 -997 20 -997 -997 -997 194 12 -997 163 -997 -997 208 -997 -997 -47 -24 -117 94 53 34 41 -997 134 -997 -997 20 -47 75 -997 62 -147 75 -997 94 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 10 E= 2.3e+002 0.100000 0.000000 0.000000 0.900000 0.100000 0.000000 0.000000 0.900000 0.000000 0.400000 0.400000 0.200000 0.000000 0.000000 1.000000 0.000000 0.700000 0.200000 0.000000 0.100000 0.300000 0.700000 0.000000 0.000000 0.200000 0.600000 0.000000 0.200000 0.000000 0.700000 0.000000 0.300000 0.300000 0.200000 0.200000 0.300000 0.100000 0.600000 0.000000 0.300000 0.000000 0.000000 0.000000 1.000000 0.300000 0.000000 0.700000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.200000 0.100000 0.500000 0.400000 0.300000 0.300000 0.000000 0.700000 0.000000 0.000000 0.300000 0.200000 0.400000 0.000000 0.400000 0.100000 0.400000 0.000000 0.500000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TT[CGT]G[AC][CA][CAT][CT][ATCG][CT]T[GA]C[TAC][ACG][AT][CTA][TC] -------------------------------------------------------------------------------- Time 2.62 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 14 sites = 12 llr = 123 E-value = 1.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 985:31811932:: pos.-specific C 1::a1:1:91:142 probability G :15:33:1::1248 matrix T :1::3628::762: bits 2.1 * 1.9 * 1.7 * * 1.5 * * ** * Relative 1.3 * * ** * Entropy 1.1 **** *** * (14.7 bits) 0.9 **** **** * 0.6 **** ****** ** 0.4 **** ********* 0.2 ************** 0.0 -------------- Multilevel AAACGTATCATTCG consensus G TG A G sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 45710 412 5.56e-08 TTACAAAGGA AAGCATATCATTCG GCAAAGATAT 48105 182 1.55e-06 CCGTAATTGC AAGCGGAGCATTCG CAAGCAGAGA 43215 425 1.96e-06 TTTTCCGAAC AGACATATCATTGG TCTGGAGCAT 42825 403 3.69e-06 AGGAAATACG AAGCGTCTCATAGG AAACCAAATA 15565 117 4.03e-06 GTATCCCGAC AAACTGATCATGGC TGTGCATTCA 49176 111 4.45e-06 TGCCGTTCCC AAACGTTTCAGTCG ATGGAGAGAG 14553 411 8.53e-06 TGACACATAC AAGCATATCAATTC GGTGTATTAC 8658 4 8.53e-06 TGG AAACTGAACAATCG AGAAAACCGC 36877 403 1.50e-05 GCAGCAGTCA AAACCAATCATGGG ACCTGCGTAG 36649 366 1.61e-05 TCACTCTGAG CAGCTTTTCATTTG TTTGACCAGT 55073 15 3.94e-05 GGTGAGGTGT ATACGTATCCTACG GATAGTGTAC 43602 354 4.20e-05 GCTTTTATCC AAGCTGATAAACGG TGGACGTTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45710 5.6e-08 411_[+3]_75 48105 1.6e-06 181_[+3]_305 43215 2e-06 424_[+3]_62 42825 3.7e-06 402_[+3]_84 15565 4e-06 116_[+3]_370 49176 4.5e-06 110_[+3]_376 14553 8.5e-06 410_[+3]_76 8658 8.5e-06 3_[+3]_483 36877 1.5e-05 402_[+3]_84 36649 1.6e-05 365_[+3]_121 55073 3.9e-05 14_[+3]_472 43602 4.2e-05 353_[+3]_133 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=14 seqs=12 45710 ( 412) AAGCATATCATTCG 1 48105 ( 182) AAGCGGAGCATTCG 1 43215 ( 425) AGACATATCATTGG 1 42825 ( 403) AAGCGTCTCATAGG 1 15565 ( 117) AAACTGATCATGGC 1 49176 ( 111) AAACGTTTCAGTCG 1 14553 ( 411) AAGCATATCAATTC 1 8658 ( 4) AAACTGAACAATCG 1 36877 ( 403) AAACCAATCATGGG 1 36649 ( 366) CAGCTTTTCATTTG 1 55073 ( 15) ATACGTATCCTACG 1 43602 ( 354) AAGCTGATAAACGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 5844 bayes= 8.92481 E= 1.5e+002 173 -151 -1023 -1023 159 -1023 -144 -164 85 -1023 115 -1023 -1023 208 -1023 -1023 -15 -151 56 36 -173 -1023 56 116 144 -151 -1023 -64 -173 -1023 -144 168 -173 195 -1023 -1023 173 -151 -1023 -1023 -15 -1023 -144 135 -73 -151 -44 116 -1023 81 88 -64 -1023 -51 188 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 12 E= 1.5e+002 0.916667 0.083333 0.000000 0.000000 0.833333 0.000000 0.083333 0.083333 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.083333 0.333333 0.333333 0.083333 0.000000 0.333333 0.583333 0.750000 0.083333 0.000000 0.166667 0.083333 0.000000 0.083333 0.833333 0.083333 0.916667 0.000000 0.000000 0.916667 0.083333 0.000000 0.000000 0.250000 0.000000 0.083333 0.666667 0.166667 0.083333 0.166667 0.583333 0.000000 0.416667 0.416667 0.166667 0.000000 0.166667 0.833333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AA[AG]C[GTA][TG]ATCA[TA]T[CG]G -------------------------------------------------------------------------------- Time 3.93 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8658 3.56e-11 3_[+3(8.53e-06)]_103_[+1(6.48e-10)]_\ 32_[+1(7.27e-05)]_213_[+2(1.16e-07)]_81 14553 1.93e-10 186_[+1(1.14e-10)]_162_\ [+2(4.05e-06)]_26_[+3(8.53e-06)]_76 15565 1.47e-02 116_[+3(4.03e-06)]_370 49176 3.75e-10 110_[+3(4.45e-06)]_147_\ [+1(3.49e-10)]_138_[+2(5.15e-06)]_55 55073 6.43e-02 14_[+3(3.94e-05)]_472 45710 1.07e-09 268_[+2(3.25e-10)]_125_\ [+3(5.56e-08)]_75 42825 1.32e-04 285_[+2(1.65e-06)]_99_\ [+3(3.69e-06)]_84 43215 9.49e-05 32_[+2(2.10e-06)]_374_\ [+3(1.96e-06)]_62 43602 1.97e-05 353_[+3(4.20e-05)]_101_\ [+2(1.03e-07)]_14 36649 1.83e-04 323_[+2(1.21e-06)]_24_\ [+3(1.61e-05)]_121 36877 4.10e-04 66_[+2(3.58e-06)]_318_\ [+3(1.50e-05)]_84 48105 4.91e-06 181_[+3(1.55e-06)]_139_\ [+2(2.01e-07)]_148 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************