******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/42/42.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 8657 1.0000 500 36423 1.0000 500 46639 1.0000 500 39118 1.0000 500 48710 1.0000 500 43615 1.0000 500 23168 1.0000 500 43981 1.0000 500 10282 1.0000 500 11313 1.0000 500 45408 1.0000 500 45453 1.0000 500 12171 1.0000 500 42880 1.0000 500 43228 1.0000 500 45933 1.0000 500 49599 1.0000 500 37886 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/42/42.seqs.fa -oc motifs/42 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.239 G 0.225 T 0.265 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.239 G 0.225 T 0.265 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 15 llr = 154 E-value = 3.0e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::8:6::1852 pos.-specific C 4:8:a:1:7:23 probability G 1:22::9:2:31 matrix T 5a:::4:a:2:4 bits 2.2 * 1.9 * * * 1.7 * * ** 1.5 * * ** Relative 1.3 **** ** Entropy 1.1 **** **** (14.8 bits) 0.9 ********* 0.6 ********** 0.4 *********** 0.2 ************ 0.0 ------------ Multilevel TTCACAGTCAAT consensus C GG T GTGC sequence CA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43228 30 1.55e-07 GTGGTACTAT TTCACAGTCAAC AGGATGGGAG 36423 267 2.97e-07 AAGTTTCTGC TTCACAGTCAGT TAGCTGCGAC 45933 359 1.11e-06 GGTGGCACGT TTCACAGTCACT GCCAGTCCAT 45408 401 1.11e-06 CATCTTCGTT CTCACTGTCAAC AGGCACACAC 23168 151 1.50e-06 TTCCAACACA CTCACAGTCAAA TCACGTCACG 39118 222 4.71e-06 TTAACATCAG TTCGCTGTCAAC ATTTACCTTA 37886 289 5.85e-06 AATAGAAATA CTCGCTGTCAAC AAACTACACT 11313 12 8.43e-06 GCATTTCTTC GTCACAGTCAGC GTACATTGCG 12171 132 1.06e-05 AACTACTAGA TTCACAGTAAAT GCAAAGCCAA 43615 129 1.42e-05 TGAAACCACC TTCGCAGTCTGT TTCTTTGTTC 48710 217 1.64e-05 TCGCTCACAC CTCACAGTCTCA TAAAATATCT 46639 83 2.05e-05 TAGTGGGACC TTGACTGTGAGT AGGAACCAAC 43981 464 3.33e-05 GACGAGAATC CTGACTGTGAAA GGTCGTCTCC 45453 301 3.56e-05 GTCCGGTTCA CTCACTCTCACT CACCCCATCC 42880 371 8.39e-05 CCAAAGAACA TTGACAGTGTGG TCAACCAGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43228 1.6e-07 29_[+1]_459 36423 3e-07 266_[+1]_222 45933 1.1e-06 358_[+1]_130 45408 1.1e-06 400_[+1]_88 23168 1.5e-06 150_[+1]_338 39118 4.7e-06 221_[+1]_267 37886 5.8e-06 288_[+1]_200 11313 8.4e-06 11_[+1]_477 12171 1.1e-05 131_[+1]_357 43615 1.4e-05 128_[+1]_360 48710 1.6e-05 216_[+1]_272 46639 2e-05 82_[+1]_406 43981 3.3e-05 463_[+1]_25 45453 3.6e-05 300_[+1]_188 42880 8.4e-05 370_[+1]_118 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=15 43228 ( 30) TTCACAGTCAAC 1 36423 ( 267) TTCACAGTCAGT 1 45933 ( 359) TTCACAGTCACT 1 45408 ( 401) CTCACTGTCAAC 1 23168 ( 151) CTCACAGTCAAA 1 39118 ( 222) TTCGCTGTCAAC 1 37886 ( 289) CTCGCTGTCAAC 1 11313 ( 12) GTCACAGTCAGC 1 12171 ( 132) TTCACAGTAAAT 1 43615 ( 129) TTCGCAGTCTGT 1 48710 ( 217) CTCACAGTCTCA 1 46639 ( 83) TTGACTGTGAGT 1 43981 ( 464) CTGACTGTGAAA 1 45453 ( 301) CTCACTCTCACT 1 42880 ( 371) TTGACAGTGTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 9.86941 E= 3.0e-003 -1055 74 -175 101 -1055 -1055 -1055 192 -1055 174 -17 -1055 156 -1055 -17 -1055 -1055 206 -1055 -1055 114 -1055 -1055 60 -1055 -184 205 -1055 -1055 -1055 -1055 192 -202 162 -17 -1055 156 -1055 -1055 -40 78 -26 57 -1055 -44 48 -175 60 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 15 E= 3.0e-003 0.000000 0.400000 0.066667 0.533333 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 0.066667 0.933333 0.000000 0.000000 0.000000 0.000000 1.000000 0.066667 0.733333 0.200000 0.000000 0.800000 0.000000 0.000000 0.200000 0.466667 0.200000 0.333333 0.000000 0.200000 0.333333 0.066667 0.400000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TC]T[CG][AG]C[AT]GT[CG][AT][AGC][TCA] -------------------------------------------------------------------------------- Time 2.99 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 13 sites = 10 llr = 116 E-value = 1.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :34:::9997::6 pos.-specific C 11::9::1:2a2: probability G 56:11a::1::74 matrix T 4:69::1::1:1: bits 2.2 * * 1.9 * * 1.7 * * 1.5 ****** * Relative 1.3 ****** * Entropy 1.1 ****** * * (16.7 bits) 0.9 ************ 0.6 ************* 0.4 ************* 0.2 ************* 0.0 ------------- Multilevel GGTTCGAAAACGA consensus TAA C CG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 46639 250 1.50e-08 GTGCATTCTC GGTTCGAAAACGA ACCAATCAAG 43615 487 9.45e-07 TAAAGGACAC GGAGCGAAAACGA G 23168 67 1.14e-06 CAGTCTCAAC TCTTCGAAAACGG TAGCCACAGT 36423 367 1.32e-06 TCACTTATTG TATTCGAAAACCA CTGACTGATA 43228 387 1.47e-06 CCTCGCTCAC GAATCGAAACCGG GAGCGAGATA 37886 455 1.89e-06 CGGTACTAGC TGATCGTAAACGA TATCTCTTCT 11313 397 3.12e-06 GCGCACGCTG GAATCGAAGACGG CGCCAGCAGC 45933 304 5.49e-06 CCTATCCCGA CGTTCGAAACCCA CTCACTCACT 48710 318 5.82e-06 AATCAATTCG GGTTCGACATCGG ATCTTGACAA 12171 86 7.75e-06 ATTGCTATTT TGTTGGAAAACTA AAAAAGTCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46639 1.5e-08 249_[+2]_238 43615 9.4e-07 486_[+2]_1 23168 1.1e-06 66_[+2]_421 36423 1.3e-06 366_[+2]_121 43228 1.5e-06 386_[+2]_101 37886 1.9e-06 454_[+2]_33 11313 3.1e-06 396_[+2]_91 45933 5.5e-06 303_[+2]_184 48710 5.8e-06 317_[+2]_170 12171 7.8e-06 85_[+2]_402 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=13 seqs=10 46639 ( 250) GGTTCGAAAACGA 1 43615 ( 487) GGAGCGAAAACGA 1 23168 ( 67) TCTTCGAAAACGG 1 36423 ( 367) TATTCGAAAACCA 1 43228 ( 387) GAATCGAAACCGG 1 37886 ( 455) TGATCGTAAACGA 1 11313 ( 397) GAATCGAAGACGG 1 45933 ( 304) CGTTCGAAACCCA 1 48710 ( 318) GGTTCGACATCGG 1 12171 ( 86) TGTTGGAAAACTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 8784 bayes= 9.2107 E= 1.5e+002 -997 -126 115 60 14 -126 141 -997 56 -997 -997 118 -997 -997 -117 177 -997 191 -117 -997 -997 -997 215 -997 173 -997 -997 -140 173 -126 -997 -997 173 -997 -117 -997 137 -26 -997 -140 -997 206 -997 -997 -997 -26 164 -140 114 -997 83 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 10 E= 1.5e+002 0.000000 0.100000 0.500000 0.400000 0.300000 0.100000 0.600000 0.000000 0.400000 0.000000 0.000000 0.600000 0.000000 0.000000 0.100000 0.900000 0.000000 0.900000 0.100000 0.000000 0.000000 0.000000 1.000000 0.000000 0.900000 0.000000 0.000000 0.100000 0.900000 0.100000 0.000000 0.000000 0.900000 0.000000 0.100000 0.000000 0.700000 0.200000 0.000000 0.100000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.700000 0.100000 0.600000 0.000000 0.400000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GT][GA][TA]TCGAAA[AC]C[GC][AG] -------------------------------------------------------------------------------- Time 5.91 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 8 llr = 110 E-value = 3.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 93::1:3::5::::13 pos.-specific C ::613:5:8:4:8::6 probability G :5:86:3a3:6a:a1: matrix T 1341:a:::5::3:81 bits 2.2 * * * 1.9 * * * * 1.7 * * * * 1.5 * * * * Relative 1.3 * * ** *** Entropy 1.1 * ** * ** **** (19.8 bits) 0.9 * **** ******** 0.6 * ************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel AGCGGTCGCAGGCGTC consensus AT C A GTC T A sequence T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 10282 470 2.06e-08 TACATTCACG AGCGGTCGGTGGTGTC CAACCGGGAA 36423 469 3.66e-08 TTGCGACCTA AGCGGTAGCAGGCGTT AATCCCTCAG 45453 6 7.81e-08 TCTGG AGTCGTCGCACGCGTC GCTGCTCGTC 45408 145 1.97e-07 ACTGAGATGT AGTGGTGGGTGGCGGC ATGTGGCACA 43615 452 3.62e-07 CCATATTGTC ATTGATCGCAGGCGTA AAATCTCTAT 46639 218 3.62e-07 GTTGGTCGGA AACTCTGGCTGGCGTC CGTCCGGTGC 49599 120 4.26e-07 ATAGGCCCAA AACGGTAGCACGTGTA CCGAGGGGTT 42880 472 1.38e-06 ACGGTATGGT TTCGCTCGCTCGCGAC GAGTTCTAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10282 2.1e-08 469_[+3]_15 36423 3.7e-08 468_[+3]_16 45453 7.8e-08 5_[+3]_479 45408 2e-07 144_[+3]_340 43615 3.6e-07 451_[+3]_33 46639 3.6e-07 217_[+3]_267 49599 4.3e-07 119_[+3]_365 42880 1.4e-06 471_[+3]_13 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=8 10282 ( 470) AGCGGTCGGTGGTGTC 1 36423 ( 469) AGCGGTAGCAGGCGTT 1 45453 ( 6) AGTCGTCGCACGCGTC 1 45408 ( 145) AGTGGTGGGTGGCGGC 1 43615 ( 452) ATTGATCGCAGGCGTA 1 46639 ( 218) AACTCTGGCTGGCGTC 1 49599 ( 120) AACGGTAGCACGTGTA 1 42880 ( 472) TTCGCTCGCTCGCGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8730 bayes= 11.4132 E= 3.7e+002 169 -965 -965 -108 -12 -965 115 -8 -965 139 -965 50 -965 -93 174 -108 -112 6 147 -965 -965 -965 -965 192 -12 106 15 -965 -965 -965 215 -965 -965 165 15 -965 88 -965 -965 92 -965 65 147 -965 -965 -965 215 -965 -965 165 -965 -8 -965 -965 215 -965 -112 -965 -85 150 -12 139 -965 -108 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 3.7e+002 0.875000 0.000000 0.000000 0.125000 0.250000 0.000000 0.500000 0.250000 0.000000 0.625000 0.000000 0.375000 0.000000 0.125000 0.750000 0.125000 0.125000 0.250000 0.625000 0.000000 0.000000 0.000000 0.000000 1.000000 0.250000 0.500000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.375000 0.625000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.125000 0.000000 0.125000 0.750000 0.250000 0.625000 0.000000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[GAT][CT]G[GC]T[CAG]G[CG][AT][GC]G[CT]GT[CA] -------------------------------------------------------------------------------- Time 9.20 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8657 7.15e-01 500 36423 6.60e-10 266_[+1(2.97e-07)]_88_\ [+2(1.32e-06)]_89_[+3(3.66e-08)]_16 46639 4.37e-09 82_[+1(2.05e-05)]_123_\ [+3(3.62e-07)]_16_[+2(1.50e-08)]_238 39118 6.76e-03 221_[+1(4.71e-06)]_267 48710 8.41e-04 216_[+1(1.64e-05)]_89_\ [+2(5.82e-06)]_170 43615 1.40e-07 128_[+1(1.42e-05)]_227_\ [+1(6.54e-05)]_72_[+3(3.62e-07)]_19_[+2(9.45e-07)]_1 23168 3.72e-05 52_[+1(1.64e-05)]_2_[+2(1.14e-06)]_\ 71_[+1(1.50e-06)]_338 43981 6.53e-02 463_[+1(3.33e-05)]_25 10282 1.29e-04 469_[+3(2.06e-08)]_15 11313 4.47e-04 11_[+1(8.43e-06)]_373_\ [+2(3.12e-06)]_91 45408 2.06e-06 144_[+3(1.97e-07)]_240_\ [+1(1.11e-06)]_88 45453 5.88e-05 5_[+3(7.81e-08)]_279_[+1(3.56e-05)]_\ 188 12171 9.44e-04 85_[+2(7.75e-06)]_33_[+1(1.06e-05)]_\ 357 42880 9.14e-04 370_[+1(8.39e-05)]_89_\ [+3(1.38e-06)]_13 43228 3.19e-06 29_[+1(1.55e-07)]_202_\ [+1(3.96e-05)]_131_[+2(1.47e-06)]_101 45933 2.45e-05 303_[+2(5.49e-06)]_4_[+1(6.81e-06)]_\ 26_[+1(1.11e-06)]_130 49599 2.29e-03 119_[+3(4.26e-07)]_365 37886 1.36e-04 288_[+1(5.85e-06)]_154_\ [+2(1.89e-06)]_33 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************