******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/144/144.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42442 1.0000 500 17408 1.0000 500 24996 1.0000 500 54586 1.0000 500 46588 1.0000 500 13360 1.0000 500 37131 1.0000 500 13777 1.0000 500 21667 1.0000 500 14713 1.0000 500 14915 1.0000 500 39335 1.0000 500 25420 1.0000 500 49313 1.0000 500 49426 1.0000 500 33325 1.0000 500 44586 1.0000 500 11281 1.0000 500 45164 1.0000 500 54375 1.0000 500 42654 1.0000 500 38678 1.0000 500 47494 1.0000 500 54217 1.0000 500 49231 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/144/144.seqs.fa -oc motifs/144 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 25 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12500 N= 25 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.262 C 0.247 G 0.229 T 0.262 Background letter frequencies (from dataset with add-one prior applied): A 0.262 C 0.247 G 0.229 T 0.262 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 14 sites = 12 llr = 148 E-value = 1.2e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::9:5:2:a212:3 pos.-specific C :3:a:::2:2:53: probability G :81::a:8:72356 matrix T a:::5:81::8:21 bits 2.1 * 1.9 * * * * 1.7 * * * * 1.5 * ** * * Relative 1.3 **** ** * Entropy 1.1 **** **** (17.8 bits) 0.9 *********** * 0.6 ************** 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel TGACAGTGAGTCGG consensus C T GCA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 17408 437 1.10e-08 GGTACTTCAC TGACTGTGAGTGGG GGTCCTCATT 54375 139 2.93e-08 TGTGAGTATT TGACTGTGAGTGCG AAGAGCAGAA 49426 315 1.70e-07 TCGTTCGGAC TGACAGTGAGGCCG CACACTGTCG 49231 362 3.11e-07 CTAACATATG TGACAGTGACTGCG AAAGCCAGCC 47494 415 5.81e-07 GTGGACTTGT TCACAGTCAGTCGG GTCCATTGCA 46588 473 7.85e-07 TCCACCTTGC TGACTGTTAGTCGA ACCATTTCCG 42442 21 9.20e-07 CATACAAACC TGACAGAGAGGGGG TTTGTTTTGA 37131 446 1.43e-06 CGTCCTCTAT TGACTGAGAATCGA ATCCAATCGA 38678 260 1.59e-06 CACTGGTAAG TGACTGTGACTATG TGACAGCATC 11281 136 4.03e-06 GGCATGGCTT TGACTGTGAAACCA GCCCAACATA 39335 79 5.15e-06 AAACAATGAT TCGCAGTGAGTAGA GATTGGCTTG 45164 365 9.27e-06 AAGAACAAAG TCACAGTCAGTCTT TCGATAGGGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17408 1.1e-08 436_[+1]_50 54375 2.9e-08 138_[+1]_348 49426 1.7e-07 314_[+1]_172 49231 3.1e-07 361_[+1]_125 47494 5.8e-07 414_[+1]_72 46588 7.9e-07 472_[+1]_14 42442 9.2e-07 20_[+1]_466 37131 1.4e-06 445_[+1]_41 38678 1.6e-06 259_[+1]_227 11281 4e-06 135_[+1]_351 39335 5.1e-06 78_[+1]_408 45164 9.3e-06 364_[+1]_122 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=14 seqs=12 17408 ( 437) TGACTGTGAGTGGG 1 54375 ( 139) TGACTGTGAGTGCG 1 49426 ( 315) TGACAGTGAGGCCG 1 49231 ( 362) TGACAGTGACTGCG 1 47494 ( 415) TCACAGTCAGTCGG 1 46588 ( 473) TGACTGTTAGTCGA 1 42442 ( 21) TGACAGAGAGGGGG 1 37131 ( 446) TGACTGAGAATCGA 1 38678 ( 260) TGACTGTGACTATG 1 11281 ( 136) TGACTGTGAAACCA 1 39335 ( 79) TCGCAGTGAGTAGA 1 45164 ( 365) TCACAGTCAGTCTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 12175 bayes= 10.4331 E= 1.2e-001 -1023 -1023 -1023 193 -1023 2 171 -1023 181 -1023 -146 -1023 -1023 202 -1023 -1023 93 -1023 -1023 93 -1023 -1023 213 -1023 -65 -1023 -1023 167 -1023 -57 171 -165 193 -1023 -1023 -1023 -65 -57 154 -1023 -165 -1023 -46 151 -65 102 54 -1023 -1023 43 113 -65 35 -1023 135 -165 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 12 E= 1.2e-001 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.916667 0.000000 0.083333 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.166667 0.750000 0.083333 1.000000 0.000000 0.000000 0.000000 0.166667 0.166667 0.666667 0.000000 0.083333 0.000000 0.166667 0.750000 0.166667 0.500000 0.333333 0.000000 0.000000 0.333333 0.500000 0.166667 0.333333 0.000000 0.583333 0.083333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[GC]AC[AT]GTGAGT[CG][GC][GA] -------------------------------------------------------------------------------- Time 6.39 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 24 llr = 203 E-value = 1.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 4::::::1::4: pos.-specific C ::::7:152128 probability G 21:9292:28:: matrix T 59a111846142 bits 2.1 1.9 1.7 * 1.5 ** * Relative 1.3 *** * Entropy 1.1 *** * * * (12.2 bits) 0.9 ****** * * 0.6 ********* * 0.4 ********** * 0.2 ************ 0.0 ------------ Multilevel TTTGCGTCTGAC consensus A G TC T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 14915 147 7.50e-07 TTGACAATCG ATTGCGTCTGCC GTTGGCGTGA 54375 51 1.29e-06 CCGACCATCG GTTGCGTTTGAC TTCTTTTCCA 24996 160 2.52e-06 CACACGTATC TTTGCGTTGGTC ACCCACAGTC 45164 109 1.02e-05 ATAGAACGAC TTTGGGTCTGAT AAACTAGAAA 39335 12 1.40e-05 ATGACCCGAG ATTGCGTTTCAC TTTCACAGTC 21667 311 1.40e-05 ACGGTCCGTA GTTTCGTCTGTC TGTCCGTGAC 37131 180 1.40e-05 ACTCCGAAGG GTTTCGTCTGTC CCTCTTACCA 25420 244 1.74e-05 ATTTCTAATG ATTGCGTACGTC CGTCAACGCA 49231 22 2.39e-05 TGGGAGAGAA TTTGGGTCTCAC CTATCTGTGC 17408 301 2.39e-05 GTACCATAGC ATTGGGGCTGCC ACATTCGGCC 49426 489 2.63e-05 CACTACTCTA TTTGCTTTCGAC 54586 172 3.85e-05 TGCATGTCGG TTTGCGTTCGCT CCTCCCCGTC 38678 219 4.62e-05 GGAAAGAGCA TGTGTGTCTGTC GGTCTGGCCG 14713 295 4.62e-05 ACGGAAAAAG ATCGCGTCTGCC CGTTGCTCCC 47494 197 5.52e-05 GGTAGGAACT TTTGCGCCTTTC GCTCCGTTCC 49313 478 6.57e-05 CATCGGCTCC ATTGTGTTTGGC TTTGCTTTAT 13360 350 6.57e-05 CTACCGCGGT TTTGCGTTGGAG CGAAAGGCAA 33325 365 8.36e-05 CTTTGCTTCT TTTGCGTCCTTT GTCCACCTTC 42442 406 1.06e-04 TCCCCCGGCA TTTGTGTCGTTC TTGTCGCCTC 54217 209 1.34e-04 AGGATCTAAA AATGCGTATGTC TCATTGGATT 13777 347 1.68e-04 AAGTGAATGG TGTGGGCCTGAC TATTCCTATT 42654 347 1.93e-04 ATCCAGCTTC GTTTCGGTCGAC GAAGACAACT 11281 203 1.93e-04 ATAGTTGGCC ATTGCTGTTGAT GGGACTGTGC 44586 135 2.22e-04 TGCAGAGTGC ATTGGTGCGGAC AATACGATTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14915 7.5e-07 146_[+2]_342 54375 1.3e-06 50_[+2]_438 24996 2.5e-06 159_[+2]_329 45164 1e-05 108_[+2]_380 39335 1.4e-05 11_[+2]_477 21667 1.4e-05 310_[+2]_178 37131 1.4e-05 179_[+2]_309 25420 1.7e-05 243_[+2]_245 49231 2.4e-05 21_[+2]_467 17408 2.4e-05 300_[+2]_188 49426 2.6e-05 488_[+2] 54586 3.9e-05 171_[+2]_317 38678 4.6e-05 218_[+2]_270 14713 4.6e-05 294_[+2]_194 47494 5.5e-05 196_[+2]_292 49313 6.6e-05 477_[+2]_11 13360 6.6e-05 349_[+2]_139 33325 8.4e-05 364_[+2]_124 42442 0.00011 405_[+2]_83 54217 0.00013 208_[+2]_280 13777 0.00017 346_[+2]_142 42654 0.00019 346_[+2]_142 11281 0.00019 202_[+2]_286 44586 0.00022 134_[+2]_354 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=24 14915 ( 147) ATTGCGTCTGCC 1 54375 ( 51) GTTGCGTTTGAC 1 24996 ( 160) TTTGCGTTGGTC 1 45164 ( 109) TTTGGGTCTGAT 1 39335 ( 12) ATTGCGTTTCAC 1 21667 ( 311) GTTTCGTCTGTC 1 37131 ( 180) GTTTCGTCTGTC 1 25420 ( 244) ATTGCGTACGTC 1 49231 ( 22) TTTGGGTCTCAC 1 17408 ( 301) ATTGGGGCTGCC 1 49426 ( 489) TTTGCTTTCGAC 1 54586 ( 172) TTTGCGTTCGCT 1 38678 ( 219) TGTGTGTCTGTC 1 14713 ( 295) ATCGCGTCTGCC 1 47494 ( 197) TTTGCGCCTTTC 1 49313 ( 478) ATTGTGTTTGGC 1 13360 ( 350) TTTGCGTTGGAG 1 33325 ( 365) TTTGCGTCCTTT 1 42442 ( 406) TTTGTGTCGTTC 1 54217 ( 209) AATGCGTATGTC 1 13777 ( 347) TGTGGGCCTGAC 1 42654 ( 347) GTTTCGGTCGAC 1 11281 ( 203) ATTGCTGTTGAT 1 44586 ( 135) ATTGGTGCGGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 12225 bayes= 9.43796 E= 1.6e+002 52 -1123 -46 80 -265 -1123 -146 174 -1123 -256 -1123 187 -1123 -1123 193 -107 -1123 143 -14 -107 -1123 -1123 193 -107 -1123 -157 -46 151 -165 113 -1123 51 -1123 -24 -46 125 -1123 -157 179 -107 67 -57 -245 51 -1123 168 -245 -65 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 24 E= 1.6e+002 0.375000 0.000000 0.166667 0.458333 0.041667 0.000000 0.083333 0.875000 0.000000 0.041667 0.000000 0.958333 0.000000 0.000000 0.875000 0.125000 0.000000 0.666667 0.208333 0.125000 0.000000 0.000000 0.875000 0.125000 0.000000 0.083333 0.166667 0.750000 0.083333 0.541667 0.000000 0.375000 0.000000 0.208333 0.166667 0.625000 0.000000 0.083333 0.791667 0.125000 0.416667 0.166667 0.041667 0.375000 0.000000 0.791667 0.041667 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TA]TTG[CG]GT[CT][TC]G[AT]C -------------------------------------------------------------------------------- Time 12.78 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 101 E-value = 9.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :a:2:::::a2:2a::2:::2 pos.-specific C ::8:2:6:4:::6:2:2::4: probability G a::8224a6:822:::::46: matrix T ::2:68:::::8::8a6a6:8 bits 2.1 * * 1.9 ** * * * * * 1.7 ** * * * * * 1.5 ** * * * * * Relative 1.3 **** * * *** *** * * Entropy 1.1 **** ******* *** **** (29.1 bits) 0.9 **** ******* *** **** 0.6 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GACGTTCGGAGTCATTTTTGT consensus TACGG C AGA C A GCA sequence G G C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 33325 341 1.42e-10 AGATTCACCA GACGGTCGGAGTCACTTTGCT TCTTTTGCGT 49426 161 1.73e-10 TCTGGCCTTG GACATTGGGAGTCATTCTGGT TTCGAAAAGG 38678 156 2.72e-10 CTTCGAAACG GACGTGCGGAGGAATTTTTGT TTTTTGTGAG 21667 49 4.52e-10 GTTGAGGGGC GACGCTCGCAATGATTTTTGT TCGCCGTAGG 49231 231 1.32e-09 CGTGTCCAGC GATGTTGGCAGTCATTATTCA ATTTTAAATA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33325 1.4e-10 340_[+3]_139 49426 1.7e-10 160_[+3]_319 38678 2.7e-10 155_[+3]_324 21667 4.5e-10 48_[+3]_431 49231 1.3e-09 230_[+3]_249 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 33325 ( 341) GACGGTCGGAGTCACTTTGCT 1 49426 ( 161) GACATTGGGAGTCATTCTGGT 1 38678 ( 156) GACGTGCGGAGGAATTTTTGT 1 21667 ( 49) GACGCTCGCAATGATTTTTGT 1 49231 ( 231) GATGTTGGCAGTCATTATTCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 12000 bayes= 11.4799 E= 9.7e+002 -897 -897 213 -897 193 -897 -897 -897 -897 169 -897 -39 -39 -897 180 -897 -897 -30 -19 119 -897 -897 -19 161 -897 128 80 -897 -897 -897 213 -897 -897 70 139 -897 193 -897 -897 -897 -39 -897 180 -897 -897 -897 -19 161 -39 128 -19 -897 193 -897 -897 -897 -897 -30 -897 161 -897 -897 -897 193 -39 -30 -897 119 -897 -897 -897 193 -897 -897 80 119 -897 70 139 -897 -39 -897 -897 161 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 9.7e+002 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.200000 0.000000 0.800000 0.000000 0.000000 0.200000 0.200000 0.600000 0.000000 0.000000 0.200000 0.800000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 0.200000 0.800000 0.200000 0.600000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 0.200000 0.200000 0.000000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.400000 0.600000 0.000000 0.200000 0.000000 0.000000 0.800000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GA[CT][GA][TCG][TG][CG]G[GC]A[GA][TG][CAG]A[TC]T[TAC]T[TG][GC][TA] -------------------------------------------------------------------------------- Time 18.86 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42442 4.69e-04 20_[+1(9.20e-07)]_466 17408 1.60e-06 300_[+2(2.39e-05)]_124_\ [+1(1.10e-08)]_50 24996 8.21e-03 159_[+2(2.52e-06)]_329 54586 1.95e-02 171_[+2(3.85e-05)]_317 46588 1.46e-03 472_[+1(7.85e-07)]_14 13360 1.67e-02 349_[+2(6.57e-05)]_139 37131 3.89e-04 179_[+2(1.40e-05)]_254_\ [+1(1.43e-06)]_41 13777 1.98e-01 500 21667 1.50e-07 48_[+3(4.52e-10)]_241_\ [+2(1.40e-05)]_178 14713 1.13e-01 294_[+2(4.62e-05)]_194 14915 3.13e-03 8_[+2(1.94e-05)]_126_[+2(7.50e-07)]_\ 342 39335 1.45e-04 11_[+2(1.40e-05)]_55_[+1(5.15e-06)]_\ 408 25420 1.51e-02 243_[+2(1.74e-05)]_245 49313 1.39e-01 477_[+2(6.57e-05)]_11 49426 4.25e-11 81_[+3(8.27e-05)]_58_[+3(1.73e-10)]_\ 133_[+1(1.70e-07)]_160_[+2(2.63e-05)] 33325 4.94e-07 340_[+3(1.42e-10)]_3_[+2(8.36e-05)]_\ 124 44586 1.47e-01 500 11281 7.10e-03 135_[+1(4.03e-06)]_351 45164 1.26e-03 108_[+2(1.02e-05)]_244_\ [+1(9.27e-06)]_122 54375 5.16e-07 50_[+2(1.29e-06)]_76_[+1(2.93e-08)]_\ 348 42654 3.61e-01 500 38678 8.73e-10 155_[+3(2.72e-10)]_42_\ [+2(4.62e-05)]_29_[+1(1.59e-06)]_227 47494 5.52e-05 196_[+2(5.52e-05)]_206_\ [+1(5.81e-07)]_72 54217 3.79e-01 500 49231 4.55e-10 21_[+2(2.39e-05)]_197_\ [+3(1.32e-09)]_110_[+1(3.11e-07)]_125 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************