******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/309/309.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10297 1.0000 500 10407 1.0000 500 14979 1.0000 500 19094 1.0000 500 1947 1.0000 500 22030 1.0000 500 22332 1.0000 500 22835 1.0000 500 22967 1.0000 500 24294 1.0000 500 25133 1.0000 500 25203 1.0000 500 25392 1.0000 500 3137 1.0000 500 3222 1.0000 500 41463 1.0000 500 6850 1.0000 500 8835 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/309/309.seqs.fa -oc motifs/309 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.240 G 0.227 T 0.261 Background letter frequencies (from dataset with add-one prior applied): A 0.272 C 0.240 G 0.227 T 0.261 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 8 llr = 141 E-value = 5.5e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :9::a::3::3:13::::1 pos.-specific C 9::9::9:6a:95:48:89 probability G ::1::::8::8::5::6:: matrix T 1191:a1:4::1436343: bits 2.1 * 1.9 ** * 1.7 ** * 1.5 * ***** * * * Relative 1.3 ******** *** * ** Entropy 1.1 ************ ***** (25.4 bits) 0.9 ************ ***** 0.6 ************* ***** 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel CATCATCGCCGCCGTCGCC consensus AT A TACTTT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 25133 437 2.16e-11 TAGCTGCAAT CATCATCGTCGCTGTCGCC AGAGGAGGCA 10407 272 2.16e-11 TAGCTGCAAT CATCATCGTCGCTGTCGCC AGAGGAGGCA 41463 6 2.03e-10 CACAA CATCATCGCCGCCATCGTC CTCGTGTCGT 25203 309 6.01e-09 TCCACTGCAG TATCATTGCCGCCGTCTCC GTCTGGGCTC 22967 183 8.80e-09 CGAACAACTC CTTCATCGCCACCTCCTCC ATCCATCCGA 3222 444 1.66e-08 GTCATCACAT CATCATCACCGCTGCTTCA ATTGACACCT 1947 61 2.75e-08 GAGGCTACCA CAGTATCGCCGCATCCGCC GCAGCCTCAG 22835 420 8.85e-08 CAGAGATTGA CATCATCATCATCATTGTC AAAACTTCGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25133 2.2e-11 436_[+1]_45 10407 2.2e-11 271_[+1]_210 41463 2e-10 5_[+1]_476 25203 6e-09 308_[+1]_173 22967 8.8e-09 182_[+1]_299 3222 1.7e-08 443_[+1]_38 1947 2.7e-08 60_[+1]_421 22835 8.8e-08 419_[+1]_62 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=8 25133 ( 437) CATCATCGTCGCTGTCGCC 1 10407 ( 272) CATCATCGTCGCTGTCGCC 1 41463 ( 6) CATCATCGCCGCCATCGTC 1 25203 ( 309) TATCATTGCCGCCGTCTCC 1 22967 ( 183) CTTCATCGCCACCTCCTCC 1 3222 ( 444) CATCATCACCGCTGCTTCA 1 1947 ( 61) CAGTATCGCCGCATCCGCC 1 22835 ( 420) CATCATCATCATCATTGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 8676 bayes= 10.819 E= 5.5e-006 -965 186 -965 -106 168 -965 -965 -106 -965 -965 -86 174 -965 186 -965 -106 188 -965 -965 -965 -965 -965 -965 194 -965 186 -965 -106 -12 -965 173 -965 -965 138 -965 52 -965 206 -965 -965 -12 -965 173 -965 -965 186 -965 -106 -112 106 -965 52 -12 -965 114 -6 -965 64 -965 126 -965 164 -965 -6 -965 -965 146 52 -965 164 -965 -6 -112 186 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 8 E= 5.5e-006 0.000000 0.875000 0.000000 0.125000 0.875000 0.000000 0.000000 0.125000 0.000000 0.000000 0.125000 0.875000 0.000000 0.875000 0.000000 0.125000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.875000 0.000000 0.125000 0.250000 0.000000 0.750000 0.000000 0.000000 0.625000 0.000000 0.375000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.875000 0.000000 0.125000 0.125000 0.500000 0.000000 0.375000 0.250000 0.000000 0.500000 0.250000 0.000000 0.375000 0.000000 0.625000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.625000 0.375000 0.000000 0.750000 0.000000 0.250000 0.125000 0.875000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CATCATC[GA][CT]C[GA]C[CT][GAT][TC][CT][GT][CT]C -------------------------------------------------------------------------------- Time 3.20 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 6 llr = 110 E-value = 1.5e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :a:258::::7:8:7:::2 pos.-specific C ::::2::::22::2::3:: probability G a:8832a3a82a:22a538 matrix T ::2::::7::::272:27: bits 2.1 * * * * * 1.9 ** * * * * 1.7 ** * * * * 1.5 **** * ** * * * Relative 1.3 **** ** ** ** * * Entropy 1.1 **** ***** ** * ** (26.4 bits) 0.9 **** ***** ** * ** 0.6 ******************* 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel GAGGAAGTGGAGATAGGTG consensus G G CG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 25133 414 7.56e-12 CCCACGTGAA GAGGAAGTGGAGATAGCTG CAATCATCAT 10407 249 7.56e-12 CCCACGTGAA GAGGAAGTGGAGATAGCTG CAATCATCAT 24294 99 5.56e-09 GAGGAGGCCG GAGGCGGTGGCGATGGGTG TTCGTTGTTG 19094 19 7.79e-09 TGAGGGTGGG GATGGAGGGGAGAGTGGTG AAGTGGACGA 41463 449 1.05e-08 CAGCCTGATG GAGAGAGTGGAGACAGGGA TGAAGTTGAA 6850 305 2.34e-08 TGTGCCGACG GAGGAAGGGCGGTTAGTGG AGAGGTGGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25133 7.6e-12 413_[+2]_68 10407 7.6e-12 248_[+2]_233 24294 5.6e-09 98_[+2]_383 19094 7.8e-09 18_[+2]_463 41463 1e-08 448_[+2]_33 6850 2.3e-08 304_[+2]_177 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=6 25133 ( 414) GAGGAAGTGGAGATAGCTG 1 10407 ( 249) GAGGAAGTGGAGATAGCTG 1 24294 ( 99) GAGGCGGTGGCGATGGGTG 1 19094 ( 19) GATGGAGGGGAGAGTGGTG 1 41463 ( 449) GAGAGAGTGGAGACAGGGA 1 6850 ( 305) GAGGAAGGGCGGTTAGTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 8676 bayes= 10.9446 E= 1.5e-002 -923 -923 214 -923 188 -923 -923 -923 -923 -923 188 -65 -71 -923 188 -923 88 -53 56 -923 161 -923 -44 -923 -923 -923 214 -923 -923 -923 56 135 -923 -923 214 -923 -923 -53 188 -923 129 -53 -44 -923 -923 -923 214 -923 161 -923 -923 -65 -923 -53 -44 135 129 -923 -44 -65 -923 -923 214 -923 -923 47 114 -65 -923 -923 56 135 -71 -923 188 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 6 E= 1.5e-002 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.166667 0.000000 0.833333 0.000000 0.500000 0.166667 0.333333 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.666667 0.166667 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.166667 0.166667 0.666667 0.666667 0.000000 0.166667 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.500000 0.166667 0.000000 0.000000 0.333333 0.666667 0.166667 0.000000 0.833333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GAGG[AG]AG[TG]GGAGATAG[GC][TG]G -------------------------------------------------------------------------------- Time 6.20 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 13 llr = 151 E-value = 1.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :212::52:71:7:6: pos.-specific C 845::::4:252:::5 probability G :22::93292:83:45 matrix T 2238a1221:5::a:: bits 2.1 1.9 * * 1.7 ** * * 1.5 ** * * Relative 1.3 * ** * * * Entropy 1.1 * *** * ***** (16.7 bits) 0.9 * *** * ***** 0.6 * **** ******** 0.4 * **** ******** 0.2 * ***** ******** 0.0 ---------------- Multilevel CCCTTGACGACGATAC consensus TATA GG TCG GG sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 22030 397 2.45e-09 TACAACAACT CGCTTGACGACGATAC GTGCAGGATG 25133 216 3.16e-08 AGGGTGACTG CATTTGACGATGATAG TAGAGCAAAC 10407 51 3.16e-08 AGGGTGACTG CATTTGACGATGATAG TAGAGCAAAC 41463 329 5.88e-07 TTGCCTGCTG CGTTTGTGGACGATGC AACGTTGTTT 22967 73 1.03e-06 GATCTTTACT CCCTTGATGAAGGTAG AAACTCCTAA 10297 20 1.39e-06 AGTGTTGTAC CCCATGGAGATGGTAG ATCGGCATCA 3222 322 1.69e-06 TTCTTGGTCA CTGATGAGGACGATAC TCCATCTAAT 6850 202 2.21e-06 GGACATTTCG CCCTTGGGGGTCATGG CTCGCTTCTT 8835 342 5.25e-06 TCACTCCAGT CTGTTGACGCTCATAC ACATAGTCTA 25203 276 7.32e-06 TGCCCGCTCA CACATGGAGATCGTAC ACGCACTTCC 22332 92 1.05e-05 AAGTGACGTT TGATTGATGGCGATGC GTGATGCTTC 19094 325 1.37e-05 CAATACCCCT TCCTTTTCGACGGTGG TGAACTGAAA 25392 363 2.20e-05 ATACTGCTTT TCTTTGGTTCCGATGC CAAAGGTGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22030 2.5e-09 396_[+3]_88 25133 3.2e-08 215_[+3]_269 10407 3.2e-08 50_[+3]_434 41463 5.9e-07 328_[+3]_156 22967 1e-06 72_[+3]_412 10297 1.4e-06 19_[+3]_465 3222 1.7e-06 321_[+3]_163 6850 2.2e-06 201_[+3]_283 8835 5.2e-06 341_[+3]_143 25203 7.3e-06 275_[+3]_209 22332 1.1e-05 91_[+3]_393 19094 1.4e-05 324_[+3]_160 25392 2.2e-05 362_[+3]_122 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=13 22030 ( 397) CGCTTGACGACGATAC 1 25133 ( 216) CATTTGACGATGATAG 1 10407 ( 51) CATTTGACGATGATAG 1 41463 ( 329) CGTTTGTGGACGATGC 1 22967 ( 73) CCCTTGATGAAGGTAG 1 10297 ( 20) CCCATGGAGATGGTAG 1 3222 ( 322) CTGATGAGGACGATAC 1 6850 ( 202) CCCTTGGGGGTCATGG 1 8835 ( 342) CTGTTGACGCTCATAC 1 25203 ( 276) CACATGGAGATCGTAC 1 22332 ( 92) TGATTGATGGCGATGC 1 19094 ( 325) TCCTTTTCGACGGTGG 1 25392 ( 363) TCTTTGGTTCCGATGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8730 bayes= 9.92035 E= 1.1e+000 -1035 168 -1035 -18 -24 68 3 -76 -182 94 -56 24 -24 -1035 -1035 156 -1035 -1035 -1035 194 -1035 -1035 203 -176 98 -1035 44 -76 -82 68 3 -18 -1035 -1035 203 -176 135 -64 -56 -1035 -182 94 -1035 82 -1035 -6 176 -1035 135 -1035 44 -1035 -1035 -1035 -1035 194 118 -1035 76 -1035 -1035 116 103 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 13 E= 1.1e+000 0.000000 0.769231 0.000000 0.230769 0.230769 0.384615 0.230769 0.153846 0.076923 0.461538 0.153846 0.307692 0.230769 0.000000 0.000000 0.769231 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.923077 0.076923 0.538462 0.000000 0.307692 0.153846 0.153846 0.384615 0.230769 0.230769 0.000000 0.000000 0.923077 0.076923 0.692308 0.153846 0.153846 0.000000 0.076923 0.461538 0.000000 0.461538 0.000000 0.230769 0.769231 0.000000 0.692308 0.000000 0.307692 0.000000 0.000000 0.000000 0.000000 1.000000 0.615385 0.000000 0.384615 0.000000 0.000000 0.538462 0.461538 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CT][CAG][CT][TA]TG[AG][CGT]GA[CT][GC][AG]T[AG][CG] -------------------------------------------------------------------------------- Time 9.58 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10297 7.36e-03 19_[+3(1.39e-06)]_430_\ [+3(2.80e-05)]_19 10407 7.25e-19 50_[+3(3.16e-08)]_117_\ [+2(3.85e-05)]_46_[+2(7.56e-12)]_4_[+1(2.16e-11)]_210 14979 1.97e-01 500 19094 3.44e-06 18_[+2(7.79e-09)]_287_\ [+3(1.37e-05)]_160 1947 3.32e-04 60_[+1(2.75e-08)]_421 22030 1.33e-05 396_[+3(2.45e-09)]_88 22332 5.51e-02 91_[+3(1.05e-05)]_393 22835 1.64e-03 419_[+1(8.85e-08)]_62 22967 4.63e-07 72_[+3(1.03e-06)]_94_[+1(8.80e-09)]_\ 299 24294 4.48e-05 98_[+2(5.56e-09)]_383 25133 7.25e-19 215_[+3(3.16e-08)]_117_\ [+2(3.85e-05)]_46_[+2(7.56e-12)]_4_[+1(2.16e-11)]_45 25203 1.48e-06 275_[+3(7.32e-06)]_17_\ [+1(6.01e-09)]_173 25392 1.56e-01 362_[+3(2.20e-05)]_122 3137 2.35e-01 500 3222 9.24e-07 321_[+3(1.69e-06)]_106_\ [+1(1.66e-08)]_38 41463 9.88e-14 5_[+1(2.03e-10)]_193_[+1(7.30e-05)]_\ 92_[+3(5.88e-07)]_104_[+2(1.05e-08)]_11_[+3(7.40e-05)]_6 6850 1.37e-06 201_[+3(2.21e-06)]_29_\ [+2(1.58e-05)]_39_[+2(2.34e-08)]_177 8835 1.33e-02 281_[+3(6.70e-05)]_44_\ [+3(5.25e-06)]_143 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************