******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/441/441.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11375 1.0000 500 20810 1.0000 500 21605 1.0000 500 22166 1.0000 500 268009 1.0000 500 4261 1.0000 500 4470 1.0000 500 8181 1.0000 500 bd1078 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/441/441.seqs.fa -oc motifs/441 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.282 C 0.231 G 0.233 T 0.254 Background letter frequencies (from dataset with add-one prior applied): A 0.282 C 0.231 G 0.233 T 0.254 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 8 llr = 131 E-value = 8.1e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 318:8:::1:::43:4:36:: pos.-specific C :::5:8:9349a4:84a11:9 probability G 8933336:6::::43::6:6: matrix T :::3::41:61:34:3::341 bits 2.1 * * 1.9 * * 1.7 * * 1.5 * * ** * * Relative 1.3 ** * * ** * * * Entropy 1.1 *** **** *** * * ** (23.6 bits) 0.8 *** ******** * * ** 0.6 ************ * ***** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GGACACGCGTCCAGCACGAGC consensus A GGGGT CC CTGC ATT sequence T TA T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 21605 422 2.98e-10 GATTTTTAAA GGATACGCGTCCTTGCCGAGC ACCCCAACGT 268009 373 4.67e-10 GTAACAGAGA GGGCACGCGCCCAGCACAAGC AGCACGGAGC 20810 354 4.67e-10 AGTGCACAGA GGGCACGCGCCCAGCACAAGC CAGCACGAGC bd1078 198 1.29e-08 GGACGTGGAC AGAGAGTCCTCCCACCCGAGC CAACTGAAAT 8181 198 1.29e-08 GGACGTGGAC AGAGAGTCCTCCCACCCGAGC CAACTGAAAT 22166 344 8.28e-08 GAAAGGAGAT GAACGCGTGTCCTGCACGTTC CCGCCACCTT 4261 227 1.39e-07 GGAACGTGCA GGACACGCGTTCATCTCCTTT CAGCACAAAA 11375 83 1.71e-07 CAACTCGTGA GGATGCTCACCCCTGTCGCTC TATTGTCTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21605 3e-10 421_[+1]_58 268009 4.7e-10 372_[+1]_107 20810 4.7e-10 353_[+1]_126 bd1078 1.3e-08 197_[+1]_282 8181 1.3e-08 197_[+1]_282 22166 8.3e-08 343_[+1]_136 4261 1.4e-07 226_[+1]_253 11375 1.7e-07 82_[+1]_397 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=8 21605 ( 422) GGATACGCGTCCTTGCCGAGC 1 268009 ( 373) GGGCACGCGCCCAGCACAAGC 1 20810 ( 354) GGGCACGCGCCCAGCACAAGC 1 bd1078 ( 198) AGAGAGTCCTCCCACCCGAGC 1 8181 ( 198) AGAGAGTCCTCCCACCCGAGC 1 22166 ( 344) GAACGCGTGTCCTGCACGTTC 1 4261 ( 227) GGACACGCGTTCATCTCCTTT 1 11375 ( 83) GGATGCTCACCCCTGTCGCTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4320 bayes= 9.07414 E= 8.1e-004 -17 -965 169 -965 -117 -965 191 -965 141 -965 10 -965 -965 111 10 -2 141 -965 10 -965 -965 170 10 -965 -965 -965 142 56 -965 192 -965 -102 -117 11 142 -965 -965 70 -965 130 -965 192 -965 -102 -965 211 -965 -965 41 70 -965 -2 -17 -965 69 56 -965 170 10 -965 41 70 -965 -2 -965 211 -965 -965 -17 -89 142 -965 115 -89 -965 -2 -965 -965 142 56 -965 192 -965 -102 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 8.1e-004 0.250000 0.000000 0.750000 0.000000 0.125000 0.000000 0.875000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.500000 0.250000 0.250000 0.750000 0.000000 0.250000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.625000 0.375000 0.000000 0.875000 0.000000 0.125000 0.125000 0.250000 0.625000 0.000000 0.000000 0.375000 0.000000 0.625000 0.000000 0.875000 0.000000 0.125000 0.000000 1.000000 0.000000 0.000000 0.375000 0.375000 0.000000 0.250000 0.250000 0.000000 0.375000 0.375000 0.000000 0.750000 0.250000 0.000000 0.375000 0.375000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.250000 0.125000 0.625000 0.000000 0.625000 0.125000 0.000000 0.250000 0.000000 0.000000 0.625000 0.375000 0.000000 0.875000 0.000000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA]G[AG][CGT][AG][CG][GT]C[GC][TC]CC[ACT][GTA][CG][ACT]C[GA][AT][GT]C -------------------------------------------------------------------------------- Time 0.74 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 4 llr = 99 E-value = 1.0e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :5:::::::::5:::5:5::a pos.-specific C a3:::5:::aa55:a::::a: probability G :::::::a:::::a:5::::: matrix T :3aaa5a:a:::5:::a5a:: bits 2.1 * * ** ** * 1.9 * *** ***** ** * *** 1.7 * *** ***** ** * *** 1.5 * *** ***** ** * *** Relative 1.3 * *** ***** ** * *** Entropy 1.1 * *************** *** (35.8 bits) 0.8 * ******************* 0.6 * ******************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CATTTCTGTCCACGCATATCA consensus C T CT G T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- bd1078 394 1.85e-12 ACTCTAACGG CATTTCTGTCCACGCGTATCA CACATCACGT 8181 394 1.85e-12 ACTCTAACGG CATTTCTGTCCACGCGTATCA CACATCACGT 20810 295 1.36e-11 ATTTTGAATG CCTTTTTGTCCCTGCATTTCA GAACAATAAT 268009 314 1.49e-11 TTTTGACTGC CTTTTTTGTCCCTGCATTTCA GTACAATTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd1078 1.8e-12 393_[+2]_86 8181 1.8e-12 393_[+2]_86 20810 1.4e-11 294_[+2]_185 268009 1.5e-11 313_[+2]_166 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=4 bd1078 ( 394) CATTTCTGTCCACGCGTATCA 1 8181 ( 394) CATTTCTGTCCACGCGTATCA 1 20810 ( 295) CCTTTTTGTCCCTGCATTTCA 1 268009 ( 314) CTTTTTTGTCCCTGCATTTCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4320 bayes= 10.0755 E= 1.0e-003 -865 211 -865 -865 83 11 -865 -2 -865 -865 -865 197 -865 -865 -865 197 -865 -865 -865 197 -865 111 -865 98 -865 -865 -865 197 -865 -865 210 -865 -865 -865 -865 197 -865 211 -865 -865 -865 211 -865 -865 83 111 -865 -865 -865 111 -865 98 -865 -865 210 -865 -865 211 -865 -865 83 -865 110 -865 -865 -865 -865 197 83 -865 -865 98 -865 -865 -865 197 -865 211 -865 -865 182 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 1.0e-003 0.000000 1.000000 0.000000 0.000000 0.500000 0.250000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[ACT]TTT[CT]TGTCC[AC][CT]GC[AG]T[AT]TCA -------------------------------------------------------------------------------- Time 1.46 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 9 llr = 121 E-value = 1.5e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::221:::6a8a9:93 pos.-specific C 4:3:2:::4:::1::: probability G 63431118::2::a17 matrix T :7:46992:::::::: bits 2.1 * 1.9 * * * 1.7 * * * 1.5 ** * * * Relative 1.3 *** * **** Entropy 1.1 ** *********** (19.5 bits) 0.8 ** *********** 0.6 ** *********** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GTGTTTTGAAAAAGAG consensus CGCGC TC G A sequence AA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- bd1078 436 3.25e-09 CTCCGGCATT GTCGTTTGAAAAAGAG AGAGCTTGGC 8181 436 3.25e-09 CTCCGGCATT GTCGTTTGAAAAAGAG AGAGCTTGGC 21605 117 1.16e-07 ATCGTCCGCA GGGATTTGCAAAAGAA CTCAAAAGCA 20810 258 1.16e-07 AAACACTGAA GTGGCTTTAAAAAGAG TCTCAAATAG 268009 276 2.62e-07 AAAACACTGA GTGACTTTCAAAAGAG TCTCAAATAG 22166 260 2.62e-07 TCACGGATTC CTCTGTTGAAAAAGAA AATTGCCACC 4470 28 1.09e-06 TCGCATATTG CGGTATTGCAAAAGGG CACGGATGGG 4261 258 2.39e-06 CAGCACAAAA CGATTGTGCAGAAGAA AGGTTGAGTT 11375 242 2.85e-06 CCCCAAAAAG CTATTTGGAAGACGAG GAGTGCAAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd1078 3.3e-09 435_[+3]_49 8181 3.3e-09 435_[+3]_49 21605 1.2e-07 116_[+3]_368 20810 1.2e-07 257_[+3]_227 268009 2.6e-07 275_[+3]_209 22166 2.6e-07 259_[+3]_225 4470 1.1e-06 27_[+3]_457 4261 2.4e-06 257_[+3]_227 11375 2.8e-06 241_[+3]_243 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=9 bd1078 ( 436) GTCGTTTGAAAAAGAG 1 8181 ( 436) GTCGTTTGAAAAAGAG 1 21605 ( 117) GGGATTTGCAAAAGAA 1 20810 ( 258) GTGGCTTTAAAAAGAG 1 268009 ( 276) GTGACTTTCAAAAGAG 1 22166 ( 260) CTCTGTTGAAAAAGAA 1 4470 ( 28) CGGTATTGCAAAAGGG 1 4261 ( 258) CGATTGTGCAGAAGAA 1 11375 ( 242) CTATTTGGAAGACGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4365 bayes= 9.05343 E= 1.5e-003 -982 94 125 -982 -982 -982 52 139 -34 53 93 -982 -34 -982 52 81 -134 -6 -107 113 -982 -982 -107 181 -982 -982 -107 181 -982 -982 174 -19 98 94 -982 -982 183 -982 -982 -982 146 -982 -7 -982 183 -982 -982 -982 166 -106 -982 -982 -982 -982 210 -982 166 -982 -107 -982 24 -982 152 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 1.5e-003 0.000000 0.444444 0.555556 0.000000 0.000000 0.000000 0.333333 0.666667 0.222222 0.333333 0.444444 0.000000 0.222222 0.000000 0.333333 0.444444 0.111111 0.222222 0.111111 0.555556 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.777778 0.222222 0.555556 0.444444 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.777778 0.000000 0.222222 0.000000 1.000000 0.000000 0.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.888889 0.000000 0.111111 0.000000 0.333333 0.000000 0.666667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC][TG][GCA][TGA][TC]TT[GT][AC]A[AG]AAGA[GA] -------------------------------------------------------------------------------- Time 2.13 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11375 1.04e-05 82_[+1(1.71e-07)]_138_\ [+3(2.85e-06)]_243 20810 8.31e-17 257_[+3(1.16e-07)]_21_\ [+2(1.36e-11)]_38_[+1(4.67e-10)]_126 21605 2.10e-09 116_[+3(1.16e-07)]_289_\ [+1(2.98e-10)]_58 22166 7.77e-07 259_[+3(2.62e-07)]_68_\ [+1(8.28e-08)]_136 268009 1.98e-16 275_[+3(2.62e-07)]_22_\ [+2(1.49e-11)]_38_[+1(4.67e-10)]_107 4261 1.20e-06 226_[+1(1.39e-07)]_10_\ [+3(2.39e-06)]_227 4470 4.21e-03 27_[+3(1.09e-06)]_457 8181 9.64e-18 197_[+1(1.29e-08)]_175_\ [+2(1.85e-12)]_21_[+3(3.25e-09)]_49 bd1078 9.64e-18 197_[+1(1.29e-08)]_175_\ [+2(1.85e-12)]_21_[+3(3.25e-09)]_49 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************