******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/104/104.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 14378 1.0000 500 14490 1.0000 500 24322 1.0000 500 25218 1.0000 500 263115 1.0000 500 263268 1.0000 500 264726 1.0000 500 268043 1.0000 500 269792 1.0000 500 33717 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/104/104.seqs.fa -oc motifs/104 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.265 C 0.225 G 0.234 T 0.276 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.225 G 0.234 T 0.276 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 5 llr = 94 E-value = 2.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :466a:66:a4:a:48::a: pos.-specific C 26:2:8:2a::8::6::8:4 probability G 8:42:222::6::8::a::6 matrix T ::::::2::::2:2:2:2:: bits 2.2 * * 1.9 * ** * * * 1.7 * ** * * * 1.5 ** ** * * * Relative 1.3 * ** ** *** *** Entropy 1.1 *** ** ************ (27.2 bits) 0.9 *** ** ************ 0.6 ******************** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GCAAACAACAGCAGCAGCAG consensus CAGC GGC AT TAT T C sequence G TG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 263115 313 6.02e-12 GAGCATGAAA GCGAACAACAACAGCAGCAG TGATCCCAAC 264726 15 5.49e-10 CAGGGGTATA GCGAACAGCAGCAGATGCAG ATGGAAGGAG 14490 417 8.23e-10 AATCACTTTT GAAAACACCAGCATCAGCAC CTCGTGCACA 263268 480 3.87e-09 TGACCAACAG CAACACTACAACAGCAGCAC A 24322 450 1.52e-08 AGCATAGACA GCAGAGGACAGTAGAAGTAG CTCTCATCTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 263115 6e-12 312_[+1]_168 264726 5.5e-10 14_[+1]_466 14490 8.2e-10 416_[+1]_64 263268 3.9e-09 479_[+1]_1 24322 1.5e-08 449_[+1]_31 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=5 263115 ( 313) GCGAACAACAACAGCAGCAG 1 264726 ( 15) GCGAACAGCAGCAGATGCAG 1 14490 ( 417) GAAAACACCAGCATCAGCAC 1 263268 ( 480) CAACACTACAACAGCAGCAC 1 24322 ( 450) GCAGAGGACAGTAGAAGTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4810 bayes= 10.8525 E= 2.3e+001 -897 -17 177 -897 59 141 -897 -897 118 -897 77 -897 118 -17 -23 -897 191 -897 -897 -897 -897 183 -23 -897 118 -897 -23 -46 118 -17 -23 -897 -897 215 -897 -897 191 -897 -897 -897 59 -897 136 -897 -897 183 -897 -46 191 -897 -897 -897 -897 -897 177 -46 59 141 -897 -897 159 -897 -897 -46 -897 -897 209 -897 -897 183 -897 -46 191 -897 -897 -897 -897 83 136 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 5 E= 2.3e+001 0.000000 0.200000 0.800000 0.000000 0.400000 0.600000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.600000 0.200000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.600000 0.000000 0.200000 0.200000 0.600000 0.200000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.800000 0.000000 0.200000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.400000 0.600000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.000000 0.200000 1.000000 0.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC][CA][AG][ACG]A[CG][AGT][ACG]CA[GA][CT]A[GT][CA][AT]G[CT]A[GC] -------------------------------------------------------------------------------- Time 1.03 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 10 llr = 115 E-value = 2.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 16:4:::1::22::: pos.-specific C 2:a21751:9:48:: probability G 72::7::81:5:2:5 matrix T :2:4235:9134:a5 bits 2.2 * 1.9 * * 1.7 * * * 1.5 * ** ** Relative 1.3 * ** ** Entropy 1.1 * ***** *** (16.6 bits) 0.9 * * ****** *** 0.6 *** ****** *** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel GACAGCCGTCGCCTG consensus CG TTTT TTG T sequence T C AA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 14490 343 6.03e-08 ATCGTATCGT GTCAGCCGTCGCCTT CATCTCTACA 268043 137 2.83e-07 TGTGGATGGA GACTGTCGTCACCTT CGTGGAAGGG 269792 75 5.22e-07 GACACTAGTG CACCGCCGTCTTCTT CGGGAAGTGA 24322 355 1.07e-06 TCGCATCATA CACACCTGTCGCCTG ATCCTCCTCC 264726 169 2.28e-06 AGGTTTATTT AGCCGCTGTCGTCTG CTCCTGATTC 263115 25 2.71e-06 TGTAGCTGTT GTCAGTCGTCAACTG ATTTGTTGAC 263268 457 4.75e-06 TCGAGCCTCC GACTGCTATCGAGTG ACCAACAGCA 33717 406 5.11e-06 AATTCTCACT GGCTGCCCTCGTGTT GTTGTTTTCC 14378 6 5.91e-06 CGCTT GACATCTGTTTTCTG CGTGCGTGTG 25218 190 1.00e-05 TATCTCTAAC GACTTTTGGCTCCTT CAAAATACGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14490 6e-08 342_[+2]_143 268043 2.8e-07 136_[+2]_349 269792 5.2e-07 74_[+2]_411 24322 1.1e-06 354_[+2]_131 264726 2.3e-06 168_[+2]_317 263115 2.7e-06 24_[+2]_461 263268 4.8e-06 456_[+2]_29 33717 5.1e-06 405_[+2]_80 14378 5.9e-06 5_[+2]_480 25218 1e-05 189_[+2]_296 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=10 14490 ( 343) GTCAGCCGTCGCCTT 1 268043 ( 137) GACTGTCGTCACCTT 1 269792 ( 75) CACCGCCGTCTTCTT 1 24322 ( 355) CACACCTGTCGCCTG 1 264726 ( 169) AGCCGCTGTCGTCTG 1 263115 ( 25) GTCAGTCGTCAACTG 1 263268 ( 457) GACTGCTATCGAGTG 1 33717 ( 406) GGCTGCCCTCGTGTT 1 14378 ( 6) GACATCTGTTTTCTG 1 25218 ( 190) GACTTTTGGCTCCTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4860 bayes= 8.92184 E= 2.4e+001 -140 -17 158 -997 118 -997 -23 -46 -997 215 -997 -997 59 -17 -997 53 -997 -117 158 -46 -997 164 -997 12 -997 115 -997 86 -140 -117 177 -997 -997 -997 -122 170 -997 200 -997 -146 -41 -997 109 12 -41 83 -997 53 -997 183 -23 -997 -997 -997 -997 186 -997 -997 109 86 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 10 E= 2.4e+001 0.100000 0.200000 0.700000 0.000000 0.600000 0.000000 0.200000 0.200000 0.000000 1.000000 0.000000 0.000000 0.400000 0.200000 0.000000 0.400000 0.000000 0.100000 0.700000 0.200000 0.000000 0.700000 0.000000 0.300000 0.000000 0.500000 0.000000 0.500000 0.100000 0.100000 0.800000 0.000000 0.000000 0.000000 0.100000 0.900000 0.000000 0.900000 0.000000 0.100000 0.200000 0.000000 0.500000 0.300000 0.200000 0.400000 0.000000 0.400000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GC][AGT]C[ATC][GT][CT][CT]GTC[GTA][CTA][CG]T[GT] -------------------------------------------------------------------------------- Time 1.97 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 10 llr = 101 E-value = 2.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :78:1:2:3:38 pos.-specific C 61:a:31::83: probability G :22::7:a:14: matrix T 4:::9:7:71:2 bits 2.2 * * 1.9 * * 1.7 * * 1.5 * * Relative 1.3 **** * * Entropy 1.1 * **** *** * (14.6 bits) 0.9 ****** *** * 0.6 ********** * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CAACTGTGTCGA consensus TGG CA A AT sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 24322 261 1.12e-07 TCCGTTGAAC CAACTGTGTCCA TTGGATACAA 14378 99 8.38e-07 ACACCTGTAA CAACTGTGACAA TCAACTATTA 263115 487 2.51e-06 CGGGATGATG CAACTCAGTCGA TT 268043 48 3.65e-06 AAGACAAGTC CCACTGTGTCAA CGTGGTTGAC 264726 470 7.38e-06 ACGACAGGTT TGACTCTGTCCA TGAGGACGAT 25218 409 9.86e-06 CACCCTTCTT CAACTGTGATGA CTAAGTGGCC 263268 402 2.03e-05 TCGGAACAGG CAGCTGTGTGAA GAAGTGAAGA 14490 390 2.03e-05 GGTGGTGCCG TAGCTGCGTCCA TCTTGAATCA 33717 219 4.25e-05 TCACTTCCTT TGACTCTGACGT TGCCTCTATC 269792 301 6.24e-05 ATGAAACTGC TAACAGAGTCGT TTGTACTTTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24322 1.1e-07 260_[+3]_228 14378 8.4e-07 98_[+3]_390 263115 2.5e-06 486_[+3]_2 268043 3.7e-06 47_[+3]_441 264726 7.4e-06 469_[+3]_19 25218 9.9e-06 408_[+3]_80 263268 2e-05 401_[+3]_87 14490 2e-05 389_[+3]_99 33717 4.3e-05 218_[+3]_270 269792 6.2e-05 300_[+3]_188 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=10 24322 ( 261) CAACTGTGTCCA 1 14378 ( 99) CAACTGTGACAA 1 263115 ( 487) CAACTCAGTCGA 1 268043 ( 48) CCACTGTGTCAA 1 264726 ( 470) TGACTCTGTCCA 1 25218 ( 409) CAACTGTGATGA 1 263268 ( 402) CAGCTGTGTGAA 1 14490 ( 390) TAGCTGCGTCCA 1 33717 ( 219) TGACTCTGACGT 1 269792 ( 301) TAACAGAGTCGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 9.18275 E= 2.2e+002 -997 141 -997 53 140 -117 -23 -997 159 -997 -23 -997 -997 215 -997 -997 -140 -997 -997 170 -997 41 158 -997 -41 -117 -997 134 -997 -997 209 -997 18 -997 -997 134 -997 183 -122 -146 18 41 77 -997 159 -997 -997 -46 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 2.2e+002 0.000000 0.600000 0.000000 0.400000 0.700000 0.100000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.100000 0.000000 0.000000 0.900000 0.000000 0.300000 0.700000 0.000000 0.200000 0.100000 0.000000 0.700000 0.000000 0.000000 1.000000 0.000000 0.300000 0.000000 0.000000 0.700000 0.000000 0.800000 0.100000 0.100000 0.300000 0.300000 0.400000 0.000000 0.800000 0.000000 0.000000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CT][AG][AG]CT[GC][TA]G[TA]C[GAC][AT] -------------------------------------------------------------------------------- Time 2.97 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14378 1.22e-04 5_[+2(5.91e-06)]_78_[+3(8.38e-07)]_\ 390 14490 5.44e-11 342_[+2(6.03e-08)]_32_\ [+3(2.03e-05)]_15_[+1(8.23e-10)]_64 24322 9.53e-11 260_[+3(1.12e-07)]_82_\ [+2(1.07e-06)]_80_[+1(1.52e-08)]_31 25218 5.50e-04 189_[+2(1.00e-05)]_204_\ [+3(9.86e-06)]_80 263115 2.71e-12 24_[+2(2.71e-06)]_273_\ [+1(6.02e-12)]_154_[+3(2.51e-06)]_2 263268 1.31e-08 401_[+3(2.03e-05)]_43_\ [+2(4.75e-06)]_8_[+1(3.87e-09)]_1 264726 4.31e-10 14_[+1(5.49e-10)]_134_\ [+2(2.28e-06)]_286_[+3(7.38e-06)]_19 268043 2.26e-05 47_[+3(3.65e-06)]_77_[+2(2.83e-07)]_\ 349 269792 4.34e-04 74_[+2(5.22e-07)]_211_\ [+3(6.24e-05)]_188 33717 3.01e-03 218_[+3(4.25e-05)]_175_\ [+2(5.11e-06)]_80 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************