******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/139/139.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11622 1.0000 500 21594 1.0000 500 2209 1.0000 500 22726 1.0000 500 23083 1.0000 500 24022 1.0000 500 24373 1.0000 500 268830 1.0000 500 269490 1.0000 500 269575 1.0000 500 269576 1.0000 500 39032 1.0000 500 4774 1.0000 500 5034 1.0000 500 5456 1.0000 500 5615 1.0000 500 6114 1.0000 500 7850 1.0000 500 8715 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/139/139.seqs.fa -oc motifs/139 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.270 C 0.226 G 0.255 T 0.250 Background letter frequencies (from dataset with add-one prior applied): A 0.270 C 0.226 G 0.255 T 0.250 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 19 llr = 179 E-value = 1.4e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :41::::1:2:2 pos.-specific C 3:1:::::1:2: probability G 727:1a1573:3 matrix T :42a9:953686 bits 2.1 1.9 * * 1.7 * ** 1.5 **** Relative 1.3 **** * Entropy 1.1 * **** * (13.6 bits) 0.9 * **** * * 0.6 * ********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GAGTTGTGGTTT consensus CT TTG G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 268830 73 6.45e-08 TCGTCCATTG GTGTTGTTGTTT GGTGATTACT 5615 328 5.80e-07 CAGTGCTCAG GTGTTGTTGGTT GCTGAAAGCA 21594 38 9.19e-07 TGAGCCGAGA CAGTTGTGGTTT GCTCCCTACT 22726 304 1.19e-06 GAGGCGTTGT GAGTTGTGGTTG TGGGCAGTGA 5456 71 2.23e-06 GCCTCCAACG CTGTTGTTGTTG TTGTTGTACT 24022 141 4.36e-06 TTCTTTATTG GAGTTGTTTGTT TGCTCATGTT 6114 42 4.96e-06 AGCCAGGGGT GTATTGTGGTTT GGTTGGTGCT 2209 288 9.40e-06 GGGAGAAACA GTCTTGTTGTTT TGGCGGGTGA 39032 148 1.32e-05 TGAAAGGTGC GGGTTGTGTGTT CGTTTCATAG 269575 96 1.32e-05 TGTAGGGCTG GTGTTGGTGTTT GTTGGTGTAT 8715 133 1.98e-05 GTGATTGGCC GGGTGGTGGTTT GGGTTCCGTC 269576 378 2.98e-05 CAGTCACATA CATTTGTTGTTA TCCACTGGTT 5034 132 3.25e-05 AATAGAGGCT GAGTTGTAGTTA TGGCGGTGAG 4774 309 4.21e-05 ACGATTTGAT CTTTTGTTTGTT CATTGACTTT 24373 170 4.21e-05 GCGTTCTCGT GGGTTGTGGACT GGCAATGGGC 23083 147 6.81e-05 TTGGTTGAGT GTGTTGTTCATG TTCGTCTCGT 7850 316 1.43e-04 GCGTCAATGG GAATTGTGTATA ACGATTGAAG 11622 458 1.43e-04 AGTGCGAGAA CATTTGTGTTCG CATTCTCTTT 269490 257 1.84e-04 AAAGCCATGA CAGTGGTGGGCG CCAGCGGTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 268830 6.5e-08 72_[+1]_416 5615 5.8e-07 327_[+1]_161 21594 9.2e-07 37_[+1]_451 22726 1.2e-06 303_[+1]_185 5456 2.2e-06 70_[+1]_418 24022 4.4e-06 140_[+1]_348 6114 5e-06 41_[+1]_447 2209 9.4e-06 287_[+1]_201 39032 1.3e-05 147_[+1]_341 269575 1.3e-05 95_[+1]_393 8715 2e-05 132_[+1]_356 269576 3e-05 377_[+1]_111 5034 3.2e-05 131_[+1]_357 4774 4.2e-05 308_[+1]_180 24373 4.2e-05 169_[+1]_319 23083 6.8e-05 146_[+1]_342 7850 0.00014 315_[+1]_173 11622 0.00014 457_[+1]_31 269490 0.00018 256_[+1]_232 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=19 268830 ( 73) GTGTTGTTGTTT 1 5615 ( 328) GTGTTGTTGGTT 1 21594 ( 38) CAGTTGTGGTTT 1 22726 ( 304) GAGTTGTGGTTG 1 5456 ( 71) CTGTTGTTGTTG 1 24022 ( 141) GAGTTGTTTGTT 1 6114 ( 42) GTATTGTGGTTT 1 2209 ( 288) GTCTTGTTGTTT 1 39032 ( 148) GGGTTGTGTGTT 1 269575 ( 96) GTGTTGGTGTTT 1 8715 ( 133) GGGTGGTGGTTT 1 269576 ( 378) CATTTGTTGTTA 1 5034 ( 132) GAGTTGTAGTTA 1 4774 ( 309) CTTTTGTTTGTT 1 24373 ( 170) GGGTTGTGGACT 1 23083 ( 147) GTGTTGTTCATG 1 7850 ( 316) GAATTGTGTATA 1 11622 ( 458) CATTTGTGTTCG 1 269490 ( 257) CAGTGGTGGGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9291 bayes= 8.93074 E= 1.4e-004 -1089 48 143 -1089 64 -1089 -69 75 -136 -210 143 -66 -1089 -1089 -1089 200 -1089 -1089 -127 184 -1089 -1089 197 -1089 -1089 -1089 -227 192 -235 -1089 90 92 -1089 -210 143 7 -77 -1089 5 121 -1089 -52 -1089 175 -77 -1089 5 121 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 19 E= 1.4e-004 0.000000 0.315789 0.684211 0.000000 0.421053 0.000000 0.157895 0.421053 0.105263 0.052632 0.684211 0.157895 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.105263 0.894737 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.052632 0.947368 0.052632 0.000000 0.473684 0.473684 0.000000 0.052632 0.684211 0.263158 0.157895 0.000000 0.263158 0.578947 0.000000 0.157895 0.000000 0.842105 0.157895 0.000000 0.263158 0.578947 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC][AT]GTTGT[GT][GT][TG]T[TG] -------------------------------------------------------------------------------- Time 4.36 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 8 llr = 138 E-value = 6.6e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :65:3:1834:31:13:88:3 pos.-specific C a1584a6363a8:a84a1:98 probability G :3::1:3::3::3:1::1::: matrix T :::33:::11::6::4::31: bits 2.1 * * * * * 1.9 * * * * * 1.7 * * * * * 1.5 * * * * * * Relative 1.3 * * * ** * * ** Entropy 1.1 * ** * * ** ** * *** (24.9 bits) 0.9 * ** **** ** ** ***** 0.6 **** **** ***** ***** 0.4 **** **** *********** 0.2 ********* *********** 0.0 --------------------- Multilevel CAACCCCACACCTCCCCAACC consensus GCTA GCAC AG T T A sequence T G A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 269576 455 2.05e-10 AGCCACCCTC CAACACCAAACCTCCCCATCC AAAACACACA 23083 237 3.07e-10 CGGAAAAACT CACCCCCCCCCCTCGTCAACC AACCTAAAAG 39032 365 1.33e-09 AACATCAACG CACCTCCACGCCTCCACGTCC ACCACCACCA 4774 436 1.48e-09 AGACGACGCC CAACGCCAAGCCTCCTCAACA ACTCAAATCC 24373 277 9.18e-09 GACCGACGAC CGACCCCACACAGCAACAACC ACGCCAAAGG 5456 474 1.60e-08 GGCGGCCATA CACCTCACCTCCTCCCCAATC TCTACA 2209 455 4.44e-08 TCTACCTGCT CGCTCCGACACAACCTCCACC TCTCTCGTCC 21594 383 8.17e-08 TGGTTTGCCG CCATACGATCCCGCCCCAACA GCGTATTGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 269576 2e-10 454_[+2]_25 23083 3.1e-10 236_[+2]_243 39032 1.3e-09 364_[+2]_115 4774 1.5e-09 435_[+2]_44 24373 9.2e-09 276_[+2]_203 5456 1.6e-08 473_[+2]_6 2209 4.4e-08 454_[+2]_25 21594 8.2e-08 382_[+2]_97 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=8 269576 ( 455) CAACACCAAACCTCCCCATCC 1 23083 ( 237) CACCCCCCCCCCTCGTCAACC 1 39032 ( 365) CACCTCCACGCCTCCACGTCC 1 4774 ( 436) CAACGCCAAGCCTCCTCAACA 1 24373 ( 277) CGACCCCACACAGCAACAACC 1 5456 ( 474) CACCTCACCTCCTCCCCAATC 1 2209 ( 455) CGCTCCGACACAACCTCCACC 1 21594 ( 383) CCATACGATCCCGCCCCAACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9120 bayes= 10.891 E= 6.6e-004 -965 215 -965 -965 121 -85 -3 -965 89 115 -965 -965 -965 173 -965 0 -11 73 -102 0 -965 215 -965 -965 -111 147 -3 -965 147 15 -965 -965 -11 147 -965 -100 48 15 -3 -100 -965 215 -965 -965 -11 173 -965 -965 -111 -965 -3 132 -965 215 -965 -965 -111 173 -102 -965 -11 73 -965 58 -965 215 -965 -965 147 -85 -102 -965 147 -965 -965 0 -965 195 -965 -100 -11 173 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 6.6e-004 0.000000 1.000000 0.000000 0.000000 0.625000 0.125000 0.250000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.250000 0.375000 0.125000 0.250000 0.000000 1.000000 0.000000 0.000000 0.125000 0.625000 0.250000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.625000 0.000000 0.125000 0.375000 0.250000 0.250000 0.125000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.125000 0.000000 0.250000 0.625000 0.000000 1.000000 0.000000 0.000000 0.125000 0.750000 0.125000 0.000000 0.250000 0.375000 0.000000 0.375000 0.000000 1.000000 0.000000 0.000000 0.750000 0.125000 0.125000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 0.875000 0.000000 0.125000 0.250000 0.750000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[AG][AC][CT][CAT]C[CG][AC][CA][ACG]C[CA][TG]CC[CTA]CA[AT]C[CA] -------------------------------------------------------------------------------- Time 8.69 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 8 llr = 111 E-value = 3.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :933::5:155::::: pos.-specific C a:38:a1a3:516a39 probability G :13:3:::44::4::: matrix T ::3:8:4:31:9::81 bits 2.1 * * * * 1.9 * * * * 1.7 * * * * 1.5 * * * * * * Relative 1.3 ** *** * * *** Entropy 1.1 ** *** * ****** (20.0 bits) 0.9 ** *** * ****** 0.6 ** ***** ****** 0.4 ** ***** ******* 0.2 ** ************* 0.0 ---------------- Multilevel CAACTCACGAATCCTC consensus CAG T CGC G C sequence G T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 21594 429 3.25e-09 GCCAGCTGCG CAGCTCTCGGCTCCTC GAAGGTGCTT 5456 410 3.79e-08 CTAGAACTTC CACCTCTCGTCTCCTC TCATTCATCT 5615 60 1.15e-07 CTATACTGTT CAGATCACCACTGCTC TCTTTGCCAT 5034 468 2.96e-07 GTGTTATCCC CAAATCTCAAATCCTC TTCCCTGGAC 24373 12 4.04e-07 TCCATCCTGT CGACTCACCGATGCTC CATGATCCTC 269576 258 5.09e-07 CTACCACCAC CACCGCCCTACTCCCC TTCCTGGCAA 4774 395 5.47e-07 GATCACTTAC CATCTCACGAACGCCC CCCCGTCGCC 269490 382 7.06e-07 TGCAAACTGT CATCGCACTGATCCTT TCTTCTTCCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21594 3.2e-09 428_[+3]_56 5456 3.8e-08 409_[+3]_75 5615 1.2e-07 59_[+3]_425 5034 3e-07 467_[+3]_17 24373 4e-07 11_[+3]_473 269576 5.1e-07 257_[+3]_227 4774 5.5e-07 394_[+3]_90 269490 7.1e-07 381_[+3]_103 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=8 21594 ( 429) CAGCTCTCGGCTCCTC 1 5456 ( 410) CACCTCTCGTCTCCTC 1 5615 ( 60) CAGATCACCACTGCTC 1 5034 ( 468) CAAATCTCAAATCCTC 1 24373 ( 12) CGACTCACCGATGCTC 1 269576 ( 258) CACCGCCCTACTCCCC 1 4774 ( 395) CATCTCACGAACGCCC 1 269490 ( 382) CATCGCACTGATCCTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9215 bayes= 11.4912 E= 3.0e+000 -965 215 -965 -965 170 -965 -102 -965 -11 15 -3 0 -11 173 -965 -965 -965 -965 -3 158 -965 215 -965 -965 89 -85 -965 58 -965 215 -965 -965 -111 15 56 0 89 -965 56 -100 89 115 -965 -965 -965 -85 -965 181 -965 147 56 -965 -965 215 -965 -965 -965 15 -965 158 -965 195 -965 -100 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 3.0e+000 0.000000 1.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.250000 0.250000 0.250000 0.250000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.500000 0.125000 0.000000 0.375000 0.000000 1.000000 0.000000 0.000000 0.125000 0.250000 0.375000 0.250000 0.500000 0.000000 0.375000 0.125000 0.500000 0.500000 0.000000 0.000000 0.000000 0.125000 0.000000 0.875000 0.000000 0.625000 0.375000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.875000 0.000000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CA[ACGT][CA][TG]C[AT]C[GCT][AG][AC]T[CG]C[TC]C -------------------------------------------------------------------------------- Time 12.89 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11622 3.25e-01 500 21594 1.44e-11 37_[+1(9.19e-07)]_333_\ [+2(8.17e-08)]_25_[+3(3.25e-09)]_56 2209 4.07e-06 287_[+1(9.40e-06)]_155_\ [+2(4.44e-08)]_25 22726 1.81e-02 182_[+1(8.26e-06)]_109_\ [+1(1.19e-06)]_185 23083 1.15e-07 146_[+1(6.81e-05)]_78_\ [+2(3.07e-10)]_243 24022 4.51e-03 140_[+1(4.36e-06)]_348 24373 5.85e-09 11_[+3(4.04e-07)]_142_\ [+1(4.21e-05)]_95_[+2(9.18e-09)]_203 268830 5.96e-04 72_[+1(6.45e-08)]_416 269490 1.02e-03 381_[+3(7.06e-07)]_103 269575 1.17e-02 95_[+1(1.32e-05)]_393 269576 1.55e-10 240_[+2(1.80e-07)]_116_\ [+1(2.98e-05)]_65_[+2(2.05e-10)]_25 39032 1.22e-07 147_[+1(1.32e-05)]_95_\ [+2(8.80e-05)]_16_[+2(1.60e-05)]_20_[+2(1.24e-05)]_11_[+2(1.33e-09)]_115 4774 1.43e-09 308_[+1(4.21e-05)]_74_\ [+3(5.47e-07)]_25_[+2(1.48e-09)]_44 5034 1.38e-04 131_[+1(3.25e-05)]_324_\ [+3(2.96e-07)]_17 5456 7.17e-11 70_[+1(2.23e-06)]_327_\ [+3(3.79e-08)]_48_[+2(1.60e-08)]_6 5615 2.05e-06 59_[+3(1.15e-07)]_252_\ [+1(5.80e-07)]_161 6114 2.33e-03 41_[+1(4.96e-06)]_312_\ [+2(8.35e-05)]_114 7850 3.05e-01 500 8715 4.14e-03 132_[+1(1.98e-05)]_356 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************