******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/309/309.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 43095 1.0000 500 43101 1.0000 500 46492 1.0000 500 47109 1.0000 500 47434 1.0000 500 47472 1.0000 500 49785 1.0000 500 49883 1.0000 500 50616 1.0000 500 50617 1.0000 500 44564 1.0000 500 45160 1.0000 500 11811 1.0000 500 51837 1.0000 500 48223 1.0000 500 43885 1.0000 500 42817 1.0000 500 45018 1.0000 500 32218 1.0000 500 50619 1.0000 500 49408 1.0000 500 47234 1.0000 500 49939 1.0000 500 47877 1.0000 500 45440 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/309/309.seqs.fa -oc motifs/309 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 25 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12500 N= 25 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.227 G 0.218 T 0.287 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.227 G 0.218 T 0.287 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 19 llr = 191 E-value = 5.2e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::9:::::873 pos.-specific C 326:a:::622: probability G 1:4::1a32111 matrix T 68:1:9:73::6 bits 2.2 * * 2.0 * * 1.8 * * 1.5 *** Relative 1.3 ***** Entropy 1.1 ******* * (14.5 bits) 0.9 ******* ** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTCACTGTCAAT consensus CCG GT CA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47877 153 9.56e-08 AAAAACTAGC TTCACTGTCAAT ACGGAAATTG 50619 90 9.56e-08 GATGTCATCC TTCACTGTCAAT GATACTCCCC 49408 121 4.97e-07 AAAACATCGG TTCACTGGCAAT TAGGAATGAG 49939 87 5.68e-07 CCTACCTGTA CTCACTGTCAAA TGATTAAAGT 47234 55 9.31e-07 AAATGCCCAT TCCACTGTCAAT GCATTCGAAA 45440 407 2.80e-06 ATTCTGTGCT CTCACTGTCCAT ACAATCGAGC 49883 430 9.07e-06 GTTCGTTCAA TTCACTGTTACA AAGTTCAAGT 47109 100 1.08e-05 ATTGAGAGAT TCGACTGGCAAA ATACGAAAGC 45160 360 1.32e-05 AATATTCTAC CTCTCTGTCAAA GGCTCTAGAT 48223 289 1.41e-05 ATGTTGGGCT GTGACTGTGAAT TCTACAGTTC 51837 200 1.41e-05 TTGGTTTTGC CTGACTGTGACT GTGAGTGGGA 47434 39 1.52e-05 TCTGGTTTTA CTCACTGTCAGA GGTATCGCCG 50617 235 1.68e-05 TGGCGGGAAA TTCACTGGTCAT CCGAACCGAC 50616 235 1.68e-05 AAGAGGAAAA TTCACTGGTCAT CCGAACCGAC 43885 257 2.87e-05 TAGATTCTTC TTGACTGTTGAT CTTGCCCTTC 45018 386 3.33e-05 TTCATCAAGT TCCACGGTCAAA ACTACCGTGG 42817 411 3.51e-05 GACGGATTAA TCGACTGTGAAG TACGTATAAA 43095 396 4.04e-05 CATGATGACT GTGACTGTCACG GAAGAGCCAA 11811 467 9.62e-05 ACTACACGCG CTGTCTGGTACT GGCCTCCTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47877 9.6e-08 152_[+1]_336 50619 9.6e-08 89_[+1]_399 49408 5e-07 120_[+1]_368 49939 5.7e-07 86_[+1]_402 47234 9.3e-07 54_[+1]_434 45440 2.8e-06 406_[+1]_82 49883 9.1e-06 429_[+1]_59 47109 1.1e-05 99_[+1]_389 45160 1.3e-05 359_[+1]_129 48223 1.4e-05 288_[+1]_200 51837 1.4e-05 199_[+1]_289 47434 1.5e-05 38_[+1]_450 50617 1.7e-05 234_[+1]_254 50616 1.7e-05 234_[+1]_254 43885 2.9e-05 256_[+1]_232 45018 3.3e-05 385_[+1]_103 42817 3.5e-05 410_[+1]_78 43095 4e-05 395_[+1]_93 11811 9.6e-05 466_[+1]_22 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=19 47877 ( 153) TTCACTGTCAAT 1 50619 ( 90) TTCACTGTCAAT 1 49408 ( 121) TTCACTGGCAAT 1 49939 ( 87) CTCACTGTCAAA 1 47234 ( 55) TCCACTGTCAAT 1 45440 ( 407) CTCACTGTCCAT 1 49883 ( 430) TTCACTGTTACA 1 47109 ( 100) TCGACTGGCAAA 1 45160 ( 360) CTCTCTGTCAAA 1 48223 ( 289) GTGACTGTGAAT 1 51837 ( 200) CTGACTGTGACT 1 47434 ( 39) CTCACTGTCAGA 1 50617 ( 235) TTCACTGGTCAT 1 50616 ( 235) TTCACTGGTCAT 1 43885 ( 257) TTGACTGTTGAT 1 45018 ( 386) TCCACGGTCAAA 1 42817 ( 411) TCGACTGTGAAG 1 43095 ( 396) GTGACTGTCACG 1 11811 ( 467) CTGTCTGGTACT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 12225 bayes= 10.2258 E= 5.2e-004 -1089 48 -105 101 -1089 -11 -1089 146 -1089 148 76 -1089 174 -1089 -1089 -145 -1089 214 -1089 -1089 -1089 -1089 -205 172 -1089 -1089 220 -1089 -1089 -1089 27 136 -1089 135 -47 -13 156 -52 -205 -1089 146 -11 -205 -1089 24 -1089 -105 101 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 19 E= 5.2e-004 0.000000 0.315789 0.105263 0.578947 0.000000 0.210526 0.000000 0.789474 0.000000 0.631579 0.368421 0.000000 0.894737 0.000000 0.000000 0.105263 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.052632 0.947368 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.263158 0.736842 0.000000 0.578947 0.157895 0.263158 0.789474 0.157895 0.052632 0.000000 0.736842 0.210526 0.052632 0.000000 0.315789 0.000000 0.105263 0.578947 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TC][TC][CG]ACTG[TG][CT]A[AC][TA] -------------------------------------------------------------------------------- Time 5.32 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 5 llr = 105 E-value = 5.7e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::86::::46:84::::6::: pos.-specific C a8::::6a2:a:6:aa6:88: probability G :2:2::2:4::::8::::22a matrix T ::22aa2::4:2:2::44::: bits 2.2 * * * ** * 2.0 * * * ** * 1.8 * ** * * ** * 1.5 ** ** * * ** *** Relative 1.3 ** ** * * *** *** Entropy 1.1 *** ** * ******* *** (30.2 bits) 0.9 *** ** * ************ 0.7 ******** ************ 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CCAATTCCAACACGCCCACCG consensus GTG G GT TAT TTGG sequence T T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 50617 441 2.34e-12 CACCATTTTA CCAATTCCGACACGCCTTCCG ATACTCTCCT 50616 441 2.34e-12 CACCATTTTA CCAATTCCGACACGCCTTCCG ATACTCTCCT 48223 420 2.40e-10 GTAACGAGGT CCAATTCCATCTAGCCCAGCG CGGAAGATTG 43095 162 8.98e-10 CCTATGATTG CGTGTTGCCACACGCCCACCG ACCCTTAAAC 45018 440 1.69e-09 TCACCGTGGA CCATTTTCATCAATCCCACGG TATTGTACAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50617 2.3e-12 440_[+2]_39 50616 2.3e-12 440_[+2]_39 48223 2.4e-10 419_[+2]_60 43095 9e-10 161_[+2]_318 45018 1.7e-09 439_[+2]_40 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=5 50617 ( 441) CCAATTCCGACACGCCTTCCG 1 50616 ( 441) CCAATTCCGACACGCCTTCCG 1 48223 ( 420) CCAATTCCATCTAGCCCAGCG 1 43095 ( 162) CGTGTTGCCACACGCCCACCG 1 45018 ( 440) CCATTTTCATCAATCCCACGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 12000 bayes= 11.4799 E= 5.7e+000 -897 214 -897 -897 -897 181 -13 -897 158 -897 -897 -52 116 -897 -13 -52 -897 -897 -897 180 -897 -897 -897 180 -897 140 -13 -52 -897 214 -897 -897 58 -18 87 -897 116 -897 -897 48 -897 214 -897 -897 158 -897 -897 -52 58 140 -897 -897 -897 -897 187 -52 -897 214 -897 -897 -897 214 -897 -897 -897 140 -897 48 116 -897 -897 48 -897 181 -13 -897 -897 181 -13 -897 -897 -897 219 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 5.7e+000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.800000 0.000000 0.000000 0.200000 0.600000 0.000000 0.200000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.200000 0.200000 0.000000 1.000000 0.000000 0.000000 0.400000 0.200000 0.400000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 1.000000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.400000 0.600000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.600000 0.000000 0.000000 0.400000 0.000000 0.800000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[CG][AT][AGT]TT[CGT]C[AGC][AT]C[AT][CA][GT]CC[CT][AT][CG][CG]G -------------------------------------------------------------------------------- Time 10.05 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 3 llr = 77 E-value = 2.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::3:7aa::::7:::a:::a: pos.-specific C ::::3::77:a:::::::::3 probability G aa7::::33::::aa:a:a:7 matrix T :::a:::::a:3a::::a::: bits 2.2 ** * ** * * 2.0 ** ** * **** ** 1.8 ** * ** ** ******** 1.5 ** * ** ** ******** Relative 1.3 ** * ****** ********* Entropy 1.1 *********** ********* (37.1 bits) 0.9 ********************* 0.7 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GGGTAAACCTCATGGAGTGAG consensus A C GG T C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 50617 371 1.50e-13 CGCTTTGTAA GGGTAAACCTCATGGAGTGAG GACACTTTAT 50616 371 1.50e-13 CGCTTTGTAA GGGTAAACCTCATGGAGTGAG GACACTTTAT 32218 409 1.00e-11 GAAAAGCCTG GGATCAAGGTCTTGGAGTGAC CGCGTGAGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50617 1.5e-13 370_[+3]_109 50616 1.5e-13 370_[+3]_109 32218 1e-11 408_[+3]_71 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=3 50617 ( 371) GGGTAAACCTCATGGAGTGAG 1 50616 ( 371) GGGTAAACCTCATGGAGTGAG 1 32218 ( 409) GGATCAAGGTCTTGGAGTGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 12000 bayes= 12.413 E= 2.5e+001 -823 -823 219 -823 -823 -823 219 -823 32 -823 161 -823 -823 -823 -823 180 131 55 -823 -823 190 -823 -823 -823 190 -823 -823 -823 -823 155 61 -823 -823 155 61 -823 -823 -823 -823 180 -823 214 -823 -823 131 -823 -823 21 -823 -823 -823 180 -823 -823 219 -823 -823 -823 219 -823 190 -823 -823 -823 -823 -823 219 -823 -823 -823 -823 180 -823 -823 219 -823 190 -823 -823 -823 -823 55 161 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 3 E= 2.5e+001 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.666667 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GG[GA]T[AC]AA[CG][CG]TC[AT]TGGAGTGA[GC] -------------------------------------------------------------------------------- Time 15.10 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43095 1.14e-06 161_[+2(8.98e-10)]_213_\ [+1(4.04e-05)]_93 43101 8.59e-01 500 46492 7.99e-02 340_[+2(2.09e-05)]_139 47109 5.78e-02 99_[+1(1.08e-05)]_389 47434 6.11e-02 38_[+1(1.52e-05)]_450 47472 3.64e-01 500 49785 3.00e-01 500 49883 4.84e-02 221_[+1(2.87e-05)]_196_\ [+1(9.07e-06)]_59 50616 8.21e-19 234_[+1(1.68e-05)]_124_\ [+3(1.50e-13)]_49_[+2(2.34e-12)]_39 50617 8.21e-19 234_[+1(1.68e-05)]_124_\ [+3(1.50e-13)]_49_[+2(2.34e-12)]_39 44564 4.81e-01 500 45160 3.45e-02 359_[+1(1.32e-05)]_129 11811 2.10e-01 466_[+1(9.62e-05)]_22 51837 1.62e-02 199_[+1(1.41e-05)]_289 48223 1.43e-07 288_[+1(1.41e-05)]_119_\ [+2(2.40e-10)]_60 43885 8.52e-02 256_[+1(2.87e-05)]_232 42817 2.56e-02 129_[+2(9.67e-05)]_260_\ [+1(3.51e-05)]_78 45018 9.46e-07 385_[+1(3.33e-05)]_42_\ [+2(1.69e-09)]_40 32218 4.31e-07 408_[+3(1.00e-11)]_71 50619 9.64e-04 89_[+1(9.56e-08)]_399 49408 2.87e-03 120_[+1(4.97e-07)]_368 47234 9.33e-03 54_[+1(9.31e-07)]_434 49939 2.99e-05 60_[+3(9.04e-05)]_5_[+1(5.68e-07)]_\ 318_[+2(3.99e-05)]_63 47877 4.09e-05 152_[+1(9.56e-08)]_336 45440 1.51e-02 381_[+1(8.51e-05)]_13_\ [+1(2.80e-06)]_82 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************