******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/390/390.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42635 1.0000 500 42675 1.0000 500 8691 1.0000 500 36976 1.0000 500 47329 1.0000 500 5235 1.0000 500 6437 1.0000 500 48376 1.0000 500 29551 1.0000 500 15286 1.0000 500 48846 1.0000 500 32585 1.0000 500 43754 1.0000 500 49805 1.0000 500 50412 1.0000 500 34957 1.0000 500 35566 1.0000 500 12595 1.0000 500 43126 1.0000 500 43323 1.0000 500 36880 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/390/390.seqs.fa -oc motifs/390 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 21 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10500 N= 21 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.230 G 0.232 T 0.277 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.230 G 0.232 T 0.277 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 10 llr = 114 E-value = 8.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::29:a1:1a3: pos.-specific C 1:::a:::::3: probability G ::3:::914:47 matrix T 9a51:::95::3 bits 2.1 * 1.9 * ** * 1.7 * *** * 1.5 ** ***** * Relative 1.3 ** ***** * Entropy 1.1 ** ***** * * (16.5 bits) 0.8 ** ***** * * 0.6 ** ******* * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTTACAGTTAGG consensus G G AT sequence A C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 35566 316 4.33e-07 AATGAGCATA TTGACAGTGAGG CTGTCGAGAA 48376 433 4.33e-07 AAATCGTTCA TTGACAGTGAGG TCAACTTCCC 32585 299 6.05e-07 TGACTGTCTT TTTACAGTTAAG CTCCACGAGA 42675 261 1.35e-06 GATAATTAGT TTTACAGTTAGT TGCAAACCTT 43126 325 1.51e-06 GCAAGACGTG TTAACAGTTAAG TCAAAGGAGA 34957 461 4.49e-06 ACCGAGCGAT TTTACAGGGACG AAACACAGAT 48846 157 4.49e-06 AGTTTCCCAA TTAACAGTAAGG ATTCTGATCT 50412 212 4.63e-06 TATTGCGTTT CTTACAGTTAAG ACCCCTATTG 8691 325 1.21e-05 TTTGGTCATG TTTACAATTACT GTTTGTTCGT 36880 198 1.35e-05 TGCTTATGTT TTGTCAGTGACT GTGATTGTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35566 4.3e-07 315_[+1]_173 48376 4.3e-07 432_[+1]_56 32585 6.1e-07 298_[+1]_190 42675 1.3e-06 260_[+1]_228 43126 1.5e-06 324_[+1]_164 34957 4.5e-06 460_[+1]_28 48846 4.5e-06 156_[+1]_332 50412 4.6e-06 211_[+1]_277 8691 1.2e-05 324_[+1]_164 36880 1.4e-05 197_[+1]_291 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=10 35566 ( 316) TTGACAGTGAGG 1 48376 ( 433) TTGACAGTGAGG 1 32585 ( 299) TTTACAGTTAAG 1 42675 ( 261) TTTACAGTTAGT 1 43126 ( 325) TTAACAGTTAAG 1 34957 ( 461) TTTACAGGGACG 1 48846 ( 157) TTAACAGTAAGG 1 50412 ( 212) CTTACAGTTAAG 1 8691 ( 325) TTTACAATTACT 1 36880 ( 198) TTGTCAGTGACT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 10269 bayes= 10.2544 E= 8.3e+001 -997 -120 -997 170 -997 -997 -997 185 -38 -997 37 85 178 -997 -997 -147 -997 212 -997 -997 194 -997 -997 -997 -138 -997 196 -997 -997 -997 -121 170 -138 -997 79 85 194 -997 -997 -997 20 38 79 -997 -997 -997 160 12 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 8.3e+001 0.000000 0.100000 0.000000 0.900000 0.000000 0.000000 0.000000 1.000000 0.200000 0.000000 0.300000 0.500000 0.900000 0.000000 0.000000 0.100000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.100000 0.000000 0.900000 0.000000 0.000000 0.000000 0.100000 0.900000 0.100000 0.000000 0.400000 0.500000 1.000000 0.000000 0.000000 0.000000 0.300000 0.300000 0.400000 0.000000 0.000000 0.000000 0.700000 0.300000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TT[TGA]ACAGT[TG]A[GAC][GT] -------------------------------------------------------------------------------- Time 3.77 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 5 llr = 80 E-value = 2.4e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :a8:a2a:2:::8::: pos.-specific C a:22:2::6284:a6a probability G :::8:6:8:8242::: matrix T :::::::22::2::4: bits 2.1 * * * 1.9 ** * * * * 1.7 ** * * * * 1.5 ** ** * ** * * Relative 1.3 ***** ** ** ** * Entropy 1.1 ***** ** ** **** (23.1 bits) 0.8 ***** ** ** **** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CAAGAGAGCGCCACCC consensus CC A TACGGG T sequence C T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 42635 437 5.29e-10 TGGAATCGAA CAAGAGAGCGCCACTC GTCATTCCGG 43323 92 1.70e-08 GTTGATGTAG CAAGAGAGTGGCACCC ACCACATGAA 36880 388 5.82e-08 CAAGAGAGCA CACGAGAGCGCTGCTC GAAACAAGTC 15286 297 5.82e-08 TGTGGCGCAT CAAGACATAGCGACCC GTTCTAGTAT 5235 437 6.85e-08 TTTCCCGGTT CAACAAAGCCCGACCC AAAAGTCGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42635 5.3e-10 436_[+2]_48 43323 1.7e-08 91_[+2]_393 36880 5.8e-08 387_[+2]_97 15286 5.8e-08 296_[+2]_188 5235 6.8e-08 436_[+2]_48 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=5 42635 ( 437) CAAGAGAGCGCCACTC 1 43323 ( 92) CAAGAGAGTGGCACCC 1 36880 ( 388) CACGAGAGCGCTGCTC 1 15286 ( 297) CAAGACATAGCGACCC 1 5235 ( 437) CAACAAAGCCCGACCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 10185 bayes= 11.2432 E= 2.4e+003 -897 212 -897 -897 193 -897 -897 -897 161 -20 -897 -897 -897 -20 179 -897 193 -897 -897 -897 -38 -20 137 -897 193 -897 -897 -897 -897 -897 179 -47 -38 138 -897 -47 -897 -20 179 -897 -897 179 -21 -897 -897 80 79 -47 161 -897 -21 -897 -897 212 -897 -897 -897 138 -897 53 -897 212 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 2.4e+003 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.200000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.200000 0.600000 0.000000 0.200000 0.000000 0.200000 0.800000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.400000 0.400000 0.200000 0.800000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CA[AC][GC]A[GAC]A[GT][CAT][GC][CG][CGT][AG]C[CT]C -------------------------------------------------------------------------------- Time 7.69 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 21 llr = 175 E-value = 7.9e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1::1:4:::::1 pos.-specific C 31:47:8a:3:: probability G :6:2::2:6::2 matrix T 52a236::4797 bits 2.1 1.9 * * 1.7 * * 1.5 * * Relative 1.3 * ** * Entropy 1.1 * *** * (12.0 bits) 0.8 * ******** 0.6 ** ******** 0.4 *** ******** 0.2 ************ 0.0 ------------ Multilevel TGTCCTCCGTTT consensus CT GTAG TC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 35566 360 6.80e-08 CACTTCCTTC TGTCCTCCGTTT TCTTCTCGCT 12595 346 1.42e-06 CACGCCCTCT TGTCCACCTTTT TGTGGACACA 36976 454 1.14e-05 CGACGGTGGG TTTCCTCCGTTG ATGATGTGAC 43754 446 1.55e-05 AGCCGAAGAG CTTCCACCTTTT TCCTAACCAA 50412 114 1.78e-05 GCGAGTCGGG AGTGTTCCGTTT TCTACTAGTA 49805 180 1.78e-05 CTTGCTGCCA CGTTTTCCGCTT CTTGTCTATG 15286 441 2.04e-05 TTGGACATTC AGTACACCGTTT TGCGGCAAGG 8691 177 3.92e-05 GAACGAGGGT TGTGTTGCGCTT CACGGTAGCA 42635 235 3.92e-05 GGACAGGCGA TGTCCACCGTAT GCGAGCGGCA 6437 298 6.23e-05 GGGGATGGGT CGTACACCTTTG ATTGGTGGAA 42675 161 6.23e-05 CCAACCAAGG CTTGCTGCGCTT TGGCTTGAAA 43126 142 7.49e-05 TGGTTGTCTA TGTTCTCCTCTA CTACGGCCGT 43323 475 8.01e-05 TTTCTTGCAA TGTACAGCTCTT ATCCTCATGG 32585 241 8.74e-05 ATGCATCGCA TCTCCTCCTTTA GGACTGCTTG 5235 66 9.39e-05 CTCACATGCA ATTTCTCCTCTT TCGGAGGTCT 36880 13 1.20e-04 AGCAGTTCTG TGTGTTGCTTTG TTCTAAGTTG 34957 216 1.20e-04 AGAGCACCTG GTTCCTCCGTTG TGGAAAGCCG 48846 140 1.49e-04 ATCTCCACGT TGTTTTCAGTTT CCCAATTAAC 29551 68 1.60e-04 CATGGTCCGA CGTGTACCGATT TATCTCGGCG 48376 125 1.98e-04 GAATGAGATG CCTTGTCCGTTT CAGAAAATTC 47329 126 3.25e-04 GCAGGGGTCA TCTCCAGCGTGT GGGGAGGATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35566 6.8e-08 359_[+3]_129 12595 1.4e-06 345_[+3]_143 36976 1.1e-05 453_[+3]_35 43754 1.6e-05 445_[+3]_43 50412 1.8e-05 113_[+3]_375 49805 1.8e-05 179_[+3]_309 15286 2e-05 440_[+3]_48 8691 3.9e-05 176_[+3]_312 42635 3.9e-05 234_[+3]_254 6437 6.2e-05 297_[+3]_191 42675 6.2e-05 160_[+3]_328 43126 7.5e-05 141_[+3]_347 43323 8e-05 474_[+3]_14 32585 8.7e-05 240_[+3]_248 5235 9.4e-05 65_[+3]_423 36880 0.00012 12_[+3]_476 34957 0.00012 215_[+3]_273 48846 0.00015 139_[+3]_349 29551 0.00016 67_[+3]_421 48376 0.0002 124_[+3]_364 47329 0.00033 125_[+3]_363 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=21 35566 ( 360) TGTCCTCCGTTT 1 12595 ( 346) TGTCCACCTTTT 1 36976 ( 454) TTTCCTCCGTTG 1 43754 ( 446) CTTCCACCTTTT 1 50412 ( 114) AGTGTTCCGTTT 1 49805 ( 180) CGTTTTCCGCTT 1 15286 ( 441) AGTACACCGTTT 1 8691 ( 177) TGTGTTGCGCTT 1 42635 ( 235) TGTCCACCGTAT 1 6437 ( 298) CGTACACCTTTG 1 42675 ( 161) CTTGCTGCGCTT 1 43126 ( 142) TGTTCTCCTCTA 1 43323 ( 475) TGTACAGCTCTT 1 32585 ( 241) TCTCCTCCTTTA 1 5235 ( 66) ATTTCTCCTCTT 1 36880 ( 13) TGTGTTGCTTTG 1 34957 ( 216) GTTCCTCCGTTG 1 48846 ( 140) TGTTTTCAGTTT 1 29551 ( 68) CGTGTACCGATT 1 48376 ( 125) CCTTGTCCGTTT 1 47329 ( 126) TCTCCAGCGTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 10269 bayes= 8.93074 E= 7.9e+003 -87 31 -228 92 -1104 -69 142 -22 -1104 -1104 -1104 185 -87 73 4 -22 -1104 153 -228 4 54 -1104 -1104 116 -1104 173 4 -1104 -245 205 -1104 -1104 -1104 -1104 142 46 -245 31 -1104 127 -245 -1104 -228 171 -145 -1104 -28 137 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 21 E= 7.9e+003 0.142857 0.285714 0.047619 0.523810 0.000000 0.142857 0.619048 0.238095 0.000000 0.000000 0.000000 1.000000 0.142857 0.380952 0.238095 0.238095 0.000000 0.666667 0.047619 0.285714 0.380952 0.000000 0.000000 0.619048 0.000000 0.761905 0.238095 0.000000 0.047619 0.952381 0.000000 0.000000 0.000000 0.000000 0.619048 0.380952 0.047619 0.285714 0.000000 0.666667 0.047619 0.000000 0.047619 0.904762 0.095238 0.000000 0.190476 0.714286 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TC][GT]T[CGT][CT][TA][CG]C[GT][TC]TT -------------------------------------------------------------------------------- Time 11.53 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42635 6.30e-07 234_[+3(3.92e-05)]_190_\ [+2(5.29e-10)]_48 42675 1.24e-03 160_[+3(6.23e-05)]_88_\ [+1(1.35e-06)]_228 8691 2.43e-03 176_[+3(3.92e-05)]_136_\ [+1(1.21e-05)]_164 36976 3.52e-02 453_[+3(1.14e-05)]_35 47329 5.06e-01 500 5235 1.42e-04 65_[+3(9.39e-05)]_359_\ [+2(6.85e-08)]_48 6437 5.35e-02 297_[+3(6.23e-05)]_191 48376 9.26e-04 432_[+1(4.33e-07)]_56 29551 2.62e-01 500 15286 1.88e-05 296_[+2(5.82e-08)]_128_\ [+3(2.04e-05)]_48 48846 7.35e-03 156_[+1(4.49e-06)]_332 32585 8.90e-04 240_[+3(8.74e-05)]_46_\ [+1(6.05e-07)]_190 43754 1.26e-01 445_[+3(1.55e-05)]_43 49805 6.23e-04 150_[+1(3.74e-05)]_17_\ [+3(1.78e-05)]_309 50412 4.11e-04 113_[+3(1.78e-05)]_68_\ [+1(6.74e-05)]_6_[+1(4.63e-06)]_277 34957 4.95e-03 460_[+1(4.49e-06)]_28 35566 1.47e-07 208_[+1(4.06e-05)]_95_\ [+1(4.33e-07)]_32_[+3(6.80e-08)]_129 12595 1.06e-02 345_[+3(1.42e-06)]_143 43126 1.51e-03 141_[+3(7.49e-05)]_171_\ [+1(1.51e-06)]_164 43323 2.44e-05 91_[+2(1.70e-08)]_367_\ [+3(8.01e-05)]_14 36880 2.00e-06 197_[+1(1.35e-05)]_178_\ [+2(5.82e-08)]_97 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************