******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/335/335.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42740 1.0000 500 43116 1.0000 500 54731 1.0000 500 47784 1.0000 500 47798 1.0000 500 32661 1.0000 500 15848 1.0000 500 49712 1.0000 500 54190 1.0000 500 44137 1.0000 500 33512 1.0000 500 48348 1.0000 500 41541 1.0000 500 36095 1.0000 500 44576 1.0000 500 44022 1.0000 500 48124 1.0000 500 43653 1.0000 500 49410 1.0000 500 50590 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/335/335.seqs.fa -oc motifs/335 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10000 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.273 C 0.232 G 0.219 T 0.277 Background letter frequencies (from dataset with add-one prior applied): A 0.273 C 0.232 G 0.219 T 0.277 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 19 llr = 195 E-value = 2.7e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 12:::::5:21:1731 pos.-specific C 51:141217:::413: probability G 5:51135:2:9:4152 matrix T :758563418:a12:7 bits 2.2 2.0 1.8 ** 1.5 ** Relative 1.3 *** Entropy 1.1 ** **** (14.8 bits) 0.9 * ** * **** * 0.7 ************ *** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CTGTTTGACTGTGAGT consensus GAT CGTT C A sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 54190 53 9.03e-10 TGTCCACTGG CTGTTTGACTGTGAGT AGAAACCCTA 33512 381 2.39e-07 GGAGAATAAA GCTTTTGTCTGTGAGT GTTGCGTTGT 32661 12 2.78e-07 AAGAAAGACG CTGTTGTTCTGTCACT GAACGACACA 48124 10 1.03e-06 TTTGGTCAA CTGCCTGACTGTGAGG GACTTTGATG 44137 36 1.31e-06 TTGATCGCTG GTTTCTGACTATGAAT CATATTTGTA 43116 212 1.64e-06 GCACTTGATT GTGTTTCACTGTAAAT TCAAAGTCAA 49410 288 2.04e-06 TGTCAGTCTT GTGGCGGACTGTGAAT GGTGCTCGCC 41541 40 2.28e-06 GTAGGAACCT CTTTTTGTGTGTCTGT TTGAGTATGT 54731 241 2.28e-06 GAGGCTGCAC GTTTGTCACTGTCAGT GACCCTTGTA 47798 292 5.64e-06 ACCGTCCGGT ATGTCGTTCTGTCACT CCGTAGAGCA 44022 301 6.79e-06 ACTCCGGGTG CTTTTTCTTTGTCACT TTGCCATGAC 50590 317 8.93e-06 CGCCAAGACG GCGTTTGACTGTCGAT CGACTACTGC 48348 390 2.38e-05 TCCGGATTGT GTGCCGGAGTGTGCGT CGGTAGCCGC 15848 222 2.38e-05 CACATTCTTA CATTCTTACAGTTACT CATCAATAGG 49712 8 3.19e-05 AAACGGA GTGTCGTTTTGTTTGT CCAATTGTGA 44576 211 4.22e-05 TCCTCGATAA GTTTCCTCCTGTCAGG AGTTCCAAAG 36095 266 5.16e-05 ATCAAGAGCA CAGTTTTTCTATGAAA GAATCGACAA 47784 484 7.53e-05 TTGTTGTGGT CATTTTGTGAGTGTGG C 43653 4 7.99e-05 GCC CATTTGGACAGTAACA GTTATTTGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54190 9e-10 52_[+1]_432 33512 2.4e-07 380_[+1]_104 32661 2.8e-07 11_[+1]_473 48124 1e-06 9_[+1]_475 44137 1.3e-06 35_[+1]_449 43116 1.6e-06 211_[+1]_273 49410 2e-06 287_[+1]_197 41541 2.3e-06 39_[+1]_445 54731 2.3e-06 240_[+1]_244 47798 5.6e-06 291_[+1]_193 44022 6.8e-06 300_[+1]_184 50590 8.9e-06 316_[+1]_168 48348 2.4e-05 389_[+1]_95 15848 2.4e-05 221_[+1]_263 49712 3.2e-05 7_[+1]_477 44576 4.2e-05 210_[+1]_274 36095 5.2e-05 265_[+1]_219 47784 7.5e-05 483_[+1]_1 43653 8e-05 3_[+1]_481 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=19 54190 ( 53) CTGTTTGACTGTGAGT 1 33512 ( 381) GCTTTTGTCTGTGAGT 1 32661 ( 12) CTGTTGTTCTGTCACT 1 48124 ( 10) CTGCCTGACTGTGAGG 1 44137 ( 36) GTTTCTGACTATGAAT 1 43116 ( 212) GTGTTTCACTGTAAAT 1 49410 ( 288) GTGGCGGACTGTGAAT 1 41541 ( 40) CTTTTTGTGTGTCTGT 1 54731 ( 241) GTTTGTCACTGTCAGT 1 47798 ( 292) ATGTCGTTCTGTCACT 1 44022 ( 301) CTTTTTCTTTGTCACT 1 50590 ( 317) GCGTTTGACTGTCGAT 1 48348 ( 390) GTGCCGGAGTGTGCGT 1 15848 ( 222) CATTCTTACAGTTACT 1 49712 ( 8) GTGTCGTTTTGTTTGT 1 44576 ( 211) GTTTCCTCCTGTCAGG 1 36095 ( 266) CAGTTTTTCTATGAAA 1 47784 ( 484) CATTTTGTGAGTGTGG 1 43653 ( 4) CATTTGGACAGTAACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 9.18819 E= 2.7e-001 -237 103 112 -1089 -38 -114 -1089 131 -1089 -1089 127 78 -1089 -114 -205 161 -1089 86 -205 93 -1089 -214 53 119 -1089 -55 127 19 95 -214 -1089 61 -1089 167 -47 -139 -79 -1089 -1089 161 -137 -1089 203 -1089 -1089 -1089 -1089 185 -137 67 95 -139 143 -214 -205 -81 -5 18 112 -1089 -137 -1089 -47 141 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 19 E= 2.7e-001 0.052632 0.473684 0.473684 0.000000 0.210526 0.105263 0.000000 0.684211 0.000000 0.000000 0.526316 0.473684 0.000000 0.105263 0.052632 0.842105 0.000000 0.421053 0.052632 0.526316 0.000000 0.052632 0.315789 0.631579 0.000000 0.157895 0.526316 0.315789 0.526316 0.052632 0.000000 0.421053 0.000000 0.736842 0.157895 0.105263 0.157895 0.000000 0.000000 0.842105 0.105263 0.000000 0.894737 0.000000 0.000000 0.000000 0.000000 1.000000 0.105263 0.368421 0.421053 0.105263 0.736842 0.052632 0.052632 0.157895 0.263158 0.263158 0.473684 0.000000 0.105263 0.000000 0.157895 0.736842 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG][TA][GT]T[TC][TG][GT][AT]CTGT[GC]A[GAC]T -------------------------------------------------------------------------------- Time 3.37 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 6 llr = 107 E-value = 4.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :8:::5:2::::::2:82:2 pos.-specific C 2:32:32::22:38:::88: probability G 8:7:7223a8:2228a:::8 matrix T :2:83:75::885:::2:2: bits 2.2 * * 2.0 * * 1.8 * * 1.5 * ** *** * Relative 1.3 **** **** ******* Entropy 1.1 ***** **** ******* (25.6 bits) 0.9 ***** **** ******* 0.7 ***** * **** ******* 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GAGTGATTGGTTTCGGACCG consensus C TC G C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 43653 38 7.44e-11 ACAAGATTGA CAGTGATGGGTTCCGGACCG GAACGAGCAA 36095 111 1.52e-09 TCTGTCAAGT GAGTTCCGGGTTTCGGAACG CGGATGCGAC 54731 75 4.69e-09 CGATTCATTC GACTGGTTGGTTTCGGACTA TTCCACGTCA 47798 266 5.05e-09 CGCTCTGTCT GTGTGATTGGCGGCGGACCG TCCGGTATGT 33512 321 6.37e-09 ACCCTTCCAA GACTTAGAGGTTTCGGTCCG CTCCCCCGAA 32661 452 2.02e-08 CTCTTTTCAA GAGCGCTTGCTTCGAGACCG CTTCCAGATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43653 7.4e-11 37_[+2]_443 36095 1.5e-09 110_[+2]_370 54731 4.7e-09 74_[+2]_406 47798 5e-09 265_[+2]_215 33512 6.4e-09 320_[+2]_160 32661 2e-08 451_[+2]_29 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=6 43653 ( 38) CAGTGATGGGTTCCGGACCG 1 36095 ( 111) GAGTTCCGGGTTTCGGAACG 1 54731 ( 75) GACTGGTTGGTTTCGGACTA 1 47798 ( 266) GTGTGATTGGCGGCGGACCG 1 33512 ( 321) GACTTAGAGGTTTCGGTCCG 1 32661 ( 452) GAGCGCTTGCTTCGAGACCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 9620 bayes= 10.3047 E= 4.9e+001 -923 -47 193 -923 161 -923 -923 -73 -923 52 161 -923 -923 -47 -923 159 -923 -923 161 27 87 52 -39 -923 -923 -47 -39 127 -71 -923 61 85 -923 -923 219 -923 -923 -47 193 -923 -923 -47 -923 159 -923 -923 -39 159 -923 52 -39 85 -923 185 -39 -923 -71 -923 193 -923 -923 -923 219 -923 161 -923 -923 -73 -71 185 -923 -923 -923 185 -923 -73 -71 -923 193 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 4.9e+001 0.000000 0.166667 0.833333 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.333333 0.666667 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.666667 0.333333 0.500000 0.333333 0.166667 0.000000 0.000000 0.166667 0.166667 0.666667 0.166667 0.000000 0.333333 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.166667 0.833333 0.000000 0.333333 0.166667 0.500000 0.000000 0.833333 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.166667 0.833333 0.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.166667 0.000000 0.833333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GA[GC]T[GT][AC]T[TG]GGTT[TC]CGGACCG -------------------------------------------------------------------------------- Time 6.98 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 7 llr = 98 E-value = 9.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 9:3413:aa1::a77 pos.-specific C :11::3a::7::::1 probability G :9169:::::a9:3: matrix T 1:4::4:::1:1::1 bits 2.2 * * 2.0 *** * * 1.8 *** * * 1.5 * * *** *** Relative 1.3 ** * *** *** Entropy 1.1 ** ** *** **** (20.2 bits) 0.9 ** ** ******** 0.7 ** ** ********* 0.4 ** ************ 0.2 ** ************ 0.0 --------------- Multilevel AGTGGTCAACGGAAA consensus AA A G sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 47784 51 1.58e-09 AACAGTTCAT AGTGGCCAACGGAAA CGCGGTTTGC 48124 144 1.09e-08 CACGTGAAAT AGAAGTCAACGGAAA AATGCTTTTT 54190 381 8.46e-08 AATCGGAAGC AGAGATCAACGGAAA CAACAGCCAG 41541 177 2.60e-07 TTCGCTAGGT AGGGGACAACGGAGC GTCACAATCT 33512 361 2.60e-07 GACAACGACG ACTAGACAACGGAGA ATAAAGCTTT 49410 75 1.17e-06 TACCTCTAGT AGCAGTCAATGTAAA TGGAGATTGA 32661 93 1.28e-06 GTTAGAATAG TGTGGCCAAAGGAAT GGTAGTCCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47784 1.6e-09 50_[+3]_435 48124 1.1e-08 143_[+3]_342 54190 8.5e-08 380_[+3]_105 41541 2.6e-07 176_[+3]_309 33512 2.6e-07 360_[+3]_125 49410 1.2e-06 74_[+3]_411 32661 1.3e-06 92_[+3]_393 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=7 47784 ( 51) AGTGGCCAACGGAAA 1 48124 ( 144) AGAAGTCAACGGAAA 1 54190 ( 381) AGAGATCAACGGAAA 1 41541 ( 177) AGGGGACAACGGAGC 1 33512 ( 361) ACTAGACAACGGAGA 1 49410 ( 75) AGCAGTCAATGTAAA 1 32661 ( 93) TGTGGCCAAAGGAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 9720 bayes= 11.0444 E= 9.9e+001 165 -945 -945 -95 -945 -70 197 -945 7 -70 -61 63 65 -945 138 -945 -93 -945 197 -945 7 30 -945 63 -945 211 -945 -945 187 -945 -945 -945 187 -945 -945 -945 -93 162 -945 -95 -945 -945 219 -945 -945 -945 197 -95 187 -945 -945 -945 139 -945 39 -945 139 -70 -945 -95 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 7 E= 9.9e+001 0.857143 0.000000 0.000000 0.142857 0.000000 0.142857 0.857143 0.000000 0.285714 0.142857 0.142857 0.428571 0.428571 0.000000 0.571429 0.000000 0.142857 0.000000 0.857143 0.000000 0.285714 0.285714 0.000000 0.428571 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.714286 0.000000 0.142857 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.857143 0.142857 1.000000 0.000000 0.000000 0.000000 0.714286 0.000000 0.285714 0.000000 0.714286 0.142857 0.000000 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AG[TA][GA]G[TAC]CAACGGA[AG]A -------------------------------------------------------------------------------- Time 10.63 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42740 9.95e-02 500 43116 1.83e-03 131_[+1(6.66e-05)]_64_\ [+1(1.64e-06)]_273 54731 4.22e-07 74_[+2(4.69e-09)]_146_\ [+1(2.28e-06)]_112_[+1(4.52e-05)]_116 47784 2.87e-06 50_[+3(1.58e-09)]_418_\ [+1(7.53e-05)]_1 47798 6.43e-07 265_[+2(5.05e-09)]_6_[+1(5.64e-06)]_\ 193 32661 3.40e-10 11_[+1(2.78e-07)]_65_[+3(1.28e-06)]_\ 344_[+2(2.02e-08)]_29 15848 1.17e-01 221_[+1(2.38e-05)]_263 49712 6.31e-02 7_[+1(3.19e-05)]_477 54190 1.38e-09 52_[+1(9.03e-10)]_312_\ [+3(8.46e-08)]_105 44137 9.33e-03 35_[+1(1.31e-06)]_449 33512 2.27e-11 320_[+2(6.37e-09)]_20_\ [+3(2.60e-07)]_5_[+1(2.39e-07)]_104 48348 8.52e-02 389_[+1(2.38e-05)]_95 41541 1.45e-05 39_[+1(2.28e-06)]_121_\ [+3(2.60e-07)]_309 36095 6.02e-07 110_[+2(1.52e-09)]_135_\ [+1(5.16e-05)]_219 44576 1.57e-01 210_[+1(4.22e-05)]_274 44022 2.67e-02 300_[+1(6.79e-06)]_184 48124 4.38e-07 9_[+1(1.03e-06)]_118_[+3(1.09e-08)]_\ 342 43653 2.64e-07 3_[+1(7.99e-05)]_18_[+2(7.44e-11)]_\ 443 49410 1.04e-05 74_[+3(1.17e-06)]_198_\ [+1(2.04e-06)]_197 50590 2.02e-02 316_[+1(8.93e-06)]_168 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************