******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/3/3.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42956 1.0000 500 28431 1.0000 500 29606 1.0000 500 10571 1.0000 500 26375 1.0000 500 11956 1.0000 500 46304 1.0000 500 43480 1.0000 500 31827 1.0000 500 40158 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/3/3.seqs.fa -oc motifs/3 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.264 C 0.244 G 0.210 T 0.282 Background letter frequencies (from dataset with add-one prior applied): A 0.264 C 0.244 G 0.210 T 0.282 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 10 llr = 103 E-value = 7.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 3::41:a:3::1 pos.-specific C :15::a:a31:8 probability G 1441:::::9:: matrix T 65159:::4:a1 bits 2.3 2.0 *** 1.8 *** ** 1.6 *** ** Relative 1.4 **** ** Entropy 1.1 **** *** (14.9 bits) 0.9 **** *** 0.7 *** **** *** 0.5 ************ 0.2 ************ 0.0 ------------ Multilevel TTCTTCACTGTC consensus AGGA A sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 31827 144 3.05e-07 AACTGTGGCG TGCATCACTGTC AATCTCTCGT 29606 177 3.05e-07 GACAGTGACG TGCATCACTGTC AAAACCAAGT 40158 285 6.05e-07 GACCAAGATT TTGTTCACTGTC AATCATGTAA 28431 236 2.63e-06 CTTGCACCAA AGCTTCACAGTC AGTTTCCGCA 10571 487 5.43e-06 GTTGCGAGAC GGCATCACCGTC TG 43480 216 7.52e-06 CGGCGTGCGC TTTTTCACAGTC CTCTGAAATC 11956 185 1.02e-05 TACGCTGTGT TTGTTCACTGTA AGATTTTCAC 42956 25 1.73e-05 TCCACTCTAA TTGATCACAGTT TTACCTCTCA 26375 333 4.74e-05 GGAGACGTAC ATGGACACCGTC TCTCAGGCTG 46304 467 5.77e-05 CGTCGTTGTC ACCTTCACCCTC CTTGTAGATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31827 3e-07 143_[+1]_345 29606 3e-07 176_[+1]_312 40158 6.1e-07 284_[+1]_204 28431 2.6e-06 235_[+1]_253 10571 5.4e-06 486_[+1]_2 43480 7.5e-06 215_[+1]_273 11956 1e-05 184_[+1]_304 42956 1.7e-05 24_[+1]_464 26375 4.7e-05 332_[+1]_156 46304 5.8e-05 466_[+1]_22 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=10 31827 ( 144) TGCATCACTGTC 1 29606 ( 177) TGCATCACTGTC 1 40158 ( 285) TTGTTCACTGTC 1 28431 ( 236) AGCTTCACAGTC 1 10571 ( 487) GGCATCACCGTC 1 43480 ( 216) TTTTTCACAGTC 1 11956 ( 185) TTGTTCACTGTA 1 42956 ( 25) TTGATCACAGTT 1 26375 ( 333) ATGGACACCGTC 1 46304 ( 467) ACCTTCACCCTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 9.18275 E= 7.8e+000 18 -997 -107 109 -997 -128 93 82 -997 104 93 -149 60 -997 -107 82 -140 -997 -997 167 -997 204 -997 -997 192 -997 -997 -997 -997 204 -997 -997 18 30 -997 50 -997 -128 210 -997 -997 -997 -997 182 -140 171 -997 -149 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 7.8e+000 0.300000 0.000000 0.100000 0.600000 0.000000 0.100000 0.400000 0.500000 0.000000 0.500000 0.400000 0.100000 0.400000 0.000000 0.100000 0.500000 0.100000 0.000000 0.000000 0.900000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.300000 0.300000 0.000000 0.400000 0.000000 0.100000 0.900000 0.000000 0.000000 0.000000 0.000000 1.000000 0.100000 0.800000 0.000000 0.100000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TA][TG][CG][TA]TCAC[TAC]GTC -------------------------------------------------------------------------------- Time 1.19 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 6 llr = 101 E-value = 2.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::32::2:832:::2a2::2 pos.-specific C a:283::a2252:87:::a2 probability G :2::3a2:::25:22:3::7 matrix T :85:3:7::523a:::5a:: bits 2.3 * 2.0 * * * * * 1.8 * * * * * ** 1.6 * * * * * ** Relative 1.4 ** * * ** ** * ** Entropy 1.1 ** * * ** ** * ** (24.3 bits) 0.9 ** * * ** **** *** 0.7 ** * **** ***** *** 0.5 ********** ********* 0.2 ******************** 0.0 -------------------- Multilevel CTTCCGTCATCGTCCATTCG consensus A G A T G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 26375 103 3.13e-10 GTGTAGCCTA CTACGGTCATGTTCCATTCG TACCCACGGT 40158 219 4.40e-09 CAACGGTTCC CTTCCGTCCACGTGCAGTCG TTGTTATTGA 46304 367 6.62e-09 GCGTTATTCA CTTATGACATCTTCCATTCG GAACCGCAAC 42956 245 7.30e-09 CCAAACAAAC CTTCGGTCATAGTCGAGTCA CGCGTCAATG 29606 318 4.42e-08 TGTCCCGTCC CGACTGTCACCGTCAATTCC GAATCGGTAC 31827 409 4.70e-08 ATCATTATCC CTCCCGGCAATCTCCAATCG CAACATCGCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 26375 3.1e-10 102_[+2]_378 40158 4.4e-09 218_[+2]_262 46304 6.6e-09 366_[+2]_114 42956 7.3e-09 244_[+2]_236 29606 4.4e-08 317_[+2]_163 31827 4.7e-08 408_[+2]_72 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=6 26375 ( 103) CTACGGTCATGTTCCATTCG 1 40158 ( 219) CTTCCGTCCACGTGCAGTCG 1 46304 ( 367) CTTATGACATCTTCCATTCG 1 42956 ( 245) CTTCGGTCATAGTCGAGTCA 1 29606 ( 318) CGACTGTCACCGTCAATTCC 1 31827 ( 409) CTCCCGGCAATCTCCAATCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4810 bayes= 9.30354 E= 2.2e+001 -923 203 -923 -923 -923 -923 -33 156 34 -55 -923 82 -66 177 -923 -923 -923 45 67 24 -923 -923 225 -923 -66 -923 -33 124 -923 203 -923 -923 166 -55 -923 -923 34 -55 -923 82 -66 103 -33 -76 -923 -55 125 24 -923 -923 -923 182 -923 177 -33 -923 -66 145 -33 -923 192 -923 -923 -923 -66 -923 67 82 -923 -923 -923 182 -923 203 -923 -923 -66 -55 167 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 2.2e+001 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.333333 0.166667 0.000000 0.500000 0.166667 0.833333 0.000000 0.000000 0.000000 0.333333 0.333333 0.333333 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.166667 0.666667 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.333333 0.166667 0.000000 0.500000 0.166667 0.500000 0.166667 0.166667 0.000000 0.166667 0.500000 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.166667 0.000000 0.166667 0.666667 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.333333 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.166667 0.666667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CT[TA]C[CGT]GTCA[TA]C[GT]TCCA[TG]TCG -------------------------------------------------------------------------------- Time 2.41 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 8 llr = 120 E-value = 2.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 6:8:156:183:1:1536::8 pos.-specific C :411:1:a413:395:6:331 probability G 4619913:4:5:61:41388: matrix T :::::31:11:a::41:1::1 bits 2.3 2.0 * 1.8 * * 1.6 ** * * * Relative 1.4 ** * * * ** Entropy 1.1 ** ** * * * ** (21.7 bits) 0.9 ***** * * *** *** 0.7 ***** ** ***** ****** 0.5 ***** ** ************ 0.2 ********************* 0.0 --------------------- Multilevel AGAGGAACCAGTGCCACAGGA consensus GC TG G A C TGAGCC sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 40158 144 1.18e-11 AGTATGACCT AGAGGAACCAGTGCCGCACGA CACACGAGAT 42956 317 5.66e-09 ATACACAGGA AGAGGAACGTGTGCTACGGGT CTAACGTCCG 28431 183 1.31e-08 ACAACTTGGA GGAGGAGCGAATCCTGAACGA CCGTGGACGT 26375 420 1.08e-07 GTCCGGAGGT GGCGGTGCCACTACCACAGCA GTTGTTGATC 10571 390 1.35e-07 TAGCGTGTAA AGAGGTACCCATGCAAAAGGC GATACCGCCT 31827 70 2.39e-07 ACATTGGAGG ACGCGGACGAGTGGTACGGGA CAACAACACG 11956 280 2.73e-07 CGAACCAATC GCAGAATCAAGTCCCGCAGCA AGTCTCCGGA 29606 358 3.78e-07 CTAACGGCGT ACAGGCACTACTGCCTGTGGA AGCGTTTAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40158 1.2e-11 143_[+3]_336 42956 5.7e-09 316_[+3]_163 28431 1.3e-08 182_[+3]_297 26375 1.1e-07 419_[+3]_60 10571 1.4e-07 389_[+3]_90 31827 2.4e-07 69_[+3]_410 11956 2.7e-07 279_[+3]_200 29606 3.8e-07 357_[+3]_122 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=8 40158 ( 144) AGAGGAACCAGTGCCGCACGA 1 42956 ( 317) AGAGGAACGTGTGCTACGGGT 1 28431 ( 183) GGAGGAGCGAATCCTGAACGA 1 26375 ( 420) GGCGGTGCCACTACCACAGCA 1 10571 ( 390) AGAGGTACCCATGCAAAAGGC 1 31827 ( 70) ACGCGGACGAGTGGTACGGGA 1 11956 ( 280) GCAGAATCAAGTCCCGCAGCA 1 29606 ( 358) ACAGGCACTACTGCCTGTGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 9.22641 E= 2.4e+001 124 -965 84 -965 -965 62 157 -965 151 -96 -75 -965 -965 -96 206 -965 -108 -965 206 -965 92 -96 -75 -18 124 -965 25 -117 -965 203 -965 -965 -108 62 84 -117 151 -96 -965 -117 -8 4 125 -965 -965 -965 -965 182 -108 4 157 -965 -965 184 -75 -965 -108 104 -965 41 92 -965 84 -117 -8 136 -75 -965 124 -965 25 -117 -965 4 184 -965 -965 4 184 -965 151 -96 -965 -117 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 2.4e+001 0.625000 0.000000 0.375000 0.000000 0.000000 0.375000 0.625000 0.000000 0.750000 0.125000 0.125000 0.000000 0.000000 0.125000 0.875000 0.000000 0.125000 0.000000 0.875000 0.000000 0.500000 0.125000 0.125000 0.250000 0.625000 0.000000 0.250000 0.125000 0.000000 1.000000 0.000000 0.000000 0.125000 0.375000 0.375000 0.125000 0.750000 0.125000 0.000000 0.125000 0.250000 0.250000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.125000 0.250000 0.625000 0.000000 0.000000 0.875000 0.125000 0.000000 0.125000 0.500000 0.000000 0.375000 0.500000 0.000000 0.375000 0.125000 0.250000 0.625000 0.125000 0.000000 0.625000 0.000000 0.250000 0.125000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 0.750000 0.000000 0.750000 0.125000 0.000000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AG][GC]AGG[AT][AG]C[CG]A[GAC]T[GC]C[CT][AG][CA][AG][GC][GC]A -------------------------------------------------------------------------------- Time 3.63 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42956 3.90e-11 24_[+1(1.73e-05)]_208_\ [+2(7.30e-09)]_52_[+3(5.66e-09)]_163 28431 1.19e-06 182_[+3(1.31e-08)]_32_\ [+1(2.63e-06)]_253 29606 2.45e-10 176_[+1(3.05e-07)]_129_\ [+2(4.42e-08)]_20_[+3(3.78e-07)]_122 10571 1.98e-05 389_[+3(1.35e-07)]_76_\ [+1(5.43e-06)]_2 26375 8.22e-11 102_[+2(3.13e-10)]_210_\ [+1(4.74e-05)]_28_[+2(9.95e-05)]_27_[+3(1.08e-07)]_60 11956 5.92e-05 184_[+1(1.02e-05)]_83_\ [+3(2.73e-07)]_200 46304 1.16e-05 366_[+2(6.62e-09)]_80_\ [+1(5.77e-05)]_22 43480 3.01e-02 215_[+1(7.52e-06)]_273 31827 1.69e-10 69_[+3(2.39e-07)]_53_[+1(3.05e-07)]_\ 253_[+2(4.70e-08)]_72 40158 3.02e-15 143_[+3(1.18e-11)]_54_\ [+2(4.40e-09)]_46_[+1(6.05e-07)]_204 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************