******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/98/98.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 53929 1.0000 500 2444 1.0000 500 14702 1.0000 500 39311 1.0000 500 7359 1.0000 500 8538 1.0000 500 19188 1.0000 500 10954 1.0000 500 44739 1.0000 500 45682 1.0000 500 49194 1.0000 500 46458 1.0000 500 35067 1.0000 500 34245 1.0000 500 49066 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/98/98.seqs.fa -oc motifs/98 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.275 C 0.234 G 0.213 T 0.278 Background letter frequencies (from dataset with add-one prior applied): A 0.275 C 0.234 G 0.213 T 0.278 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 15 llr = 137 E-value = 8.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 7:97:94a:161 pos.-specific C 37:19:5:3111 probability G :3:3:11:731: matrix T ::1:1::::529 bits 2.2 2.0 1.8 * 1.6 * ** * Relative 1.3 ** ** ** Entropy 1.1 *** ** ** * (13.2 bits) 0.9 ****** ** * 0.7 ********* * 0.4 ********* ** 0.2 ************ 0.0 ------------ Multilevel ACAACACAGTAT consensus CG G A CGT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 49066 364 1.14e-06 GGATCGATCG ACAACACAGCAT TGCGGTAGGT 7359 101 2.28e-06 CCGCCAGCGC ACAACAAAGCAT CGGCCAATAT 53929 138 2.28e-06 CAAATCGATT CGAACACAGTAT GGTAGTGTAA 10954 180 2.93e-06 ATGAAATGAA AGAACAAAGGAT TACTGTTGTG 45682 117 5.85e-06 AATCTGAAAC CGAGCACAGTAT AGCCAATTCT 19188 346 9.24e-06 ATGGCAAGAC ACAACACACTCT GGAAACTGCT 39311 158 1.64e-05 CAGGGGAAGA CCAACAGAGAAT TTCCAAAAAC 14702 457 1.64e-05 CGTCCGTCAC CCACCAAAGTAT TATCCTCCCT 34245 407 2.80e-05 ACTTCTCATC ACAACACACATT TTCTTCATTT 2444 116 3.05e-05 AGAAGCAAAG ACAACAGAGTAC ACGCGCCTTT 35067 52 6.82e-05 GAAAACTGCG ACAACACAGTGA CAAGTAAATT 8538 465 7.81e-05 TCCTGCTCCG ACTACAAAGGCT ACCGGCGCTC 49194 42 8.88e-05 TCAACTACCT ACAGCGAAGGTT CTCCTTATTT 46458 411 9.49e-05 TCTGATTCTG ACAGTACACTTT TGAATATTTC 44739 343 1.86e-04 CCATAGACTG CGAGTAAACGAT ATCGCTAAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49066 1.1e-06 363_[+1]_125 7359 2.3e-06 100_[+1]_388 53929 2.3e-06 137_[+1]_351 10954 2.9e-06 179_[+1]_309 45682 5.9e-06 116_[+1]_372 19188 9.2e-06 345_[+1]_143 39311 1.6e-05 157_[+1]_331 14702 1.6e-05 456_[+1]_32 34245 2.8e-05 406_[+1]_82 2444 3e-05 115_[+1]_373 35067 6.8e-05 51_[+1]_437 8538 7.8e-05 464_[+1]_24 49194 8.9e-05 41_[+1]_447 46458 9.5e-05 410_[+1]_78 44739 0.00019 342_[+1]_146 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=15 49066 ( 364) ACAACACAGCAT 1 7359 ( 101) ACAACAAAGCAT 1 53929 ( 138) CGAACACAGTAT 1 10954 ( 180) AGAACAAAGGAT 1 45682 ( 117) CGAGCACAGTAT 1 19188 ( 346) ACAACACACTCT 1 39311 ( 158) CCAACAGAGAAT 1 14702 ( 457) CCACCAAAGTAT 1 34245 ( 407) ACAACACACATT 1 2444 ( 116) ACAACAGAGTAC 1 35067 ( 52) ACAACACAGTGA 1 8538 ( 465) ACTACAAAGGCT 1 49194 ( 42) ACAGCGAAGGTT 1 46458 ( 411) ACAGTACACTTT 1 44739 ( 343) CGAGTAAACGAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 9.60607 E= 8.3e+001 128 51 -1055 -1055 -1055 165 32 -1055 176 -1055 -1055 -206 128 -181 32 -1055 -1055 189 -1055 -106 176 -1055 -168 -1055 54 100 -68 -1055 186 -1055 -1055 -1055 -1055 19 178 -1055 -105 -81 32 75 112 -81 -168 -47 -204 -181 -1055 164 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 15 E= 8.3e+001 0.666667 0.333333 0.000000 0.000000 0.000000 0.733333 0.266667 0.000000 0.933333 0.000000 0.000000 0.066667 0.666667 0.066667 0.266667 0.000000 0.000000 0.866667 0.000000 0.133333 0.933333 0.000000 0.066667 0.000000 0.400000 0.466667 0.133333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.266667 0.733333 0.000000 0.133333 0.133333 0.266667 0.466667 0.600000 0.133333 0.066667 0.200000 0.066667 0.066667 0.000000 0.866667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AC][CG]A[AG]CA[CA]A[GC][TG][AT]T -------------------------------------------------------------------------------- Time 1.76 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 15 llr = 171 E-value = 6.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 12:37213133:137974969 pos.-specific C 51521:563127:6:11313: probability G 33541:113::1913::2:11 matrix T :4:1183:3652::::21::: bits 2.2 2.0 1.8 1.6 * * * Relative 1.3 * * * * Entropy 1.1 * * ** ** * * (16.4 bits) 0.9 * * * ** ** * * 0.7 * * * * ****** *** 0.4 * * ** * ******** *** 0.2 * ****** ************ 0.0 --------------------- Multilevel CTGGATCCCTTCGCAAAAAAA consensus GGCA ATAGAAT AG TC C sequence A C T C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 2444 324 3.02e-09 ACAGATTCGA CTGGATGCCCTCGCAAACAAA GAACCGACCA 45682 467 1.65e-08 AACTTGGTTT GGGAATTCGTTCGAGAAAAAA AGATATTCAA 46458 362 5.99e-07 AAACGTTTCA CTCTATTAGTTTGCAAAAACA GTAATACATC 14702 130 9.06e-07 TTGGTACAGG CCGCCACACTCCGCAAAAAAA TTCGGTCCGC 53929 185 1.11e-06 TCACGGTATG GTGGTTCCATTCGAAATGACA GTGTCGAAAC 8538 480 2.14e-06 AAAGGCTACC GGCGCTCCGTCCGGAAAACCA 39311 127 2.14e-06 ATACGATTCC ATGAAACCCTACAAAAACAAA CAGGGGAAGA 7359 26 2.55e-06 CCTTATCGGT GACAATGGCTTTGCGAAAAAA CTTGGAACTG 19188 377 2.79e-06 TAATATTACT CTGGAACAAAACGGGAAGAAA CATTGCCTAC 44739 226 3.04e-06 TTATTAGATG CAGAATCCTAAGACAAAAAAA CGGCTAAATC 35067 168 3.91e-06 AAATAAAAGT CCCGATCACACCGCAAATAAG CTATTTGCGG 34245 40 4.99e-06 TTGTACGACA GACAATTCTCTCGAAATGAGA CACTCCCATT 10954 442 6.82e-06 CCGACGAACA CGCCATAATTTTGCGACCAAA TTGGTTGTGC 49194 107 1.22e-05 GTAGCCAAAA AGGCTTTCTAACGCAATCAGA CTCCTGACAC 49066 219 1.60e-05 TATCTGTTTA CTCGGTTCGTTCGCACCCCCA CTCACAGTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 2444 3e-09 323_[+2]_156 45682 1.7e-08 466_[+2]_13 46458 6e-07 361_[+2]_118 14702 9.1e-07 129_[+2]_350 53929 1.1e-06 184_[+2]_295 8538 2.1e-06 479_[+2] 39311 2.1e-06 126_[+2]_353 7359 2.6e-06 25_[+2]_454 19188 2.8e-06 376_[+2]_103 44739 3e-06 225_[+2]_254 35067 3.9e-06 167_[+2]_312 34245 5e-06 39_[+2]_440 10954 6.8e-06 441_[+2]_38 49194 1.2e-05 106_[+2]_373 49066 1.6e-05 218_[+2]_261 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=15 2444 ( 324) CTGGATGCCCTCGCAAACAAA 1 45682 ( 467) GGGAATTCGTTCGAGAAAAAA 1 46458 ( 362) CTCTATTAGTTTGCAAAAACA 1 14702 ( 130) CCGCCACACTCCGCAAAAAAA 1 53929 ( 185) GTGGTTCCATTCGAAATGACA 1 8538 ( 480) GGCGCTCCGTCCGGAAAACCA 1 39311 ( 127) ATGAAACCCTACAAAAACAAA 1 7359 ( 26) GACAATGGCTTTGCGAAAAAA 1 19188 ( 377) CTGGAACAAAACGGGAAGAAA 1 44739 ( 226) CAGAATCCTAAGACAAAAAAA 1 35067 ( 168) CCCGATCACACCGCAAATAAG 1 34245 ( 40) GACAATTCTCTCGAAATGAGA 1 10954 ( 442) CGCCATAATTTTGCGACCAAA 1 49194 ( 107) AGGCTTTCTAACGCAATCAGA 1 49066 ( 219) CTCGGTTCGTTCGCACCCCCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 8.90388 E= 6.9e+001 -105 119 64 -1055 -46 -81 32 53 -1055 100 132 -1055 28 -22 91 -206 128 -81 -168 -106 -46 -1055 -1055 153 -204 100 -68 26 28 136 -168 -1055 -105 51 32 -6 -5 -81 -1055 111 -5 -22 -1055 94 -1055 165 -168 -47 -105 -1055 202 -1055 -5 136 -68 -1055 141 -1055 32 -1055 176 -181 -1055 -1055 128 -81 -1055 -47 54 51 -9 -206 165 -81 -1055 -1055 112 19 -68 -1055 176 -1055 -168 -1055 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 15 E= 6.9e+001 0.133333 0.533333 0.333333 0.000000 0.200000 0.133333 0.266667 0.400000 0.000000 0.466667 0.533333 0.000000 0.333333 0.200000 0.400000 0.066667 0.666667 0.133333 0.066667 0.133333 0.200000 0.000000 0.000000 0.800000 0.066667 0.466667 0.133333 0.333333 0.333333 0.600000 0.066667 0.000000 0.133333 0.333333 0.266667 0.266667 0.266667 0.133333 0.000000 0.600000 0.266667 0.200000 0.000000 0.533333 0.000000 0.733333 0.066667 0.200000 0.133333 0.000000 0.866667 0.000000 0.266667 0.600000 0.133333 0.000000 0.733333 0.000000 0.266667 0.000000 0.933333 0.066667 0.000000 0.000000 0.666667 0.133333 0.000000 0.200000 0.400000 0.333333 0.200000 0.066667 0.866667 0.133333 0.000000 0.000000 0.600000 0.266667 0.133333 0.000000 0.933333 0.000000 0.066667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG][TGA][GC][GAC]A[TA][CT][CA][CGT][TA][TAC][CT]G[CA][AG]A[AT][ACG]A[AC]A -------------------------------------------------------------------------------- Time 3.55 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 5 llr = 77 E-value = 5.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :44a:::8:2::::2 pos.-specific C a4::8a:::88:aa: probability G :22:2:822:::::8 matrix T ::4:::2:8:2a::: bits 2.2 2.0 * * ** 1.8 * * * *** 1.6 * * * *** Relative 1.3 * **** ****** Entropy 1.1 * ************ (22.4 bits) 0.9 * ************ 0.7 * ************ 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CAAACCGATCCTCCG consensus CT G TGGAT A sequence GG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 8538 438 1.32e-09 GAGGAAGACT CCTACCGATCCTCCG ACTCCTGCTC 14702 328 1.13e-08 GGCAGTGCGT CCAAGCGATCCTCCG TCACACGCAA 7359 306 5.79e-08 TACGCAGGTA CGGACCTATCCTCCG GCATCGTCAC 39311 335 7.20e-08 TGAAATACGG CATACCGAGCTTCCG TTGGCGTTGA 53929 235 2.62e-07 TGATGATGTC CAAACCGGTACTCCA CAATGTCGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8538 1.3e-09 437_[+3]_48 14702 1.1e-08 327_[+3]_158 7359 5.8e-08 305_[+3]_180 39311 7.2e-08 334_[+3]_151 53929 2.6e-07 234_[+3]_251 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=5 8538 ( 438) CCTACCGATCCTCCG 1 14702 ( 328) CCAAGCGATCCTCCG 1 7359 ( 306) CGGACCTATCCTCCG 1 39311 ( 335) CATACCGAGCTTCCG 1 53929 ( 235) CAAACCGGTACTCCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 7290 bayes= 10.7605 E= 5.4e+002 -897 210 -897 -897 54 77 -9 -897 54 -897 -9 53 186 -897 -897 -897 -897 177 -9 -897 -897 210 -897 -897 -897 -897 190 -47 154 -897 -9 -897 -897 -897 -9 152 -46 177 -897 -897 -897 177 -897 -47 -897 -897 -897 185 -897 210 -897 -897 -897 210 -897 -897 -46 -897 190 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 5 E= 5.4e+002 0.000000 1.000000 0.000000 0.000000 0.400000 0.400000 0.200000 0.000000 0.400000 0.000000 0.200000 0.400000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 0.200000 0.800000 0.200000 0.800000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[ACG][ATG]A[CG]C[GT][AG][TG][CA][CT]TCC[GA] -------------------------------------------------------------------------------- Time 5.12 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 53929 2.22e-08 137_[+1(2.28e-06)]_35_\ [+2(1.11e-06)]_29_[+3(2.62e-07)]_251 2444 7.92e-07 115_[+1(3.05e-05)]_196_\ [+2(3.02e-09)]_156 14702 6.29e-09 129_[+2(9.06e-07)]_177_\ [+3(1.13e-08)]_114_[+1(1.64e-05)]_32 39311 7.59e-08 126_[+2(2.14e-06)]_10_\ [+1(1.64e-05)]_165_[+3(7.20e-08)]_151 7359 1.20e-08 25_[+2(2.55e-06)]_54_[+1(2.28e-06)]_\ 193_[+3(5.79e-08)]_180 8538 7.97e-09 120_[+3(7.85e-05)]_302_\ [+3(1.32e-09)]_12_[+1(7.81e-05)]_3_[+2(2.14e-06)] 19188 4.33e-04 345_[+1(9.24e-06)]_19_\ [+2(2.79e-06)]_103 10954 1.95e-04 179_[+1(2.93e-06)]_250_\ [+2(6.82e-06)]_38 44739 3.94e-03 225_[+2(3.04e-06)]_254 45682 3.65e-06 116_[+1(5.85e-06)]_338_\ [+2(1.65e-08)]_13 49194 1.01e-02 41_[+1(8.88e-05)]_53_[+2(1.22e-05)]_\ 373 46458 9.72e-04 361_[+2(5.99e-07)]_28_\ [+1(9.49e-05)]_78 35067 1.82e-03 51_[+1(6.82e-05)]_80_[+2(7.52e-05)]_\ 3_[+2(3.91e-06)]_237_[+2(6.85e-05)]_54 34245 7.54e-04 39_[+2(4.99e-06)]_346_\ [+1(2.80e-05)]_82 49066 5.54e-05 218_[+2(1.60e-05)]_124_\ [+1(1.14e-06)]_125 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************