******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/122/122.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 3311 1.0000 500 9609 1.0000 500 13125 1.0000 500 46734 1.0000 500 46891 1.0000 500 14453 1.0000 500 22357 1.0000 500 43473 1.0000 500 43499 1.0000 500 49073 1.0000 500 49310 1.0000 500 50177 1.0000 500 50495 1.0000 500 44341 1.0000 500 20424 1.0000 500 48476 1.0000 500 48744 1.0000 500 47289 1.0000 500 46439 1.0000 500 34642 1.0000 500 44926 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/122/122.seqs.fa -oc motifs/122 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 21 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10500 N= 21 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.280 C 0.246 G 0.220 T 0.254 Background letter frequencies (from dataset with add-one prior applied): A 0.280 C 0.246 G 0.220 T 0.254 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 14 sites = 13 llr = 146 E-value = 6.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 151::8:::1:817 pos.-specific C 222:1:82::2:2: probability G 82::5:::a:626: matrix T :27a5228:92:23 bits 2.2 * 2.0 * * 1.7 * * 1.5 * ** Relative 1.3 * ***** * Entropy 1.1 * * ***** * * (16.2 bits) 0.9 * ** ******* * 0.7 * ************ 0.4 * ************ 0.2 ************** 0.0 -------------- Multilevel GATTGACTGTGAGA consensus GC T C T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 3311 143 9.54e-09 TGCCCTACCC GGTTGACTGTGAGA TGAAAGTTTC 44341 203 2.46e-08 GGCTACCTTG GTTTGACTGTGAGA CAAAGCGAGT 48744 218 2.80e-07 GTCGTGATTG GATTTACTGTGACT GAATGTAAAG 49073 168 5.11e-07 CCCCCACAAT GTTTGATTGTGAGA GCCCTATACG 20424 367 1.19e-06 TAGTCGCTCA GATTTTCCGTGAGA AATGTTCTTC 46891 6 1.61e-06 TACTC GCCTTACTGTTAGA TCAATGTATC 43473 82 4.70e-06 TGCTAACGAG GGTTGACTGACAGT GATTTCTGAC 43499 223 5.49e-06 TGCTTTTGAT GCTTTTCTGTGATT TCGTCGCCCA 44926 473 6.43e-06 CGAGCCGACC CACTGACTGTGAAA TGTACTGATA 49310 272 7.65e-06 TTCGCGTAGG GAATGACTGTGGCA AGGTACGATG 47289 485 1.16e-05 AGTCGCGTCG CACTCACTGTCAGA CA 48476 384 1.83e-05 GAGGTAGGCC AATTTACTGTTATT CGTCAATCCA 50177 107 2.06e-05 GGAATAAATC GGTTTATCGTCGGA TTTAGGAGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3311 9.5e-09 142_[+1]_344 44341 2.5e-08 202_[+1]_284 48744 2.8e-07 217_[+1]_269 49073 5.1e-07 167_[+1]_319 20424 1.2e-06 366_[+1]_120 46891 1.6e-06 5_[+1]_481 43473 4.7e-06 81_[+1]_405 43499 5.5e-06 222_[+1]_264 44926 6.4e-06 472_[+1]_14 49310 7.7e-06 271_[+1]_215 47289 1.2e-05 484_[+1]_2 48476 1.8e-05 383_[+1]_103 50177 2.1e-05 106_[+1]_380 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=14 seqs=13 3311 ( 143) GGTTGACTGTGAGA 1 44341 ( 203) GTTTGACTGTGAGA 1 48744 ( 218) GATTTACTGTGACT 1 49073 ( 168) GTTTGATTGTGAGA 1 20424 ( 367) GATTTTCCGTGAGA 1 46891 ( 6) GCCTTACTGTTAGA 1 43473 ( 82) GGTTGACTGACAGT 1 43499 ( 223) GCTTTTCTGTGATT 1 44926 ( 473) CACTGACTGTGAAA 1 49310 ( 272) GAATGACTGTGGCA 1 47289 ( 485) CACTCACTGTCAGA 1 48476 ( 384) AATTTACTGTTATT 1 50177 ( 107) GGTTTATCGTCGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 10227 bayes= 10.1489 E= 6.2e+000 -186 -68 180 -1035 72 -68 7 -72 -186 -9 -1035 145 -1035 -1035 -1035 198 -1035 -167 107 86 159 -1035 -1035 -72 -1035 178 -1035 -72 -1035 -68 -1035 174 -1035 -1035 218 -1035 -186 -1035 -1035 186 -1035 -9 148 -72 159 -1035 -52 -1035 -186 -68 148 -72 131 -1035 -1035 28 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 13 E= 6.2e+000 0.076923 0.153846 0.769231 0.000000 0.461538 0.153846 0.230769 0.153846 0.076923 0.230769 0.000000 0.692308 0.000000 0.000000 0.000000 1.000000 0.000000 0.076923 0.461538 0.461538 0.846154 0.000000 0.000000 0.153846 0.000000 0.846154 0.000000 0.153846 0.000000 0.153846 0.000000 0.846154 0.000000 0.000000 1.000000 0.000000 0.076923 0.000000 0.000000 0.923077 0.000000 0.230769 0.615385 0.153846 0.846154 0.000000 0.153846 0.000000 0.076923 0.153846 0.615385 0.153846 0.692308 0.000000 0.000000 0.307692 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[AG][TC]T[GT]ACTGT[GC]AG[AT] -------------------------------------------------------------------------------- Time 4.56 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 13 llr = 174 E-value = 4.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::2894128696326255a4 pos.-specific C 15612129214:1662642:6 probability G 1248::::61:::11::23:: matrix T 82::1:4::1:13:222:::: bits 2.2 2.0 1.7 * 1.5 * * * * Relative 1.3 * * * * * Entropy 1.1 * ** * * * ** (19.3 bits) 0.9 * **** * ** ** 0.7 ****** ******* ** *** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TCCGAAACGAAAACCACAAAC consensus GG T C C TA CTCG A sequence T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 9609 94 8.00e-12 TCCCTCACCG TCCGAACCGAAAACCACAAAC AGAAAAAGGC 48476 302 2.63e-09 GACAAAGCGA TCCAAATCGACAAACACAAAC AGATCCGGAA 43499 397 5.06e-08 TGCTTTGTCG TCGGAACCGAAAACCTTGCAC TGACGTTTCA 3311 238 5.63e-08 CGCATCTTGG TTCCAATCGAAAACCACGGAA ATTGGTTCGA 48744 184 1.51e-07 TTGAGATGGC GGGGAAACGACAACCATCCAC CGAGTCGTGA 44341 354 1.66e-07 TCTCTGTCAG TCCGAAACAAAATCTCAAAAC CCAACGGTTC 44926 434 5.12e-07 GGATCGAAAA TGGGTATCCACATCCTCCAAA CAGAACGACG 46891 253 9.52e-07 TCTTAAAATT TTGGAAAAAAAAAACCCCAAA TATACGAATG 14453 87 1.03e-06 TGATATAGCT TCCGAATCGTCAAGAAAAGAC TTCAAAACCG 13125 289 1.03e-06 GTGGACAAGA TCCGCCTCGAAAAAGCCAAAA GCTGGCGTCG 22357 233 1.47e-06 GACCCTCAGA TCCGAACCCGATTCAACCGAC CGTAGTAATC 47289 396 1.80e-06 GTAAGGATAT CGGGAAACCCCATCCACAGAA ACACCAACAC 46439 428 3.43e-06 CGGGCTTCGT TTCACAACGAAACATATCAAC GCGACCAAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9609 8e-12 93_[+2]_386 48476 2.6e-09 301_[+2]_178 43499 5.1e-08 396_[+2]_83 3311 5.6e-08 237_[+2]_242 48744 1.5e-07 183_[+2]_296 44341 1.7e-07 353_[+2]_126 44926 5.1e-07 433_[+2]_46 46891 9.5e-07 252_[+2]_227 14453 1e-06 86_[+2]_393 13125 1e-06 288_[+2]_191 22357 1.5e-06 232_[+2]_247 47289 1.8e-06 395_[+2]_84 46439 3.4e-06 427_[+2]_52 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=13 9609 ( 94) TCCGAACCGAAAACCACAAAC 1 48476 ( 302) TCCAAATCGACAAACACAAAC 1 43499 ( 397) TCGGAACCGAAAACCTTGCAC 1 3311 ( 238) TTCCAATCGAAAACCACGGAA 1 48744 ( 184) GGGGAAACGACAACCATCCAC 1 44341 ( 354) TCCGAAACAAAATCTCAAAAC 1 44926 ( 434) TGGGTATCCACATCCTCCAAA 1 46891 ( 253) TTGGAAAAAAAAAACCCCAAA 1 14453 ( 87) TCCGAATCGTCAAGAAAAGAC 1 13125 ( 289) TCCGCCTCGAAAAAGCCAAAA 1 22357 ( 233) TCCGAACCCGATTCAACCGAC 1 47289 ( 396) CGGGAAACCCCATCCACAGAA 1 46439 ( 428) TTCACAACGAAACATATCAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 10080 bayes= 10.128 E= 4.5e+001 -1035 -167 -152 174 -1035 113 7 -14 -1035 132 80 -1035 -86 -167 180 -1035 146 -68 -1035 -172 172 -167 -1035 -1035 46 -9 -1035 60 -186 191 -1035 -1035 -86 -9 148 -1035 146 -167 -152 -172 114 64 -1035 -1035 172 -1035 -1035 -172 114 -167 -1035 28 14 132 -152 -1035 -86 132 -152 -72 114 -9 -1035 -72 -86 132 -1035 -14 72 64 -52 -1035 94 -68 48 -1035 184 -1035 -1035 -1035 46 132 -1035 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 13 E= 4.5e+001 0.000000 0.076923 0.076923 0.846154 0.000000 0.538462 0.230769 0.230769 0.000000 0.615385 0.384615 0.000000 0.153846 0.076923 0.769231 0.000000 0.769231 0.153846 0.000000 0.076923 0.923077 0.076923 0.000000 0.000000 0.384615 0.230769 0.000000 0.384615 0.076923 0.923077 0.000000 0.000000 0.153846 0.230769 0.615385 0.000000 0.769231 0.076923 0.076923 0.076923 0.615385 0.384615 0.000000 0.000000 0.923077 0.000000 0.000000 0.076923 0.615385 0.076923 0.000000 0.307692 0.307692 0.615385 0.076923 0.000000 0.153846 0.615385 0.076923 0.153846 0.615385 0.230769 0.000000 0.153846 0.153846 0.615385 0.000000 0.230769 0.461538 0.384615 0.153846 0.000000 0.538462 0.153846 0.307692 0.000000 1.000000 0.000000 0.000000 0.000000 0.384615 0.615385 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[CGT][CG]GAA[ATC]C[GC]A[AC]A[AT][CA]C[AC][CT][AC][AG]A[CA] -------------------------------------------------------------------------------- Time 9.16 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 7 llr = 98 E-value = 5.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :6:7:1194::3:4: pos.-specific C 3:a:::1:41a:a:a probability G :::3a:311::7::: matrix T 74:::94::9:::6: bits 2.2 * 2.0 * * * * * 1.7 * * * * * 1.5 * * * * * Relative 1.3 * ** * **** * Entropy 1.1 * **** * **** * (20.3 bits) 0.9 ****** * ****** 0.7 ****** * ****** 0.4 ****** ******** 0.2 *************** 0.0 --------------- Multilevel TACAGTTAATCGCTC consensus CT G G C A A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 44341 325 2.16e-09 ACATGTGATT TACAGTTAATCGCTC ACTGTCTCTG 3311 23 2.11e-08 TGAATAACAG TACGGTGACTCGCTC CAGAGTCGCC 13125 330 2.83e-08 GAAGGACTTA TTCGGTGACTCGCTC GTTCAACAAA 48744 485 2.13e-07 TTCTAGATTT TTCAGATACTCGCAC A 9609 153 4.58e-07 GATCTTGACT TACAGTCAACCGCAC GTTTACTACT 20424 412 7.33e-07 CTGCCAAGCA CTCAGTTGATCACTC ACCCACTCTC 22357 275 1.14e-06 CCCCCAAAAT CACAGTAAGTCACAC ACAGACAACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44341 2.2e-09 324_[+3]_161 3311 2.1e-08 22_[+3]_463 13125 2.8e-08 329_[+3]_156 48744 2.1e-07 484_[+3]_1 9609 4.6e-07 152_[+3]_333 20424 7.3e-07 411_[+3]_74 22357 1.1e-06 274_[+3]_211 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=7 44341 ( 325) TACAGTTAATCGCTC 1 3311 ( 23) TACGGTGACTCGCTC 1 13125 ( 330) TTCGGTGACTCGCTC 1 48744 ( 485) TTCAGATACTCGCAC 1 9609 ( 153) TACAGTCAACCGCAC 1 20424 ( 412) CTCAGTTGATCACTC 1 22357 ( 275) CACAGTAAGTCACAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 10206 bayes= 10.3526 E= 5.9e+002 -945 22 -945 149 103 -945 -945 75 -945 202 -945 -945 135 -945 38 -945 -945 -945 218 -945 -97 -945 -945 175 -97 -78 38 75 161 -945 -62 -945 61 80 -62 -945 -945 -78 -945 175 -945 202 -945 -945 3 -945 170 -945 -945 202 -945 -945 61 -945 -945 117 -945 202 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 7 E= 5.9e+002 0.000000 0.285714 0.000000 0.714286 0.571429 0.000000 0.000000 0.428571 0.000000 1.000000 0.000000 0.000000 0.714286 0.000000 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.000000 0.000000 0.857143 0.142857 0.142857 0.285714 0.428571 0.857143 0.000000 0.142857 0.000000 0.428571 0.428571 0.142857 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 0.285714 0.000000 0.714286 0.000000 0.000000 1.000000 0.000000 0.000000 0.428571 0.000000 0.000000 0.571429 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TC][AT]C[AG]GT[TG]A[AC]TC[GA]C[TA]C -------------------------------------------------------------------------------- Time 13.75 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3311 8.01e-13 22_[+3(2.11e-08)]_105_\ [+1(9.54e-09)]_81_[+2(5.63e-08)]_242 9609 1.23e-10 93_[+2(8.00e-12)]_38_[+3(4.58e-07)]_\ 333 13125 1.30e-06 288_[+2(1.03e-06)]_20_\ [+3(2.83e-08)]_156 46734 3.66e-01 500 46891 9.58e-06 5_[+1(1.61e-06)]_233_[+2(9.52e-07)]_\ 227 14453 9.97e-03 86_[+2(1.03e-06)]_393 22357 4.69e-05 232_[+2(1.47e-06)]_21_\ [+3(1.14e-06)]_35_[+2(2.85e-05)]_155 43473 1.96e-02 81_[+1(4.70e-06)]_405 43499 7.89e-06 222_[+1(5.49e-06)]_160_\ [+2(5.06e-08)]_83 49073 1.70e-03 167_[+1(5.11e-07)]_319 49310 3.05e-02 271_[+1(7.65e-06)]_215 50177 5.59e-02 106_[+1(2.06e-05)]_380 50495 6.43e-01 500 44341 6.35e-13 202_[+1(2.46e-08)]_108_\ [+3(2.16e-09)]_14_[+2(1.66e-07)]_126 20424 2.43e-05 366_[+1(1.19e-06)]_31_\ [+3(7.33e-07)]_74 48476 1.20e-06 301_[+2(2.63e-09)]_61_\ [+1(1.83e-05)]_103 48744 4.20e-10 156_[+1(9.97e-06)]_13_\ [+2(1.51e-07)]_13_[+1(2.80e-07)]_107_[+1(8.39e-05)]_132_[+3(2.13e-07)]_1 47289 1.51e-04 395_[+2(1.80e-06)]_68_\ [+1(1.16e-05)]_2 46439 2.44e-02 427_[+2(3.43e-06)]_52 34642 3.35e-01 500 44926 4.16e-05 433_[+2(5.12e-07)]_18_\ [+1(6.43e-06)]_14 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************