******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/480/480.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11697 1.0000 500 1193 1.0000 500 1528 1.0000 500 21749 1.0000 500 23150 1.0000 500 23674 1.0000 500 262439 1.0000 500 262478 1.0000 500 262926 1.0000 500 263820 1.0000 500 33126 1.0000 500 33199 1.0000 500 3418 1.0000 500 5799 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/480/480.seqs.fa -oc motifs/480 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.260 C 0.221 G 0.256 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.260 C 0.221 G 0.256 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 13 sites = 6 llr = 88 E-value = 1.8e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :a::::52::::: pos.-specific C ::a::a:2:2::2 probability G a::a::2:a82:8 matrix T ::::a:37::8a: bits 2.2 * * 2.0 ****** * * 1.7 ****** * * 1.5 ****** * * Relative 1.3 ****** ***** Entropy 1.1 ****** ***** (21.2 bits) 0.9 ****** ***** 0.7 ****** ****** 0.4 ************* 0.2 ************* 0.0 ------------- Multilevel GACGTCATGGTTG consensus T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 5799 294 1.74e-08 TGCTTCGGTG GACGTCATGGTTG AGAGTGGTTG 21749 4 3.49e-08 TCC GACGTCTTGGTTG GTTCGCGTGG 11697 349 1.14e-07 GATTTGGATG GACGTCATGGTTC ACGACGTGAC 33126 311 1.45e-07 GCAGAGTGCA GACGTCTCGGTTG AGGTATTGTC 3418 135 2.42e-07 GGAGAAGCAC GACGTCGAGGTTG TTGTTGGCGG 262478 232 4.16e-07 GGGGTCGTTG GACGTCATGCGTG GTACGGATGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5799 1.7e-08 293_[+1]_194 21749 3.5e-08 3_[+1]_484 11697 1.1e-07 348_[+1]_139 33126 1.5e-07 310_[+1]_177 3418 2.4e-07 134_[+1]_353 262478 4.2e-07 231_[+1]_256 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=13 seqs=6 5799 ( 294) GACGTCATGGTTG 1 21749 ( 4) GACGTCTTGGTTG 1 11697 ( 349) GACGTCATGGTTC 1 33126 ( 311) GACGTCTCGGTTG 1 3418 ( 135) GACGTCGAGGTTG 1 262478 ( 232) GACGTCATGCGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 6832 bayes= 9.81049 E= 1.8e-001 -923 -923 196 -923 194 -923 -923 -923 -923 218 -923 -923 -923 -923 196 -923 -923 -923 -923 192 -923 218 -923 -923 94 -923 -62 34 -64 -40 -923 134 -923 -923 196 -923 -923 -40 170 -923 -923 -923 -62 166 -923 -923 -923 192 -923 -40 170 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 6 E= 1.8e-001 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.166667 0.333333 0.166667 0.166667 0.000000 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GACGTC[AT]TGGTTG -------------------------------------------------------------------------------- Time 1.58 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 10 llr = 113 E-value = 1.3e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 548:674:aaa: pos.-specific C :22a:33a:::6 probability G ::::::1::::: matrix T 54::4:2::::4 bits 2.2 * * 2.0 * **** 1.7 * **** 1.5 * **** Relative 1.3 ** **** Entropy 1.1 ** * ***** (16.2 bits) 0.9 * **** ***** 0.7 * **** ***** 0.4 ****** ***** 0.2 ************ 0.0 ------------ Multilevel AAACAAACAAAC consensus TTC TCC T sequence C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 33199 365 2.37e-07 ACCGGAACCC AAACAAACAAAC GGGCTCCTCT 11697 424 6.79e-07 AGCGAGAACT TTACTAACAAAC TTCAAACTGA 33126 382 1.35e-06 TGTCGAGCCA ACACAACCAAAC AAGACGCCGA 263820 132 2.03e-06 TGTGTTCACT ATACAACCAAAT GATGTCTTTG 23674 482 2.03e-06 CTCATCTTAT TAACACACAAAC AAGCATC 1193 279 2.03e-06 TAGAGAGTCG TAACAACCAAAT GTGATTGTAC 3418 483 4.15e-06 TCTCTCGCTC TTCCAAACAAAC CAAACC 262478 428 1.24e-05 TATTAGGTTT TTACTAGCAAAT ATCAAATGTA 1528 378 1.24e-05 CGTTGCAATA AAACTCTCAAAT CTCACACCTT 21749 280 1.77e-05 AATGGATGAA ACCCTCTCAAAC CACGTCATAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33199 2.4e-07 364_[+2]_124 11697 6.8e-07 423_[+2]_65 33126 1.4e-06 381_[+2]_107 263820 2e-06 131_[+2]_357 23674 2e-06 481_[+2]_7 1193 2e-06 278_[+2]_210 3418 4.1e-06 482_[+2]_6 262478 1.2e-05 427_[+2]_61 1528 1.2e-05 377_[+2]_111 21749 1.8e-05 279_[+2]_209 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=10 33199 ( 365) AAACAAACAAAC 1 11697 ( 424) TTACTAACAAAC 1 33126 ( 382) ACACAACCAAAC 1 263820 ( 132) ATACAACCAAAT 1 23674 ( 482) TAACACACAAAC 1 1193 ( 279) TAACAACCAAAT 1 3418 ( 483) TTCCAAACAAAC 1 262478 ( 428) TTACTAGCAAAT 1 1528 ( 378) AAACTCTCAAAT 1 21749 ( 280) ACCCTCTCAAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.66888 E= 1.3e+000 95 -997 -997 92 62 -14 -997 60 162 -14 -997 -997 -997 218 -997 -997 121 -997 -997 60 143 44 -997 -997 62 44 -136 -40 -997 218 -997 -997 194 -997 -997 -997 194 -997 -997 -997 194 -997 -997 -997 -997 144 -997 60 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 1.3e+000 0.500000 0.000000 0.000000 0.500000 0.400000 0.200000 0.000000 0.400000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.000000 0.000000 0.400000 0.700000 0.300000 0.000000 0.000000 0.400000 0.300000 0.100000 0.200000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.600000 0.000000 0.400000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AT][ATC][AC]C[AT][AC][ACT]CAAA[CT] -------------------------------------------------------------------------------- Time 3.17 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 5 llr = 80 E-value = 4.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::8:2:::2::2:: pos.-specific C 884a2a882a28a::6 probability G 2::::::2::2::6:: matrix T :26:::::8:42:2a4 bits 2.2 * * * * 2.0 * * * * * 1.7 * * * * * 1.5 * * * * * Relative 1.3 ** ******* ** * Entropy 1.1 ********** ** ** (23.0 bits) 0.9 ********** ** ** 0.7 ********** ***** 0.4 ********** ***** 0.2 ********** ***** 0.0 ---------------- Multilevel CCTCACCCTCTCCGTC consensus GTC C AGC AT A T sequence C T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 1528 99 5.16e-10 GGCTTTGCTT CCTCACCCTCGCCGTC TTGGCCATTG 262926 476 6.16e-09 GGCAACGCAT CCTCACCCTCCTCGTC CCCCCAATA 33126 351 2.46e-08 CGAAAAAACG CCCCCCCCCCTCCGTT CTCGTTGTCG 21749 471 8.14e-08 TTTCATCTCA CTCCACACTCTCCATC TTCCAACGAC 33199 393 1.72e-07 CTCTTCAAAC GCTCACCGTCACCTTT CTTCGGAGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1528 5.2e-10 98_[+3]_386 262926 6.2e-09 475_[+3]_9 33126 2.5e-08 350_[+3]_134 21749 8.1e-08 470_[+3]_14 33199 1.7e-07 392_[+3]_92 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=5 1528 ( 99) CCTCACCCTCGCCGTC 1 262926 ( 476) CCTCACCCTCCTCGTC 1 33126 ( 351) CCCCCCCCCCTCCGTT 1 21749 ( 471) CTCCACACTCTCCATC 1 33199 ( 393) GCTCACCGTCACCTTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 11.3501 E= 4.4e+001 -897 186 -36 -897 -897 186 -897 -40 -897 86 -897 119 -897 218 -897 -897 162 -14 -897 -897 -897 218 -897 -897 -38 186 -897 -897 -897 186 -36 -897 -897 -14 -897 160 -897 218 -897 -897 -38 -14 -36 60 -897 186 -897 -40 -897 218 -897 -897 -38 -897 122 -40 -897 -897 -897 192 -897 144 -897 60 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 4.4e+001 0.000000 0.800000 0.200000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.400000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.200000 0.200000 0.200000 0.400000 0.000000 0.800000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.600000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.000000 0.400000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG][CT][TC]C[AC]C[CA][CG][TC]C[TACG][CT]C[GAT]T[CT] -------------------------------------------------------------------------------- Time 4.72 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11697 1.11e-06 348_[+1(1.14e-07)]_62_\ [+2(6.79e-07)]_65 1193 2.80e-02 278_[+2(2.03e-06)]_210 1528 2.05e-07 98_[+3(5.16e-10)]_263_\ [+2(1.24e-05)]_111 21749 2.10e-09 3_[+1(3.49e-08)]_263_[+2(1.77e-05)]_\ 179_[+3(8.14e-08)]_14 23150 3.70e-01 500 23674 8.05e-03 481_[+2(2.03e-06)]_7 262439 9.46e-01 500 262478 3.01e-05 231_[+1(4.16e-07)]_183_\ [+2(1.24e-05)]_61 262926 1.38e-04 475_[+3(6.16e-09)]_9 263820 1.99e-02 131_[+2(2.03e-06)]_357 33126 2.39e-10 310_[+1(1.45e-07)]_27_\ [+3(2.46e-08)]_15_[+2(1.35e-06)]_107 33199 1.76e-06 364_[+2(2.37e-07)]_16_\ [+3(1.72e-07)]_92 3418 8.74e-06 36_[+1(3.55e-05)]_85_[+1(2.42e-07)]_\ 335_[+2(4.15e-06)]_6 5799 1.53e-05 205_[+3(3.74e-05)]_72_\ [+1(1.74e-08)]_194 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************