******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/49/49.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 25 1.0000 500 47113 1.0000 500 38095 1.0000 500 29502 1.0000 500 48390 1.0000 500 38946 1.0000 500 54072 1.0000 500 43525 1.0000 500 43708 1.0000 500 33072 1.0000 500 40382 1.0000 500 49817 1.0000 500 3366 1.0000 500 41174 1.0000 500 50325 1.0000 500 16856 1.0000 500 54175 1.0000 500 44952 1.0000 500 12523 1.0000 500 45914 1.0000 500 36956 1.0000 500 43230 1.0000 500 49402 1.0000 500 49256 1.0000 500 45873 1.0000 500 41029 1.0000 500 49553 1.0000 500 35903 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/49/49.seqs.fa -oc motifs/49 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 28 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 14000 N= 28 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.274 C 0.228 G 0.230 T 0.269 Background letter frequencies (from dataset with add-one prior applied): A 0.274 C 0.228 G 0.230 T 0.269 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 11 llr = 141 E-value = 1.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 4a2::4::6178a:51 pos.-specific C :::6::9:19:::32: probability G 6:1181181:22::1: matrix T ::7325:22:1::729 bits 2.1 1.9 * * 1.7 * * * * 1.5 * * ** * * * Relative 1.3 * * ** * ** * Entropy 1.1 ** * ** * *** * (18.5 bits) 0.9 ***** ** ***** * 0.6 ******** ***** * 0.4 ************** * 0.2 **************** 0.0 ---------------- Multilevel GATCGTCGACAAATAT consensus A T A C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 43525 246 7.25e-08 TCCTTGATTC AATTGACGACAAATCT GCGGAATTTG 49553 142 7.98e-08 AAACGCGACT GATCGTCGTCGAACAT AAAACCACCA 35903 219 1.38e-07 CCATCGTGTC AATCGACGACAAACGT CGCTCACGGG 36956 398 2.79e-07 AATCTGCTCA AAACGACTACAAATAT GATTCTCCTC 25 138 3.99e-07 CATTGAGGAA GAGCGTCGTCAGATAT TGGTACAAGA 49817 433 4.76e-07 CTCTCTAGCA GATTTTCGACTAATAT CCATTACTGA 33072 61 4.76e-07 GATCCCCAGG AATCGGCGACGAATTT GTCAACAACA 41029 403 7.26e-07 TGAAGTTCAC GATGGACGCCAAATCT TTCTCTGATG 45914 79 1.01e-06 TGGTGCTATT GATCGTGGACAAACAA ACGAACGAAT 54072 262 1.93e-06 GTAAAACTAT GATTTTCTGCAAATAT ATAAATAGAA 3366 178 2.06e-06 TTGCGAAAAC GAACGTCGAAAGATTT GCGAGGTGGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43525 7.2e-08 245_[+1]_239 49553 8e-08 141_[+1]_343 35903 1.4e-07 218_[+1]_266 36956 2.8e-07 397_[+1]_87 25 4e-07 137_[+1]_347 49817 4.8e-07 432_[+1]_52 33072 4.8e-07 60_[+1]_424 41029 7.3e-07 402_[+1]_82 45914 1e-06 78_[+1]_406 54072 1.9e-06 261_[+1]_223 3366 2.1e-06 177_[+1]_307 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=11 43525 ( 246) AATTGACGACAAATCT 1 49553 ( 142) GATCGTCGTCGAACAT 1 35903 ( 219) AATCGACGACAAACGT 1 36956 ( 398) AAACGACTACAAATAT 1 25 ( 138) GAGCGTCGTCAGATAT 1 49817 ( 433) GATTTTCGACTAATAT 1 33072 ( 61) AATCGGCGACGAATTT 1 41029 ( 403) GATGGACGCCAAATCT 1 45914 ( 79) GATCGTGGACAAACAA 1 54072 ( 262) GATTTTCTGCAAATAT 1 3366 ( 178) GAACGTCGAAAGATTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 13580 bayes= 10.6239 E= 1.6e+002 41 -1010 147 -1010 187 -1010 -1010 -1010 -59 -1010 -134 144 -1010 148 -134 2 -1010 -1010 183 -56 41 -1010 -134 102 -1010 200 -134 -1010 -1010 -1010 183 -56 122 -132 -134 -56 -159 200 -1010 -1010 141 -1010 -34 -156 158 -1010 -34 -1010 187 -1010 -1010 -1010 -1010 26 -1010 144 99 -32 -134 -56 -159 -1010 -1010 176 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 1.6e+002 0.363636 0.000000 0.636364 0.000000 1.000000 0.000000 0.000000 0.000000 0.181818 0.000000 0.090909 0.727273 0.000000 0.636364 0.090909 0.272727 0.000000 0.000000 0.818182 0.181818 0.363636 0.000000 0.090909 0.545455 0.000000 0.909091 0.090909 0.000000 0.000000 0.000000 0.818182 0.181818 0.636364 0.090909 0.090909 0.181818 0.090909 0.909091 0.000000 0.000000 0.727273 0.000000 0.181818 0.090909 0.818182 0.000000 0.181818 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.272727 0.000000 0.727273 0.545455 0.181818 0.090909 0.181818 0.090909 0.000000 0.000000 0.909091 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA]AT[CT]G[TA]CGACAAA[TC]AT -------------------------------------------------------------------------------- Time 6.55 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 11 llr = 153 E-value = 2.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :6:111:2:14:1:1:::: pos.-specific C 513:5248:5:2638:18a probability G 23454:::132:171::1: matrix T 3:45:76:91582::a91: bits 2.1 * 1.9 * * 1.7 * * 1.5 ** ** * Relative 1.3 ** * ****** Entropy 1.1 *** * ****** (20.0 bits) 0.9 ***** * ****** 0.6 ** ****** ******** 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel CAGGCTTCTCTTCGCTTCC consensus TGTTG C GA C sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 38946 203 3.52e-10 TGGTGTGAAG CACGCTCCTGTTCGCTTCC AACAACAAAC 50325 463 8.08e-09 CAGTTTCCAG TAGTCCTCTCTTCCCTTCC GCTATCATTC 29502 55 1.33e-08 GGGAGCGACA CGGGGTTCTCATCGCTCCC GATTCCGGTG 49817 50 4.05e-08 TTGATTTTCT TATGGTTATCATCCCTTCC AATAGTCAAG 36956 414 7.42e-08 CTACAAATAT GATTCTCCTCGTCGCTTGC CATTTGAGAA 38095 249 1.86e-07 GACTGCTGAT CGGTCTTCTGTTGGGTTCC CGTAGAGTCC 33072 311 4.54e-07 ATGATCTTTT GAGGGATCTCACTGCTTCC GGCAATGGAG 25 95 7.59e-07 AAAACACTGC CACGCCTCTTTTTGCTTTC GAGACGATGA 47113 65 8.72e-07 GCGACGACCT TCTTCTCCTCACCGATTCC ATCATGGGTC 41029 74 1.14e-06 AACAATATAG CACAATTCGGTTCCCTTCC TAAACTACTT 41174 15 1.89e-06 GTTCGCGTAA CGTTGTCATAGTAGCTTCC GGATTGGGGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38946 3.5e-10 202_[+2]_279 50325 8.1e-09 462_[+2]_19 29502 1.3e-08 54_[+2]_427 49817 4e-08 49_[+2]_432 36956 7.4e-08 413_[+2]_68 38095 1.9e-07 248_[+2]_233 33072 4.5e-07 310_[+2]_171 25 7.6e-07 94_[+2]_387 47113 8.7e-07 64_[+2]_417 41029 1.1e-06 73_[+2]_408 41174 1.9e-06 14_[+2]_467 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=11 38946 ( 203) CACGCTCCTGTTCGCTTCC 1 50325 ( 463) TAGTCCTCTCTTCCCTTCC 1 29502 ( 55) CGGGGTTCTCATCGCTCCC 1 49817 ( 50) TATGGTTATCATCCCTTCC 1 36956 ( 414) GATTCTCCTCGTCGCTTGC 1 38095 ( 249) CGGTCTTCTGTTGGGTTCC 1 33072 ( 311) GAGGGATCTCACTGCTTCC 1 25 ( 95) CACGCCTCTTTTTGCTTTC 1 47113 ( 65) TCTTCTCCTCACCGATTCC 1 41029 ( 74) CACAATTCGGTTCCCTTCC 1 41174 ( 15) CGTTGTCATAGTAGCTTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 13496 bayes= 10.615 E= 2.1e+001 -1010 126 -34 2 122 -132 25 -1010 -1010 26 66 44 -159 -1010 98 76 -159 126 66 -1010 -159 -32 -1010 144 -1010 67 -1010 124 -59 184 -1010 -1010 -1010 -1010 -134 176 -159 126 25 -156 41 -1010 -34 76 -1010 -32 -1010 161 -159 148 -134 -56 -1010 26 166 -1010 -159 184 -134 -1010 -1010 -1010 -1010 189 -1010 -132 -1010 176 -1010 184 -134 -156 -1010 213 -1010 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 11 E= 2.1e+001 0.000000 0.545455 0.181818 0.272727 0.636364 0.090909 0.272727 0.000000 0.000000 0.272727 0.363636 0.363636 0.090909 0.000000 0.454545 0.454545 0.090909 0.545455 0.363636 0.000000 0.090909 0.181818 0.000000 0.727273 0.000000 0.363636 0.000000 0.636364 0.181818 0.818182 0.000000 0.000000 0.000000 0.000000 0.090909 0.909091 0.090909 0.545455 0.272727 0.090909 0.363636 0.000000 0.181818 0.454545 0.000000 0.181818 0.000000 0.818182 0.090909 0.636364 0.090909 0.181818 0.000000 0.272727 0.727273 0.000000 0.090909 0.818182 0.090909 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.090909 0.000000 0.909091 0.000000 0.818182 0.090909 0.090909 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CT][AG][GTC][GT][CG]T[TC]CT[CG][TA]TC[GC]CTTCC -------------------------------------------------------------------------------- Time 13.01 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 7 llr = 92 E-value = 5.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :9a:9aa94:7: pos.-specific C 11:::::1:::1 probability G 9::a1:::6a39 matrix T :::::::::::: bits 2.1 * * 1.9 ** ** * 1.7 ** ** * 1.5 * ** ** * * Relative 1.3 ******** * * Entropy 1.1 ************ (19.0 bits) 0.9 ************ 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GAAGAAAAGGAG consensus A G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 3366 351 7.46e-08 CTCAGAAAAC GAAGAAAAGGAG ATAATTGATA 38095 39 7.46e-08 GTTCCGCAAA GAAGAAAAGGAG ACTATGCTCC 45914 127 4.88e-07 AAGGGAGGCA GAAGAAACGGAG TGTGGCATCG 35903 175 1.03e-06 GTTTGGATTG GAAGAAAAAGAC TCCTACGACA 45873 422 1.03e-06 TGATGCTTGC CAAGAAAAAGAG AACCTGGCGA 49256 94 1.19e-06 CAAGCAGTAT GAAGGAAAGGGG TCCATTGACT 29502 19 1.50e-06 TTAGCTCTTC GCAGAAAAAGGG AGAGAGTCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3366 7.5e-08 350_[+3]_138 38095 7.5e-08 38_[+3]_450 45914 4.9e-07 126_[+3]_362 35903 1e-06 174_[+3]_314 45873 1e-06 421_[+3]_67 49256 1.2e-06 93_[+3]_395 29502 1.5e-06 18_[+3]_470 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=7 3366 ( 351) GAAGAAAAGGAG 1 38095 ( 39) GAAGAAAAGGAG 1 45914 ( 127) GAAGAAACGGAG 1 35903 ( 175) GAAGAAAAAGAC 1 45873 ( 422) CAAGAAAAAGAG 1 49256 ( 94) GAAGGAAAGGGG 1 29502 ( 19) GCAGAAAAAGGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 13692 bayes= 10.7767 E= 5.0e+002 -945 -67 190 -945 165 -67 -945 -945 187 -945 -945 -945 -945 -945 212 -945 165 -945 -68 -945 187 -945 -945 -945 187 -945 -945 -945 165 -67 -945 -945 65 -945 131 -945 -945 -945 212 -945 138 -945 31 -945 -945 -67 190 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 5.0e+002 0.000000 0.142857 0.857143 0.000000 0.857143 0.142857 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.857143 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.428571 0.000000 0.571429 0.000000 0.000000 0.000000 1.000000 0.000000 0.714286 0.000000 0.285714 0.000000 0.000000 0.142857 0.857143 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GAAGAAAA[GA]G[AG]G -------------------------------------------------------------------------------- Time 19.17 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25 2.01e-06 94_[+2(7.59e-07)]_24_[+1(3.99e-07)]_\ 347 47113 5.34e-04 64_[+2(8.72e-07)]_360_\ [+1(4.34e-05)]_41 38095 2.91e-07 38_[+3(7.46e-08)]_198_\ [+2(1.86e-07)]_233 29502 8.19e-07 18_[+3(1.50e-06)]_24_[+2(1.33e-08)]_\ 427 48390 5.02e-01 500 38946 2.48e-06 202_[+2(3.52e-10)]_279 54072 7.49e-03 261_[+1(1.93e-06)]_223 43525 3.87e-04 245_[+1(7.25e-08)]_239 43708 2.95e-01 500 33072 4.75e-06 60_[+1(4.76e-07)]_234_\ [+2(4.54e-07)]_171 40382 1.79e-01 328_[+2(7.44e-05)]_153 49817 1.94e-07 49_[+2(4.05e-08)]_364_\ [+1(4.76e-07)]_52 3366 2.90e-06 177_[+1(2.06e-06)]_157_\ [+3(7.46e-08)]_138 41174 1.73e-02 14_[+2(1.89e-06)]_467 50325 2.45e-04 462_[+2(8.08e-09)]_19 16856 1.62e-01 500 54175 5.10e-01 500 44952 6.83e-02 482_[+1(8.01e-05)]_2 12523 7.24e-01 500 45914 1.26e-05 78_[+1(1.01e-06)]_32_[+3(4.88e-07)]_\ 362 36956 2.57e-07 397_[+1(2.79e-07)]_[+2(7.42e-08)]_\ 68 43230 6.85e-01 500 49402 3.39e-01 500 49256 2.75e-03 93_[+3(1.19e-06)]_395 45873 1.93e-03 421_[+3(1.03e-06)]_67 41029 5.37e-06 73_[+2(1.14e-06)]_310_\ [+1(7.26e-07)]_82 49553 1.53e-03 141_[+1(7.98e-08)]_343 35903 2.87e-06 174_[+3(1.03e-06)]_32_\ [+1(1.38e-07)]_266 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************