******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/419/419.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 16210 1.0000 500 20690 1.0000 500 23648 1.0000 500 23649 1.0000 500 261201 1.0000 500 30866 1.0000 500 32485 1.0000 500 8855 1.0000 500 bd1786 1.0000 500 bd1787 1.0000 500 bd1808 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/419/419.seqs.fa -oc motifs/419 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.235 G 0.233 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.272 C 0.235 G 0.233 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 6 llr = 111 E-value = 1.4e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::a::5:28::::3:5:7 pos.-specific C 3:::::22::::::::::: probability G :7a::8::8::8a35a2a3 matrix T 73::a238:2a2:72:3:: bits 2.1 * * * * 1.9 *** * * * * 1.7 *** * * * * 1.5 **** * *** * * Relative 1.3 **** ****** * * Entropy 1.0 ****** ******* * ** (26.8 bits) 0.8 ****** ******* * ** 0.6 ****** ********* ** 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel TGGATGATGATGGTGGAGA consensus CT T GA T G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- bd1787 8 9.09e-11 GGATATT TGGATGATGATGGGAGAGA CTCAGTGTCA 23649 9 9.09e-11 TGGATATT TGGATGATGATGGGAGAGA CTCAGTGTCA 8855 198 3.29e-10 TGATGTTGGC TGGATGTTGATGGTTGTGA TTGGTTTGTT bd1808 253 1.18e-09 TTTCTTGCGC TGGATGTTGTTGGTGGTGG GTGTGGCGTG 16210 375 7.40e-09 CAAGTATCAA CTGATGCTAATGGTGGAGG AGGTAGATCC 30866 6 2.75e-08 CTCGT CTGATTACGATTGTGGGGA TGAGTGCAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd1787 9.1e-11 7_[+1]_474 23649 9.1e-11 8_[+1]_473 8855 3.3e-10 197_[+1]_284 bd1808 1.2e-09 252_[+1]_229 16210 7.4e-09 374_[+1]_107 30866 2.7e-08 5_[+1]_476 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=6 bd1787 ( 8) TGGATGATGATGGGAGAGA 1 23649 ( 9) TGGATGATGATGGGAGAGA 1 8855 ( 198) TGGATGTTGATGGTTGTGA 1 bd1808 ( 253) TGGATGTTGTTGGTGGTGG 1 16210 ( 375) CTGATGCTAATGGTGGAGG 1 30866 ( 6) CTGATTACGATTGTGGGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 5302 bayes= 10.2336 E= 1.4e-003 -923 51 -923 136 -923 -923 151 36 -923 -923 210 -923 188 -923 -923 -923 -923 -923 -923 194 -923 -923 184 -64 88 -49 -923 36 -923 -49 -923 168 -71 -923 184 -923 161 -923 -923 -64 -923 -923 -923 194 -923 -923 184 -64 -923 -923 210 -923 -923 -923 51 136 29 -923 110 -64 -923 -923 210 -923 88 -923 -48 36 -923 -923 210 -923 129 -923 51 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 6 E= 1.4e-003 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.500000 0.166667 0.000000 0.333333 0.000000 0.166667 0.000000 0.833333 0.166667 0.000000 0.833333 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.333333 0.000000 0.500000 0.166667 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.166667 0.333333 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.333333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TC][GT]GATG[AT]TGATGG[TG][GA]G[AT]G[AG] -------------------------------------------------------------------------------- Time 1.23 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 11 llr = 129 E-value = 3.1e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 2:1::::33:2:1: pos.-specific C ::1:1155:4:5:: probability G 8:119:2:16:::8 matrix T :a79:9326:8592 bits 2.1 1.9 * 1.7 * * 1.5 ** *** ** Relative 1.3 ** *** * ** Entropy 1.0 ** *** ***** (16.9 bits) 0.8 ** *** ***** 0.6 ************** 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel GTTTGTCCTGTTTG consensus TAAC C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- bd1787 408 2.42e-08 GTGACCAAGT GTTTGTTCTGTTTG GCTACTCTAA 23649 409 2.42e-08 GTGACCAAGT GTTTGTTCTGTTTG GCTACTCTAA 261201 246 5.93e-08 TGTTTCCCTT GTTTGTCATCTTTG TCACATGTTT bd1786 137 7.60e-07 TGATCCCTGA ATTTGTCCACTCTG TACTAACTCC 23648 447 7.60e-07 TGATCCCTGA ATTTGTCCACTCTG TACTAACTCC 30866 211 1.26e-06 AGGAAGAGAC GTCTGTCAAGTTTG AGAATGTTAT 20690 62 1.66e-06 CACGAAGGTT GTGTGTCCGGTCTG CTGGTGTAAG bd1808 133 3.19e-06 CTTTATTGGC GTTTGCTCTGATTG AATGGCTCTC 16210 437 8.27e-06 TGGATCTGAG GTATGTGATGTCTT CGTTTAAAGG 8855 176 1.06e-05 TTGATCGGCG GTTGGTGTTGATTG ATGTTGGCTG 32485 415 3.73e-05 CTCTGTAGCA GTTTCTCTTCTCAT TTCTCCACAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd1787 2.4e-08 407_[+2]_79 23649 2.4e-08 408_[+2]_78 261201 5.9e-08 245_[+2]_241 bd1786 7.6e-07 136_[+2]_350 23648 7.6e-07 446_[+2]_40 30866 1.3e-06 210_[+2]_276 20690 1.7e-06 61_[+2]_425 bd1808 3.2e-06 132_[+2]_354 16210 8.3e-06 436_[+2]_50 8855 1.1e-05 175_[+2]_311 32485 3.7e-05 414_[+2]_72 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=11 bd1787 ( 408) GTTTGTTCTGTTTG 1 23649 ( 409) GTTTGTTCTGTTTG 1 261201 ( 246) GTTTGTCATCTTTG 1 bd1786 ( 137) ATTTGTCCACTCTG 1 23648 ( 447) ATTTGTCCACTCTG 1 30866 ( 211) GTCTGTCAAGTTTG 1 20690 ( 62) GTGTGTCCGGTCTG 1 bd1808 ( 133) GTTTGCTCTGATTG 1 16210 ( 437) GTATGTGATGTCTT 1 8855 ( 176) GTTGGTGTTGATTG 1 32485 ( 415) GTTTCTCTTCTCAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 5357 bayes= 8.92481 E= 3.1e-004 -58 -1010 181 -1010 -1010 -1010 -1010 194 -158 -137 -136 148 -1010 -1010 -136 181 -1010 -137 196 -1010 -1010 -137 -1010 181 -1010 122 -36 7 0 122 -1010 -52 0 -1010 -136 129 -1010 63 145 -1010 -58 -1010 -1010 165 -1010 95 -1010 107 -158 -1010 -1010 181 -1010 -1010 181 -52 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 11 E= 3.1e-004 0.181818 0.000000 0.818182 0.000000 0.000000 0.000000 0.000000 1.000000 0.090909 0.090909 0.090909 0.727273 0.000000 0.000000 0.090909 0.909091 0.000000 0.090909 0.909091 0.000000 0.000000 0.090909 0.000000 0.909091 0.000000 0.545455 0.181818 0.272727 0.272727 0.545455 0.000000 0.181818 0.272727 0.000000 0.090909 0.636364 0.000000 0.363636 0.636364 0.000000 0.181818 0.000000 0.000000 0.818182 0.000000 0.454545 0.000000 0.545455 0.090909 0.000000 0.000000 0.909091 0.000000 0.000000 0.818182 0.181818 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GTTTGT[CT][CA][TA][GC]T[TC]TG -------------------------------------------------------------------------------- Time 2.26 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 9 llr = 142 E-value = 1.4e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::a42::7:1:942:9137 pos.-specific C :2:2:11::::96:3:4:3:2 probability G 1713:2:82:a::12:4:6:: matrix T 9194:27283::4::811:71 bits 2.1 * 1.9 * * 1.7 * * 1.5 * * * ** * * Relative 1.3 * * * ** ** * * * Entropy 1.0 * * * ******* * * * (22.7 bits) 0.8 *** * ******* * * * 0.6 *** * ******** ****** 0.4 ***** *************** 0.2 ********************* 0.0 --------------------- Multilevel TGTTAATGTAGCCAATCAGTA consensus C G GATGT T CAG CAC sequence C T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- bd1787 183 8.85e-10 GAGAATACAA TGTTAATGGAGCCACTCACTC GTGAGAGTTC 23649 184 8.85e-10 GAGAATACAA TGTTAATGGAGCCACTCACTC GTGAGAGTTC 30866 334 1.51e-09 TGGAACCGTT TCTGACTGTAGCCAGTGAGTA CCAAGTGCTT bd1786 11 4.42e-09 GTTGTTGGCT TGTTAGAGTAGCCAAACAGAA CAAACACTTG 23648 322 4.42e-09 GTTGTTGGCT TGTTAGAGTAGCCAAACAGAA CAAACACTTG 8855 129 6.74e-08 TTCCTGTCCC TGTCATTGTTGCTGATTACTA GTTTGATACT 261201 12 1.34e-07 AGTAGCATCG TCTCATTTTAGATACTGAGAA CGGTGTACGT 32485 66 3.97e-07 AATTGAGAAC GGGGAATGTTGCTAATGAATT TTAAGGAGTG bd1808 325 4.38e-07 CAACGGTCGG TTTGAACTTTGCTAGTGTGTA TTTTTTGCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd1787 8.8e-10 182_[+3]_297 23649 8.8e-10 183_[+3]_296 30866 1.5e-09 333_[+3]_146 bd1786 4.4e-09 10_[+3]_469 23648 4.4e-09 321_[+3]_158 8855 6.7e-08 128_[+3]_351 261201 1.3e-07 11_[+3]_468 32485 4e-07 65_[+3]_414 bd1808 4.4e-07 324_[+3]_155 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=9 bd1787 ( 183) TGTTAATGGAGCCACTCACTC 1 23649 ( 184) TGTTAATGGAGCCACTCACTC 1 30866 ( 334) TCTGACTGTAGCCAGTGAGTA 1 bd1786 ( 11) TGTTAGAGTAGCCAAACAGAA 1 23648 ( 322) TGTTAGAGTAGCCAAACAGAA 1 8855 ( 129) TGTCATTGTTGCTGATTACTA 1 261201 ( 12) TCTCATTTTAGATACTGAGAA 1 32485 ( 66) GGGGAATGTTGCTAATGAATT 1 bd1808 ( 325) TTTGAACTTTGCTAGTGTGTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 9.32846 E= 1.4e-003 -982 -982 -107 177 -982 -8 151 -122 -982 -982 -107 177 -982 -8 51 77 188 -982 -982 -982 71 -108 -7 -23 -29 -108 -982 136 -982 -982 174 -23 -982 -982 -7 158 129 -982 -982 36 -982 -982 210 -982 -129 192 -982 -982 -982 124 -982 77 171 -982 -107 -982 71 51 -7 -982 -29 -982 -982 158 -982 92 93 -122 171 -982 -982 -122 -129 51 125 -982 29 -982 -982 136 129 -8 -982 -122 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 1.4e-003 0.000000 0.000000 0.111111 0.888889 0.000000 0.222222 0.666667 0.111111 0.000000 0.000000 0.111111 0.888889 0.000000 0.222222 0.333333 0.444444 1.000000 0.000000 0.000000 0.000000 0.444444 0.111111 0.222222 0.222222 0.222222 0.111111 0.000000 0.666667 0.000000 0.000000 0.777778 0.222222 0.000000 0.000000 0.222222 0.777778 0.666667 0.000000 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.000000 0.555556 0.000000 0.444444 0.888889 0.000000 0.111111 0.000000 0.444444 0.333333 0.222222 0.000000 0.222222 0.000000 0.000000 0.777778 0.000000 0.444444 0.444444 0.111111 0.888889 0.000000 0.000000 0.111111 0.111111 0.333333 0.555556 0.000000 0.333333 0.000000 0.000000 0.666667 0.666667 0.222222 0.000000 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[GC]T[TGC]A[AGT][TA][GT][TG][AT]GC[CT]A[ACG][TA][CG]A[GC][TA][AC] -------------------------------------------------------------------------------- Time 3.23 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 16210 2.00e-06 374_[+1(7.40e-09)]_43_\ [+2(8.27e-06)]_50 20690 7.35e-03 61_[+2(1.66e-06)]_425 23648 1.08e-07 321_[+3(4.42e-09)]_104_\ [+2(7.60e-07)]_40 23649 2.12e-16 8_[+1(9.09e-11)]_156_[+3(8.85e-10)]_\ 204_[+2(2.42e-08)]_78 261201 1.02e-07 11_[+3(1.34e-07)]_213_\ [+2(5.93e-08)]_241 30866 3.34e-12 5_[+1(2.75e-08)]_186_[+2(1.26e-06)]_\ 109_[+3(1.51e-09)]_146 32485 2.22e-04 65_[+3(3.97e-07)]_328_\ [+2(3.73e-05)]_72 8855 1.38e-11 128_[+3(6.74e-08)]_26_\ [+2(1.06e-05)]_8_[+1(3.29e-10)]_284 bd1786 1.35e-07 10_[+3(4.42e-09)]_105_\ [+2(7.60e-07)]_350 bd1787 2.12e-16 7_[+1(9.09e-11)]_156_[+3(8.85e-10)]_\ 204_[+2(2.42e-08)]_79 bd1808 8.57e-11 132_[+2(3.19e-06)]_106_\ [+1(1.18e-09)]_53_[+3(4.38e-07)]_155 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************