******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/471/471.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 46746 1.0000 500 48678 1.0000 500 50153 1.0000 500 33107 1.0000 500 49986 1.0000 500 44680 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/471/471.seqs.fa -oc motifs/471 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 6 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3000 N= 6 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.274 C 0.221 G 0.229 T 0.277 Background letter frequencies (from dataset with add-one prior applied): A 0.274 C 0.221 G 0.229 T 0.277 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 4 llr = 55 E-value = 3.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :358:a:8aa:a pos.-specific C a:53a:::::a: probability G :5::::a::::: matrix T :3:::::3:::: bits 2.2 * * * * 2.0 * *** **** 1.7 * *** **** 1.5 * *** **** Relative 1.3 * *** **** Entropy 1.1 * ********** (19.8 bits) 0.9 * ********** 0.7 * ********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CGAACAGAAACA consensus ACC T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 50153 343 5.29e-08 TTCCAAAATC CGCACAGAAACA ATGGGAAACA 46746 471 1.19e-07 TGACGCCAAC CGAACAGAAACA CTGAACAAAT 44680 359 7.85e-07 ATTGGAAAGT CAACCAGAAACA GATGTGGTAC 49986 308 1.02e-06 TATTCGATGG CTCACAGTAACA TAAGCAAAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50153 5.3e-08 342_[+1]_146 46746 1.2e-07 470_[+1]_18 44680 7.9e-07 358_[+1]_130 49986 1e-06 307_[+1]_181 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=4 50153 ( 343) CGCACAGAAACA 1 46746 ( 471) CGAACAGAAACA 1 44680 ( 359) CAACCAGAAACA 1 49986 ( 308) CTCACAGTAACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 2934 bayes= 9.51668 E= 3.7e+002 -865 218 -865 -865 -13 -865 113 -15 87 118 -865 -865 145 18 -865 -865 -865 218 -865 -865 187 -865 -865 -865 -865 -865 213 -865 145 -865 -865 -15 187 -865 -865 -865 187 -865 -865 -865 -865 218 -865 -865 187 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 4 E= 3.7e+002 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.500000 0.250000 0.500000 0.500000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.000000 0.250000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[GAT][AC][AC]CAG[AT]AACA -------------------------------------------------------------------------------- Time 0.48 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 97 E-value = 6.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 7::332:27372::8:82::7 pos.-specific C ::a2:23733:::825::27: probability G :::577:2:2:2:2:::2:33 matrix T 3a::::7::237a::5278:: bits 2.2 * 2.0 ** * 1.7 ** * 1.5 ** ** Relative 1.3 ** *** * ** Entropy 1.1 ** * * * ***** *** (23.3 bits) 0.9 *** ***** * ***** *** 0.7 ********* *********** 0.4 ********* *********** 0.2 ********* *********** 0.0 --------------------- Multilevel ATCGGGTCAAATTCACATTCA consensus T AA C CCT T GG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 44680 458 2.81e-10 ATGATTTTTG ATCGGGCCCAATTCACAGTCA AAGTAATTGA 49986 283 4.36e-10 ATAGGACTTC TTCGGGTCACAGTCATATTCG ATGGCTCACA 46746 372 6.64e-09 GTTGGACCAA ATCAGGTCATTTTCACAATGA CCATGTCTCT 33107 386 9.73e-08 GGCCTTTTAC ATCCGGCCCAAATCCTTTTCA GAGACATTCG 48678 190 1.03e-07 TAAATTGGAA ATCAAATGACTTTCACATCCG TACTGTTATA 50153 185 1.79e-07 GCCGACTCCG TTCGACTAAGATTGATATTGA AAACATCGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44680 2.8e-10 457_[+2]_22 49986 4.4e-10 282_[+2]_197 46746 6.6e-09 371_[+2]_108 33107 9.7e-08 385_[+2]_94 48678 1e-07 189_[+2]_290 50153 1.8e-07 184_[+2]_295 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 44680 ( 458) ATCGGGCCCAATTCACAGTCA 1 49986 ( 283) TTCGGGTCACAGTCATATTCG 1 46746 ( 372) ATCAGGTCATTTTCACAATGA 1 33107 ( 386) ATCCGGCCCAAATCCTTTTCA 1 48678 ( 190) ATCAAATGACTTTCACATCCG 1 50153 ( 185) TTCGACTAAGATTGATATTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 2880 bayes= 8.90388 E= 6.1e+002 128 -923 -923 27 -923 -923 -923 185 -923 218 -923 -923 28 -40 113 -923 28 -923 154 -923 -72 -40 154 -923 -923 59 -923 127 -72 159 -46 -923 128 59 -923 -923 28 59 -46 -73 128 -923 -923 27 -72 -923 -46 127 -923 -923 -923 185 -923 192 -46 -923 160 -40 -923 -923 -923 118 -923 85 160 -923 -923 -73 -72 -923 -46 127 -923 -40 -923 159 -923 159 54 -923 128 -923 54 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 6.1e+002 0.666667 0.000000 0.000000 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.166667 0.500000 0.000000 0.333333 0.000000 0.666667 0.000000 0.166667 0.166667 0.666667 0.000000 0.000000 0.333333 0.000000 0.666667 0.166667 0.666667 0.166667 0.000000 0.666667 0.333333 0.000000 0.000000 0.333333 0.333333 0.166667 0.166667 0.666667 0.000000 0.000000 0.333333 0.166667 0.000000 0.166667 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.166667 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.833333 0.000000 0.000000 0.166667 0.166667 0.000000 0.166667 0.666667 0.000000 0.166667 0.000000 0.833333 0.000000 0.666667 0.333333 0.000000 0.666667 0.000000 0.333333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AT]TC[GA][GA]G[TC]C[AC][AC][AT]TTCA[CT]ATT[CG][AG] -------------------------------------------------------------------------------- Time 0.92 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 5 llr = 59 E-value = 1.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 4:2:::::a4a2 pos.-specific C 2a22a:a4:6:: probability G 2:2::::6:::8 matrix T 2:48:a:::::: bits 2.2 * * * 2.0 * *** * * 1.7 * *** * * 1.5 * *** * * Relative 1.3 * *** * ** Entropy 1.1 * ********* (17.1 bits) 0.9 * ********* 0.7 * ********* 0.4 * ********* 0.2 * ********* 0.0 ------------ Multilevel ACTTCTCGACAG consensus C AC C A A sequence G C T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44680 59 4.63e-07 CGGTACTGCT TCTTCTCGACAG AAAATTTCGA 33107 32 1.67e-06 AGAGTGAAAC CCGTCTCCACAG GTTTTTACTG 46746 351 1.67e-06 ATGCCTTTTG ACATCTCGAAAG TTGGACCAAA 49986 333 1.94e-06 GCAAAATGAC GCTTCTCCAAAG CCCCGTGCCA 48678 478 8.46e-06 TGTTCATTTG ACCCCTCGACAA TTTGCGACAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44680 4.6e-07 58_[+3]_430 33107 1.7e-06 31_[+3]_457 46746 1.7e-06 350_[+3]_138 49986 1.9e-06 332_[+3]_156 48678 8.5e-06 477_[+3]_11 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=5 44680 ( 59) TCTTCTCGACAG 1 33107 ( 32) CCGTCTCCACAG 1 46746 ( 351) ACATCTCGAAAG 1 49986 ( 333) GCTTCTCCAAAG 1 48678 ( 478) ACCCCTCGACAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 2934 bayes= 9.4462 E= 1.2e+003 55 -14 -19 -47 -897 218 -897 -897 -45 -14 -19 53 -897 -14 -897 153 -897 218 -897 -897 -897 -897 -897 185 -897 218 -897 -897 -897 86 139 -897 187 -897 -897 -897 55 144 -897 -897 187 -897 -897 -897 -45 -897 180 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 5 E= 1.2e+003 0.400000 0.200000 0.200000 0.200000 0.000000 1.000000 0.000000 0.000000 0.200000 0.200000 0.200000 0.400000 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [ACGT]C[TACG][TC]CTC[GC]A[CA]A[GA] -------------------------------------------------------------------------------- Time 1.26 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46746 7.05e-11 350_[+3(1.67e-06)]_9_[+2(6.64e-09)]_\ 78_[+1(1.19e-07)]_18 48678 2.38e-05 189_[+2(1.03e-07)]_267_\ [+3(8.46e-06)]_11 50153 3.29e-07 184_[+2(1.79e-07)]_137_\ [+1(5.29e-08)]_146 33107 4.16e-06 31_[+3(1.67e-06)]_342_\ [+2(9.73e-08)]_94 49986 4.76e-11 282_[+2(4.36e-10)]_4_[+1(1.02e-06)]_\ 13_[+3(1.94e-06)]_156 44680 6.44e-12 58_[+3(4.63e-07)]_288_\ [+1(7.85e-07)]_87_[+2(2.81e-10)]_22 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************