******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/191/191.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1031 1.0000 500 1082 1.0000 500 23908 1.0000 500 261577 1.0000 500 262215 1.0000 500 36742 1.0000 500 37795 1.0000 500 4046 1.0000 500 8138 1.0000 500 9361 1.0000 500 9420 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/191/191.seqs.fa -oc motifs/191 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.256 C 0.217 G 0.257 T 0.270 Background letter frequencies (from dataset with add-one prior applied): A 0.256 C 0.217 G 0.257 T 0.270 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 6 llr = 92 E-value = 1.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :38::2::a:73328: pos.-specific C a228:8a2:8272:2a probability G :5::a::7:22::5:: matrix T :::2:::2::::53:: bits 2.2 * * * 2.0 * * * * * 1.8 * * * * * 1.5 * **** ** * Relative 1.3 * ***** ** ** Entropy 1.1 * ***** ** * ** (22.0 bits) 0.9 * ***** ** * ** 0.7 * ********** ** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CGACGCCGACACTGAC consensus A AAT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 262215 53 3.97e-09 CATATCGACC CGACGCCCACACTTAC TCGCACATGT 9420 393 5.26e-09 TTCCCCCGAC CGACGCCGACGCAGAC CAGACCGACC 1031 380 1.28e-08 TTATATCATA CCCCGCCGACACTGAC AACGATACTG 261577 459 1.34e-07 CTGTTTCACG CGACGACGAGACCGAC GTCCTCTCCC 1082 447 1.69e-07 AACCGCGGCT CAACGCCGACCAATCC ATCGCCCTAG 9361 114 2.88e-07 GCAACAATTA CAATGCCTACAATAAC GCGTACTAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 262215 4e-09 52_[+1]_432 9420 5.3e-09 392_[+1]_92 1031 1.3e-08 379_[+1]_105 261577 1.3e-07 458_[+1]_26 1082 1.7e-07 446_[+1]_38 9361 2.9e-07 113_[+1]_371 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=6 262215 ( 53) CGACGCCCACACTTAC 1 9420 ( 393) CGACGCCGACGCAGAC 1 1031 ( 380) CCCCGCCGACACTGAC 1 261577 ( 459) CGACGACGAGACCGAC 1 1082 ( 447) CAACGCCGACCAATCC 1 9361 ( 114) CAATGCCTACAATAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5335 bayes= 10.2426 E= 1.5e+000 -923 220 -923 -923 38 -38 96 -923 170 -38 -923 -923 -923 194 -923 -69 -923 -923 196 -923 -62 194 -923 -923 -923 220 -923 -923 -923 -38 137 -69 197 -923 -923 -923 -923 194 -62 -923 138 -38 -62 -923 38 162 -923 -923 38 -38 -923 89 -62 -923 96 30 170 -38 -923 -923 -923 220 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 1.5e+000 0.000000 1.000000 0.000000 0.000000 0.333333 0.166667 0.500000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.666667 0.166667 1.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.666667 0.166667 0.166667 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.166667 0.000000 0.500000 0.166667 0.000000 0.500000 0.333333 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[GA]ACGCCGACA[CA][TA][GT]AC -------------------------------------------------------------------------------- Time 1.07 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 7 llr = 120 E-value = 3.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::::::1111:::4:::49 pos.-specific C 3:3:1::16::1:4::::31: probability G :66:9119:69:16a::6741 matrix T 741a:99:33:79::6a4::: bits 2.2 2.0 * * * 1.8 * * * 1.5 * * * Relative 1.3 ***** * * * * * Entropy 1.1 * ***** * *** * * * (24.6 bits) 0.9 ** ***** ********* * 0.7 ******************* * 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TGGTGTTGCGGTTGGTTGGAA consensus CTC TT C A TCG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 4046 119 1.40e-09 CTGTCAATGT TGCTGTTGCAGTTGGATGGCA TGATGGAGTA 1031 35 1.95e-09 ATGCTGCTGC TGCTGTTGTTGTTCGATGCAA CATTGTCATC 36742 277 2.94e-09 TCTGTCCAGC CGGTGTGGTGGTTGGTTTGGA GTTGGAGGTC 23908 138 3.93e-09 CGAGGAGTTG TGGTGGTGAGGTTGGATTGGA GGATCTGAGC 9420 142 1.85e-08 CCACCAACTC TTGTCTTGCGGAGCGTTGGGA GGTCGGTCAA 8138 310 1.85e-08 GAGCACTACT TTTTGTTGCGATTGGTTGGAG GCCCACGTGG 1082 405 2.95e-08 ATCTCCAATC CTGTGTTCCTGCTCGTTTCAA TGAAGTTGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 4046 1.4e-09 118_[+2]_361 1031 1.9e-09 34_[+2]_445 36742 2.9e-09 276_[+2]_203 23908 3.9e-09 137_[+2]_342 9420 1.8e-08 141_[+2]_338 8138 1.8e-08 309_[+2]_170 1082 3e-08 404_[+2]_75 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=7 4046 ( 119) TGCTGTTGCAGTTGGATGGCA 1 1031 ( 35) TGCTGTTGTTGTTCGATGCAA 1 36742 ( 277) CGGTGTGGTGGTTGGTTTGGA 1 23908 ( 138) TGGTGGTGAGGTTGGATTGGA 1 9420 ( 142) TTGTCTTGCGGAGCGTTGGGA 1 8138 ( 310) TTTTGTTGCGATTGGTTGGAG 1 1082 ( 405) CTGTGTTCCTGCTCGTTTCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 9.40072 E= 3.5e+000 -945 40 -945 140 -945 -945 115 67 -945 40 115 -92 -945 -945 -945 189 -945 -60 173 -945 -945 -945 -85 167 -945 -945 -85 167 -945 -60 173 -945 -84 139 -945 8 -84 -945 115 8 -84 -945 173 -945 -84 -60 -945 140 -945 -945 -85 167 -945 98 115 -945 -945 -945 196 -945 74 -945 -945 108 -945 -945 -945 189 -945 -945 115 67 -945 40 147 -945 74 -60 74 -945 174 -945 -85 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 3.5e+000 0.000000 0.285714 0.000000 0.714286 0.000000 0.000000 0.571429 0.428571 0.000000 0.285714 0.571429 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.142857 0.857143 0.000000 0.142857 0.857143 0.000000 0.142857 0.571429 0.000000 0.285714 0.142857 0.000000 0.571429 0.285714 0.142857 0.000000 0.857143 0.000000 0.142857 0.142857 0.000000 0.714286 0.000000 0.000000 0.142857 0.857143 0.000000 0.428571 0.571429 0.000000 0.000000 0.000000 1.000000 0.000000 0.428571 0.000000 0.000000 0.571429 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.571429 0.428571 0.000000 0.285714 0.714286 0.000000 0.428571 0.142857 0.428571 0.000000 0.857143 0.000000 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TC][GT][GC]TGTTG[CT][GT]GTT[GC]G[TA]T[GT][GC][AG]A -------------------------------------------------------------------------------- Time 2.30 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 11 llr = 141 E-value = 4.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 33::31:35:4712::23::5 pos.-specific C 56193486273:515952:a5 probability G ::3::2:1::2:2:2::23:: matrix T 2161542:33233731347:: bits 2.2 * 2.0 * 1.8 * * * 1.5 * * * * Relative 1.3 * * * * * Entropy 1.1 * * * * * *** (18.5 bits) 0.9 * * ** * * * * *** 0.7 **** **** * **** *** 0.4 ***** **** * **** *** 0.2 ********** ****** *** 0.0 --------------------- Multilevel CCTCTCCCACAACTCCCTTCA consensus AAG AT ATTCTT T TAG C sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 261577 477 1.42e-10 AGACCGACGT CCTCTCCCACATCTCCCTGCA GCA 37795 228 3.08e-08 GATTGGTAAA CCGCTGCCTCTACACCCATCC AACTCAACAC 36742 453 5.65e-08 TTCGTTGCTC CCTCCTCAACAACCTCTTTCC TACCTTGAAC 9361 350 2.07e-07 TGCCTCCTAC CATCCTCCTCCTCTCCAGGCA CAATTTAGAG 262215 451 4.05e-07 GAGTAGAGTG TCCCAGCAACAATTCCCGTCA AGTCTGGTTT 1031 285 9.54e-07 CCTCTCTCAA CAGCTATCACCACTGCTATCA TGCACTTTTC 9420 366 1.31e-06 TAAGTCCACC TCTCTCTCATAAATTCTTTCC CCCGACCGAC 8138 115 1.31e-06 TTCCACTGGT CCTCACCGATGAGTTCAATCA AGTTGAAGTG 1082 359 2.18e-06 AGGCGTAATC AATTTTCCCCTTTTCCCCTCA AGCCAAACAT 4046 443 3.25e-06 TTTGGTTTTG ATGCATCACCCATACCCCTCC TAGCTCCAGC 23908 443 4.72e-06 AATTGAGAGT ACTCCCCCTTGAGTGTCTGCC TGTACGTCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 261577 1.4e-10 476_[+3]_3 37795 3.1e-08 227_[+3]_252 36742 5.7e-08 452_[+3]_27 9361 2.1e-07 349_[+3]_130 262215 4e-07 450_[+3]_29 1031 9.5e-07 284_[+3]_195 9420 1.3e-06 365_[+3]_114 8138 1.3e-06 114_[+3]_365 1082 2.2e-06 358_[+3]_121 4046 3.3e-06 442_[+3]_37 23908 4.7e-06 442_[+3]_37 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=11 261577 ( 477) CCTCTCCCACATCTCCCTGCA 1 37795 ( 228) CCGCTGCCTCTACACCCATCC 1 36742 ( 453) CCTCCTCAACAACCTCTTTCC 1 9361 ( 350) CATCCTCCTCCTCTCCAGGCA 1 262215 ( 451) TCCCAGCAACAATTCCCGTCA 1 1031 ( 285) CAGCTATCACCACTGCTATCA 1 9420 ( 366) TCTCTCTCATAAATTCTTTCC 1 8138 ( 115) CCTCACCGATGAGTTCAATCA 1 1082 ( 359) AATTTTCCCCTTTTCCCCTCA 1 4046 ( 443) ATGCATCACCCATACCCCTCC 1 23908 ( 443) ACTCCCCCTTGAGTGTCTGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 8.90388 E= 4.4e+001 9 133 -1010 -57 9 155 -1010 -157 -1010 -125 8 124 -1010 206 -1010 -157 9 33 -1010 75 -149 74 -50 43 -1010 191 -1010 -57 9 155 -150 -1010 109 -26 -1010 1 -1010 174 -1010 1 51 33 -50 -57 151 -1010 -1010 1 -149 107 -50 1 -49 -125 -1010 143 -1010 133 -50 1 -1010 206 -1010 -157 -49 133 -1010 1 9 -26 -50 43 -1010 -1010 8 143 -1010 220 -1010 -1010 109 107 -1010 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 11 E= 4.4e+001 0.272727 0.545455 0.000000 0.181818 0.272727 0.636364 0.000000 0.090909 0.000000 0.090909 0.272727 0.636364 0.000000 0.909091 0.000000 0.090909 0.272727 0.272727 0.000000 0.454545 0.090909 0.363636 0.181818 0.363636 0.000000 0.818182 0.000000 0.181818 0.272727 0.636364 0.090909 0.000000 0.545455 0.181818 0.000000 0.272727 0.000000 0.727273 0.000000 0.272727 0.363636 0.272727 0.181818 0.181818 0.727273 0.000000 0.000000 0.272727 0.090909 0.454545 0.181818 0.272727 0.181818 0.090909 0.000000 0.727273 0.000000 0.545455 0.181818 0.272727 0.000000 0.909091 0.000000 0.090909 0.181818 0.545455 0.000000 0.272727 0.272727 0.181818 0.181818 0.363636 0.000000 0.000000 0.272727 0.727273 0.000000 1.000000 0.000000 0.000000 0.545455 0.454545 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA][CA][TG]C[TAC][CT]C[CA][AT][CT][AC][AT][CT]T[CT]C[CT][TA][TG]C[AC] -------------------------------------------------------------------------------- Time 3.31 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1031 1.59e-12 34_[+2(1.95e-09)]_229_\ [+3(9.54e-07)]_74_[+1(1.28e-08)]_105 1082 4.91e-10 358_[+3(2.18e-06)]_25_\ [+2(2.95e-08)]_21_[+1(1.69e-07)]_12_[+2(2.59e-05)]_5 23908 7.68e-08 137_[+2(3.93e-09)]_284_\ [+3(4.72e-06)]_37 261577 1.10e-09 458_[+1(1.34e-07)]_2_[+3(1.42e-10)]_\ 3 262215 7.64e-08 52_[+1(3.97e-09)]_382_\ [+3(4.05e-07)]_29 36742 3.93e-09 276_[+2(2.94e-09)]_155_\ [+3(5.65e-08)]_27 37795 3.01e-04 227_[+3(3.08e-08)]_252 4046 1.67e-07 118_[+2(1.40e-09)]_303_\ [+3(3.25e-06)]_37 8138 1.05e-06 114_[+3(1.31e-06)]_174_\ [+2(1.85e-08)]_127_[+2(2.92e-05)]_22 9361 1.62e-06 113_[+1(2.88e-07)]_220_\ [+3(2.07e-07)]_130 9420 7.68e-12 141_[+2(1.85e-08)]_203_\ [+3(1.31e-06)]_6_[+1(5.26e-09)]_92 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************