******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/111/111.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 15142 1.0000 500 21098 1.0000 500 21193 1.0000 500 23407 1.0000 500 25229 1.0000 500 268881 1.0000 500 31674 1.0000 500 31885 1.0000 500 36429 1.0000 500 3793 1.0000 500 38071 1.0000 500 39149 1.0000 500 39864 1.0000 500 41169 1.0000 500 5514 1.0000 500 9460 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/111/111.seqs.fa -oc motifs/111 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.273 C 0.238 G 0.243 T 0.245 Background letter frequencies (from dataset with add-one prior applied): A 0.273 C 0.239 G 0.243 T 0.245 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 10 llr = 154 E-value = 3.2e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :a1:1:2::::35::4126:: pos.-specific C 9:941a229:6734825718a probability G :::31:51:2:::2:321::: matrix T 1::37:17184:24212:32: bits 2.1 * * 1.9 * * * 1.7 *** * * * 1.4 *** * * * Relative 1.2 *** * ** * ** Entropy 1.0 *** * **** * ** (22.2 bits) 0.8 *** * ***** * * ** 0.6 *** ** ***** ** **** 0.4 ****** ******** **** 0.2 ********************* 0.0 --------------------- Multilevel CACCTCGTCTCCACCACCACC consensus G AC GTACTTGGATT sequence T C TG CT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 23407 365 1.28e-10 TCACACTACA CACCTCATCTTCATCACCTCC ACGCCGTATC 21193 126 2.32e-09 CTACTTCCTT CACTTCGTCGTCCTCGTCACC GAAGTCAACA 9460 428 1.12e-08 CAACTGGATA CACCTCGGCTTCCCCCCCCCC TGGCGTCTGC 41169 432 3.82e-08 CGCCGTCGTC CACGTCCTCTCCCGTCACACC ACCAACTTTT 15142 422 4.17e-08 ACAAAAACCA CACCACGCCTTCATCATCATC GGCGTTGCTG 39864 38 6.32e-08 TACAAATCTG CACTGCTCCTCCACCACAACC ACATAAGACG 31885 68 8.67e-08 AATCACAAAC TACCCCGTCTCATGCACCACC ACAATCTACC 36429 25 9.36e-08 CTGTCTGCCT CACGTCGTCGCATTCTCGACC GCAACAACAG 3793 19 1.79e-07 CTCATCATAC CACTTCCTTTCAACCGGATCC GACAAGCATT 21098 243 3.01e-07 ATCGGTTGCA CAAGTCATCTCCACTGGCTTC GAGCAAAGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23407 1.3e-10 364_[+1]_115 21193 2.3e-09 125_[+1]_354 9460 1.1e-08 427_[+1]_52 41169 3.8e-08 431_[+1]_48 15142 4.2e-08 421_[+1]_58 39864 6.3e-08 37_[+1]_442 31885 8.7e-08 67_[+1]_412 36429 9.4e-08 24_[+1]_455 3793 1.8e-07 18_[+1]_461 21098 3e-07 242_[+1]_237 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=10 23407 ( 365) CACCTCATCTTCATCACCTCC 1 21193 ( 126) CACTTCGTCGTCCTCGTCACC 1 9460 ( 428) CACCTCGGCTTCCCCCCCCCC 1 41169 ( 432) CACGTCCTCTCCCGTCACACC 1 15142 ( 422) CACCACGCCTTCATCATCATC 1 39864 ( 38) CACTGCTCCTCCACCACAACC 1 31885 ( 68) TACCCCGTCTCATGCACCACC 1 36429 ( 25) CACGTCGTCGCATTCTCGACC 1 3793 ( 19) CACTTCCTTTCAACCGGATCC 1 21098 ( 243) CAAGTCATCTCCACTGGCTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 11.0582 E= 3.2e-004 -997 191 -997 -129 187 -997 -997 -997 -145 191 -997 -997 -997 75 30 29 -145 -125 -128 151 -997 207 -997 -997 -45 -25 104 -129 -997 -25 -128 151 -997 191 -997 -129 -997 -997 -28 170 -997 133 -997 70 14 155 -997 -997 87 33 -997 -29 -997 75 -28 70 -997 174 -997 -29 55 -25 30 -129 -145 107 -28 -29 -45 155 -128 -997 114 -125 -997 29 -997 174 -997 -29 -997 207 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 3.2e-004 0.000000 0.900000 0.000000 0.100000 1.000000 0.000000 0.000000 0.000000 0.100000 0.900000 0.000000 0.000000 0.000000 0.400000 0.300000 0.300000 0.100000 0.100000 0.100000 0.700000 0.000000 1.000000 0.000000 0.000000 0.200000 0.200000 0.500000 0.100000 0.000000 0.200000 0.100000 0.700000 0.000000 0.900000 0.000000 0.100000 0.000000 0.000000 0.200000 0.800000 0.000000 0.600000 0.000000 0.400000 0.300000 0.700000 0.000000 0.000000 0.500000 0.300000 0.000000 0.200000 0.000000 0.400000 0.200000 0.400000 0.000000 0.800000 0.000000 0.200000 0.400000 0.200000 0.300000 0.100000 0.100000 0.500000 0.200000 0.200000 0.200000 0.700000 0.100000 0.000000 0.600000 0.100000 0.000000 0.300000 0.000000 0.800000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CAC[CGT]TC[GAC][TC]C[TG][CT][CA][ACT][CTG][CT][AGC][CGT][CA][AT][CT]C -------------------------------------------------------------------------------- Time 2.77 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 13 llr = 139 E-value = 7.4e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :aa285193:86 pos.-specific C a::71361:a2: probability G ::::123:2::4 matrix T :::1::::5::: bits 2.1 * * 1.9 *** * 1.7 *** * 1.4 *** * * Relative 1.2 *** * ** Entropy 1.0 *** * * *** (15.4 bits) 0.8 ***** ** *** 0.6 ******** *** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CAACAACATCAA consensus A CG A G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 21098 165 9.03e-08 TTTGGCAAAA CAACAACATCAA TACATTGATA 31885 117 1.71e-07 TATGCAGCAT CAACAACATCAG ACATCACAAG 21193 152 2.50e-07 TCACCGAAGT CAACACCATCAA AGTCCGTGTA 39864 317 1.77e-06 TCCAAACCAT CAACAGCATCAG ACGCCACCGC 3793 462 4.34e-06 CTTACTCCAC CAAAACCAACAA ACTCGTTCAC 36429 47 4.34e-06 TTCTCGACCG CAACAACAGCCA TCAATGACGT 268881 483 5.48e-06 TTCATCGAAC CAACAAAATCAG CCGACC 25229 481 8.91e-06 ACCCGAAAGA CAACAACCACAA GGCAGCAA 39149 489 1.20e-05 CACCACGCAA CAAAACGAACAG 23407 53 1.31e-05 AAAATGCAAT CAATAGCATCAA AACTATCATT 15142 365 1.31e-05 AACTTGGGGT CAACACGAGCCA ATTAGTGGAC 31674 97 1.56e-05 GACGAGGCGG CAACGAGAGCAA GAACGATCTT 38071 167 3.91e-05 GATGAGATCA CAAACAGAACAG AGAGAGAGGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21098 9e-08 164_[+2]_324 31885 1.7e-07 116_[+2]_372 21193 2.5e-07 151_[+2]_337 39864 1.8e-06 316_[+2]_172 3793 4.3e-06 461_[+2]_27 36429 4.3e-06 46_[+2]_442 268881 5.5e-06 482_[+2]_6 25229 8.9e-06 480_[+2]_8 39149 1.2e-05 488_[+2] 23407 1.3e-05 52_[+2]_436 15142 1.3e-05 364_[+2]_124 31674 1.6e-05 96_[+2]_392 38071 3.9e-05 166_[+2]_322 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=13 21098 ( 165) CAACAACATCAA 1 31885 ( 117) CAACAACATCAG 1 21193 ( 152) CAACACCATCAA 1 39864 ( 317) CAACAGCATCAG 1 3793 ( 462) CAAAACCAACAA 1 36429 ( 47) CAACAACAGCCA 1 268881 ( 483) CAACAAAATCAG 1 25229 ( 481) CAACAACCACAA 1 39149 ( 489) CAAAACGAACAG 1 23407 ( 53) CAATAGCATCAA 1 15142 ( 365) CAACACGAGCCA 1 31674 ( 97) CAACGAGAGCAA 1 38071 ( 167) CAAACAGAACAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 9.76211 E= 7.4e-003 -1035 207 -1035 -1035 187 -1035 -1035 -1035 187 -1035 -1035 -1035 -24 154 -1035 -167 163 -163 -166 -1035 98 37 -66 -1035 -182 137 34 -1035 176 -163 -1035 -1035 17 -1035 -8 91 -1035 207 -1035 -1035 163 -63 -1035 -1035 117 -1035 66 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 7.4e-003 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.230769 0.692308 0.000000 0.076923 0.846154 0.076923 0.076923 0.000000 0.538462 0.307692 0.153846 0.000000 0.076923 0.615385 0.307692 0.000000 0.923077 0.076923 0.000000 0.000000 0.307692 0.000000 0.230769 0.461538 0.000000 1.000000 0.000000 0.000000 0.846154 0.153846 0.000000 0.000000 0.615385 0.000000 0.384615 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CAA[CA]A[AC][CG]A[TAG]CA[AG] -------------------------------------------------------------------------------- Time 5.82 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 12 llr = 127 E-value = 8.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :33::3::a2:7 pos.-specific C 8237::a::::: probability G 233:a::a::93 matrix T 1333:7:::81: bits 2.1 * ** 1.9 * *** 1.7 * *** * 1.4 * ***** Relative 1.2 * ***** Entropy 1.0 * ********* (15.2 bits) 0.8 * ********* 0.6 * ********* 0.4 * ********* 0.2 * ********* 0.0 ------------ Multilevel CAACGTCGATGA consensus GCT A G sequence TG T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 21193 6 5.72e-08 TCTAC CACCGTCGATGA TGCAACAGAC 9460 325 1.74e-07 AAGGTTGTAT CATCGTCGATGA AGAAACACTT 268881 387 6.67e-07 GAGGTGACGA CGACGTCGATGA AAGTGAGAGC 15142 129 1.98e-06 ACATTTCCGT CGGCGTCGATGG TGGCTTCGTT 38071 78 2.77e-06 GCATGGCCGT CGTCGACGATGA CGTATCCGTC 39864 285 5.38e-06 GTGGAGTACC CTCTGACGATGA GCCCTCCCCG 21098 120 1.07e-05 GATACTTTCC GAACGACGATGA TGCTACGGTA 25229 230 1.28e-05 TCGCTGCGTT CTGCGTCGAAGG ATGGAGGCCT 31674 428 2.05e-05 TCAAGATACG CAGTGTCGATTA GGTTTTGTCC 39149 312 2.21e-05 TGGGCGGTGG TTACGTCGATGG GATCGCTCAT 3793 263 2.61e-05 GCTGATAAGC CCCTGTCGAAGG AGACTTGAGA 41169 51 2.72e-05 CGGTTCCGTC GCTTGACGATGA GTCTATGCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21193 5.7e-08 5_[+3]_483 9460 1.7e-07 324_[+3]_164 268881 6.7e-07 386_[+3]_102 15142 2e-06 128_[+3]_360 38071 2.8e-06 77_[+3]_411 39864 5.4e-06 284_[+3]_204 21098 1.1e-05 119_[+3]_369 25229 1.3e-05 229_[+3]_259 31674 2e-05 427_[+3]_61 39149 2.2e-05 311_[+3]_177 3793 2.6e-05 262_[+3]_226 41169 2.7e-05 50_[+3]_438 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=12 21193 ( 6) CACCGTCGATGA 1 9460 ( 325) CATCGTCGATGA 1 268881 ( 387) CGACGTCGATGA 1 15142 ( 129) CGGCGTCGATGG 1 38071 ( 78) CGTCGACGATGA 1 39864 ( 285) CTCTGACGATGA 1 21098 ( 120) GAACGACGATGA 1 25229 ( 230) CTGCGTCGAAGG 1 31674 ( 428) CAGTGTCGATTA 1 39149 ( 312) TTACGTCGATGG 1 3793 ( 263) CCCTGTCGAAGG 1 41169 ( 51) GCTTGACGATGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 10.4472 E= 8.0e-001 -1023 165 -54 -156 29 -52 4 3 -13 7 4 3 -1023 148 -1023 44 -1023 -1023 204 -1023 29 -1023 -1023 144 -1023 207 -1023 -1023 -1023 -1023 204 -1023 187 -1023 -1023 -1023 -71 -1023 -1023 176 -1023 -1023 191 -156 129 -1023 45 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 8.0e-001 0.000000 0.750000 0.166667 0.083333 0.333333 0.166667 0.250000 0.250000 0.250000 0.250000 0.250000 0.250000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.916667 0.083333 0.666667 0.000000 0.333333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[AGT][ACGT][CT]G[TA]CGATG[AG] -------------------------------------------------------------------------------- Time 8.49 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 15142 3.51e-08 11_[+3(4.31e-05)]_105_\ [+3(1.98e-06)]_224_[+2(1.31e-05)]_45_[+1(4.17e-08)]_58 21098 1.05e-08 119_[+3(1.07e-05)]_33_\ [+2(9.03e-08)]_66_[+1(3.01e-07)]_237 21193 2.22e-12 5_[+3(5.72e-08)]_108_[+1(2.32e-09)]_\ 5_[+2(2.50e-07)]_337 23407 5.70e-08 52_[+2(1.31e-05)]_300_\ [+1(1.28e-10)]_115 25229 1.80e-03 185_[+3(6.41e-05)]_32_\ [+3(1.28e-05)]_239_[+2(8.91e-06)]_8 268881 5.53e-05 386_[+3(6.67e-07)]_84_\ [+2(5.48e-06)]_6 31674 4.18e-03 96_[+2(1.56e-05)]_319_\ [+3(2.05e-05)]_61 31885 5.96e-07 67_[+1(8.67e-08)]_28_[+2(1.71e-07)]_\ 372 36429 1.03e-05 24_[+1(9.36e-08)]_1_[+2(4.34e-06)]_\ 224_[+1(9.60e-05)]_197 3793 5.06e-07 18_[+1(1.79e-07)]_90_[+3(7.38e-05)]_\ 121_[+3(2.61e-05)]_187_[+2(4.34e-06)]_27 38071 6.91e-04 77_[+3(2.77e-06)]_77_[+2(3.91e-05)]_\ 322 39149 2.77e-03 311_[+3(2.21e-05)]_165_\ [+2(1.20e-05)] 39864 2.06e-08 37_[+1(6.32e-08)]_226_\ [+3(5.38e-06)]_20_[+2(1.77e-06)]_172 41169 3.09e-05 50_[+3(2.72e-05)]_369_\ [+1(3.82e-08)]_48 5514 7.18e-01 500 9460 1.05e-07 324_[+3(1.74e-07)]_91_\ [+1(1.12e-08)]_52 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************