******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/105/105.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10791 1.0000 500 1093 1.0000 500 21739 1.0000 500 21918 1.0000 500 22941 1.0000 500 24566 1.0000 500 25058 1.0000 500 25060 1.0000 500 25061 1.0000 500 25337 1.0000 500 260919 1.0000 500 268533 1.0000 500 33601 1.0000 500 9710 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/105/105.seqs.fa -oc motifs/105 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.234 G 0.236 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.234 G 0.236 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 13 llr = 141 E-value = 4.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 8229226:43:12729 pos.-specific C :1::2:221:2:31:: probability G 228:68:55289228: matrix T :5:1::23:5::3::1 bits 2.1 1.9 1.7 * 1.5 ** ** ** Relative 1.3 * ** * ** ** Entropy 1.0 * ** * ** ** (15.7 bits) 0.8 * ** * ** *** 0.6 * ***** * ** *** 0.4 * ********** *** 0.2 ************ *** 0.0 ---------------- Multilevel ATGAGGAGGTGGCAGA consensus A AATTAA TG sequence G C G A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 21918 127 4.57e-10 GTTGACGAGG ATGAGGAGGTGGTAGA TTTGTCGTGC 25060 172 4.52e-08 AAGAGCGAGG ATGAGGACGAGGGAGA GTATGTGAAA 9710 166 2.65e-07 TCATCTTTGC AGGAAGAGAGGGCAGA CGATATGATT 21739 391 3.44e-07 CGGTGTTGCG AGGAGGAGCTGGAAGA AAGGCTCTGT 24566 251 1.35e-06 GGTCACTCTA ACGAGGCTGAGGTAGA GCAACTGATG 25058 82 5.38e-06 ACAGTGGAAG AAGAAGAGGTGGCGGT GGTGACACTT 260919 97 8.66e-06 GAAAGACTCA AGGAGATTATGGCAAA GCCTTTTGTC 22941 121 9.32e-06 ATCTTGATTA GTGAGGATAGGATAGA GGAGCGATGA 25061 158 1.34e-05 CTCATCGGGG ATGACATCATGGAGGA GATGTTCCGT 33601 254 1.54e-05 CTTCAGAGTA AAAAGATTGGGGAAGA AGCTCCCGAT 268533 477 1.88e-05 GAGTGGGTAC ATGTGGAGGAGGCCAA ATTGGAAA 25337 257 1.88e-05 GACTGGTAGA GTGACGACGACGTGGA CCGAAGATCA 10791 435 4.08e-05 GGTACACAAT AAAAAGCGATCGGAGA GGAGAGGGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21918 4.6e-10 126_[+1]_358 25060 4.5e-08 171_[+1]_313 9710 2.6e-07 165_[+1]_319 21739 3.4e-07 390_[+1]_94 24566 1.3e-06 250_[+1]_234 25058 5.4e-06 81_[+1]_403 260919 8.7e-06 96_[+1]_388 22941 9.3e-06 120_[+1]_364 25061 1.3e-05 157_[+1]_327 33601 1.5e-05 253_[+1]_231 268533 1.9e-05 476_[+1]_8 25337 1.9e-05 256_[+1]_228 10791 4.1e-05 434_[+1]_50 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=13 21918 ( 127) ATGAGGAGGTGGTAGA 1 25060 ( 172) ATGAGGACGAGGGAGA 1 9710 ( 166) AGGAAGAGAGGGCAGA 1 21739 ( 391) AGGAGGAGCTGGAAGA 1 24566 ( 251) ACGAGGCTGAGGTAGA 1 25058 ( 82) AAGAAGAGGTGGCGGT 1 260919 ( 97) AGGAGATTATGGCAAA 1 22941 ( 121) GTGAGGATAGGATAGA 1 25061 ( 158) ATGACATCATGGAGGA 1 33601 ( 254) AAAAGATTGGGGAAGA 1 268533 ( 477) ATGTGGAGGAGGCCAA 1 25337 ( 257) GTGACGACGACGTGGA 1 10791 ( 435) AAAAAGCGATCGGAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 9.55736 E= 4.9e+001 166 -1035 -62 -1035 -21 -160 -3 81 -80 -1035 184 -1035 179 -1035 -1035 -177 -21 -60 138 -1035 -21 -1035 171 -1035 120 -60 -1035 -19 -1035 -2 97 23 52 -160 119 -1035 20 -1035 -3 81 -1035 -60 184 -1035 -180 -1035 197 -1035 -21 40 -62 23 137 -160 -3 -1035 -80 -1035 184 -1035 179 -1035 -1035 -177 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 13 E= 4.9e+001 0.846154 0.000000 0.153846 0.000000 0.230769 0.076923 0.230769 0.461538 0.153846 0.000000 0.846154 0.000000 0.923077 0.000000 0.000000 0.076923 0.230769 0.153846 0.615385 0.000000 0.230769 0.000000 0.769231 0.000000 0.615385 0.153846 0.000000 0.230769 0.000000 0.230769 0.461538 0.307692 0.384615 0.076923 0.538462 0.000000 0.307692 0.000000 0.230769 0.461538 0.000000 0.153846 0.846154 0.000000 0.076923 0.000000 0.923077 0.000000 0.230769 0.307692 0.153846 0.307692 0.692308 0.076923 0.230769 0.000000 0.153846 0.000000 0.846154 0.000000 0.923077 0.000000 0.000000 0.076923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[TAG]GA[GA][GA][AT][GTC][GA][TAG]GG[CTA][AG]GA -------------------------------------------------------------------------------- Time 1.82 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 14 llr = 138 E-value = 7.4e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 42:191:::412 pos.-specific C :44919:86:98 probability G :42::::2:2:: matrix T 6:4:::a:44:: bits 2.1 1.9 * 1.7 ** 1.5 **** * Relative 1.3 ***** ** Entropy 1.0 * ****** ** (14.2 bits) 0.8 * ****** ** 0.6 * ****** ** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGTCACTCCTCC consensus ACC GTA A sequence AG G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 9710 310 5.38e-07 CCAAACATAC TCTCACTCTTCC AAGAAGTCGT 25337 88 7.84e-07 CTCCGTCAAT TGTCACTCCGCC GGTGTCTTCC 25061 289 2.55e-06 CAACCCCCTG TGGCACTCTACC AGCGTCAAAA 25058 417 3.95e-06 CCAGGGGGTC TGGCACTCTGCC TGCCTGAATA 25060 273 5.25e-06 ACTTTGCTTT AGGCACTCTACC TTCGTTTGAA 21739 457 9.51e-06 CGTTCCCCGT TCTCACTCCAAC TCGACGGGTC 22941 302 1.26e-05 AATATCCAAC AGCCACTGTACC GAAGCTGACT 24566 329 1.80e-05 CAACTAATTC TCTCACTGCTCA AGTTACTTGA 260919 485 1.97e-05 CAGCAACCGC AACCACTCCTCA TCAA 21918 351 1.97e-05 TGGTCGGCCT ACTCACTGTGCC TTTTTCTCTC 268533 414 2.67e-05 CAGCTCTGGC TACCCCTCCTCC AGTTCCATCG 33601 482 2.86e-05 GAACTCGACG AACAACTCCTCC AGCAACC 10791 285 3.02e-05 TGCCTGCACC TCTCACTCCTAA AGAAGAAGGA 1093 489 7.73e-05 TCAATACCTC TGCAAATCCACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9710 5.4e-07 309_[+2]_179 25337 7.8e-07 87_[+2]_401 25061 2.6e-06 288_[+2]_200 25058 3.9e-06 416_[+2]_72 25060 5.2e-06 272_[+2]_216 21739 9.5e-06 456_[+2]_32 22941 1.3e-05 301_[+2]_187 24566 1.8e-05 328_[+2]_160 260919 2e-05 484_[+2]_4 21918 2e-05 350_[+2]_138 268533 2.7e-05 413_[+2]_75 33601 2.9e-05 481_[+2]_7 10791 3e-05 284_[+2]_204 1093 7.7e-05 488_[+2] -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=14 9710 ( 310) TCTCACTCTTCC 1 25337 ( 88) TGTCACTCCGCC 1 25061 ( 289) TGGCACTCTACC 1 25058 ( 417) TGGCACTCTGCC 1 25060 ( 273) AGGCACTCTACC 1 21739 ( 457) TCTCACTCCAAC 1 22941 ( 302) AGCCACTGTACC 1 24566 ( 329) TCTCACTGCTCA 1 260919 ( 485) AACCACTCCTCA 1 21918 ( 351) ACTCACTGTGCC 1 268533 ( 414) TACCCCTCCTCC 1 33601 ( 482) AACAACTCCTCC 1 10791 ( 285) TCTCACTCCTAA 1 1093 ( 489) TGCAAATCCACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.53747 E= 7.4e-002 42 -1045 -1045 129 -32 61 86 -1045 -1045 61 -14 70 -91 187 -1045 -1045 179 -171 -1045 -1045 -190 199 -1045 -1045 -1045 -1045 -1045 193 -1045 175 -14 -1045 -1045 129 -1045 70 42 -1045 -14 70 -91 187 -1045 -1045 -32 175 -1045 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 7.4e-002 0.357143 0.000000 0.000000 0.642857 0.214286 0.357143 0.428571 0.000000 0.000000 0.357143 0.214286 0.428571 0.142857 0.857143 0.000000 0.000000 0.928571 0.071429 0.000000 0.000000 0.071429 0.928571 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.785714 0.214286 0.000000 0.000000 0.571429 0.000000 0.428571 0.357143 0.000000 0.214286 0.428571 0.142857 0.857143 0.000000 0.000000 0.214286 0.785714 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TA][GCA][TCG]CACT[CG][CT][TAG]C[CA] -------------------------------------------------------------------------------- Time 3.56 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 10 llr = 119 E-value = 2.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :7142:23::::221 pos.-specific C a2753a827989:59 probability G ::::1::3::2:::: matrix T :1214::231:183: bits 2.1 * * 1.9 * * 1.7 * * * * * 1.5 * * *** * Relative 1.3 * ** ***** * Entropy 1.0 * ** ***** * (17.1 bits) 0.8 *** ** ***** * 0.6 **** ** ***** * 0.4 **** ** ******* 0.2 ******* ******* 0.0 --------------- Multilevel CACCTCCACCCCTCC consensus CTAC AGT G AT sequence A C A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 21918 152 2.44e-08 ATTTGTCGTG CACCGCCGCCCCTCC GTCGTTGTCA 260919 446 1.57e-07 AAGACCGCCC CACAACCACCCCTAC TCGACCGACT 33601 43 3.28e-07 GCCACATCGA CACCCCCCCCCCTCA AGTGACGCTG 1093 420 4.07e-07 AGACAGTGAC CATCCCCCCCCCTAC GTCAGCACAC 25058 1 8.98e-07 . CCCATCCACCCCATC CTCCGCCTAC 22941 470 1.68e-06 ATATATCACA CACATCCATCGCACC ACAATCATCA 268533 18 2.70e-06 CCCATTACAC CAAACCAGCCCCTTC CCGTACACAC 25060 66 5.51e-06 CGTTGTCATT CTCCTCATTCCCTTC GTTGTTTTGG 9710 275 8.00e-06 CTTCCTCTCT CCTCACCTCTCCTCC GAGGAGTTTT 21739 438 9.55e-06 GGAAACAGTA CACTTCCGTCGTTCC CCGTTCTCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21918 2.4e-08 151_[+3]_334 260919 1.6e-07 445_[+3]_40 33601 3.3e-07 42_[+3]_443 1093 4.1e-07 419_[+3]_66 25058 9e-07 [+3]_485 22941 1.7e-06 469_[+3]_16 268533 2.7e-06 17_[+3]_468 25060 5.5e-06 65_[+3]_420 9710 8e-06 274_[+3]_211 21739 9.5e-06 437_[+3]_48 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=10 21918 ( 152) CACCGCCGCCCCTCC 1 260919 ( 446) CACAACCACCCCTAC 1 33601 ( 43) CACCCCCCCCCCTCA 1 1093 ( 420) CATCCCCCCCCCTAC 1 25058 ( 1) CCCATCCACCCCATC 1 22941 ( 470) CACATCCATCGCACC 1 268533 ( 18) CAAACCAGCCCCTTC 1 25060 ( 66) CTCCTCATTCCCTTC 1 9710 ( 275) CCTCACCTCTCCTCC 1 21739 ( 438) CACTTCCGTCGTTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6804 bayes= 9.66 E= 2.1e+001 -997 210 -997 -997 139 -22 -997 -139 -142 158 -997 -39 58 110 -997 -139 -42 36 -124 60 -997 210 -997 -997 -42 178 -997 -997 16 -22 35 -39 -997 158 -997 19 -997 194 -997 -139 -997 178 -24 -997 -997 194 -997 -139 -42 -997 -997 160 -42 110 -997 19 -142 194 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 10 E= 2.1e+001 0.000000 1.000000 0.000000 0.000000 0.700000 0.200000 0.000000 0.100000 0.100000 0.700000 0.000000 0.200000 0.400000 0.500000 0.000000 0.100000 0.200000 0.300000 0.100000 0.400000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.300000 0.200000 0.300000 0.200000 0.000000 0.700000 0.000000 0.300000 0.000000 0.900000 0.000000 0.100000 0.000000 0.800000 0.200000 0.000000 0.000000 0.900000 0.000000 0.100000 0.200000 0.000000 0.000000 0.800000 0.200000 0.500000 0.000000 0.300000 0.100000 0.900000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[AC][CT][CA][TCA]C[CA][AGCT][CT]C[CG]C[TA][CTA]C -------------------------------------------------------------------------------- Time 5.36 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10791 1.38e-03 284_[+2(3.02e-05)]_138_\ [+1(4.08e-05)]_50 1093 1.96e-04 419_[+3(4.07e-07)]_54_\ [+2(7.73e-05)] 21739 7.50e-07 390_[+1(3.44e-07)]_31_\ [+3(9.55e-06)]_4_[+2(9.51e-06)]_32 21918 1.32e-11 126_[+1(4.57e-10)]_9_[+3(2.44e-08)]_\ 184_[+2(1.97e-05)]_138 22941 3.93e-06 120_[+1(9.32e-06)]_165_\ [+2(1.26e-05)]_156_[+3(1.68e-06)]_16 24566 2.31e-04 3_[+1(4.08e-05)]_231_[+1(1.35e-06)]_\ 62_[+2(1.80e-05)]_160 25058 4.82e-07 [+3(8.98e-07)]_66_[+1(5.38e-06)]_38_\ [+1(6.49e-05)]_90_[+1(7.48e-05)]_159_[+2(3.95e-06)]_72 25060 4.20e-08 65_[+3(5.51e-06)]_91_[+1(4.52e-08)]_\ 85_[+2(5.25e-06)]_216 25061 1.48e-04 157_[+1(1.34e-05)]_115_\ [+2(2.55e-06)]_61_[+1(8.57e-05)]_123 25337 3.07e-04 87_[+2(7.84e-07)]_157_\ [+1(1.88e-05)]_228 260919 6.52e-07 96_[+1(8.66e-06)]_333_\ [+3(1.57e-07)]_24_[+2(1.97e-05)]_4 268533 2.16e-05 17_[+3(2.70e-06)]_381_\ [+2(2.67e-05)]_51_[+1(1.88e-05)]_8 33601 2.95e-06 42_[+3(3.28e-07)]_196_\ [+1(1.54e-05)]_212_[+2(2.86e-05)]_7 9710 3.71e-08 46_[+1(8.96e-05)]_103_\ [+1(2.65e-07)]_93_[+3(8.00e-06)]_20_[+2(5.38e-07)]_179 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************