******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/75/75.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10203 1.0000 500 13175 1.0000 500 1964 1.0000 500 2122 1.0000 500 21892 1.0000 500 23033 1.0000 500 24285 1.0000 500 264314 1.0000 500 268650 1.0000 500 35103 1.0000 500 7044 1.0000 500 9741 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/75/75.seqs.fa -oc motifs/75 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.245 G 0.220 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.272 C 0.246 G 0.220 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 12 llr = 138 E-value = 6.9e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1::3::131:32:132 pos.-specific C 1::125:::112:22: probability G 83:73:92:954:655 matrix T :8a:65:69:23a2:3 bits 2.2 2.0 * * 1.7 * * * * 1.5 * * ** * Relative 1.3 * * * ** * Entropy 1.1 *** * ** * (16.6 bits) 0.9 **** ** ** * 0.7 ********** * ** 0.4 ********** **** 0.2 **************** 0.0 ---------------- Multilevel GTTGTCGTTGGGTGGG consensus G AGT A AT AT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 23033 118 2.03e-10 GTTTGGTAGT GTTGTTGTTGGGTGGG GTTCGTCGTC 2122 372 4.09e-09 CGTCATCGTC GTTGTCGTTGGTTGGT CTTTGCCAAA 13175 382 1.15e-07 TGGAGAAGAT GTTGCCGTTGGTTGCG TTGCGTTTGC 35103 300 5.45e-07 TTTAGATCCA ATTGTTGATGGGTGGT AGGCTAGTTC 10203 71 1.61e-06 TGGACAGTCG CTTGTTGTTGTGTCGG CCATTGCTGT 21892 169 2.70e-06 CTCGTTTGGA GTTGTTGGAGACTGGG GTGGGCACAA 9741 351 3.19e-06 GAGTGATGCA GGTGGCGGTCGGTGAG CAGGAGGCCA 268650 251 4.37e-06 CGAGAATGAA GGTGGTGATGATTGAA ATGCCATCGT 264314 109 1.09e-05 AATATATGTT GGTAGCGTTGCATTGG ATCTGACCGC 7044 370 1.16e-05 GCTGTGGTGA GTTATTATTGGGTAAT TATTTGATAC 1964 444 1.87e-05 CCCTCCATCT GTTCTCGTTGAATCCA TCCACCAGTC 24285 143 2.21e-05 CGGTCATGAA GTTACCGATGTCTTAT CGGATGTGGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23033 2e-10 117_[+1]_367 2122 4.1e-09 371_[+1]_113 13175 1.1e-07 381_[+1]_103 35103 5.5e-07 299_[+1]_185 10203 1.6e-06 70_[+1]_414 21892 2.7e-06 168_[+1]_316 9741 3.2e-06 350_[+1]_134 268650 4.4e-06 250_[+1]_234 264314 1.1e-05 108_[+1]_376 7044 1.2e-05 369_[+1]_115 1964 1.9e-05 443_[+1]_41 24285 2.2e-05 142_[+1]_342 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=12 23033 ( 118) GTTGTTGTTGGGTGGG 1 2122 ( 372) GTTGTCGTTGGTTGGT 1 13175 ( 382) GTTGCCGTTGGTTGCG 1 35103 ( 300) ATTGTTGATGGGTGGT 1 10203 ( 71) CTTGTTGTTGTGTCGG 1 21892 ( 169) GTTGTTGGAGACTGGG 1 9741 ( 351) GGTGGCGGTCGGTGAG 1 268650 ( 251) GGTGGTGATGATTGAA 1 264314 ( 109) GGTAGCGTTGCATTGG 1 7044 ( 370) GTTATTATTGGGTAAT 1 1964 ( 444) GTTCTCGTTGAATCCA 1 24285 ( 143) GTTACCGATGTCTTAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 9.36712 E= 6.9e-002 -170 -156 192 -1023 -1023 -1023 19 151 -1023 -1023 -1023 193 -12 -156 160 -1023 -1023 -56 19 115 -1023 103 -1023 93 -170 -1023 206 -1023 -12 -1023 -40 115 -170 -1023 -1023 180 -1023 -156 206 -1023 -12 -156 118 -66 -70 -56 92 -7 -1023 -1023 -1023 193 -170 -56 141 -66 29 -56 118 -1023 -70 -1023 118 34 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 12 E= 6.9e-002 0.083333 0.083333 0.833333 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.250000 0.083333 0.666667 0.000000 0.000000 0.166667 0.250000 0.583333 0.000000 0.500000 0.000000 0.500000 0.083333 0.000000 0.916667 0.000000 0.250000 0.000000 0.166667 0.583333 0.083333 0.000000 0.000000 0.916667 0.000000 0.083333 0.916667 0.000000 0.250000 0.083333 0.500000 0.166667 0.166667 0.166667 0.416667 0.250000 0.000000 0.000000 0.000000 1.000000 0.083333 0.166667 0.583333 0.166667 0.333333 0.166667 0.500000 0.000000 0.166667 0.000000 0.500000 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[TG]T[GA][TG][CT]G[TA]TG[GA][GT]TG[GA][GT] -------------------------------------------------------------------------------- Time 1.22 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 5 llr = 91 E-value = 5.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 284868:a226a:4::22a pos.-specific C ::::::::::2:::2:::: probability G 826:4:a:882:a62288: matrix T :::2:2::::::::68::: bits 2.2 * * 2.0 ** ** * 1.7 ** ** * 1.5 ** ** * Relative 1.3 ** **** ** **** Entropy 1.1 ********** *** **** (26.3 bits) 0.9 ********** *** **** 0.7 ******************* 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel GAGAAAGAGGAAGGTTGGA consensus AGATGT AAC ACGAA sequence G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 1964 50 2.50e-12 GCCAGTGCCC GAGAAAGAGGAAGGTTGGA ATATTAGCGT 10203 46 7.25e-10 GGCATGGGAT GGGAGAGAGGCAGATTGGA CAGTCGCTTG 2122 395 4.05e-09 GGTCTTTGCC AAAAATGAGGGAGGTTGGA GGCTGGCAGA 7044 275 1.60e-08 GGTTGAATAC GAATAAGAAGAAGGCTAGA GAGTTTTTGG 268650 313 1.66e-08 ACTGCGAGAG GAGAGAGAGAAAGAGGGAA GGGGAGAAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1964 2.5e-12 49_[+2]_432 10203 7.3e-10 45_[+2]_436 2122 4.1e-09 394_[+2]_87 7044 1.6e-08 274_[+2]_207 268650 1.7e-08 312_[+2]_169 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=5 1964 ( 50) GAGAAAGAGGAAGGTTGGA 1 10203 ( 46) GGGAGAGAGGCAGATTGGA 1 2122 ( 395) AAAAATGAGGGAGGTTGGA 1 7044 ( 275) GAATAAGAAGAAGGCTAGA 1 268650 ( 313) GAGAGAGAGAAAGAGGGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 5784 bayes= 10.4264 E= 5.3e+001 -44 -897 186 -897 156 -897 -14 -897 56 -897 145 -897 156 -897 -897 -39 114 -897 86 -897 156 -897 -897 -39 -897 -897 218 -897 188 -897 -897 -897 -44 -897 186 -897 -44 -897 186 -897 114 -30 -14 -897 188 -897 -897 -897 -897 -897 218 -897 56 -897 145 -897 -897 -30 -14 119 -897 -897 -14 160 -44 -897 186 -897 -44 -897 186 -897 188 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 5 E= 5.3e+001 0.200000 0.000000 0.800000 0.000000 0.800000 0.000000 0.200000 0.000000 0.400000 0.000000 0.600000 0.000000 0.800000 0.000000 0.000000 0.200000 0.600000 0.000000 0.400000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.800000 0.000000 0.600000 0.200000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.200000 0.200000 0.600000 0.000000 0.000000 0.200000 0.800000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GA][AG][GA][AT][AG][AT]GA[GA][GA][ACG]AG[GA][TCG][TG][GA][GA]A -------------------------------------------------------------------------------- Time 2.42 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 12 llr = 132 E-value = 3.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a1283:52:66:851a pos.-specific C :7732542a33a:28: probability G ::2:1213::::3:2: matrix T :3::53:3:12::3:: bits 2.2 2.0 * * * * 1.7 * * * * 1.5 * * * * Relative 1.3 * * * * Entropy 1.1 * * * ** ** (15.8 bits) 0.9 **** * ** ** 0.7 **** ** ** ** ** 0.4 **** ** ******** 0.2 ******* ******** 0.0 ---------------- Multilevel ACCATCAGCAACAACA consensus T CATCT CC GT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 2122 101 2.08e-08 AGTAGAACTA ACCATTATCAACATCA AACCCCATCC 7044 439 3.37e-08 TTCATTTAAT ACCATTCGCACCAACA GGAACTAGCA 264314 335 2.40e-07 CTTCTCTCCT ACCACCACCCACAACA ATCGTAGCAG 24285 411 3.77e-07 GATCTCAATC ATCATTCACAACAACA ACAACAAACA 23033 475 1.55e-06 GACGACGTAG ACCATCACCCTCACCA CCGAGCCACC 13175 462 3.94e-06 TACCAACCCA ACAATCAACAACGCCA TAGACACAAT 9741 471 4.32e-06 CTCACCATTC ATCCTTCTCCCCATCA CATCACTGCA 21892 461 1.24e-05 CAAACAGGAG ACGAAGAGCAACAAAA AGCAAGTTCC 10203 460 1.24e-05 CGTGAGCCGC AACAGCGGCAACAACA GCAGGAGGGT 1964 210 1.62e-05 CTCACGCATC ATCACGCGCACCATGA TTTTTGCTTC 268650 230 1.95e-05 TATCAAACGA ACGCACCTCCACGAGA ATGAAGGTGG 35103 400 5.07e-05 AACAAGCCGA ACACACATCTTCGTCA CCCCTTTTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 2122 2.1e-08 100_[+3]_384 7044 3.4e-08 438_[+3]_46 264314 2.4e-07 334_[+3]_150 24285 3.8e-07 410_[+3]_74 23033 1.6e-06 474_[+3]_10 13175 3.9e-06 461_[+3]_23 9741 4.3e-06 470_[+3]_14 21892 1.2e-05 460_[+3]_24 10203 1.2e-05 459_[+3]_25 1964 1.6e-05 209_[+3]_275 268650 1.9e-05 229_[+3]_255 35103 5.1e-05 399_[+3]_85 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=12 2122 ( 101) ACCATTATCAACATCA 1 7044 ( 439) ACCATTCGCACCAACA 1 264314 ( 335) ACCACCACCCACAACA 1 24285 ( 411) ATCATTCACAACAACA 1 23033 ( 475) ACCATCACCCTCACCA 1 13175 ( 462) ACAATCAACAACGCCA 1 9741 ( 471) ATCCTTCTCCCCATCA 1 21892 ( 461) ACGAAGAGCAACAAAA 1 10203 ( 460) AACAGCGGCAACAACA 1 1964 ( 210) ATCACGCGCACCATGA 1 268650 ( 230) ACGCACCTCCACGAGA 1 35103 ( 400) ACACACATCTTCGTCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 9.36712 E= 3.3e+001 188 -1023 -1023 -1023 -170 144 -1023 -7 -70 144 -40 -1023 146 3 -1023 -1023 -12 -56 -140 93 -1023 103 -40 34 88 76 -140 -1023 -70 -56 60 34 -1023 203 -1023 -1023 110 44 -1023 -166 110 3 -1023 -66 -1023 203 -1023 -1023 146 -1023 19 -1023 88 -56 -1023 34 -170 161 -40 -1023 188 -1023 -1023 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 12 E= 3.3e+001 1.000000 0.000000 0.000000 0.000000 0.083333 0.666667 0.000000 0.250000 0.166667 0.666667 0.166667 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.166667 0.083333 0.500000 0.000000 0.500000 0.166667 0.333333 0.500000 0.416667 0.083333 0.000000 0.166667 0.166667 0.333333 0.333333 0.000000 1.000000 0.000000 0.000000 0.583333 0.333333 0.000000 0.083333 0.583333 0.250000 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.500000 0.166667 0.000000 0.333333 0.083333 0.750000 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[CT]C[AC][TA][CT][AC][GT]C[AC][AC]C[AG][AT]CA -------------------------------------------------------------------------------- Time 3.71 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10203 6.48e-10 45_[+2(7.25e-10)]_6_[+1(1.61e-06)]_\ 373_[+3(1.24e-05)]_25 13175 1.50e-05 381_[+1(1.15e-07)]_64_\ [+3(3.94e-06)]_23 1964 4.12e-11 49_[+2(2.50e-12)]_141_\ [+3(1.62e-05)]_218_[+1(1.87e-05)]_41 2122 2.93e-14 100_[+3(2.08e-08)]_255_\ [+1(4.09e-09)]_7_[+2(4.05e-09)]_87 21892 4.56e-04 168_[+1(2.70e-06)]_276_\ [+3(1.24e-05)]_24 23033 1.16e-08 117_[+1(2.03e-10)]_341_\ [+3(1.55e-06)]_10 24285 1.78e-04 142_[+1(2.21e-05)]_252_\ [+3(3.77e-07)]_48_[+3(7.89e-06)]_10 264314 6.80e-05 108_[+1(1.09e-05)]_210_\ [+3(2.40e-07)]_150 268650 4.43e-08 229_[+3(1.95e-05)]_5_[+1(4.37e-06)]_\ 46_[+2(1.66e-08)]_169 35103 3.69e-04 299_[+1(5.45e-07)]_84_\ [+3(5.07e-05)]_85 7044 2.97e-10 146_[+2(6.39e-05)]_109_\ [+2(1.60e-08)]_76_[+1(1.16e-05)]_53_[+3(3.37e-08)]_46 9741 7.78e-05 350_[+1(3.19e-06)]_104_\ [+3(4.32e-06)]_14 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************