******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/78/78.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47294 1.0000 500 37525 1.0000 500 21664 1.0000 500 47576 1.0000 500 47903 1.0000 500 38671 1.0000 500 38755 1.0000 500 22713 1.0000 500 39534 1.0000 500 49244 1.0000 500 15917 1.0000 500 40271 1.0000 500 30620 1.0000 500 16295 1.0000 500 23694 1.0000 500 50076 1.0000 500 50148 1.0000 500 50273 1.0000 500 23913 1.0000 500 16722 1.0000 500 43792 1.0000 500 10454 1.0000 500 39165 1.0000 500 44140 1.0000 500 49046 1.0000 500 49977 1.0000 500 47852 1.0000 500 40761 1.0000 500 49411 1.0000 500 49428 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/78/78.seqs.fa -oc motifs/78 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 30 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 15000 N= 30 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.266 C 0.233 G 0.229 T 0.272 Background letter frequencies (from dataset with add-one prior applied): A 0.266 C 0.233 G 0.229 T 0.272 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 18 llr = 187 E-value = 4.4e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :1a:812:97:5 pos.-specific C :4:9:::51:32 probability G :5:1:9:5:313 matrix T a::12:8::16: bits 2.1 1.9 * * 1.7 * * * * 1.5 * ** * * Relative 1.3 * ** ** * Entropy 1.1 * ******* (15.0 bits) 0.9 ********** 0.6 *********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGACAGTCAATA consensus C T G GCG sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47903 214 1.51e-07 TTTCCGACTC TGACAGTGAATA AACGCGAAAC 30620 185 9.62e-07 TGTAAGCAAT TGACAGTGAATC TTTTGGTTCG 40761 445 1.09e-06 AGGCTTTGCT TGACAGTGAGTA CGGTAGCATT 49244 317 1.34e-06 CGTCGATTTA TGACAGTGAACG AGAACAACTC 37525 171 1.34e-06 ATCCCTGTGT TCACAGTCAATC GGCATCGTCG 22713 469 1.70e-06 TTCTTTCCAT TCACAGTCAACG GTCTTGTTCC 49977 144 3.03e-06 GTTGTGTCTC TCACAGTCAGTC AGTGTCGGGC 43792 282 3.03e-06 TATTTGTTAC TCACAGTCAGTC GCAGTCAAAA 49046 381 5.40e-06 CCCAGCGAGG TGACTGTGAACG ACTTTAGACC 49428 248 6.73e-06 AAAGGCCATT TCACAGTCAGGA ATGTTAGTGG 50076 279 8.33e-06 ATTGCGGATC TCACAGAGAACG TAAACAAGTA 39534 101 1.24e-05 GCTAGACTCC TCACTGACAATA AGACGATCAA 50148 300 1.37e-05 TTCTTTCTGC TGAGAGTCAACA ACTCGAGCTG 47576 349 2.12e-05 AACGAGCACA TCACAGTCCACG ACGGATGCGT 44140 467 2.86e-05 TCTTTCACAC TGACAGACATTA GCTTCGACTC 39165 279 3.02e-05 CCTGTCGGTC TGATTGTGAATA CCGGTAGGCA 50273 305 3.45e-05 GAAAAAGCAG TAACTGTGAGTA CGGGAGTATT 10454 214 3.63e-05 ACTGGCACCA TGACAATGAAGA TTCGATCGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47903 1.5e-07 213_[+1]_275 30620 9.6e-07 184_[+1]_304 40761 1.1e-06 444_[+1]_44 49244 1.3e-06 316_[+1]_172 37525 1.3e-06 170_[+1]_318 22713 1.7e-06 468_[+1]_20 49977 3e-06 143_[+1]_345 43792 3e-06 281_[+1]_207 49046 5.4e-06 380_[+1]_108 49428 6.7e-06 247_[+1]_241 50076 8.3e-06 278_[+1]_210 39534 1.2e-05 100_[+1]_388 50148 1.4e-05 299_[+1]_189 47576 2.1e-05 348_[+1]_140 44140 2.9e-05 466_[+1]_22 39165 3e-05 278_[+1]_210 50273 3.4e-05 304_[+1]_184 10454 3.6e-05 213_[+1]_275 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=18 47903 ( 214) TGACAGTGAATA 1 30620 ( 185) TGACAGTGAATC 1 40761 ( 445) TGACAGTGAGTA 1 49244 ( 317) TGACAGTGAACG 1 37525 ( 171) TCACAGTCAATC 1 22713 ( 469) TCACAGTCAACG 1 49977 ( 144) TCACAGTCAGTC 1 43792 ( 282) TCACAGTCAGTC 1 49046 ( 381) TGACTGTGAACG 1 49428 ( 248) TCACAGTCAGGA 1 50076 ( 279) TCACAGAGAACG 1 39534 ( 101) TCACTGACAATA 1 50148 ( 300) TGAGAGTCAACA 1 47576 ( 349) TCACAGTCCACG 1 44140 ( 467) TGACAGACATTA 1 39165 ( 279) TGATTGTGAATA 1 50273 ( 305) TAACTGTGAGTA 1 10454 ( 214) TGACAATGAAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 14670 bayes= 10.5177 E= 4.4e-003 -1081 -1081 -1081 188 -226 93 113 -1081 191 -1081 -1081 -1081 -1081 193 -204 -229 155 -1081 -1081 -29 -226 -1081 205 -1081 -68 -1081 -1081 161 -1081 110 113 -1081 183 -206 -1081 -1081 132 -1081 28 -229 -1081 52 -104 103 91 -7 28 -1081 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 18 E= 4.4e-003 0.000000 0.000000 0.000000 1.000000 0.055556 0.444444 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.888889 0.055556 0.055556 0.777778 0.000000 0.000000 0.222222 0.055556 0.000000 0.944444 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.500000 0.500000 0.000000 0.944444 0.055556 0.000000 0.000000 0.666667 0.000000 0.277778 0.055556 0.000000 0.333333 0.111111 0.555556 0.500000 0.222222 0.277778 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[GC]AC[AT]GT[CG]A[AG][TC][AGC] -------------------------------------------------------------------------------- Time 7.56 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 11 llr = 146 E-value = 5.8e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3:a1a8a23:555857 pos.-specific C 1::::::147::::52 probability G 6a:9:2:2125552:1 matrix T :::::::531:::::: bits 2.1 * 1.9 ** * * 1.7 **** * 1.5 **** * Relative 1.3 ****** * Entropy 1.1 ****** ****** (19.2 bits) 0.9 ******* ******* 0.6 ******* ******* 0.4 ******* ******* 0.2 **************** 0.0 ---------------- Multilevel GGAGAAATCCGAGAAA consensus A A AGA C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 49977 473 2.41e-08 ATTGAGTTCC GGAGAAAGACGGGACA GTGTCCCCCA 47852 74 5.80e-08 AACCGGATGA GGAGAAATACGGGGCA TTCAGCCTGT 43792 315 2.40e-07 AAATTTCATT GGAGAAAATGGAGACA TAGTCCATCC 39534 378 2.40e-07 AGATTCATAA AGAGAAATCCAAAAAC CATTATATAA 21664 240 2.40e-07 GGATTTGAAA AGAGAAATCCGAGAAG AGCTCGACGG 10454 151 3.30e-07 CAAGCTCAAA GGAAAAATCCAGAACA AACATTCTAT 38755 302 4.01e-07 CTTAGAATTA GGAGAAATTTAAAAAA ATTACAAAAA 38671 9 5.92e-07 TCTGGTTT CGAGAAATTCGGGGAA TGTACATCGT 47294 289 8.81e-07 GTCACACCGT GGAGAGAAGCGAAAAA GGCCCTTCGG 44140 25 9.61e-07 TTGGGCGGCT GGAGAAAGAGAAGACC CATCGAAACT 30620 112 1.34e-06 CTAATCAGAC AGAGAGACCCAGAAAA AAATCGACAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49977 2.4e-08 472_[+2]_12 47852 5.8e-08 73_[+2]_411 43792 2.4e-07 314_[+2]_170 39534 2.4e-07 377_[+2]_107 21664 2.4e-07 239_[+2]_245 10454 3.3e-07 150_[+2]_334 38755 4e-07 301_[+2]_183 38671 5.9e-07 8_[+2]_476 47294 8.8e-07 288_[+2]_196 44140 9.6e-07 24_[+2]_460 30620 1.3e-06 111_[+2]_373 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=11 49977 ( 473) GGAGAAAGACGGGACA 1 47852 ( 74) GGAGAAATACGGGGCA 1 43792 ( 315) GGAGAAAATGGAGACA 1 39534 ( 378) AGAGAAATCCAAAAAC 1 21664 ( 240) AGAGAAATCCGAGAAG 1 10454 ( 151) GGAAAAATCCAGAACA 1 38755 ( 302) GGAGAAATTTAAAAAA 1 38671 ( 9) CGAGAAATTCGGGGAA 1 47294 ( 289) GGAGAGAAGCGAAAAA 1 44140 ( 25) GGAGAAAGAGAAGACC 1 30620 ( 112) AGAGAGACCCAGAAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 14550 bayes= 11.3952 E= 5.8e-001 3 -135 148 -1010 -1010 -1010 213 -1010 191 -1010 -1010 -1010 -155 -1010 199 -1010 191 -1010 -1010 -1010 162 -1010 -33 -1010 191 -1010 -1010 -1010 -55 -135 -33 100 3 64 -133 0 -1010 164 -33 -158 77 -1010 125 -1010 103 -1010 99 -1010 77 -1010 125 -1010 162 -1010 -33 -1010 103 97 -1010 -1010 145 -36 -133 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 5.8e-001 0.272727 0.090909 0.636364 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.090909 0.000000 0.909091 0.000000 1.000000 0.000000 0.000000 0.000000 0.818182 0.000000 0.181818 0.000000 1.000000 0.000000 0.000000 0.000000 0.181818 0.090909 0.181818 0.545455 0.272727 0.363636 0.090909 0.272727 0.000000 0.727273 0.181818 0.090909 0.454545 0.000000 0.545455 0.000000 0.545455 0.000000 0.454545 0.000000 0.454545 0.000000 0.545455 0.000000 0.818182 0.000000 0.181818 0.000000 0.545455 0.454545 0.000000 0.000000 0.727273 0.181818 0.090909 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GA]GAGAAAT[CAT]C[GA][AG][GA]A[AC]A -------------------------------------------------------------------------------- Time 14.52 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 10 llr = 133 E-value = 2.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 9aa:31:4::2319:: pos.-specific C 1::a:3::55::9:92 probability G ::::71214:73:::7 matrix T :::::5851514:111 bits 2.1 * 1.9 *** 1.7 *** * * 1.5 **** *** Relative 1.3 ***** * *** Entropy 1.1 ***** * * *** (19.2 bits) 0.9 ***** * ** **** 0.6 ***** ***** **** 0.4 ***** ********** 0.2 **************** 0.0 ---------------- Multilevel AAACGTTTCCGTCACG consensus ACGAGTAA C sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 44140 345 3.42e-08 TCTCACTATC AAACGTTAGCGTCACC CGGAATTTCT 21664 365 1.02e-07 TCCGACTCGA AAACGTTACCTACACG GTCCAAGTGG 10454 8 1.47e-07 CGGAAAA AAACATTTGTATCACG AATATTGCGG 47903 232 1.83e-07 AATAAACGCG AAACGTTTGTGAAACG AGCCCCTGAC 50148 423 3.52e-07 ATCCATTTCA CAACACTACCGTCACG TGAAAAATGG 49977 209 4.61e-07 TGGTGGGAGA AAACGCTTCCATCTCG TCGGAGCGCG 23913 462 4.61e-07 TATCGCTGGT AAACGCGACCGGCACT TTATCTTTGC 22713 96 4.61e-07 GTGATGAGTG AAACATTGCTGGCACC CAATCGCTGG 47852 107 6.61e-07 TGTCGCTTAT AAACGGTTGTGGCATG TAGTCGTTGA 50273 405 1.12e-06 GACAATGCCA AAACGAGTTTGACACG GAGTACAGGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44140 3.4e-08 344_[+3]_140 21664 1e-07 364_[+3]_120 10454 1.5e-07 7_[+3]_477 47903 1.8e-07 231_[+3]_253 50148 3.5e-07 422_[+3]_62 49977 4.6e-07 208_[+3]_276 23913 4.6e-07 461_[+3]_23 22713 4.6e-07 95_[+3]_389 47852 6.6e-07 106_[+3]_378 50273 1.1e-06 404_[+3]_80 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=10 44140 ( 345) AAACGTTAGCGTCACC 1 21664 ( 365) AAACGTTACCTACACG 1 10454 ( 8) AAACATTTGTATCACG 1 47903 ( 232) AAACGTTTGTGAAACG 1 50148 ( 423) CAACACTACCGTCACG 1 49977 ( 209) AAACGCTTCCATCTCG 1 23913 ( 462) AAACGCGACCGGCACT 1 22713 ( 96) AAACATTGCTGGCACC 1 47852 ( 107) AAACGGTTGTGGCATG 1 50273 ( 405) AAACGAGTTTGACACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 14550 bayes= 10.7575 E= 2.4e+002 176 -122 -997 -997 191 -997 -997 -997 191 -997 -997 -997 -997 210 -997 -997 17 -997 161 -997 -141 37 -119 88 -997 -997 -19 155 59 -997 -119 88 -997 110 81 -144 -997 110 -997 88 -41 -997 161 -144 17 -997 39 55 -141 195 -997 -997 176 -997 -997 -144 -997 195 -997 -144 -997 -22 161 -144 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 2.4e+002 0.900000 0.100000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.300000 0.000000 0.700000 0.000000 0.100000 0.300000 0.100000 0.500000 0.000000 0.000000 0.200000 0.800000 0.400000 0.000000 0.100000 0.500000 0.000000 0.500000 0.400000 0.100000 0.000000 0.500000 0.000000 0.500000 0.200000 0.000000 0.700000 0.100000 0.300000 0.000000 0.300000 0.400000 0.100000 0.900000 0.000000 0.000000 0.900000 0.000000 0.000000 0.100000 0.000000 0.900000 0.000000 0.100000 0.000000 0.200000 0.700000 0.100000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AAAC[GA][TC][TG][TA][CG][CT][GA][TAG]CAC[GC] -------------------------------------------------------------------------------- Time 21.75 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47294 3.88e-03 288_[+2(8.81e-07)]_196 37525 2.41e-03 170_[+1(1.34e-06)]_318 21664 1.12e-06 239_[+2(2.40e-07)]_109_\ [+3(1.02e-07)]_120 47576 5.32e-02 348_[+1(2.12e-05)]_140 47903 4.86e-07 213_[+1(1.51e-07)]_6_[+3(1.83e-07)]_\ 253 38671 2.02e-03 8_[+2(5.92e-07)]_476 38755 1.44e-03 301_[+2(4.01e-07)]_3_[+2(9.66e-05)]_\ 164 22713 2.48e-05 95_[+3(4.61e-07)]_357_\ [+1(1.70e-06)]_20 39534 2.36e-05 100_[+1(1.24e-05)]_265_\ [+2(2.40e-07)]_107 49244 4.45e-03 316_[+1(1.34e-06)]_172 15917 1.07e-01 500 40271 5.89e-01 500 30620 2.19e-05 111_[+2(1.34e-06)]_57_\ [+1(9.62e-07)]_304 16295 3.07e-01 500 23694 3.17e-01 500 50076 2.68e-02 278_[+1(8.33e-06)]_210 50148 4.08e-05 299_[+1(1.37e-05)]_111_\ [+3(3.52e-07)]_62 50273 1.71e-04 304_[+1(3.45e-05)]_88_\ [+3(1.12e-06)]_80 23913 7.85e-03 461_[+3(4.61e-07)]_23 16722 2.96e-02 15_[+3(3.64e-05)]_469 43792 5.39e-06 281_[+1(3.03e-06)]_21_\ [+2(2.40e-07)]_170 10454 5.46e-08 7_[+3(1.47e-07)]_127_[+2(3.30e-07)]_\ 47_[+1(3.63e-05)]_275 39165 1.31e-01 278_[+1(3.02e-05)]_210 44140 3.09e-08 24_[+2(9.61e-07)]_304_\ [+3(3.42e-08)]_106_[+1(2.86e-05)]_22 49046 3.51e-02 184_[+1(5.30e-05)]_184_\ [+1(5.40e-06)]_108 49977 1.44e-09 143_[+1(3.03e-06)]_53_\ [+3(4.61e-07)]_248_[+2(2.41e-08)]_12 47852 1.53e-06 73_[+2(5.80e-08)]_17_[+3(6.61e-07)]_\ 378 40761 1.07e-03 105_[+2(8.41e-05)]_323_\ [+1(1.09e-06)]_44 49411 2.46e-01 500 49428 6.07e-02 247_[+1(6.73e-06)]_241 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************