******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/298/298.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 37404 1.0000 500 47337 1.0000 500 34944 1.0000 500 39221 1.0000 500 41650 1.0000 500 34180 1.0000 500 37405 1.0000 500 44894 1.0000 500 39667 1.0000 500 54883 1.0000 500 38343 1.0000 500 48621 1.0000 500 35507 1.0000 500 33589 1.0000 500 39364 1.0000 500 45429 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/298/298.seqs.fa -oc motifs/298 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.227 G 0.223 T 0.281 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.227 G 0.223 T 0.281 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 5 llr = 107 E-value = 1.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::a::2::::::a8::826:: pos.-specific C 62:4::86:6:::::4:8::: probability G 4:::::::::aa:::22::aa matrix T :8:6a824a4:::2a4::4:: bits 2.2 ** ** 1.9 * *** ** 1.7 * * * *** * ** 1.5 * * * *** * ** Relative 1.3 * * * * *** * ** ** Entropy 1.1 *************** ** ** (31.0 bits) 0.9 *************** ***** 0.6 *************** ***** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CTATTTCCTCGGAATCACAGG consensus GC C ATT T T TGAT sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 34944 119 2.63e-12 CGTGGCCTCC CTACTTCCTCGGAATCACTGG TGTCGAGAAC 37404 119 2.63e-12 CGTGGCCTCC CTACTTCCTCGGAATCACTGG TGTCGAGAAC 39364 215 2.97e-11 ATGAAATTGC GTATTTCTTTGGAATTACAGG GAGTTATGTC 39667 475 1.57e-10 ACAAAATTGC GTATTACTTTGGAATTACAGG GAGTT 44894 326 1.58e-09 CCGTGAGCCT CCATTTTCTCGGATTGGAAGG TCGGACGCTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34944 2.6e-12 118_[+1]_361 37404 2.6e-12 118_[+1]_361 39364 3e-11 214_[+1]_265 39667 1.6e-10 474_[+1]_5 44894 1.6e-09 325_[+1]_154 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=5 34944 ( 119) CTACTTCCTCGGAATCACTGG 1 37404 ( 119) CTACTTCCTCGGAATCACTGG 1 39364 ( 215) GTATTTCTTTGGAATTACAGG 1 39667 ( 475) GTATTACTTTGGAATTACAGG 1 44894 ( 326) CCATTTTCTCGGATTGGAAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 10.8357 E= 1.4e-001 -897 140 84 -897 -897 -18 -897 151 190 -897 -897 -897 -897 81 -897 109 -897 -897 -897 183 -42 -897 -897 151 -897 181 -897 -49 -897 140 -897 51 -897 -897 -897 183 -897 140 -897 51 -897 -897 216 -897 -897 -897 216 -897 190 -897 -897 -897 158 -897 -897 -49 -897 -897 -897 183 -897 81 -16 51 158 -897 -16 -897 -42 181 -897 -897 116 -897 -897 51 -897 -897 216 -897 -897 -897 216 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 1.4e-001 0.000000 0.600000 0.400000 0.000000 0.000000 0.200000 0.000000 0.800000 1.000000 0.000000 0.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.000000 1.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.800000 0.000000 0.200000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.200000 0.400000 0.800000 0.000000 0.200000 0.000000 0.200000 0.800000 0.000000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG][TC]A[TC]T[TA][CT][CT]T[CT]GGA[AT]T[CTG][AG][CA][AT]GG -------------------------------------------------------------------------------- Time 2.21 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 16 llr = 181 E-value = 7.3e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 84253121:331a:35::21 pos.-specific C 325:1:137:32:96:86:1 probability G :3113:13:7:2:1:5:284 matrix T :13449643:45::1:33:4 bits 2.2 1.9 * 1.7 ** 1.5 * ** * Relative 1.3 * ** * * Entropy 1.1 * * ** ** ** * (16.3 bits) 0.9 * * ** ** ** * 0.6 * * * ** ******* 0.4 * * ** *** ******** 0.2 ******* ************ 0.0 -------------------- Multilevel AACATTTTCGTTACCACCGG consensus CGTTA CTAA AGTT T sequence G G C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 41650 142 1.24e-10 ACGAAGTTTG AACAATTGCGATACCGCCGG TTGAAGTGTA 38343 233 1.07e-07 GGTGTCTCCA AATTGTTGCACTACCACGGG CTTGGCCGGG 39221 362 3.53e-07 TTTTTACCGT ACCTTTGTTGTTACCGCCGT ATTGCACAGA 39667 415 5.68e-07 GATGTACCCA AACATTTGTGTCACAACCGC GAATCAACTC 44894 291 7.95e-07 CTGCACAGTC CGCTTTATCGCAACAGCCGT ATTGACCGTG 39364 155 1.10e-06 GATGTGCCTA AACATTTGTGCCACAATTGT GAATCAACTC 34944 178 1.22e-06 TTCGTCGACA AGTTGTTTCGTGACTACGGG TTACTCTGGT 37404 178 1.22e-06 TTCGTCGACA AGTTGTTTCGTGACTACGGG TTACTCTGGT 33589 164 3.75e-06 TCTGATGAGG ACGATTACCGCTACCGCTGA CGGAGACGCT 54883 81 4.08e-06 AATGGTAAAC ATATGTTTCAAGACCACCAG CACCTTTGTT 34180 284 4.81e-06 TATCCAAAGG AGCAAACCCAATACCGCCGT CAACAACATT 48621 312 5.65e-06 TGAGTTACCG ATTAATTACAATACAATCGG GGCCTTACGA 37405 131 5.65e-06 ATTATGAAAG AAAGTTACCGCTACCGTTGT CTTCAGCATG 35507 53 1.43e-05 AAGAGACGTT CCAATTCTTGTCACCGCCAT TGTCGCAAAC 47337 373 1.62e-05 CAGTCAGCCC CACTCTTACATTACAGCTAG TGATTTCGAA 45429 461 4.13e-05 CTTTGGTCCA CACAATTCTGAAAGCATCGA CTTTTGAAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41650 1.2e-10 141_[+2]_339 38343 1.1e-07 232_[+2]_248 39221 3.5e-07 361_[+2]_119 39667 5.7e-07 414_[+2]_66 44894 8e-07 290_[+2]_190 39364 1.1e-06 154_[+2]_326 34944 1.2e-06 177_[+2]_303 37404 1.2e-06 177_[+2]_303 33589 3.7e-06 163_[+2]_317 54883 4.1e-06 80_[+2]_400 34180 4.8e-06 283_[+2]_197 48621 5.6e-06 311_[+2]_169 37405 5.6e-06 130_[+2]_350 35507 1.4e-05 52_[+2]_428 47337 1.6e-05 372_[+2]_108 45429 4.1e-05 460_[+2]_20 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=16 41650 ( 142) AACAATTGCGATACCGCCGG 1 38343 ( 233) AATTGTTGCACTACCACGGG 1 39221 ( 362) ACCTTTGTTGTTACCGCCGT 1 39667 ( 415) AACATTTGTGTCACAACCGC 1 44894 ( 291) CGCTTTATCGCAACAGCCGT 1 39364 ( 155) AACATTTGTGCCACAATTGT 1 34944 ( 178) AGTTGTTTCGTGACTACGGG 1 37404 ( 178) AGTTGTTTCGTGACTACGGG 1 33589 ( 164) ACGATTACCGCTACCGCTGA 1 54883 ( 81) ATATGTTTCAAGACCACCAG 1 34180 ( 284) AGCAAACCCAATACCGCCGT 1 48621 ( 312) ATTAATTACAATACAATCGG 1 37405 ( 131) AAAGTTACCGCTACCGTTGT 1 35507 ( 53) CCAATTCTTGTCACCGCCAT 1 47337 ( 373) CACTCTTACATTACAGCTAG 1 45429 ( 461) CACAATTCTGAAAGCATCGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7696 bayes= 8.90689 E= 7.3e-002 148 14 -1064 -1064 71 -28 16 -117 -52 114 -184 -17 90 -1064 -184 64 -10 -186 16 64 -210 -1064 -1064 174 -52 -86 -184 115 -110 14 16 42 -1064 160 -1064 15 22 -1064 162 -1064 22 46 -1064 42 -110 -28 -25 83 190 -1064 -1064 -1064 -1064 204 -184 -1064 22 131 -1064 -117 90 -1064 116 -1064 -1064 172 -1064 -17 -1064 131 -25 -17 -52 -1064 186 -1064 -110 -186 97 42 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 16 E= 7.3e-002 0.750000 0.250000 0.000000 0.000000 0.437500 0.187500 0.250000 0.125000 0.187500 0.500000 0.062500 0.250000 0.500000 0.000000 0.062500 0.437500 0.250000 0.062500 0.250000 0.437500 0.062500 0.000000 0.000000 0.937500 0.187500 0.125000 0.062500 0.625000 0.125000 0.250000 0.250000 0.375000 0.000000 0.687500 0.000000 0.312500 0.312500 0.000000 0.687500 0.000000 0.312500 0.312500 0.000000 0.375000 0.125000 0.187500 0.187500 0.500000 1.000000 0.000000 0.000000 0.000000 0.000000 0.937500 0.062500 0.000000 0.312500 0.562500 0.000000 0.125000 0.500000 0.000000 0.500000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.562500 0.187500 0.250000 0.187500 0.000000 0.812500 0.000000 0.125000 0.062500 0.437500 0.375000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AC][AG][CT][AT][TAG]TT[TCG][CT][GA][TAC]TAC[CA][AG][CT][CT]G[GT] -------------------------------------------------------------------------------- Time 5.04 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 104 E-value = 6.1e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 84:::28::4::::::4:a:2 pos.-specific C :46a4:::24a42::::::a: probability G 22::282::2:686:a6a::8 matrix T ::4:4::a8::::4a:::::: bits 2.2 * * * * * 1.9 * * * *** 1.7 * * * ** *** 1.5 * * * * ** *** Relative 1.3 * * *** * * ** **** Entropy 1.1 * ** **** *********** (30.1 bits) 0.9 * ** **** *********** 0.6 * ** **** *********** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel AACCCGATTACGGGTGGGACG consensus GCT TAG CC CCT A A sequence G G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 34944 246 3.73e-12 AAGGACTTGA ACCCTGATTCCGGTTGGGACG GACTTACTAC 37404 246 3.73e-12 AAGGACTTGA ACCCTGATTCCGGTTGGGACG GACTTACTAC 39364 24 3.13e-11 TGCATTTGAC AATCCGATTACCGGTGAGACG TTTATTTTGG 33589 275 1.47e-09 GCATGGCTGG AGTCGAATTGCGCGTGGGACG TCCCGGTGTG 39667 282 2.03e-09 TGCGTTCGAC GACCCGGTCACCGGTGAGACA TTCATTTTGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34944 3.7e-12 245_[+3]_234 37404 3.7e-12 245_[+3]_234 39364 3.1e-11 23_[+3]_456 33589 1.5e-09 274_[+3]_205 39667 2e-09 281_[+3]_198 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 34944 ( 246) ACCCTGATTCCGGTTGGGACG 1 37404 ( 246) ACCCTGATTCCGGTTGGGACG 1 39364 ( 24) AATCCGATTACCGGTGAGACG 1 33589 ( 275) AGTCGAATTGCGCGTGGGACG 1 39667 ( 282) GACCCGGTCACCGGTGAGACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 10.8357 E= 6.1e-001 158 -897 -16 -897 58 81 -16 -897 -897 140 -897 51 -897 213 -897 -897 -897 81 -16 51 -42 -897 184 -897 158 -897 -16 -897 -897 -897 -897 183 -897 -18 -897 151 58 81 -16 -897 -897 213 -897 -897 -897 81 142 -897 -897 -18 184 -897 -897 -897 142 51 -897 -897 -897 183 -897 -897 216 -897 58 -897 142 -897 -897 -897 216 -897 190 -897 -897 -897 -897 213 -897 -897 -42 -897 184 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 6.1e-001 0.800000 0.000000 0.200000 0.000000 0.400000 0.400000 0.200000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.200000 0.400000 0.200000 0.000000 0.800000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.800000 0.400000 0.400000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AG][ACG][CT]C[CTG][GA][AG]T[TC][ACG]C[GC][GC][GT]TG[GA]GAC[GA] -------------------------------------------------------------------------------- Time 7.39 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37404 1.59e-18 118_[+1(2.63e-12)]_38_\ [+2(1.22e-06)]_48_[+3(3.73e-12)]_137_[+2(9.67e-05)]_77 47337 1.35e-02 230_[+1(9.85e-05)]_121_\ [+2(1.62e-05)]_108 34944 1.59e-18 118_[+1(2.63e-12)]_38_\ [+2(1.22e-06)]_48_[+3(3.73e-12)]_137_[+2(9.67e-05)]_77 39221 1.77e-03 361_[+2(3.53e-07)]_119 41650 2.78e-06 141_[+2(1.24e-10)]_339 34180 1.37e-02 283_[+2(4.81e-06)]_197 37405 3.52e-02 130_[+2(5.65e-06)]_350 44894 4.11e-09 290_[+2(7.95e-07)]_15_\ [+1(1.58e-09)]_154 39667 1.57e-14 281_[+3(2.03e-09)]_112_\ [+2(5.68e-07)]_40_[+1(1.57e-10)]_5 54883 1.41e-02 80_[+2(4.08e-06)]_400 38343 4.80e-04 39_[+2(4.97e-05)]_173_\ [+2(1.07e-07)]_248 48621 1.24e-02 311_[+2(5.65e-06)]_169 35507 7.26e-02 52_[+2(1.43e-05)]_428 33589 2.74e-07 163_[+2(3.75e-06)]_91_\ [+3(1.47e-09)]_205 39364 1.13e-16 23_[+3(3.13e-11)]_110_\ [+2(1.10e-06)]_40_[+1(2.97e-11)]_265 45429 1.89e-01 460_[+2(4.13e-05)]_20 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************