******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/52/52.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 48886 1.0000 500 43472 1.0000 500 49751 1.0000 500 45050 1.0000 500 45308 1.0000 500 45638 1.0000 500 54494 1.0000 500 45904 1.0000 500 43481 1.0000 500 44484 1.0000 500 46444 1.0000 500 43716 1.0000 500 38303 1.0000 500 33260 1.0000 500 50289 1.0000 500 47650 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/52/52.seqs.fa -oc motifs/52 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.225 G 0.220 T 0.287 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.225 G 0.220 T 0.287 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 12 llr = 126 E-value = 1.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :835a23:::19 pos.-specific C a273:2619:3: probability G :::2:31::a:: matrix T :::1:3:91:71 bits 2.2 * * 2.0 * * * 1.7 * * ** 1.5 * * ** * Relative 1.3 ** * *** * Entropy 1.1 *** * *** * (15.1 bits) 0.9 *** * **** * 0.7 *** * ****** 0.4 *** * ****** 0.2 ***** ****** 0.0 ------------ Multilevel CACAAGCTCGTA consensus AC TA C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 45050 461 5.27e-08 AACAGACAGC CACAAGCTCGTA TTGCCGCCCG 45638 192 5.08e-07 CCAGCAGGAG CAAAAGCTCGTA TCCTGTACAC 46444 363 9.59e-07 TATTGACTGT CACGATCTCGTA CAACGGCTCT 33260 427 4.34e-06 CAGCCTATCA CACGACCTCGCA ATCATCCATG 45904 186 4.74e-06 ACGATGGATC CAAAAAATCGTA GTACGAGTTG 50289 346 1.18e-05 CCGAACGAAT CACAAGCTTGCA TCATCCTTGC 43472 338 1.18e-05 AGTGAATATT CCCCACATCGTA CGGGGTGTTA 44484 431 1.28e-05 ACCCACGCGA CACAAAGTCGCA AACGCCGAGG 48886 160 1.28e-05 CTTTCACTGT CACAATATCGTT TGTGAGCCCT 54494 442 1.39e-05 CAAAGGATAC CCACAGATCGTA AATGCTTCTT 47650 466 1.48e-05 GCTTTTGTAT CAACATCTCGAA AGAGTCTTGA 43716 325 2.49e-05 TGTAGTACCA CACTATCCCGTA ACGCGGCCTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45050 5.3e-08 460_[+1]_28 45638 5.1e-07 191_[+1]_297 46444 9.6e-07 362_[+1]_126 33260 4.3e-06 426_[+1]_62 45904 4.7e-06 185_[+1]_303 50289 1.2e-05 345_[+1]_143 43472 1.2e-05 337_[+1]_151 44484 1.3e-05 430_[+1]_58 48886 1.3e-05 159_[+1]_329 54494 1.4e-05 441_[+1]_47 47650 1.5e-05 465_[+1]_23 43716 2.5e-05 324_[+1]_164 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=12 45050 ( 461) CACAAGCTCGTA 1 45638 ( 192) CAAAAGCTCGTA 1 46444 ( 363) CACGATCTCGTA 1 33260 ( 427) CACGACCTCGCA 1 45904 ( 186) CAAAAAATCGTA 1 50289 ( 346) CACAAGCTTGCA 1 43472 ( 338) CCCCACATCGTA 1 44484 ( 431) CACAAAGTCGCA 1 48886 ( 160) CACAATATCGTT 1 54494 ( 442) CCACAGATCGTA 1 47650 ( 466) CAACATCTCGAA 1 43716 ( 325) CACTATCCCGTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 9.79456 E= 1.8e+000 -1023 215 -1023 -1023 164 -43 -1023 -1023 32 157 -1023 -1023 90 15 -40 -178 190 -1023 -1023 -1023 -68 -43 60 21 32 137 -140 -1023 -1023 -143 -1023 167 -1023 203 -1023 -178 -1023 -1023 218 -1023 -168 15 -1023 121 178 -1023 -1023 -178 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 1.8e+000 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.500000 0.250000 0.166667 0.083333 1.000000 0.000000 0.000000 0.000000 0.166667 0.166667 0.333333 0.333333 0.333333 0.583333 0.083333 0.000000 0.000000 0.083333 0.000000 0.916667 0.000000 0.916667 0.000000 0.083333 0.000000 0.000000 1.000000 0.000000 0.083333 0.250000 0.000000 0.666667 0.916667 0.000000 0.000000 0.083333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CA[CA][AC]A[GT][CA]TCG[TC]A -------------------------------------------------------------------------------- Time 2.09 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 106 E-value = 2.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 75853:::3:::::2:::::: pos.-specific C :5::2:282:3a:222:2332 probability G 3::52:823::::8:2:::7: matrix T ::2:3a::2a7:a:77a87:8 bits 2.2 * 2.0 * 1.7 * * ** * 1.5 *** * *** * Relative 1.3 * *** * *** ** ** Entropy 1.1 **** *** ***** ***** (25.4 bits) 0.9 **** *** ***** ***** 0.7 **** *** ************ 0.4 **** *** ************ 0.2 **** *** ************ 0.0 --------------------- Multilevel AAAAATGCATTCTGTTTTTGT consensus GC GT G C CC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 33260 345 2.22e-11 GCAGCTGCAA ACAGATGCCTTCTGTTTTTGT TGCATGCGGT 47650 443 1.01e-09 TTCGGATTTG AAAAGTGCATCCTGCTTTTGT ATCAACATCT 43716 452 4.13e-09 AAAAGTCTAG ACAATTGCATTCTGTCTCTCT CTTTGCAAAC 44484 192 9.22e-09 ACGGCGTAGA GAAATTCCGTTCTGTTTTCGC GATAGATTTC 45050 267 1.14e-08 GGAAGGCGCC GCAGCTGGTTTCTGTGTTTGT CGTTAAACAG 43481 360 4.05e-08 AGAGTGTCAG AATGATGCGTCCTCATTTCCT AACGACAAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33260 2.2e-11 344_[+2]_135 47650 1e-09 442_[+2]_37 43716 4.1e-09 451_[+2]_28 44484 9.2e-09 191_[+2]_288 45050 1.1e-08 266_[+2]_213 43481 4.1e-08 359_[+2]_120 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 33260 ( 345) ACAGATGCCTTCTGTTTTTGT 1 47650 ( 443) AAAAGTGCATCCTGCTTTTGT 1 43716 ( 452) ACAATTGCATTCTGTCTCTCT 1 44484 ( 192) GAAATTCCGTTCTGTTTTCGC 1 45050 ( 267) GCAGCTGGTTTCTGTGTTTGT 1 43481 ( 360) AATGATGCGTCCTCATTTCCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 10.7686 E= 2.1e+002 132 -923 60 -923 90 115 -923 -923 164 -923 -923 -78 90 -923 118 -923 32 -43 -40 21 -923 -923 -923 180 -923 -43 192 -923 -923 189 -40 -923 32 -43 60 -78 -923 -923 -923 180 -923 57 -923 121 -923 215 -923 -923 -923 -923 -923 180 -923 -43 192 -923 -68 -43 -923 121 -923 -43 -40 121 -923 -923 -923 180 -923 -43 -923 153 -923 57 -923 121 -923 57 160 -923 -923 -43 -923 153 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 2.1e+002 0.666667 0.000000 0.333333 0.000000 0.500000 0.500000 0.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.500000 0.000000 0.500000 0.000000 0.333333 0.166667 0.166667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.833333 0.166667 0.000000 0.333333 0.166667 0.333333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 0.166667 0.166667 0.000000 0.666667 0.000000 0.166667 0.166667 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.333333 0.000000 0.666667 0.000000 0.333333 0.666667 0.000000 0.000000 0.166667 0.000000 0.833333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AG][AC]A[AG][AT]TGC[AG]T[TC]CTGTTTT[TC][GC]T -------------------------------------------------------------------------------- Time 4.18 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 14 llr = 161 E-value = 2.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 211:54::1:412:4:1:5 pos.-specific C 146:11:23a:4:218::: probability G 1:1a1::43:11:2::18: matrix T 661:36a44:448652725 bits 2.2 * * 2.0 * * 1.7 * * * 1.5 * * * Relative 1.3 * * * * * Entropy 1.1 * * * * * * (16.6 bits) 0.9 * * * * * ** 0.7 *** ** * * ***** 0.4 **** *** ********** 0.2 ******************* 0.0 ------------------- Multilevel TTCGATTTTCACTTTCTGA consensus AC TA GC TTACAT TT sequence CG G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 54494 22 1.30e-09 TCGACCGCCT TTCGATTTGCTTTTACTGT TACAGTTAGT 43716 207 1.57e-08 CAACCAAATG TCCGATTGTCACTTTCGGA TCTCTCTCTC 38303 315 3.74e-08 CACTTATAGA TCCGTTTTCCTTTCTCTGA TCAAATTCAT 45050 311 1.07e-07 TGGGTTCTTT TCGGTTTTGCATTTTCTGA TGTGGTAGGG 50289 201 4.36e-07 AGTTGAAGTG TTCGTATTTCACTCATTGA AAAGCATAGT 45904 437 8.49e-07 TGTAGACGTC TCTGAATGCCTCTGACTGT GAGACCAAGC 44484 93 1.05e-06 GTTCCATGGC ATCGATTGTCTCTTTTTTA CGATGTCTGT 47650 179 4.12e-06 GATAGAACCT TACGATTCTCGCAGACTGT TTAGAGGTTA 43481 192 7.05e-06 ACATTCCCTA TCCGTATTTCAAATTTTGT CGGAGGCCAT 48886 219 9.35e-06 TGTTCGAGAT TTGGGATGGCTTTTCCTTT GCGTTCTCAT 33260 179 1.00e-05 CAAATAATAG ATAGCATTCCATTTACGGT GCTTCGTGTC 45638 342 1.07e-05 AATTGAGAAC ATCGATTGACAGTGACAGT GAAGGTTTCA 49751 70 1.47e-05 TAGGGAATGA GTAGGTTCCCTCACTCTGA GCCCTCCCCT 45308 83 3.12e-05 ACGGAATTGA CTCGACTCGCGTTTTCATA CAGTGAAAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54494 1.3e-09 21_[+3]_460 43716 1.6e-08 206_[+3]_275 38303 3.7e-08 314_[+3]_167 45050 1.1e-07 310_[+3]_171 50289 4.4e-07 200_[+3]_281 45904 8.5e-07 436_[+3]_45 44484 1e-06 92_[+3]_389 47650 4.1e-06 178_[+3]_303 43481 7.1e-06 191_[+3]_290 48886 9.3e-06 218_[+3]_263 33260 1e-05 178_[+3]_303 45638 1.1e-05 341_[+3]_140 49751 1.5e-05 69_[+3]_412 45308 3.1e-05 82_[+3]_399 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=14 54494 ( 22) TTCGATTTGCTTTTACTGT 1 43716 ( 207) TCCGATTGTCACTTTCGGA 1 38303 ( 315) TCCGTTTTCCTTTCTCTGA 1 45050 ( 311) TCGGTTTTGCATTTTCTGA 1 50289 ( 201) TTCGTATTTCACTCATTGA 1 45904 ( 437) TCTGAATGCCTCTGACTGT 1 44484 ( 93) ATCGATTGTCTCTTTTTTA 1 47650 ( 179) TACGATTCTCGCAGACTGT 1 43481 ( 192) TCCGTATTTCAAATTTTGT 1 48886 ( 219) TTGGGATGGCTTTTCCTTT 1 33260 ( 179) ATAGCATTCCATTTACGGT 1 45638 ( 342) ATCGATTGACAGTGACAGT 1 49751 ( 70) GTAGGTTCCCTCACTCTGA 1 45308 ( 83) CTCGACTCGCGTTTTCATA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 7712 bayes= 8.94649 E= 2.3e+002 -32 -165 -162 116 -190 67 -1045 99 -90 151 -62 -201 -1045 -1045 218 -1045 90 -165 -62 -1 42 -165 -1045 99 -1045 -1045 -1045 180 -1045 -7 70 58 -190 35 38 31 -1045 215 -1045 -1045 68 -1045 -62 58 -190 93 -162 58 -32 -1045 -1045 145 -1045 -7 -4 99 68 -165 -1045 80 -1045 180 -1045 -42 -90 -1045 -62 131 -1045 -1045 183 -42 90 -1045 -1045 80 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 14 E= 2.3e+002 0.214286 0.071429 0.071429 0.642857 0.071429 0.357143 0.000000 0.571429 0.142857 0.642857 0.142857 0.071429 0.000000 0.000000 1.000000 0.000000 0.500000 0.071429 0.142857 0.285714 0.357143 0.071429 0.000000 0.571429 0.000000 0.000000 0.000000 1.000000 0.000000 0.214286 0.357143 0.428571 0.071429 0.285714 0.285714 0.357143 0.000000 1.000000 0.000000 0.000000 0.428571 0.000000 0.142857 0.428571 0.071429 0.428571 0.071429 0.428571 0.214286 0.000000 0.000000 0.785714 0.000000 0.214286 0.214286 0.571429 0.428571 0.071429 0.000000 0.500000 0.000000 0.785714 0.000000 0.214286 0.142857 0.000000 0.142857 0.714286 0.000000 0.000000 0.785714 0.214286 0.500000 0.000000 0.000000 0.500000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TA][TC]CG[AT][TA]T[TGC][TCG]C[AT][CT][TA][TCG][TA][CT]T[GT][AT] -------------------------------------------------------------------------------- Time 6.28 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48886 1.10e-03 159_[+1(1.28e-05)]_47_\ [+3(9.35e-06)]_263 43472 1.40e-02 337_[+1(1.18e-05)]_151 49751 6.31e-02 69_[+3(1.47e-05)]_412 45050 4.11e-12 266_[+2(1.14e-08)]_23_\ [+3(1.07e-07)]_131_[+1(5.27e-08)]_28 45308 4.66e-02 82_[+3(3.12e-05)]_399 45638 4.28e-05 191_[+1(5.08e-07)]_138_\ [+3(1.07e-05)]_140 54494 2.54e-07 21_[+3(1.30e-09)]_401_\ [+1(1.39e-05)]_12_[+3(7.99e-05)]_16 45904 5.95e-05 185_[+1(4.74e-06)]_239_\ [+3(8.49e-07)]_45 43481 4.34e-06 191_[+3(7.05e-06)]_149_\ [+2(4.05e-08)]_120 44484 4.70e-09 92_[+3(1.05e-06)]_80_[+2(9.22e-09)]_\ 218_[+1(1.28e-05)]_58 46444 5.22e-03 362_[+1(9.59e-07)]_126 43716 8.34e-11 206_[+3(1.57e-08)]_99_\ [+1(2.49e-05)]_12_[+2(1.05e-05)]_82_[+2(4.13e-09)]_28 38303 2.65e-05 314_[+3(3.74e-08)]_167 33260 5.17e-11 178_[+3(1.00e-05)]_147_\ [+2(2.22e-11)]_61_[+1(4.34e-06)]_62 50289 7.36e-05 200_[+3(4.36e-07)]_126_\ [+1(1.18e-05)]_143 47650 2.48e-09 178_[+3(4.12e-06)]_245_\ [+2(1.01e-09)]_2_[+1(1.48e-05)]_23 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************