******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/169/169.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11109 1.0000 500 12602 1.0000 500 21799 1.0000 500 22898 1.0000 500 22946 1.0000 500 2587 1.0000 500 261384 1.0000 500 263409 1.0000 500 33225 1.0000 500 3788 1.0000 500 6447 1.0000 500 6771 1.0000 500 9707 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/169/169.seqs.fa -oc motifs/169 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.229 G 0.244 T 0.259 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.229 G 0.244 T 0.259 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 9 llr = 135 E-value = 2.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :43::93:33:322293::71 pos.-specific C 13:19::61:6::111::12: probability G 9169:172:7:1677:7a9:7 matrix T :11:1::26:462::::::12 bits 2.1 * 1.9 * 1.7 * * 1.5 * *** * ** Relative 1.3 * *** * ** Entropy 1.1 * **** ** **** (21.7 bits) 0.9 * **** ** ****** * 0.6 * ******************* 0.4 * ******************* 0.2 ********************* 0.0 --------------------- Multilevel GAGGCAGCTGCTGGGAGGGAG consensus CA AGAATAAAA A CT sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 261384 370 2.54e-09 CGAGAGGCGT GCGGCAATTGTTGCGAGGGAG AGACGGTACT 22898 190 1.21e-08 GATTGAGGTG GCGGCAGTTGCAAGCAAGGAG GTGGCGGCGA 9707 346 1.64e-08 GACGACCGAC GATGCAACAGCATGGAGGGCG CTCCAAACAA 2587 219 2.65e-08 AGGGAAGGAT GAGGCAGGAGCGAGGAGGGAA CCACCGGCCG 263409 157 3.18e-08 ACAAGTAGAG GCAGCAGCTATTGGAAAGGCT TCTCCATCGT 12602 354 8.14e-08 GCTGATTGTC GGAGCAGCAATTTAAAGGGAG GTGAACCGTC 3788 132 1.20e-07 GGAAAAGTGG CAGCCAGCCGCTGGGCGGGAG ACGGGTGGCC 6771 192 1.30e-07 CAGACATTGT GTGGTGGCTGTTGGGAGGGTG CATCGTCTGA 6447 241 4.38e-07 AGCACAAAGC GAAGCAAGTACAGAGAAGCAT CGGTCAATGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 261384 2.5e-09 369_[+1]_110 22898 1.2e-08 189_[+1]_290 9707 1.6e-08 345_[+1]_134 2587 2.6e-08 218_[+1]_261 263409 3.2e-08 156_[+1]_323 12602 8.1e-08 353_[+1]_126 3788 1.2e-07 131_[+1]_348 6771 1.3e-07 191_[+1]_288 6447 4.4e-07 240_[+1]_239 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=9 261384 ( 370) GCGGCAATTGTTGCGAGGGAG 1 22898 ( 190) GCGGCAGTTGCAAGCAAGGAG 1 9707 ( 346) GATGCAACAGCATGGAGGGCG 1 2587 ( 219) GAGGCAGGAGCGAGGAGGGAA 1 263409 ( 157) GCAGCAGCTATTGGAAAGGCT 1 12602 ( 354) GGAGCAGCAATTTAAAGGGAG 1 3788 ( 132) CAGCCAGCCGCTGGGCGGGAG 1 6771 ( 192) GTGGTGGCTGTTGGGAGGGTG 1 6447 ( 241) GAAGCAAGTACAGAGAAGCAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6240 bayes= 9.56981 E= 2.2e+000 -982 -104 187 -982 73 54 -113 -122 31 -982 119 -122 -982 -104 187 -982 -982 195 -982 -122 173 -982 -113 -982 31 -982 145 -982 -982 128 -13 -22 31 -104 -982 110 31 -982 145 -982 -982 128 -982 78 31 -982 -113 110 -27 -982 119 -22 -27 -104 145 -982 -27 -104 145 -982 173 -104 -982 -982 31 -982 145 -982 -982 -982 204 -982 -982 -104 187 -982 131 -4 -982 -122 -127 -982 145 -22 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 2.2e+000 0.000000 0.111111 0.888889 0.000000 0.444444 0.333333 0.111111 0.111111 0.333333 0.000000 0.555556 0.111111 0.000000 0.111111 0.888889 0.000000 0.000000 0.888889 0.000000 0.111111 0.888889 0.000000 0.111111 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.555556 0.222222 0.222222 0.333333 0.111111 0.000000 0.555556 0.333333 0.000000 0.666667 0.000000 0.000000 0.555556 0.000000 0.444444 0.333333 0.000000 0.111111 0.555556 0.222222 0.000000 0.555556 0.222222 0.222222 0.111111 0.666667 0.000000 0.222222 0.111111 0.666667 0.000000 0.888889 0.111111 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.111111 0.888889 0.000000 0.666667 0.222222 0.000000 0.111111 0.111111 0.000000 0.666667 0.222222 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[AC][GA]GCA[GA][CGT][TA][GA][CT][TA][GAT][GA][GA]A[GA]GG[AC][GT] -------------------------------------------------------------------------------- Time 1.87 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 8 llr = 96 E-value = 2.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::41::16a6:9 pos.-specific C aa:6a4:4:4a: probability G :::::65::::1 matrix T ::63::4::::: bits 2.1 ** * * 1.9 ** * * * 1.7 ** * * * 1.5 ** * * * Relative 1.3 ** * * ** Entropy 1.1 *** ** ***** (17.3 bits) 0.9 ****** ***** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CCTCCGGAAACA consensus AT CTC C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 22898 355 1.89e-07 TTTACCTATT CCTCCGTAAACA CGTCACAAAA 261384 82 2.36e-07 CAATGTAGGT CCTCCCGAAACA GCAACTTCAT 33225 189 1.58e-06 TTCAACCCGA CCATCGGAAACA CATCTTTAGC 6447 164 1.66e-06 AGAAACAACA CCTTCGGCACCA ACAACATGAA 12602 452 2.18e-06 AAAGACTCAG CCACCCTCAACA GCCACCGAAC 9707 16 2.70e-06 GGGAATAGAG CCTACGGAACCA ATTGTACTAG 21799 414 4.31e-06 TCCTCCTTCT CCTCCGTCAACG TCGAGGTGGC 2587 475 5.53e-06 CGTAGCAAAC CCACCCAAACCA ACGACCGGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22898 1.9e-07 354_[+2]_134 261384 2.4e-07 81_[+2]_407 33225 1.6e-06 188_[+2]_300 6447 1.7e-06 163_[+2]_325 12602 2.2e-06 451_[+2]_37 9707 2.7e-06 15_[+2]_473 21799 4.3e-06 413_[+2]_75 2587 5.5e-06 474_[+2]_14 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=8 22898 ( 355) CCTCCGTAAACA 1 261384 ( 82) CCTCCCGAAACA 1 33225 ( 189) CCATCGGAAACA 1 6447 ( 164) CCTTCGGCACCA 1 12602 ( 452) CCACCCTCAACA 1 9707 ( 16) CCTACGGAACCA 1 21799 ( 414) CCTCCGTCAACG 1 2587 ( 475) CCACCCAAACCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 9.63231 E= 2.3e+001 -965 212 -965 -965 -965 212 -965 -965 48 -965 -965 127 -110 145 -965 -5 -965 212 -965 -965 -965 71 136 -965 -110 -965 104 53 122 71 -965 -965 190 -965 -965 -965 122 71 -965 -965 -965 212 -965 -965 170 -965 -96 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 2.3e+001 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.375000 0.000000 0.000000 0.625000 0.125000 0.625000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.375000 0.625000 0.000000 0.125000 0.000000 0.500000 0.375000 0.625000 0.375000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.625000 0.375000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CC[TA][CT]C[GC][GT][AC]A[AC]CA -------------------------------------------------------------------------------- Time 3.52 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 13 llr = 159 E-value = 3.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1543:1:12::1::321121: pos.-specific C 552:75:251a:521636::7 probability G 51:3::222::::34:433:2 matrix T ::54358529:955222:592 bits 2.1 * 1.9 * 1.7 * 1.5 *** * Relative 1.3 * * *** * Entropy 1.1 * * **** * (17.7 bits) 0.9 * * * **** * ** 0.6 ** *** **** * * ** 0.4 ************** * **** 0.2 ********************* 0.0 --------------------- Multilevel CATTCCTTCTCTCTGCGCTTC consensus GCAATTGG TGAACGG sequence G CT T A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 12602 259 2.95e-12 GATTGCGAAT CCATCCTTCTCTCTGCGCTTC CTTACCCTTC 21799 239 9.14e-08 ATTCACCCGT CATGCCTGCTCTTGGCGCATT TCGTGCATTG 22898 129 1.48e-07 TAGTGTTGGT GCATTCTTCTCTTGTATCGTC TACTGTACAT 22946 27 2.34e-07 CTTTGCTTTG GCTGCCTTCTCTCGCCAGGTC TCGTCGTCGC 11109 221 2.91e-07 CACATCATCA CATTCCTTGCCTTCGCCCTTC TTTGCAATTT 3788 345 3.24e-07 AGCGGTAACT GCAATCGTCTCTCTACTCTTG GTTCTGGTTT 9707 308 3.60e-07 ACTCCCACTC CACACTTGCTCTCTTCCATTC TCTGCCCGAC 261384 94 8.64e-07 TCCCGAAACA GCAACTTCATCTCCAAGCATC TGCTTGGCCA 6771 233 2.55e-06 AGGCTTTTCC AACTCTTGCTCTTGATGGTTC TGTCAATGTG 33225 268 3.19e-06 AGCGGCTCAC CATGTTGTATCTCTGCTCGAC GACGACGGGT 263409 103 8.56e-06 TGCATCAGAT GATACTTCTTCACTACCGGTG ATTTACTATG 6447 34 1.20e-05 ACACAAATAT GCAGTTGTGTCTTCGAGGATT GAATGTAGTT 2587 315 1.49e-05 CATTACCATT CGTTCATATTCTTTTTCCTTC CAGGGACACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12602 3e-12 258_[+3]_221 21799 9.1e-08 238_[+3]_241 22898 1.5e-07 128_[+3]_351 22946 2.3e-07 26_[+3]_453 11109 2.9e-07 220_[+3]_259 3788 3.2e-07 344_[+3]_135 9707 3.6e-07 307_[+3]_172 261384 8.6e-07 93_[+3]_386 6771 2.5e-06 232_[+3]_247 33225 3.2e-06 267_[+3]_212 263409 8.6e-06 102_[+3]_377 6447 1.2e-05 33_[+3]_446 2587 1.5e-05 314_[+3]_165 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=13 12602 ( 259) CCATCCTTCTCTCTGCGCTTC 1 21799 ( 239) CATGCCTGCTCTTGGCGCATT 1 22898 ( 129) GCATTCTTCTCTTGTATCGTC 1 22946 ( 27) GCTGCCTTCTCTCGCCAGGTC 1 11109 ( 221) CATTCCTTGCCTTCGCCCTTC 1 3788 ( 345) GCAATCGTCTCTCTACTCTTG 1 9707 ( 308) CACACTTGCTCTCTTCCATTC 1 261384 ( 94) GCAACTTCATCTCCAAGCATC 1 6771 ( 233) AACTCTTGCTCTTGATGGTTC 1 33225 ( 268) CATGTTGTATCTCTGCTCGAC 1 263409 ( 103) GATACTTCTTCACTACCGGTG 1 6447 ( 34) GCAGTTGTGTCTTCGAGGATT 1 2587 ( 315) CGTTCATATTCTTTTTCCTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6240 bayes= 9.43532 E= 3.9e+000 -180 101 92 -1035 78 101 -166 -1035 52 -57 -1035 83 20 -1035 34 57 -1035 159 -1035 25 -180 101 -1035 83 -1035 -1035 -8 157 -180 -57 -8 106 -80 123 -66 -75 -1035 -157 -1035 183 -1035 213 -1035 -1035 -180 -1035 -1035 183 -1035 123 -1035 83 -1035 1 34 83 20 -157 66 -17 -22 142 -1035 -75 -180 43 66 -17 -180 142 34 -1035 -22 -1035 34 83 -180 -1035 -1035 183 -1035 159 -66 -75 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 13 E= 3.9e+000 0.076923 0.461538 0.461538 0.000000 0.461538 0.461538 0.076923 0.000000 0.384615 0.153846 0.000000 0.461538 0.307692 0.000000 0.307692 0.384615 0.000000 0.692308 0.000000 0.307692 0.076923 0.461538 0.000000 0.461538 0.000000 0.000000 0.230769 0.769231 0.076923 0.153846 0.230769 0.538462 0.153846 0.538462 0.153846 0.153846 0.000000 0.076923 0.000000 0.923077 0.000000 1.000000 0.000000 0.000000 0.076923 0.000000 0.000000 0.923077 0.000000 0.538462 0.000000 0.461538 0.000000 0.230769 0.307692 0.461538 0.307692 0.076923 0.384615 0.230769 0.230769 0.615385 0.000000 0.153846 0.076923 0.307692 0.384615 0.230769 0.076923 0.615385 0.307692 0.000000 0.230769 0.000000 0.307692 0.461538 0.076923 0.000000 0.000000 0.923077 0.000000 0.692308 0.153846 0.153846 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG][AC][TA][TAG][CT][CT][TG][TG]CTCT[CT][TGC][GAT][CA][GCT][CG][TGA]TC -------------------------------------------------------------------------------- Time 5.17 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11109 1.28e-03 220_[+3(2.91e-07)]_259 12602 4.34e-14 258_[+3(2.95e-12)]_74_\ [+1(8.14e-08)]_77_[+2(2.18e-06)]_37 21799 1.33e-05 238_[+3(9.14e-08)]_154_\ [+2(4.31e-06)]_75 22898 1.95e-11 128_[+3(1.48e-07)]_40_\ [+1(1.21e-08)]_144_[+2(1.89e-07)]_134 22946 3.23e-03 26_[+3(2.34e-07)]_453 2587 6.54e-08 218_[+1(2.65e-08)]_75_\ [+3(1.49e-05)]_139_[+2(5.53e-06)]_14 261384 2.89e-11 81_[+2(2.36e-07)]_[+3(8.64e-07)]_\ 255_[+1(2.54e-09)]_110 263409 4.93e-06 102_[+3(8.56e-06)]_33_\ [+1(3.18e-08)]_323 33225 5.82e-05 188_[+2(1.58e-06)]_67_\ [+3(3.19e-06)]_212 3788 8.64e-07 131_[+1(1.20e-07)]_192_\ [+3(3.24e-07)]_135 6447 2.33e-07 33_[+3(1.20e-05)]_109_\ [+2(1.66e-06)]_65_[+1(4.38e-07)]_239 6771 5.17e-06 191_[+1(1.30e-07)]_20_\ [+3(2.55e-06)]_247 9707 7.06e-10 15_[+2(2.70e-06)]_280_\ [+3(3.60e-07)]_17_[+1(1.64e-08)]_134 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************