******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/103/103.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 25024 1.0000 500 262016 1.0000 500 268358 1.0000 500 268931 1.0000 500 31160 1.0000 500 32226 1.0000 500 33915 1.0000 500 35424 1.0000 500 35982 1.0000 500 36367 1.0000 500 3756 1.0000 500 6422 1.0000 500 6667 1.0000 500 7037 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/103/103.seqs.fa -oc motifs/103 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.239 G 0.236 T 0.264 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.239 G 0.236 T 0.264 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 9 llr = 115 E-value = 3.4e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :a:11:9:6:24::a pos.-specific C 3:13::1::44::7: probability G 7:92:a:3343:a1: matrix T :::39::711:6:2: bits 2.1 * * 1.9 * * * * 1.7 ** * * * 1.5 ** *** * * Relative 1.3 *** *** * * Entropy 1.0 *** **** * * (18.4 bits) 0.8 *** **** **** 0.6 *** ****** **** 0.4 *** *********** 0.2 *************** 0.0 --------------- Multilevel GAGCTGATACCTGCA consensus C T GGGGA T sequence G A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 25024 264 4.24e-08 AGTGGAAAGA GAGTTGATGCCAGCA GTGACAAAAG 35424 262 9.58e-08 CACCGATGGC GAGATGATAGCTGCA AAAGATTTGG 268931 202 1.79e-07 GAGATCGAGG CAGCTGAGGCCTGCA CATATGATGG 32226 82 2.66e-07 TTGCATGTGT GAGGTGAGAGAAGCA GGGATCGATC 6667 290 6.71e-07 ACTACTAGGT GAGCTGATGGGAGGA AGAGGGTATT 35982 50 1.04e-06 TGCTGACAGA CAGTAGATACGTGCA TTAGAAAAGT 262016 233 1.58e-06 TACCGCTGGT GAGTTGCTATCTGCA TGCACTCTGA 6422 231 2.29e-06 CGATCGGTGA GAGGTGAGTGGTGTA GGATTGTGAG 3756 66 4.16e-06 CTGTCGTCAA CACCTGATACAAGTA ACGTCGTACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25024 4.2e-08 263_[+1]_222 35424 9.6e-08 261_[+1]_224 268931 1.8e-07 201_[+1]_284 32226 2.7e-07 81_[+1]_404 6667 6.7e-07 289_[+1]_196 35982 1e-06 49_[+1]_436 262016 1.6e-06 232_[+1]_253 6422 2.3e-06 230_[+1]_255 3756 4.2e-06 65_[+1]_420 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=9 25024 ( 264) GAGTTGATGCCAGCA 1 35424 ( 262) GAGATGATAGCTGCA 1 268931 ( 202) CAGCTGAGGCCTGCA 1 32226 ( 82) GAGGTGAGAGAAGCA 1 6667 ( 290) GAGCTGATGGGAGGA 1 35982 ( 50) CAGTAGATACGTGCA 1 262016 ( 233) GAGTTGCTATCTGCA 1 6422 ( 231) GAGGTGAGTGGTGTA 1 3756 ( 66) CACCTGATACAAGTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6804 bayes= 9.6948 E= 3.4e+000 -982 48 150 -982 194 -982 -982 -982 -982 -110 191 -982 -123 48 -9 33 -123 -982 -982 175 -982 -982 208 -982 177 -110 -982 -982 -982 -982 50 133 109 -982 50 -125 -982 90 91 -125 -23 90 50 -982 77 -982 -982 107 -982 -982 208 -982 -982 148 -108 -25 194 -982 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 9 E= 3.4e+000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.000000 0.111111 0.333333 0.222222 0.333333 0.111111 0.000000 0.000000 0.888889 0.000000 0.000000 1.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.555556 0.000000 0.333333 0.111111 0.000000 0.444444 0.444444 0.111111 0.222222 0.444444 0.333333 0.000000 0.444444 0.000000 0.000000 0.555556 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.111111 0.222222 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC]AG[CTG]TGA[TG][AG][CG][CGA][TA]G[CT]A -------------------------------------------------------------------------------- Time 1.68 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 7 llr = 117 E-value = 4.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::31:33:::::11:1::17 pos.-specific C ::1:1:::111:44631:1: probability G :a1:9::a::91:44:6:73 matrix T a:49:77:99:94::63a:: bits 2.1 * * 1.9 ** * * 1.7 ** * * 1.5 ** * * * * Relative 1.3 ** ** ***** * Entropy 1.0 ** ********* * * * (24.1 bits) 0.8 ** ********* * *** 0.6 ** ***************** 0.4 ** ***************** 0.2 ******************** 0.0 -------------------- Multilevel TGTTGTTGTTGTCCCTGTGA consensus A AA TGGCT G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 268358 98 1.26e-11 TTGATATCCA TGATGTTGTTGTTGCTGTGA AGAAGGGAGG 268931 279 6.86e-11 AGAGGCTGGT TGTTGTTGTTGTCGGTGTGG TTGTGGAGGC 35424 21 6.52e-09 AGGTGAATAA TGTTGAAGTTGTACGTTTGA GTTTTGCTGT 3756 235 1.95e-08 TGGTTGTGTC TGTTGTTGTCGTCACCGTCA AGTCAACGGG 36367 169 1.95e-08 TGTATTCCAG TGAAGAAGTTGTCCGTTTGA TAATTTAGCT 31160 336 1.06e-07 GTGGCTCGCA TGCTCTTGCTGTTCCACTGA TCCATGACAT 6667 59 1.40e-07 CGTTTGGCGG TGGTGTTGTTCGTGCCGTAG TCTTCTTGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 268358 1.3e-11 97_[+2]_383 268931 6.9e-11 278_[+2]_202 35424 6.5e-09 20_[+2]_460 3756 1.9e-08 234_[+2]_246 36367 1.9e-08 168_[+2]_312 31160 1.1e-07 335_[+2]_145 6667 1.4e-07 58_[+2]_422 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=7 268358 ( 98) TGATGTTGTTGTTGCTGTGA 1 268931 ( 279) TGTTGTTGTTGTCGGTGTGG 1 35424 ( 21) TGTTGAAGTTGTACGTTTGA 1 3756 ( 235) TGTTGTTGTCGTCACCGTCA 1 36367 ( 169) TGAAGAAGTTGTCCGTTTGA 1 31160 ( 336) TGCTCTTGCTGTTCCACTGA 1 6667 ( 59) TGGTGTTGTTCGTGCCGTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6734 bayes= 9.7521 E= 4.8e+000 -945 -945 -945 192 -945 -945 208 -945 13 -74 -72 70 -87 -945 -945 170 -945 -74 186 -945 13 -945 -945 143 13 -945 -945 143 -945 -945 208 -945 -945 -74 -945 170 -945 -74 -945 170 -945 -74 186 -945 -945 -945 -72 170 -87 84 -945 70 -87 84 86 -945 -945 126 86 -945 -87 26 -945 111 -945 -74 128 11 -945 -945 -945 192 -87 -74 160 -945 145 -945 28 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 4.8e+000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.285714 0.142857 0.142857 0.428571 0.142857 0.000000 0.000000 0.857143 0.000000 0.142857 0.857143 0.000000 0.285714 0.000000 0.000000 0.714286 0.285714 0.000000 0.000000 0.714286 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 0.142857 0.857143 0.142857 0.428571 0.000000 0.428571 0.142857 0.428571 0.428571 0.000000 0.000000 0.571429 0.428571 0.000000 0.142857 0.285714 0.000000 0.571429 0.000000 0.142857 0.571429 0.285714 0.000000 0.000000 0.000000 1.000000 0.142857 0.142857 0.714286 0.000000 0.714286 0.000000 0.285714 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TG[TA]TG[TA][TA]GTTGT[CT][CG][CG][TC][GT]TG[AG] -------------------------------------------------------------------------------- Time 3.39 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 14 llr = 132 E-value = 2.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :a341a9:a326 pos.-specific C 9:756::4:441 probability G 1:::4::6::41 matrix T :::1::1::4:1 bits 2.1 1.9 * * * 1.7 * * * 1.5 ** * * Relative 1.3 *** ** * Entropy 1.0 *** **** (13.7 bits) 0.8 *** ***** 0.6 ********* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CACCCAAGACCA consensus AAG C TG sequence AA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 6667 266 6.80e-07 GACTCAACGA CACCCAAGAAGA AGACTACTAG 35424 287 6.80e-07 AAAGATTTGG CACCCAAGAAGA AGGCCGCCAC 33915 13 3.15e-06 AAAACGCAAG CACCGAAGACAA CCATCGCGAG 36367 322 6.50e-06 CCCGTGCCTA CAAACAACACCA CAGCTCAACC 7037 481 7.41e-06 ACGCTCACAC CACCCAACACCT CTCTAACA 268931 362 7.41e-06 GCCGACAGCC GACCCAAGACGA AGGGAAGAAG 32226 393 1.05e-05 ACACATCTGT CAACCAACACAA CCAGAACGCG 35982 350 1.33e-05 CGGTTCCTCA CACACAACATCC TTCTTCATAG 31160 147 2.01e-05 GAAGGCACTG CACCAAACATCA AACTAGCGGT 262016 166 2.83e-05 CAACCTCTTC CACAGATGATGA ATTCGCCGTG 25024 306 2.83e-05 AAGTTGGAAG CAATCAACATGA GTTTGATGCA 268358 489 4.45e-05 AGCATACTGC CACAGAAGAAAC 6422 489 1.29e-04 CTCCTGCAGC GACAGAAGAACG 3756 352 2.06e-04 TATCACATTA CAATGATGATCT CAATATGGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 6667 6.8e-07 265_[+3]_223 35424 6.8e-07 286_[+3]_202 33915 3.1e-06 12_[+3]_476 36367 6.5e-06 321_[+3]_167 7037 7.4e-06 480_[+3]_8 268931 7.4e-06 361_[+3]_127 32226 1.1e-05 392_[+3]_96 35982 1.3e-05 349_[+3]_139 31160 2e-05 146_[+3]_342 262016 2.8e-05 165_[+3]_323 25024 2.8e-05 305_[+3]_183 268358 4.4e-05 488_[+3] 6422 0.00013 488_[+3] 3756 0.00021 351_[+3]_137 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=14 6667 ( 266) CACCCAAGAAGA 1 35424 ( 287) CACCCAAGAAGA 1 33915 ( 13) CACCGAAGACAA 1 36367 ( 322) CAAACAACACCA 1 7037 ( 481) CACCCAACACCT 1 268931 ( 362) GACCCAAGACGA 1 32226 ( 393) CAACCAACACAA 1 35982 ( 350) CACACAACATCC 1 31160 ( 147) CACCAAACATCA 1 262016 ( 166) CACAGATGATGA 1 25024 ( 306) CAATCAACATGA 1 268358 ( 489) CACAGAAGAAAC 1 6422 ( 489) GACAGAAGAACG 1 3756 ( 352) CAATGATGATCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 10.1548 E= 2.1e+001 -1045 184 -72 -1045 194 -1045 -1045 -1045 13 158 -1045 -1045 45 107 -1045 -89 -187 126 60 -1045 194 -1045 -1045 -1045 171 -1045 -1045 -89 -1045 84 128 -1045 194 -1045 -1045 -1045 13 58 -1045 43 -28 84 60 -1045 130 -74 -172 -89 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 2.1e+001 0.000000 0.857143 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 0.357143 0.500000 0.000000 0.142857 0.071429 0.571429 0.357143 0.000000 1.000000 0.000000 0.000000 0.000000 0.857143 0.000000 0.000000 0.142857 0.000000 0.428571 0.571429 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.357143 0.000000 0.357143 0.214286 0.428571 0.357143 0.000000 0.642857 0.142857 0.071429 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CA[CA][CA][CG]AA[GC]A[CTA][CGA]A -------------------------------------------------------------------------------- Time 5.23 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25024 1.38e-05 263_[+1(4.24e-08)]_27_\ [+3(2.83e-05)]_183 262016 8.92e-05 165_[+3(2.83e-05)]_26_\ [+1(9.22e-05)]_14_[+1(1.58e-06)]_253 268358 8.21e-09 97_[+2(1.26e-11)]_371_\ [+3(4.45e-05)] 268931 5.73e-12 39_[+2(6.74e-07)]_142_\ [+1(1.79e-07)]_62_[+2(6.86e-11)]_63_[+3(7.41e-06)]_127 31160 2.67e-05 146_[+3(2.01e-05)]_177_\ [+2(1.06e-07)]_145 32226 6.67e-05 81_[+1(2.66e-07)]_296_\ [+3(1.05e-05)]_96 33915 1.34e-02 12_[+3(3.15e-06)]_476 35424 2.43e-11 20_[+2(6.52e-09)]_221_\ [+1(9.58e-08)]_10_[+3(6.80e-07)]_24_[+2(6.40e-05)]_158 35982 2.47e-04 49_[+1(1.04e-06)]_285_\ [+3(1.33e-05)]_139 36367 4.14e-06 168_[+2(1.95e-08)]_133_\ [+3(6.50e-06)]_167 3756 4.05e-07 65_[+1(4.16e-06)]_154_\ [+2(1.95e-08)]_246 6422 2.32e-04 160_[+2(7.40e-05)]_50_\ [+1(2.29e-06)]_255 6667 2.60e-09 58_[+2(1.40e-07)]_187_\ [+3(6.80e-07)]_12_[+1(6.71e-07)]_196 7037 4.42e-02 480_[+3(7.41e-06)]_8 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************