******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/38/38.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 22414 1.0000 500 23208 1.0000 500 23592 1.0000 500 23801 1.0000 500 263878 1.0000 500 33407 1.0000 500 33603 1.0000 500 37562 1.0000 500 5075 1.0000 500 5802 1.0000 500 8469 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/38/38.seqs.fa -oc motifs/38 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.254 C 0.247 G 0.230 T 0.269 Background letter frequencies (from dataset with add-one prior applied): A 0.254 C 0.247 G 0.230 T 0.269 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 11 llr = 143 E-value = 1.6e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :3411313a1:9291:75: pos.-specific C 55479357:4a:6:4a258 probability G 121::21::3::111:1:2 matrix T 4:22:33::3:11:5:::: bits 2.1 1.9 * * * 1.7 * * * 1.5 * * ** * * Relative 1.3 * * ** * * * Entropy 1.1 * ** ** * * ** (18.7 bits) 0.8 ** ** ** * **** 0.6 ** ** ** ** * **** 0.4 ** ** *** **** **** 0.2 ***** ************* 0.0 ------------------- Multilevel CCACCACCACCACATCAAC consensus TAC CTA G C C sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 33603 438 3.00e-11 TCCTCCGCCG CCCCCCCCACCACATCACC CCCCAACCCT 8469 474 1.85e-08 CACCGCACAA TCTCCATCAGCACACCACC CTCTTACC 5075 383 2.69e-07 CACATCCTCC TGTCCTGCAGCACACCACC TAGATATATA 23208 292 2.96e-07 CGCAGCGTTC CCCTCACCAGCAAACCAAG TCGCAGTCGA 263878 297 4.62e-07 CGGGTTGGTG CCACCGCCACCAGAGCCAC GTCTGTTCAG 23592 30 6.46e-07 CGACCTCTCC TCCACCCAATCAAACCAAC AACAAGTTGT 5802 353 1.03e-06 ACACAACTCT TCGTCTTCATCACATCACG ACCATTGTAT 22414 162 1.20e-06 GTTGGCGGTG CAACCGCAATCTCATCCAC GGCAGCTGCC 23801 143 1.38e-06 AGTGACCATC CACCATTCACCACGTCACC CCTTCCTTCG 33407 68 1.59e-06 AACAGAGCAG CAACCACCACCATAACGAC AGCAGTGGCG 37562 32 2.53e-06 GCTCCGCACC GGACCCAAAACACATCAAC TCACCAATTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33603 3e-11 437_[+1]_44 8469 1.8e-08 473_[+1]_8 5075 2.7e-07 382_[+1]_99 23208 3e-07 291_[+1]_190 263878 4.6e-07 296_[+1]_185 23592 6.5e-07 29_[+1]_452 5802 1e-06 352_[+1]_129 22414 1.2e-06 161_[+1]_320 23801 1.4e-06 142_[+1]_339 33407 1.6e-06 67_[+1]_414 37562 2.5e-06 31_[+1]_450 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=11 33603 ( 438) CCCCCCCCACCACATCACC 1 8469 ( 474) TCTCCATCAGCACACCACC 1 5075 ( 383) TGTCCTGCAGCACACCACC 1 23208 ( 292) CCCTCACCAGCAAACCAAG 1 263878 ( 297) CCACCGCCACCAGAGCCAC 1 23592 ( 30) TCCACCCAATCAAACCAAC 1 5802 ( 353) TCGTCTTCATCACATCACG 1 22414 ( 162) CAACCGCAATCTCATCCAC 1 23801 ( 143) CACCATTCACCACGTCACC 1 33407 ( 68) CAACCACCACCATAACGAC 1 37562 ( 32) GGACCCAAAACACATCAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 5302 bayes= 8.90989 E= 1.6e-003 -1010 114 -134 43 10 114 -34 -1010 52 55 -134 -56 -148 155 -1010 -56 -148 188 -1010 -1010 10 14 -34 2 -148 114 -134 2 10 155 -1010 -1010 198 -1010 -1010 -1010 -148 55 25 2 -1010 201 -1010 -1010 184 -1010 -1010 -156 -48 136 -134 -156 184 -1010 -134 -1010 -148 55 -134 76 -1010 201 -1010 -1010 152 -44 -134 -1010 110 88 -1010 -1010 -1010 172 -34 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 11 E= 1.6e-003 0.000000 0.545455 0.090909 0.363636 0.272727 0.545455 0.181818 0.000000 0.363636 0.363636 0.090909 0.181818 0.090909 0.727273 0.000000 0.181818 0.090909 0.909091 0.000000 0.000000 0.272727 0.272727 0.181818 0.272727 0.090909 0.545455 0.090909 0.272727 0.272727 0.727273 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.090909 0.363636 0.272727 0.272727 0.000000 1.000000 0.000000 0.000000 0.909091 0.000000 0.000000 0.090909 0.181818 0.636364 0.090909 0.090909 0.909091 0.000000 0.090909 0.000000 0.090909 0.363636 0.090909 0.454545 0.000000 1.000000 0.000000 0.000000 0.727273 0.181818 0.090909 0.000000 0.545455 0.454545 0.000000 0.000000 0.000000 0.818182 0.181818 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CT][CA][AC]CC[ACT][CT][CA]A[CGT]CACA[TC]CA[AC]C -------------------------------------------------------------------------------- Time 1.27 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 7 llr = 117 E-value = 2.4e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :9::a:a3:3::13::::41 pos.-specific C 6:13:1:::11:1::1::1: probability G 4:77:9:3a39:16311a39 matrix T :11::::4:3:a61779:1: bits 2.1 * * 1.9 * * * * * 1.7 * * * * * 1.5 *** * ** * * Relative 1.3 * **** * ** ** * Entropy 1.1 ** **** * ** * ** * (24.1 bits) 0.8 ******* * ** **** * 0.6 ******* * ** ***** * 0.4 ********* ** ***** * 0.2 ********* ********** 0.0 -------------------- Multilevel CAGGAGATGAGTTGTTTGAG consensus G C A G AG G sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 5075 244 3.28e-10 TTGGAGGTTG GAGGAGATGCGTTGGTTGGG GGGGAGGATA 33407 385 6.54e-10 ACGCCTACTA CAGGAGATGTGTCGTTTGCG GATCCCCTCG 22414 291 7.91e-09 ATTTTCATGT CAGCAGATGAGTTAGTGGAG TTTGTAGAAG 23592 246 1.57e-08 GGATGCTAGT GATGAGAGGGGTGGTCTGAG TTTGAGGTTA 37562 401 2.00e-08 TGGAGACGTA CTGCAGAAGTGTTGTTTGAA AGCACAATGC 23208 79 3.32e-08 AGAGTTTTTT CACGACAAGAGTTTTTTGGG AATATCCGAA 33603 278 8.54e-08 GGGGGAGGAG GAGGAGAGGGCTAATGTGTG ATCGGTGGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5075 3.3e-10 243_[+2]_237 33407 6.5e-10 384_[+2]_96 22414 7.9e-09 290_[+2]_190 23592 1.6e-08 245_[+2]_235 37562 2e-08 400_[+2]_80 23208 3.3e-08 78_[+2]_402 33603 8.5e-08 277_[+2]_203 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=7 5075 ( 244) GAGGAGATGCGTTGGTTGGG 1 33407 ( 385) CAGGAGATGTGTCGTTTGCG 1 22414 ( 291) CAGCAGATGAGTTAGTGGAG 1 23592 ( 246) GATGAGAGGGGTGGTCTGAG 1 37562 ( 401) CTGCAGAAGTGTTGTTTGAA 1 23208 ( 79) CACGACAAGAGTTTTTTGGG 1 33603 ( 278) GAGGAGAGGGCTAATGTGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5291 bayes= 9.40372 E= 2.4e-002 -945 121 90 -945 176 -945 -945 -91 -945 -79 163 -91 -945 21 163 -945 198 -945 -945 -945 -945 -79 190 -945 198 -945 -945 -945 17 -945 31 67 -945 -945 212 -945 17 -79 31 9 -945 -79 190 -945 -945 -945 -945 189 -83 -79 -68 109 17 -945 131 -91 -945 -945 31 141 -945 -79 -68 141 -945 -945 -68 167 -945 -945 212 -945 76 -79 31 -91 -83 -945 190 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 2.4e-002 0.000000 0.571429 0.428571 0.000000 0.857143 0.000000 0.000000 0.142857 0.000000 0.142857 0.714286 0.142857 0.000000 0.285714 0.714286 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.000000 0.285714 0.428571 0.000000 0.000000 1.000000 0.000000 0.285714 0.142857 0.285714 0.285714 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 0.000000 1.000000 0.142857 0.142857 0.142857 0.571429 0.285714 0.000000 0.571429 0.142857 0.000000 0.000000 0.285714 0.714286 0.000000 0.142857 0.142857 0.714286 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 1.000000 0.000000 0.428571 0.142857 0.285714 0.142857 0.142857 0.000000 0.857143 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG]AG[GC]AGA[TAG]G[AGT]GTT[GA][TG]TTG[AG]G -------------------------------------------------------------------------------- Time 2.38 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 14 sites = 10 llr = 113 E-value = 1.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 4252:8417a:841 pos.-specific C :841a2:93:9238 probability G 6::6::5:::1::: matrix T ::11::1:::::31 bits 2.1 1.9 * * 1.7 * * 1.5 * * ** Relative 1.3 * ** * *** Entropy 1.1 ** ** ***** * (16.4 bits) 0.8 ** ** ***** * 0.6 *** ******** * 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel GCAGCAGCAACAAC consensus AACA CA C CC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 33407 449 3.00e-09 GGCGGCAGCA GCAGCAGCAACAAC AGGGGGGAGG 37562 472 2.94e-07 CTTAATTCAC GACGCAACAACAAC TCATAAAGAG 5075 15 3.87e-07 GCAATGATGA ACAACAACAACAAC AATAGACGCT 33603 173 1.34e-06 GGAATCTCAT GCACCAGCCACATC TCTTGTGTCG 22414 185 1.75e-06 ATCCACGGCA GCTGCCGCAACACC CGTCGCCTTC 5802 318 1.90e-06 CAGGCAGTCT ACAGCAAAAACACC CAGAATCCAT 23801 458 5.58e-06 CAAATCTTCA ACCGCCGCAACATA ACTACCGCTA 23208 37 7.43e-06 AACCGAGAAA GCCGCAGCCACCTT GAACCCGGGC 263878 486 1.63e-05 GCATAGCACC ACCACATCCACCAC A 8469 414 2.03e-05 TTTCTTTGGC GAATCAACAAGACC ACAAGCAGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33407 3e-09 448_[+3]_38 37562 2.9e-07 471_[+3]_15 5075 3.9e-07 14_[+3]_472 33603 1.3e-06 172_[+3]_314 22414 1.7e-06 184_[+3]_302 5802 1.9e-06 317_[+3]_169 23801 5.6e-06 457_[+3]_29 23208 7.4e-06 36_[+3]_450 263878 1.6e-05 485_[+3]_1 8469 2e-05 413_[+3]_73 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=14 seqs=10 33407 ( 449) GCAGCAGCAACAAC 1 37562 ( 472) GACGCAACAACAAC 1 5075 ( 15) ACAACAACAACAAC 1 33603 ( 173) GCACCAGCCACATC 1 22414 ( 185) GCTGCCGCAACACC 1 5802 ( 318) ACAGCAAAAACACC 1 23801 ( 458) ACCGCCGCAACATA 1 23208 ( 37) GCCGCAGCCACCTT 1 263878 ( 486) ACCACATCCACCAC 1 8469 ( 414) GAATCAACAAGACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 5357 bayes= 9.31456 E= 1.7e+001 66 -997 138 -997 -34 169 -997 -997 98 69 -997 -143 -34 -131 138 -143 -997 201 -997 -997 166 -31 -997 -997 66 -997 112 -143 -134 186 -997 -997 146 28 -997 -997 198 -997 -997 -997 -997 186 -120 -997 166 -31 -997 -997 66 28 -997 16 -134 169 -997 -143 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 10 E= 1.7e+001 0.400000 0.000000 0.600000 0.000000 0.200000 0.800000 0.000000 0.000000 0.500000 0.400000 0.000000 0.100000 0.200000 0.100000 0.600000 0.100000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.400000 0.000000 0.500000 0.100000 0.100000 0.900000 0.000000 0.000000 0.700000 0.300000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.900000 0.100000 0.000000 0.800000 0.200000 0.000000 0.000000 0.400000 0.300000 0.000000 0.300000 0.100000 0.800000 0.000000 0.100000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA][CA][AC][GA]C[AC][GA]C[AC]AC[AC][ACT]C -------------------------------------------------------------------------------- Time 3.67 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22414 7.33e-10 161_[+1(1.20e-06)]_4_[+3(1.75e-06)]_\ 92_[+2(7.91e-09)]_190 23208 2.90e-09 36_[+3(7.43e-06)]_28_[+2(3.32e-08)]_\ 193_[+1(2.96e-07)]_190 23592 6.41e-08 29_[+1(6.46e-07)]_197_\ [+2(1.57e-08)]_235 23801 1.63e-04 142_[+1(1.38e-06)]_296_\ [+3(5.58e-06)]_29 263878 1.11e-04 296_[+1(4.62e-07)]_151_\ [+1(1.46e-05)]_[+3(1.63e-05)]_1 33407 2.37e-13 63_[+3(6.40e-07)]_307_\ [+2(6.54e-10)]_44_[+3(3.00e-09)]_38 33603 2.59e-13 172_[+3(1.34e-06)]_91_\ [+2(8.54e-08)]_140_[+1(3.00e-11)]_44 37562 6.63e-10 31_[+1(2.53e-06)]_350_\ [+2(2.00e-08)]_51_[+3(2.94e-07)]_15 5075 2.26e-12 14_[+3(3.87e-07)]_215_\ [+2(3.28e-10)]_119_[+1(2.69e-07)]_99 5802 5.24e-05 317_[+3(1.90e-06)]_21_\ [+1(1.03e-06)]_129 8469 9.26e-06 413_[+3(2.03e-05)]_46_\ [+1(1.85e-08)]_8 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************