******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/375/375.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 17909 1.0000 500 47025 1.0000 500 37460 1.0000 500 5954 1.0000 500 38886 1.0000 500 48523 1.0000 500 48732 1.0000 500 49052 1.0000 500 6807 1.0000 500 16283 1.0000 500 41470 1.0000 500 12303 1.0000 500 45917 1.0000 500 12759 1.0000 500 48406 1.0000 500 47523 1.0000 500 34752 1.0000 500 47607 1.0000 500 40317 1.0000 500 44124 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/375/375.seqs.fa -oc motifs/375 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10000 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.264 C 0.255 G 0.229 T 0.252 Background letter frequencies (from dataset with add-one prior applied): A 0.264 C 0.255 G 0.229 T 0.252 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 13 llr = 177 E-value = 3.0e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 6:71::8:5:::a3128372 pos.-specific C :5:3:1:91:12:122:51: probability G 2332292::a:8:224:115 matrix T 22:58::15:9::4522122 bits 2.1 * 1.9 * * 1.7 * ** * 1.5 ** * **** Relative 1.3 **** **** Entropy 1.1 * **** **** * (19.6 bits) 0.8 * **** **** * 0.6 *** ********* * ** 0.4 *** ********* * **** 0.2 *************** **** 0.0 -------------------- Multilevel ACATTGACAGTGATTGACAG consensus TGGC G T AGATA A sequence G C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 44124 95 2.02e-09 TTGGGGGGTA ACAGTGGCTGTGATTCACAG TCTAATGTAA 41470 259 3.75e-09 GGAATCCTGC GCATTGACTGTGAAGAACAG GAATAGAAAG 45917 253 1.47e-08 ACATCTTCTG GGACTGACAGTGAACGACAG TTATCTTTTC 47523 392 1.67e-08 ATAAAATCCA ACACTGACTGTGAGTGAATA GACAACACTG 37460 313 2.14e-08 CTTGGTAGGC ATGTTGACAGTGAGTCACAA TGATCGGAAT 40317 245 2.41e-08 AGTGACTAAT TGATTGACAGTGATTGAAGG AACTGCTGGC 34752 124 3.82e-08 CGATGTAGAG TGGGTGACTGTGAGTGACAT GCTATGAAAT 49052 71 4.15e-07 CACGACCACG ACAATGGCCGTGATGAAAAG AATTTTACCA 48523 149 1.63e-06 CAGTTGCGGT ATACTGACAGTGATACTGAT TGCGGCATCT 48732 303 1.97e-06 TCTAATATCA ACATGGACTGCGAACGATAT GAAACAAAAG 47607 113 2.09e-06 CCGTTCAACA ACACTCACTGTCAATTTCTG CGCGGGGCAC 38886 78 2.65e-06 CTTGCTGTGA AGGTGGACAGTCATTTTCCG TCGTTTTTCT 6807 475 4.79e-06 AGTGAGGTAC TCGTTGGTAGTGACGAAAAA ATTAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44124 2e-09 94_[+1]_386 41470 3.7e-09 258_[+1]_222 45917 1.5e-08 252_[+1]_228 47523 1.7e-08 391_[+1]_89 37460 2.1e-08 312_[+1]_168 40317 2.4e-08 244_[+1]_236 34752 3.8e-08 123_[+1]_357 49052 4.2e-07 70_[+1]_410 48523 1.6e-06 148_[+1]_332 48732 2e-06 302_[+1]_178 47607 2.1e-06 112_[+1]_368 38886 2.6e-06 77_[+1]_403 6807 4.8e-06 474_[+1]_6 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=13 44124 ( 95) ACAGTGGCTGTGATTCACAG 1 41470 ( 259) GCATTGACTGTGAAGAACAG 1 45917 ( 253) GGACTGACAGTGAACGACAG 1 47523 ( 392) ACACTGACTGTGAGTGAATA 1 37460 ( 313) ATGTTGACAGTGAGTCACAA 1 40317 ( 245) TGATTGACAGTGATTGAAGG 1 34752 ( 124) TGGGTGACTGTGAGTGACAT 1 49052 ( 71) ACAATGGCCGTGATGAAAAG 1 48523 ( 149) ATACTGACAGTGATACTGAT 1 48732 ( 303) ACATGGACTGCGAACGATAT 1 47607 ( 113) ACACTCACTGTCAATTTCTG 1 38886 ( 78) AGGTGGACAGTCATTTTCCG 1 6807 ( 475) TCGTTGGTAGTGACGAAAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 9620 bayes= 10.0605 E= 3.0e-004 122 -1035 -58 -13 -1035 108 42 -71 139 -1035 42 -1035 -177 27 -58 87 -1035 -1035 -58 175 -1035 -173 201 -1035 154 -1035 1 -1035 -1035 185 -1035 -171 81 -173 -1035 87 -1035 -1035 212 -1035 -1035 -173 -1035 187 -1035 -73 188 -1035 192 -1035 -1035 -1035 22 -173 1 61 -177 -73 1 110 -19 -15 75 -71 154 -1035 -1035 -13 22 108 -157 -171 139 -173 -157 -71 -19 -1035 123 -13 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 13 E= 3.0e-004 0.615385 0.000000 0.153846 0.230769 0.000000 0.538462 0.307692 0.153846 0.692308 0.000000 0.307692 0.000000 0.076923 0.307692 0.153846 0.461538 0.000000 0.000000 0.153846 0.846154 0.000000 0.076923 0.923077 0.000000 0.769231 0.000000 0.230769 0.000000 0.000000 0.923077 0.000000 0.076923 0.461538 0.076923 0.000000 0.461538 0.000000 0.000000 1.000000 0.000000 0.000000 0.076923 0.000000 0.923077 0.000000 0.153846 0.846154 0.000000 1.000000 0.000000 0.000000 0.000000 0.307692 0.076923 0.230769 0.384615 0.076923 0.153846 0.230769 0.538462 0.230769 0.230769 0.384615 0.153846 0.769231 0.000000 0.000000 0.230769 0.307692 0.538462 0.076923 0.076923 0.692308 0.076923 0.076923 0.153846 0.230769 0.000000 0.538462 0.230769 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AT][CG][AG][TC]TG[AG]C[AT]GTGA[TAG][TG][GAC][AT][CA]A[GAT] -------------------------------------------------------------------------------- Time 3.53 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 20 llr = 177 E-value = 1.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 31::58:1:::: pos.-specific C :117:1:5:511 probability G :17331::a127 matrix T 88213:a5:583 bits 2.1 * 1.9 * * 1.7 * * 1.5 * * Relative 1.3 * * Entropy 1.1 ** ** * * (12.8 bits) 0.8 **** ** * ** 0.6 **** ******* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTGCAATCGCTG consensus A TGG T TGT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 41470 400 2.21e-07 TCCACTTTCA TTGCAATTGTTG CTGTGGTTCC 38886 13 1.47e-06 AGAGAGAGAG ATGCAATCGTTG GGATGCGATG 45917 20 1.86e-06 ACAGACCAAC TTGGTATCGCTG GTTGCGTGTA 48406 44 4.65e-06 TAGTCTATCC TTGCGATTGGTG CCGTACGGAT 47025 333 4.65e-06 GGGCAGTCCT TTGGAATTGCGG ATATTCGTTG 47523 325 1.72e-05 TTTGGTTGTC TTGCGATAGTTT TCTCTGTATT 16283 476 1.72e-05 ACGAGGACCC TTCCAATCGTTT CTCCTCGCAA 44124 251 2.10e-05 AGGCGGGAGA ATGGAATCGGTG ACGACGGCAA 34752 102 2.86e-05 TGGAGAACTC TGGCTATTGCTG CGATGTAGAG 12759 441 2.86e-05 TACCGTCACT TTGGAATCGCCG GATGGGAATT 17909 281 2.86e-05 AAACATAGGA TTCCAATTGTGG ACGGTCTACT 12303 336 3.42e-05 ACTTACCCTC TCTCAATCGCTG TTGGTTGCAT 40317 478 4.00e-05 CAAGCCAAAC ATTGTATCGTTG CGGGTACAGG 48732 162 5.06e-05 ATGTCTCCCG ATTGAATTGCTT GCCAAGACGC 47607 170 6.78e-05 GGCGGTAGAA ATGCGATTGTTC CCTGTACGAA 5954 107 6.78e-05 GACTGCTTAC TTTCTATAGCTT CCGTCGACGC 37460 29 1.05e-04 CTTCCCCTAT TTGCTCTCGTGT ATGGAACAAT 49052 232 1.16e-04 TGGTACGTAC TTGTGGTCGTTG AGGCGGATAC 48523 136 1.53e-04 AATAAATGAA TCGCAGTTGCGG TATACTGACA 6807 323 2.96e-04 TCGCATGTAC TAGCGCTTGCTT TTATACCCAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41470 2.2e-07 399_[+2]_89 38886 1.5e-06 12_[+2]_476 45917 1.9e-06 19_[+2]_469 48406 4.6e-06 43_[+2]_445 47025 4.6e-06 332_[+2]_156 47523 1.7e-05 324_[+2]_164 16283 1.7e-05 475_[+2]_13 44124 2.1e-05 250_[+2]_238 34752 2.9e-05 101_[+2]_387 12759 2.9e-05 440_[+2]_48 17909 2.9e-05 280_[+2]_208 12303 3.4e-05 335_[+2]_153 40317 4e-05 477_[+2]_11 48732 5.1e-05 161_[+2]_327 47607 6.8e-05 169_[+2]_319 5954 6.8e-05 106_[+2]_382 37460 0.0001 28_[+2]_460 49052 0.00012 231_[+2]_257 48523 0.00015 135_[+2]_353 6807 0.0003 322_[+2]_166 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=20 41470 ( 400) TTGCAATTGTTG 1 38886 ( 13) ATGCAATCGTTG 1 45917 ( 20) TTGGTATCGCTG 1 48406 ( 44) TTGCGATTGGTG 1 47025 ( 333) TTGGAATTGCGG 1 47523 ( 325) TTGCGATAGTTT 1 16283 ( 476) TTCCAATCGTTT 1 44124 ( 251) ATGGAATCGGTG 1 34752 ( 102) TGGCTATTGCTG 1 12759 ( 441) TTGGAATCGCCG 1 17909 ( 281) TTCCAATTGTGG 1 12303 ( 336) TCTCAATCGCTG 1 40317 ( 478) ATTGTATCGTTG 1 48732 ( 162) ATTGAATTGCTT 1 47607 ( 170) ATGCGATTGTTC 1 5954 ( 107) TTTCTATAGCTT 1 37460 ( 29) TTGCTCTCGTGT 1 49052 ( 232) TTGTGGTCGTTG 1 48523 ( 136) TCGCAGTTGCGG 1 6807 ( 323) TAGCGCTTGCTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9780 bayes= 8.93074 E= 1.3e+001 -8 -1097 -1097 157 -240 -135 -219 167 -1097 -135 161 -33 -1097 135 39 -233 92 -1097 12 -1 160 -135 -120 -1097 -1097 -1097 -1097 199 -140 82 -1097 84 -1097 -1097 212 -1097 -1097 82 -120 84 -1097 -235 -20 157 -1097 -235 150 25 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 20 E= 1.3e+001 0.250000 0.000000 0.000000 0.750000 0.050000 0.100000 0.050000 0.800000 0.000000 0.100000 0.700000 0.200000 0.000000 0.650000 0.300000 0.050000 0.500000 0.000000 0.250000 0.250000 0.800000 0.100000 0.100000 0.000000 0.000000 0.000000 0.000000 1.000000 0.100000 0.450000 0.000000 0.450000 0.000000 0.000000 1.000000 0.000000 0.000000 0.450000 0.100000 0.450000 0.000000 0.050000 0.200000 0.750000 0.000000 0.050000 0.650000 0.300000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TA]T[GT][CG][AGT]AT[CT]G[CT][TG][GT] -------------------------------------------------------------------------------- Time 6.83 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 10 llr = 113 E-value = 2.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 9:83::966::: pos.-specific C :2::4::::::1 probability G :8256a:349:9 matrix T 1::2::11:1a: bits 2.1 * 1.9 * * 1.7 * *** 1.5 * ** *** Relative 1.3 *** ** *** Entropy 1.1 *** *** **** (16.3 bits) 0.8 *** *** **** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AGAGGGAAAGTG consensus CGAC GG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47025 87 3.13e-07 CTACTACGAG AGAGCGAAGGTG TGCCAAAAAG 37460 261 4.56e-07 TTATCAGTAG AGATGGAAAGTG ACATAACTCA 34752 67 8.31e-07 ACTTTTCGCA ACAGGGAAAGTG GCGAGCGATC 48732 187 1.28e-06 AAGACGCTCA AGAGGGATAGTG TATGCTCCCC 12759 343 2.57e-06 TTCTCGGTCA AGGACGAAAGTG CCGACACCAA 49052 184 3.33e-06 GAAGTCAGCG AGGAGGAGGGTG CCGTATGCGT 44124 290 3.88e-06 AATCACCGAG AGAGCGAAAGTC ACCGCGGAAT 38886 180 5.63e-06 ATGTCGTACT AGAAGGAAGTTG GATACCAAAG 40317 76 1.13e-05 ATCTATTCCC AGATCGTGAGTG CTACACAACG 47607 451 1.27e-05 ATCTGCCGTT TCAGGGAGGGTG TGTAGTATAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47025 3.1e-07 86_[+3]_402 37460 4.6e-07 260_[+3]_228 34752 8.3e-07 66_[+3]_422 48732 1.3e-06 186_[+3]_302 12759 2.6e-06 342_[+3]_146 49052 3.3e-06 183_[+3]_305 44124 3.9e-06 289_[+3]_199 38886 5.6e-06 179_[+3]_309 40317 1.1e-05 75_[+3]_413 47607 1.3e-05 450_[+3]_38 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=10 47025 ( 87) AGAGCGAAGGTG 1 37460 ( 261) AGATGGAAAGTG 1 34752 ( 67) ACAGGGAAAGTG 1 48732 ( 187) AGAGGGATAGTG 1 12759 ( 343) AGGACGAAAGTG 1 49052 ( 184) AGGAGGAGGGTG 1 44124 ( 290) AGAGCGAAAGTC 1 38886 ( 180) AGAAGGAAGTTG 1 40317 ( 76) AGATCGTGAGTG 1 47607 ( 451) TCAGGGAGGGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9780 bayes= 10.184 E= 2.1e+002 177 -997 -997 -133 -997 -35 180 -997 160 -997 -20 -997 19 -997 112 -33 -997 65 139 -997 -997 -997 212 -997 177 -997 -997 -133 119 -997 39 -133 119 -997 80 -997 -997 -997 197 -133 -997 -997 -997 199 -997 -135 197 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 2.1e+002 0.900000 0.000000 0.000000 0.100000 0.000000 0.200000 0.800000 0.000000 0.800000 0.000000 0.200000 0.000000 0.300000 0.000000 0.500000 0.200000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.900000 0.000000 0.000000 0.100000 0.600000 0.000000 0.300000 0.100000 0.600000 0.000000 0.400000 0.000000 0.000000 0.000000 0.900000 0.100000 0.000000 0.000000 0.000000 1.000000 0.000000 0.100000 0.900000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[GC][AG][GAT][GC]GA[AG][AG]GTG -------------------------------------------------------------------------------- Time 10.19 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17909 2.89e-02 280_[+2(2.86e-05)]_208 47025 3.34e-05 86_[+3(3.13e-07)]_234_\ [+2(4.65e-06)]_156 37460 3.27e-08 260_[+3(4.56e-07)]_40_\ [+1(2.14e-08)]_168 5954 2.72e-01 106_[+2(6.78e-05)]_382 38886 5.45e-07 12_[+2(1.47e-06)]_53_[+1(2.65e-06)]_\ 82_[+3(5.63e-06)]_309 48523 1.11e-03 148_[+1(1.63e-06)]_332 48732 2.63e-06 161_[+2(5.06e-05)]_13_\ [+3(1.28e-06)]_104_[+1(1.97e-06)]_178 49052 3.19e-06 70_[+1(4.15e-07)]_93_[+3(3.33e-06)]_\ 305 6807 1.17e-02 474_[+1(4.79e-06)]_6 16283 6.20e-02 475_[+2(1.72e-05)]_13 41470 4.79e-08 258_[+1(3.75e-09)]_121_\ [+2(2.21e-07)]_89 12303 1.09e-01 335_[+2(3.42e-05)]_153 45917 9.56e-07 19_[+2(1.86e-06)]_221_\ [+1(1.47e-08)]_228 12759 1.10e-03 342_[+3(2.57e-06)]_86_\ [+2(2.86e-05)]_48 48406 2.25e-02 43_[+2(4.65e-06)]_445 47523 3.97e-06 324_[+2(1.72e-05)]_55_\ [+1(1.67e-08)]_89 34752 2.99e-08 66_[+3(8.31e-07)]_23_[+2(2.86e-05)]_\ 10_[+1(3.82e-08)]_279_[+1(5.55e-06)]_58 47607 2.74e-05 112_[+1(2.09e-06)]_37_\ [+2(6.78e-05)]_269_[+3(1.27e-05)]_38 40317 2.87e-07 37_[+1(4.32e-06)]_18_[+3(1.13e-05)]_\ 157_[+1(2.41e-08)]_213_[+2(4.00e-05)]_11 44124 6.22e-09 78_[+2(4.55e-05)]_4_[+1(2.02e-09)]_\ 136_[+2(2.10e-05)]_27_[+3(3.88e-06)]_199 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************