******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/422/422.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10482 1.0000 96 10936 1.0000 500 11229 1.0000 500 11616 1.0000 500 1779 1.0000 500 17946 1.0000 500 22957 1.0000 500 2483 1.0000 500 261615 1.0000 500 262517 1.0000 500 264523 1.0000 500 264834 1.0000 500 2660 1.0000 500 32347 1.0000 500 32493 1.0000 500 36970 1.0000 500 37854 1.0000 500 4000 1.0000 500 4736 1.0000 500 5371 1.0000 500 6701 1.0000 500 896 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/422/422.seqs.fa -oc motifs/422 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 22 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10596 N= 22 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.275 C 0.232 G 0.227 T 0.266 Background letter frequencies (from dataset with add-one prior applied): A 0.275 C 0.232 G 0.227 T 0.266 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 10 llr = 151 E-value = 1.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :69247a:a942148::8338 pos.-specific C 9318:::9::2792:581541 probability G 1:::52:1::1::32121:1: matrix T :1::11:::131:1:4::221 bits 2.1 1.9 * * 1.7 * *** * 1.5 * * **** * Relative 1.3 * ** **** * * * Entropy 1.1 * ** **** * * ** * (21.8 bits) 0.9 * ** ***** ** * ** * 0.6 ********** ** **** * 0.4 ********** ** ***** * 0.2 ********** ********** 0.0 --------------------- Multilevel CAACGAACAAACCAACCACCA consensus C AAG TA GGTG AA sequence C C TT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 4000 421 2.37e-09 AAACATTGTA CAACAAACAAACCCATCCCAA TAAACAAAAG 1779 466 2.37e-09 GCAATCAAAT CAAAAAACAATACAACCACCA CTACCGACTA 36970 85 2.78e-08 GAGGGCACTC CAACGAACAAATCCAGCAATA CGGCGGGTGT 10482 65 2.78e-08 CATTTCATCG CAAAGAACAATACAATCACCT ACTATTTCAC 2483 350 3.32e-08 CCCTTCCCAC CTACTAACAACCCAATCATAA AGCGATTACC 5371 336 3.62e-08 CACAAACTTC GAACGAAGAACCCAACCATCA CAGAAGATTC 2660 328 3.62e-08 ACAATTCCAA CCACGTACATACCGGCCACCA GCTTCATCAT 262517 40 3.62e-08 TGATGTAAGA CCACGGACAATCCTACGACGA GTCGTCCTGG 261615 471 4.33e-07 TGGCATCATA CCACAAACAAACAGATCGAAC ATATAGACA 6701 334 5.18e-07 AAGGATAATA CACCAGACAAGCCGGCGAATA CAGTTCTTGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 4000 2.4e-09 420_[+1]_59 1779 2.4e-09 465_[+1]_14 36970 2.8e-08 84_[+1]_395 10482 2.8e-08 64_[+1]_11 2483 3.3e-08 349_[+1]_130 5371 3.6e-08 335_[+1]_144 2660 3.6e-08 327_[+1]_152 262517 3.6e-08 39_[+1]_440 261615 4.3e-07 470_[+1]_9 6701 5.2e-07 333_[+1]_146 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=10 4000 ( 421) CAACAAACAAACCCATCCCAA 1 1779 ( 466) CAAAAAACAATACAACCACCA 1 36970 ( 85) CAACGAACAAATCCAGCAATA 1 10482 ( 65) CAAAGAACAATACAATCACCT 1 2483 ( 350) CTACTAACAACCCAATCATAA 1 5371 ( 336) GAACGAAGAACCCAACCATCA 1 2660 ( 328) CCACGTACATACCGGCCACCA 1 262517 ( 40) CCACGGACAATCCTACGACGA 1 261615 ( 471) CCACAAACAAACAGATCGAAC 1 6701 ( 334) CACCAGACAAGCCGGCGAATA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 10156 bayes= 10.2385 E= 1.9e+000 -997 195 -118 -997 112 37 -997 -141 171 -122 -997 -997 -46 178 -997 -997 54 -997 114 -141 135 -997 -18 -141 186 -997 -997 -997 -997 195 -118 -997 186 -997 -997 -997 171 -997 -997 -141 54 -22 -118 17 -46 159 -997 -141 -146 195 -997 -997 54 -22 40 -141 154 -997 -18 -997 -997 110 -118 59 -997 178 -18 -997 154 -122 -118 -997 12 110 -997 -41 12 78 -118 -41 154 -122 -997 -141 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 1.9e+000 0.000000 0.900000 0.100000 0.000000 0.600000 0.300000 0.000000 0.100000 0.900000 0.100000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.400000 0.000000 0.500000 0.100000 0.700000 0.000000 0.200000 0.100000 1.000000 0.000000 0.000000 0.000000 0.000000 0.900000 0.100000 0.000000 1.000000 0.000000 0.000000 0.000000 0.900000 0.000000 0.000000 0.100000 0.400000 0.200000 0.100000 0.300000 0.200000 0.700000 0.000000 0.100000 0.100000 0.900000 0.000000 0.000000 0.400000 0.200000 0.300000 0.100000 0.800000 0.000000 0.200000 0.000000 0.000000 0.500000 0.100000 0.400000 0.000000 0.800000 0.200000 0.000000 0.800000 0.100000 0.100000 0.000000 0.300000 0.500000 0.000000 0.200000 0.300000 0.400000 0.100000 0.200000 0.800000 0.100000 0.000000 0.100000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[AC]A[CA][GA][AG]ACAA[ATC][CA]C[AGC][AG][CT][CG]A[CAT][CAT]A -------------------------------------------------------------------------------- Time 3.97 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 8 llr = 132 E-value = 1.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3:3:1:::4:4::116:53:5 pos.-specific C ::5::::5::1::::313:1: probability G :8:91:9519:6a6919:495 matrix T 83318a1:5154:3:::34:: bits 2.1 * 1.9 * * 1.7 * * 1.5 * ** * * * * * Relative 1.3 * * ** * * * * * Entropy 1.1 ** * *** * ** * * ** (23.8 bits) 0.9 ** ***** * **** * ** 0.6 ** ***** * ****** ** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TGCGTTGCTGTGGGGAGAGGA consensus ATA GA AT T C CT G sequence T TA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 2483 276 1.82e-10 TAGCTAGGTG TGTGTTGCAGAGGGGAGCGGA GTTGATGGGG 896 58 4.86e-10 GGGAGTGTGT TGCGTTGCAGTTGTGAGCGGA GCGGGGGGTG 32347 405 1.43e-08 AGTATGAATG TGAGTTTGTGTTGTGCGATGG CAGCGATGGC 2660 110 1.55e-08 AACGATCGAT ATCGTTGCTGAGGGGCCAGGG GCCAATTGGT 264834 273 2.50e-08 TGTCGAGGAC TGCGGTGCGGCGGGGGGATGA GGATAAATAA 37854 160 3.64e-08 ATTGCATTGG ATTGTTGGATTGGGGAGAAGA AATGCGATGA 261615 112 3.64e-08 TCATTGCGGA TGAGATGGTGATGGGAGTTCG TTTTCCATTT 5371 154 6.73e-08 GAACAACGAC TGCTTTGGTGTGGAAAGTAGG CACGTGTAGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 2483 1.8e-10 275_[+2]_204 896 4.9e-10 57_[+2]_422 32347 1.4e-08 404_[+2]_75 2660 1.5e-08 109_[+2]_370 264834 2.5e-08 272_[+2]_207 37854 3.6e-08 159_[+2]_320 261615 3.6e-08 111_[+2]_368 5371 6.7e-08 153_[+2]_326 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=8 2483 ( 276) TGTGTTGCAGAGGGGAGCGGA 1 896 ( 58) TGCGTTGCAGTTGTGAGCGGA 1 32347 ( 405) TGAGTTTGTGTTGTGCGATGG 1 2660 ( 110) ATCGTTGCTGAGGGGCCAGGG 1 264834 ( 273) TGCGGTGCGGCGGGGGGATGA 1 37854 ( 160) ATTGTTGGATTGGGGAGAAGA 1 261615 ( 112) TGAGATGGTGATGGGAGTTCG 1 5371 ( 154) TGCTTTGGTGTGGAAAGTAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 10156 bayes= 11.0463 E= 1.5e+001 -14 -965 -965 150 -965 -965 173 -9 -14 110 -965 -9 -965 -965 195 -109 -114 -965 -86 150 -965 -965 -965 191 -965 -965 195 -109 -965 110 114 -965 45 -965 -86 91 -965 -965 195 -109 45 -89 -965 91 -965 -965 146 50 -965 -965 214 -965 -114 -965 146 -9 -114 -965 195 -965 118 10 -86 -965 -965 -89 195 -965 86 10 -965 -9 -14 -965 73 50 -965 -89 195 -965 86 -965 114 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 1.5e+001 0.250000 0.000000 0.000000 0.750000 0.000000 0.000000 0.750000 0.250000 0.250000 0.500000 0.000000 0.250000 0.000000 0.000000 0.875000 0.125000 0.125000 0.000000 0.125000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.500000 0.500000 0.000000 0.375000 0.000000 0.125000 0.500000 0.000000 0.000000 0.875000 0.125000 0.375000 0.125000 0.000000 0.500000 0.000000 0.000000 0.625000 0.375000 0.000000 0.000000 1.000000 0.000000 0.125000 0.000000 0.625000 0.250000 0.125000 0.000000 0.875000 0.000000 0.625000 0.250000 0.125000 0.000000 0.000000 0.125000 0.875000 0.000000 0.500000 0.250000 0.000000 0.250000 0.250000 0.000000 0.375000 0.375000 0.000000 0.125000 0.875000 0.000000 0.500000 0.000000 0.500000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TA][GT][CAT]GTTG[CG][TA]G[TA][GT]G[GT]G[AC]G[ACT][GTA]G[AG] -------------------------------------------------------------------------------- Time 8.05 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 14 llr = 187 E-value = 2.7e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :35:142:6411:21:5::4 pos.-specific C 51::612631619::9:4a: probability G :5::2:::143::1111::5 matrix T 515a16641119179:46:1 bits 2.1 * 1.9 * * 1.7 * * * * 1.5 * * * * Relative 1.3 * ** ** * Entropy 1.1 * * * ** ** ** (19.3 bits) 0.9 * ** * ***** ** 0.6 * **** * ********** 0.4 ********* ********** 0.2 ******************** 0.0 -------------------- Multilevel CGATCTTCAACTCTTCATCG consensus TAT GAATCGG A TC A sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 5371 463 1.98e-08 ATCATCATAA CGTTCTTCCTCTCTTCTCCT TTTGCAAGAA 17946 134 3.57e-08 GATTCGTACG CGATATCCAACTCTTCTTCA TCCGAATCGT 32493 117 3.99e-08 CCGCTGAAGG CAATCTCCAGGTCATCACCG AGCCCATCCC 4736 327 5.57e-08 GAGGAATTCA TGTTCTTTGGGTCTTCACCA TCTTATTCTG 11229 268 1.04e-07 CAAGATGTTG TATTCTTCAAATCATCACCG TACTACTATC 264523 135 1.68e-07 AGAATCGGAT CGATCAATAGGTCGTCTTCG ACTGTTAACA 36970 224 2.02e-07 TGCACCGACA CAATGTTCACGTCATCATCG TCGTCGTCCA 896 436 2.42e-07 TTCTTTCGTC TAATGTTCTTCTCTTCATCA ATCATCTGTT 6701 409 3.43e-07 ACGCCGATAC TGATCACTCACACTTCTTCA CCAGCATCGT 22957 17 4.41e-07 GATAAAGCTG TTTTGTTTAGTTCTTCTTCG AAGGTGGCGA 261615 71 5.62e-07 GGTAGAGTAT TGTTCATTAGCTTTTGATCG GTTGCTTTCC 1779 437 1.11e-06 ACGCACACGC CCATCCACCACTCTGCTTCG CAATCAAATC 32347 107 1.38e-06 CACCCTTCTC TCTTTAACAACTCTTCGTCA GCGAATCTGC 37854 462 1.69e-06 ATATTCAGGA CGTTCATCCACCCTACACCT CACCCTTGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5371 2e-08 462_[+3]_18 17946 3.6e-08 133_[+3]_347 32493 4e-08 116_[+3]_364 4736 5.6e-08 326_[+3]_154 11229 1e-07 267_[+3]_213 264523 1.7e-07 134_[+3]_346 36970 2e-07 223_[+3]_257 896 2.4e-07 435_[+3]_45 6701 3.4e-07 408_[+3]_72 22957 4.4e-07 16_[+3]_464 261615 5.6e-07 70_[+3]_410 1779 1.1e-06 436_[+3]_44 32347 1.4e-06 106_[+3]_374 37854 1.7e-06 461_[+3]_19 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=14 5371 ( 463) CGTTCTTCCTCTCTTCTCCT 1 17946 ( 134) CGATATCCAACTCTTCTTCA 1 32493 ( 117) CAATCTCCAGGTCATCACCG 1 4736 ( 327) TGTTCTTTGGGTCTTCACCA 1 11229 ( 268) TATTCTTCAAATCATCACCG 1 264523 ( 135) CGATCAATAGGTCGTCTTCG 1 36970 ( 224) CAATGTTCACGTCATCATCG 1 896 ( 436) TAATGTTCTTCTCTTCATCA 1 6701 ( 409) TGATCACTCACACTTCTTCA 1 22957 ( 17) TTTTGTTTAGTTCTTCTTCG 1 261615 ( 71) TGTTCATTAGCTTTTGATCG 1 1779 ( 437) CCATCCACCACTCTGCTTCG 1 32347 ( 107) TCTTTAACAACTCTTCGTCA 1 37854 ( 462) CGTTCATCCACCCTACACCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 10178 bayes= 9.34748 E= 2.7e-003 -1045 110 -1045 91 5 -70 114 -189 86 -1045 -1045 91 -1045 -1045 -1045 191 -194 147 -8 -189 38 -170 -1045 110 -36 -12 -1045 110 -1045 147 -1045 43 105 30 -166 -189 64 -170 66 -90 -194 130 33 -189 -194 -170 -1045 169 -1045 200 -1045 -189 -36 -1045 -166 143 -194 -1045 -166 169 -1045 200 -166 -1045 86 -1045 -166 69 -1045 62 -1045 127 -1045 210 -1045 -1045 38 -1045 114 -90 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 14 E= 2.7e-003 0.000000 0.500000 0.000000 0.500000 0.285714 0.142857 0.500000 0.071429 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 0.000000 1.000000 0.071429 0.642857 0.214286 0.071429 0.357143 0.071429 0.000000 0.571429 0.214286 0.214286 0.000000 0.571429 0.000000 0.642857 0.000000 0.357143 0.571429 0.285714 0.071429 0.071429 0.428571 0.071429 0.357143 0.142857 0.071429 0.571429 0.285714 0.071429 0.071429 0.071429 0.000000 0.857143 0.000000 0.928571 0.000000 0.071429 0.214286 0.000000 0.071429 0.714286 0.071429 0.000000 0.071429 0.857143 0.000000 0.928571 0.071429 0.000000 0.500000 0.000000 0.071429 0.428571 0.000000 0.357143 0.000000 0.642857 0.000000 1.000000 0.000000 0.000000 0.357143 0.000000 0.500000 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CT][GA][AT]T[CG][TA][TAC][CT][AC][AG][CG]TC[TA]TC[AT][TC]C[GA] -------------------------------------------------------------------------------- Time 11.95 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10482 5.65e-05 64_[+1(2.78e-08)]_11 10936 4.90e-01 500 11229 4.48e-04 267_[+3(1.04e-07)]_213 11616 4.25e-01 500 1779 1.17e-07 436_[+3(1.11e-06)]_9_[+1(2.37e-09)]_\ 14 17946 1.67e-04 71_[+3(3.82e-06)]_42_[+3(3.57e-08)]_\ 347 22957 1.21e-03 16_[+3(4.41e-07)]_464 2483 1.25e-10 275_[+2(1.82e-10)]_53_\ [+1(3.32e-08)]_130 261615 4.03e-10 70_[+3(5.62e-07)]_21_[+2(3.64e-08)]_\ 338_[+1(4.33e-07)]_9 262517 6.73e-05 39_[+1(3.62e-08)]_440 264523 2.66e-04 134_[+3(1.68e-07)]_346 264834 1.78e-04 272_[+2(2.50e-08)]_207 2660 3.47e-08 109_[+2(1.55e-08)]_197_\ [+1(3.62e-08)]_152 32347 2.98e-07 106_[+3(1.38e-06)]_278_\ [+2(1.43e-08)]_75 32493 4.91e-04 116_[+3(3.99e-08)]_364 36970 2.94e-07 84_[+1(2.78e-08)]_118_\ [+3(2.02e-07)]_257 37854 2.36e-06 159_[+2(3.64e-08)]_281_\ [+3(1.69e-06)]_19 4000 2.17e-05 420_[+1(2.37e-09)]_59 4736 3.72e-04 326_[+3(5.57e-08)]_154 5371 3.07e-12 153_[+2(6.73e-08)]_161_\ [+1(3.62e-08)]_106_[+3(1.98e-08)]_18 6701 3.95e-06 333_[+1(5.18e-07)]_54_\ [+3(3.43e-07)]_72 896 7.86e-09 57_[+2(4.86e-10)]_6_[+2(3.40e-05)]_\ 330_[+3(2.42e-07)]_45 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************