******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/317/317.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11312 1.0000 500 12141 1.0000 500 20800 1.0000 500 21149 1.0000 500 22291 1.0000 500 23405 1.0000 500 24362 1.0000 500 25239 1.0000 500 261063 1.0000 500 261064 1.0000 500 261065 1.0000 500 263277 1.0000 500 269403 1.0000 500 27550 1.0000 500 36947 1.0000 500 9329 1.0000 500 bd1798 1.0000 500 bd756 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/317/317.seqs.fa -oc motifs/317 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.255 C 0.244 G 0.227 T 0.275 Background letter frequencies (from dataset with add-one prior applied): A 0.255 C 0.244 G 0.227 T 0.275 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 12 llr = 190 E-value = 8.2e-011 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :1317:55:1:11::1:1113 pos.-specific C 88882a33a9454848:9387 probability G 3:::::13::::1:111::1: matrix T :1:22:2:::64435:9:7:: bits 2.1 * * 1.9 * * 1.7 * ** * 1.5 * ** ** Relative 1.3 *** * ** * *** * Entropy 1.1 *** * ** * *** ** (22.9 bits) 0.9 **** * *** * *** ** 0.6 ****** **** ******** 0.4 ****** ***** ******** 0.2 ********************* 0.0 --------------------- Multilevel CCCCACAACCTCCCTCTCTCC consensus G A CC CTTTC C A sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 261065 313 6.73e-12 AGCCCAACCA CCCCACAGCCTCTCTCTCTCC GACACTCTCC 261064 313 6.73e-12 AGCCCAACCA CCCCACAGCCTCTCTCTCTCC GACACTCTCC 261063 313 6.73e-12 AGCCCAACCA CCCCACAGCCTCTCTCTCTCC GACACTCTCC bd756 449 5.90e-09 TTAAATTGTC CCCTACTACCTCCTCCTCTCA CCACCGTCCA 12141 10 3.30e-08 GGAGAGAGG CCCCTCTCCCTTCTCCTCCCA CCGACCGCAA 20800 145 7.74e-08 CGATGAACAG CCACCCGACCCTACTCTCCCC TTCATCCTCG 263277 433 8.33e-08 CGCCCCTGCT GCCTACACCCTACCTCTCTGC AATAGCATCA 24362 427 8.33e-08 TACTTCGCAA CCCCACCACACTGCGCTCTCA GCACTGCCCT 23405 433 1.19e-07 CTTTCCTTTG CAACACAACCCTCTCATCTCC TGTCCTCCTC 11312 313 2.02e-07 GATCGAGCAA CCCCTCCACCTCTCCCGACCA CTCCCGACCC 22291 233 2.29e-07 AATATTAAAA GTCCCCCCCCCCTCCGTCTCC GGTGACTGAA 21149 479 4.16e-07 TCAACACGAC GCAAACAACCCTCCTCTCAAC C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 261065 6.7e-12 312_[+1]_167 261064 6.7e-12 312_[+1]_167 261063 6.7e-12 312_[+1]_167 bd756 5.9e-09 448_[+1]_31 12141 3.3e-08 9_[+1]_470 20800 7.7e-08 144_[+1]_335 263277 8.3e-08 432_[+1]_47 24362 8.3e-08 426_[+1]_53 23405 1.2e-07 432_[+1]_47 11312 2e-07 312_[+1]_167 22291 2.3e-07 232_[+1]_247 21149 4.2e-07 478_[+1]_1 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=12 261065 ( 313) CCCCACAGCCTCTCTCTCTCC 1 261064 ( 313) CCCCACAGCCTCTCTCTCTCC 1 261063 ( 313) CCCCACAGCCTCTCTCTCTCC 1 bd756 ( 449) CCCTACTACCTCCTCCTCTCA 1 12141 ( 10) CCCCTCTCCCTTCTCCTCCCA 1 20800 ( 145) CCACCCGACCCTACTCTCCCC 1 263277 ( 433) GCCTACACCCTACCTCTCTGC 1 24362 ( 427) CCCCACCACACTGCGCTCTCA 1 23405 ( 433) CAACACAACCCTCTCATCTCC 1 11312 ( 313) CCCCTCCACCTCTCCCGACCA 1 22291 ( 233) GTCCCCCCCCCCTCCGTCTCC 1 21149 ( 479) GCAAACAACCCTCCTCTCAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 9.14828 E= 8.2e-011 -1023 162 14 -1023 -161 177 -1023 -172 -3 162 -1023 -1023 -161 162 -1023 -72 139 -55 -1023 -72 -1023 204 -1023 -1023 97 4 -144 -72 97 4 14 -1023 -1023 204 -1023 -1023 -161 191 -1023 -1023 -1023 77 -1023 109 -161 104 -1023 60 -161 77 -144 60 -1023 162 -1023 -13 -1023 77 -144 86 -161 177 -144 -1023 -1023 -1023 -144 174 -161 191 -1023 -1023 -161 4 -1023 128 -161 177 -144 -1023 39 145 -1023 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 12 E= 8.2e-011 0.000000 0.750000 0.250000 0.000000 0.083333 0.833333 0.000000 0.083333 0.250000 0.750000 0.000000 0.000000 0.083333 0.750000 0.000000 0.166667 0.666667 0.166667 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 0.500000 0.250000 0.083333 0.166667 0.500000 0.250000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.083333 0.916667 0.000000 0.000000 0.000000 0.416667 0.000000 0.583333 0.083333 0.500000 0.000000 0.416667 0.083333 0.416667 0.083333 0.416667 0.000000 0.750000 0.000000 0.250000 0.000000 0.416667 0.083333 0.500000 0.083333 0.833333 0.083333 0.000000 0.000000 0.000000 0.083333 0.916667 0.083333 0.916667 0.000000 0.000000 0.083333 0.250000 0.000000 0.666667 0.083333 0.833333 0.083333 0.000000 0.333333 0.666667 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG]C[CA]CAC[AC][ACG]CC[TC][CT][CT][CT][TC]CTC[TC]C[CA] -------------------------------------------------------------------------------- Time 3.19 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 135 E-value = 1.0e-008 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::882::28:a:a:a:8:7: pos.-specific C ::a22:a:22::3:2:::3:5 probability G :::::7:a5:a:7:8:a:7:5 matrix T aa:::2::2::::::::2:3: bits 2.1 * ** * * 1.9 *** ** ** * ** 1.7 *** ** ** * ** 1.5 *** ** ** **** Relative 1.3 ***** ** ********** Entropy 1.1 ***** ** ************ (32.5 bits) 0.9 ******** ************ 0.6 ******** ************ 0.4 ******** ************ 0.2 ********************* 0.0 --------------------- Multilevel TTCAAGCGGAGAGAGAGAGAC consensus C CTG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 261065 392 1.29e-13 GGTGAATAGC TTCAAGCGGAGAGAGAGAGAG AGAGAGTCAT 261064 392 1.29e-13 GGTGAATAGC TTCAAGCGGAGAGAGAGAGAG AGAGAGTCAT 261063 392 1.29e-13 GGTGAATAGC TTCAAGCGGAGAGAGAGAGAG AGAGAGTCAT bd1798 300 3.42e-10 TGTGGTTCAA TTCAAGCGACGACACAGAGTC TGTACTGGTG 22291 468 3.42e-10 GCGGGTAGCG TTCACTCGTAGACAGAGACAC CCATCGGCCA 25239 358 5.96e-10 AACAATAGCT TTCCAACGCAGAGAGAGTCTC CCAAACCAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 261065 1.3e-13 391_[+2]_88 261064 1.3e-13 391_[+2]_88 261063 1.3e-13 391_[+2]_88 bd1798 3.4e-10 299_[+2]_180 22291 3.4e-10 467_[+2]_12 25239 6e-10 357_[+2]_122 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 261065 ( 392) TTCAAGCGGAGAGAGAGAGAG 1 261064 ( 392) TTCAAGCGGAGAGAGAGAGAG 1 261063 ( 392) TTCAAGCGGAGAGAGAGAGAG 1 bd1798 ( 300) TTCAAGCGACGACACAGAGTC 1 22291 ( 468) TTCACTCGTAGACAGAGACAC 1 25239 ( 358) TTCCAACGCAGAGAGAGTCTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 10.1495 E= 1.0e-008 -923 -923 -923 186 -923 -923 -923 186 -923 203 -923 -923 171 -55 -923 -923 171 -55 -923 -923 -61 -923 155 -72 -923 203 -923 -923 -923 -923 214 -923 -61 -55 114 -72 171 -55 -923 -923 -923 -923 214 -923 197 -923 -923 -923 -923 45 155 -923 197 -923 -923 -923 -923 -55 188 -923 197 -923 -923 -923 -923 -923 214 -923 171 -923 -923 -72 -923 45 155 -923 139 -923 -923 28 -923 104 114 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 1.0e-008 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.166667 0.000000 0.666667 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.166667 0.500000 0.166667 0.833333 0.166667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.333333 0.666667 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 0.500000 0.500000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TTCAAGCGGAGA[GC]AGAGA[GC][AT][CG] -------------------------------------------------------------------------------- Time 6.26 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 17 llr = 192 E-value = 4.2e-007 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 4:53:52:1121::21 pos.-specific C :121:11:213::::1 probability G 47:1a:1a:8:1a285 matrix T 2235:47:8158:814 bits 2.1 * * * 1.9 * * * 1.7 * * * 1.5 * * * Relative 1.3 * * ** Entropy 1.1 * * * * **** (16.3 bits) 0.9 * * *** **** 0.6 * * **** **** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GGATGATGTGTTGTGG consensus ATTA T C T sequence T A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- bd756 249 1.47e-08 ATTGAAGAGT GGATGAAGTGTTGTGG CGGTTTGTCC 261065 272 2.02e-08 GTAAAAATCA AGAAGATGTGCTGTGT GCAGTTTCCG 261064 272 2.02e-08 GTAAAAATCA AGAAGATGTGCTGTGT GCAGTTTCCG 261063 272 2.02e-08 GTAAAAATCA AGAAGATGTGCTGTGT GCAGTTTCCG 20800 334 9.42e-08 GAGAGGAGAG GTATGTTGTGATGTGT CTTGAGTCTT 11312 111 3.32e-07 ATTCATAGTT AGTTGATGTGATGGGT GTAGGGGCGT 21149 254 9.73e-07 GCGGTGACAG TGCAGTTGCGTTGTGT TGCGTGGCAG bd1798 205 2.74e-06 TTTCGTCGTT GGACGTTGTTTTGTGG CTGTCCTGTA 12141 352 3.54e-06 GAGGCTCGCT TTTTGTTGTGCTGTGC AGAGCAGTGG 24362 266 4.55e-06 CACTGTGTGC TGTTGTTGCGTTGTTG TTGATTGGTC 23405 36 6.26e-06 CTCCGTTGCC ATTGGTGGTGTTGTGG GCGGCACTCC 27550 227 9.12e-06 AGTACTTGTC GGTTGCAGTATTGTGT TGGTTGCTCA 269403 316 1.05e-05 TTGGAGCCGA ATCTGACGTGCTGGGG AAGAATAGTC 263277 375 1.59e-05 AGGGTCGGGG GGAGGCTGTCTGGTGG CTAATCACAT 25239 177 1.70e-05 GTCAGCAAGG GGATGATGAGAAGTAG AACACATAGA 9329 173 3.75e-05 GAAGTACTCT GCAAGTTGCGTTGTAA ACTCACAGCG 22291 124 3.75e-05 CTTACCTCCC TGCTGAAGTGAGGGAG AGGGACATAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd756 1.5e-08 248_[+3]_236 261065 2e-08 271_[+3]_213 261064 2e-08 271_[+3]_213 261063 2e-08 271_[+3]_213 20800 9.4e-08 333_[+3]_151 11312 3.3e-07 110_[+3]_374 21149 9.7e-07 253_[+3]_231 bd1798 2.7e-06 204_[+3]_280 12141 3.5e-06 351_[+3]_133 24362 4.5e-06 265_[+3]_219 23405 6.3e-06 35_[+3]_449 27550 9.1e-06 226_[+3]_258 269403 1.1e-05 315_[+3]_169 263277 1.6e-05 374_[+3]_110 25239 1.7e-05 176_[+3]_308 9329 3.7e-05 172_[+3]_312 22291 3.7e-05 123_[+3]_361 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=17 bd756 ( 249) GGATGAAGTGTTGTGG 1 261065 ( 272) AGAAGATGTGCTGTGT 1 261064 ( 272) AGAAGATGTGCTGTGT 1 261063 ( 272) AGAAGATGTGCTGTGT 1 20800 ( 334) GTATGTTGTGATGTGT 1 11312 ( 111) AGTTGATGTGATGGGT 1 21149 ( 254) TGCAGTTGCGTTGTGT 1 bd1798 ( 205) GGACGTTGTTTTGTGG 1 12141 ( 352) TTTTGTTGTGCTGTGC 1 24362 ( 266) TGTTGTTGCGTTGTTG 1 23405 ( 36) ATTGGTGGTGTTGTGG 1 27550 ( 227) GGTTGCAGTATTGTGT 1 269403 ( 316) ATCTGACGTGCTGGGG 1 263277 ( 375) GGAGGCTGTCTGGTGG 1 25239 ( 177) GGATGATGAGAAGTAG 1 9329 ( 173) GCAAGTTGCGTTGTAA 1 22291 ( 124) TGCTGAAGTGAGGGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8730 bayes= 9.07116 E= 4.2e-007 47 -1073 86 -22 -1073 -205 164 -22 105 -47 -1073 10 21 -205 -95 95 -1073 -1073 214 -1073 88 -105 -1073 58 -53 -205 -194 136 -1073 -1073 214 -1073 -211 -47 -1073 148 -211 -205 186 -222 -12 27 -1073 78 -211 -1073 -95 158 -1073 -1073 214 -1073 -1073 -1073 -36 158 -53 -1073 175 -222 -211 -205 105 58 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 17 E= 4.2e-007 0.352941 0.000000 0.411765 0.235294 0.000000 0.058824 0.705882 0.235294 0.529412 0.176471 0.000000 0.294118 0.294118 0.058824 0.117647 0.529412 0.000000 0.000000 1.000000 0.000000 0.470588 0.117647 0.000000 0.411765 0.176471 0.058824 0.058824 0.705882 0.000000 0.000000 1.000000 0.000000 0.058824 0.176471 0.000000 0.764706 0.058824 0.058824 0.823529 0.058824 0.235294 0.294118 0.000000 0.470588 0.058824 0.000000 0.117647 0.823529 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.176471 0.823529 0.176471 0.000000 0.764706 0.058824 0.058824 0.058824 0.470588 0.411765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GAT][GT][AT][TA]G[AT]TGTG[TCA]TGTG[GT] -------------------------------------------------------------------------------- Time 9.13 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11312 2.35e-06 110_[+3(3.32e-07)]_64_\ [+3(2.49e-05)]_106_[+1(2.02e-07)]_167 12141 4.56e-06 9_[+1(3.30e-08)]_74_[+1(1.72e-05)]_\ 226_[+3(3.54e-06)]_133 20800 9.11e-08 144_[+1(7.74e-08)]_168_\ [+3(9.42e-08)]_151 21149 5.68e-06 253_[+3(9.73e-07)]_209_\ [+1(4.16e-07)]_1 22291 1.44e-10 123_[+3(3.75e-05)]_93_\ [+1(2.29e-07)]_214_[+2(3.42e-10)]_12 23405 3.41e-06 35_[+3(6.26e-06)]_381_\ [+1(1.19e-07)]_47 24362 3.04e-06 265_[+3(4.55e-06)]_145_\ [+1(8.33e-08)]_53 25239 2.69e-08 176_[+3(1.70e-05)]_165_\ [+2(5.96e-10)]_91_[+1(8.35e-05)]_10 261063 3.02e-21 271_[+3(2.02e-08)]_25_\ [+1(6.73e-12)]_58_[+2(1.29e-13)]_88 261064 3.02e-21 271_[+3(2.02e-08)]_25_\ [+1(6.73e-12)]_58_[+2(1.29e-13)]_88 261065 3.02e-21 271_[+3(2.02e-08)]_25_\ [+1(6.73e-12)]_58_[+2(1.29e-13)]_88 263277 8.75e-06 374_[+3(1.59e-05)]_42_\ [+1(8.33e-08)]_47 269403 1.06e-02 315_[+3(1.05e-05)]_169 27550 5.20e-02 226_[+3(9.12e-06)]_258 36947 6.63e-01 500 9329 4.97e-02 172_[+3(3.75e-05)]_312 bd1798 3.16e-08 204_[+3(2.74e-06)]_79_\ [+2(3.42e-10)]_180 bd756 9.94e-10 248_[+3(1.47e-08)]_184_\ [+1(5.90e-09)]_31 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************