******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/357/357.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 43666 1.0000 500 10271 1.0000 500 18473 1.0000 500 50457 1.0000 500 50513 1.0000 500 45012 1.0000 500 45098 1.0000 500 7660 1.0000 500 7136 1.0000 500 26970 1.0000 500 3787 1.0000 500 38948 1.0000 500 47803 1.0000 500 46332 1.0000 500 40535 1.0000 500 43970 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/357/357.seqs.fa -oc motifs/357 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.261 G 0.215 T 0.253 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.261 G 0.215 T 0.253 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 14 llr = 153 E-value = 8.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :47:2:76524:899 pos.-specific C a1::4:11:4531:: probability G :5224a141:171:: matrix T ::18::::44:::11 bits 2.2 * 2.0 * * 1.8 * * 1.6 * * * Relative 1.3 * * * * ** Entropy 1.1 * * * * ** (15.8 bits) 0.9 * ** ** **** 0.7 **** **** ***** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CGATCGAAATCGAAA consensus AGGG GTCAC sequence A A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 45098 485 4.28e-09 CGACGAATTA CGATCGAATTCGAAA A 3787 66 1.49e-08 CGCAGGCCAA CGATGGAATTAGAAA GGAACCAACA 46332 486 1.78e-07 CCTTGTACAC CGATAGAAAACGAAA 50457 342 3.43e-07 GGAAAGGACA CGAGCGAGACAGAAA GAAAGGCATC 50513 71 1.54e-06 TTTGTACGAG CAAGAGAGATAGAAA GCTAGAAAGG 18473 284 2.59e-06 AGATAGTGTT CCATGGAAGCCGAAA AGTTACTCGT 47803 365 3.82e-06 CGTTCGTGTC CGTTCGAGATCCAAA GAGTCTTTTT 43666 304 8.17e-06 ACTGTTAGTA CAGTCGACTACGAAA CTGGTGACTA 45012 177 1.27e-05 ACAAACCTTA CCAGCGCATACGAAA AAAAGTCCGA 10271 183 1.36e-05 GAATCACAGT CAGTGGAAACCGGTA CGACTAATTC 26970 131 1.56e-05 TCGTTAGTCA CAATAGGAATAGCAA AGGGTAGCAC 40535 93 1.77e-05 TACGGCACGT CGATCGGGATACAAT GCTACTAGCT 7136 182 1.77e-05 ACCCAGTCCA CAATGGCGTCACGAA TTCGGTGCGT 7660 472 2.55e-05 GACTTGCAAA CGGTGGAATCGCATA GTACGGTTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45098 4.3e-09 484_[+1]_1 3787 1.5e-08 65_[+1]_420 46332 1.8e-07 485_[+1] 50457 3.4e-07 341_[+1]_144 50513 1.5e-06 70_[+1]_415 18473 2.6e-06 283_[+1]_202 47803 3.8e-06 364_[+1]_121 43666 8.2e-06 303_[+1]_182 45012 1.3e-05 176_[+1]_309 10271 1.4e-05 182_[+1]_303 26970 1.6e-05 130_[+1]_355 40535 1.8e-05 92_[+1]_393 7136 1.8e-05 181_[+1]_304 7660 2.6e-05 471_[+1]_14 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=14 45098 ( 485) CGATCGAATTCGAAA 1 3787 ( 66) CGATGGAATTAGAAA 1 46332 ( 486) CGATAGAAAACGAAA 1 50457 ( 342) CGAGCGAGACAGAAA 1 50513 ( 71) CAAGAGAGATAGAAA 1 18473 ( 284) CCATGGAAGCCGAAA 1 47803 ( 365) CGTTCGAGATCCAAA 1 43666 ( 304) CAGTCGACTACGAAA 1 45012 ( 177) CCAGCGCATACGAAA 1 10271 ( 183) CAGTGGAAACCGGTA 1 26970 ( 131) CAATAGGAATAGCAA 1 40535 ( 93) CGATCGGGATACAAT 1 7136 ( 182) CAATGGCGTCACGAA 1 7660 ( 472) CGGTGGAATCGCATA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 7776 bayes= 9.72147 E= 8.6e-001 -1045 194 -1045 -1045 40 -87 122 -1045 140 -1045 0 -182 -1045 -1045 0 163 -34 72 73 -1045 -1045 -1045 222 -1045 140 -87 -59 -1045 107 -186 73 -1045 88 -1045 -159 76 -34 45 -1045 76 66 94 -159 -1045 -1045 13 173 -1045 153 -186 -59 -1045 166 -1045 -1045 -83 177 -1045 -1045 -182 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 14 E= 8.6e-001 0.000000 1.000000 0.000000 0.000000 0.357143 0.142857 0.500000 0.000000 0.714286 0.000000 0.214286 0.071429 0.000000 0.000000 0.214286 0.785714 0.214286 0.428571 0.357143 0.000000 0.000000 0.000000 1.000000 0.000000 0.714286 0.142857 0.142857 0.000000 0.571429 0.071429 0.357143 0.000000 0.500000 0.000000 0.071429 0.428571 0.214286 0.357143 0.000000 0.428571 0.428571 0.500000 0.071429 0.000000 0.000000 0.285714 0.714286 0.000000 0.785714 0.071429 0.142857 0.000000 0.857143 0.000000 0.000000 0.142857 0.928571 0.000000 0.000000 0.071429 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[GA][AG][TG][CGA]GA[AG][AT][TCA][CA][GC]AAA -------------------------------------------------------------------------------- Time 2.23 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 16 llr = 164 E-value = 7.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :36:35:114::1::1 pos.-specific C :2::14121:611141 probability G 81:a::9:3:143::6 matrix T 354:611756355963 bits 2.2 * 2.0 * 1.8 * 1.6 * * * Relative 1.3 * * * * Entropy 1.1 * * * ** (14.8 bits) 0.9 * ** * * * ** 0.7 * *** ** *** *** 0.4 * ********** *** 0.2 **************** 0.0 ---------------- Multilevel GTAGTAGTTTCTTTTG consensus TAT AC GATGG CT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 46332 58 2.45e-08 AGTATAGTTG GTTGTCGTTACGTTCG GCATTGCAAA 45012 213 3.24e-07 CTAGCCAGTA GTAGTAGTATCGGTTG TCTGCTCCAA 10271 251 4.31e-07 GGTAGCAGTA GTTGTTGTTTTTGTTG TAGTAGATGA 3787 195 1.35e-06 TCTAGAGCTC GCAGTCGAGACGTTTG GTCTTTCGAT 50457 133 4.82e-06 CCTAGGAGGC GGTGACGTGTCTTTTT GGATCCTCGA 7136 276 5.30e-06 CCGTGTCGTT GTAGCAGCGACTGTCG GTGTCGTAGT 47803 98 5.82e-06 GCCGTGGGTG GAAGTCGTCTCGCTTT CCGTACCGGA 40535 390 7.00e-06 ACAGCAAATT GAAGTAGTTACTATTC TAGCCACCAA 43970 201 7.66e-06 TCCTTCAGAT TTTGTAGTTATGTCTG ACCTGACAAA 18473 239 1.08e-05 GTCTCTTGTG TCAGAAGTTTTGGTTT ACGGTCAGCG 26970 97 1.39e-05 ACGACAACGG TTTGTAGTGACCTTCT TTTACCATTC 50513 320 1.63e-05 GTGAGATTTG GCAGTCGCGAGTTTCG CTCCTCTGAC 43666 14 1.90e-05 TGGCGCGGCC GTTGAATCTTTTTTTG ACATGCCTCT 45098 254 2.05e-05 CCTCTTTTCC GAAGTACTTTCTTTTA TAATTATTGT 38948 327 3.89e-05 TTGACAACGA TTAGCTGTTTTTCTCG CCGACGCTGA 7660 171 4.74e-05 AAGCAATGAA GATGACGACTCGATCG ACAAGTCCCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46332 2.5e-08 57_[+2]_427 45012 3.2e-07 212_[+2]_272 10271 4.3e-07 250_[+2]_234 3787 1.3e-06 194_[+2]_290 50457 4.8e-06 132_[+2]_352 7136 5.3e-06 275_[+2]_209 47803 5.8e-06 97_[+2]_387 40535 7e-06 389_[+2]_95 43970 7.7e-06 200_[+2]_284 18473 1.1e-05 238_[+2]_246 26970 1.4e-05 96_[+2]_388 50513 1.6e-05 319_[+2]_165 43666 1.9e-05 13_[+2]_471 45098 2.1e-05 253_[+2]_231 38948 3.9e-05 326_[+2]_158 7660 4.7e-05 170_[+2]_314 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=16 46332 ( 58) GTTGTCGTTACGTTCG 1 45012 ( 213) GTAGTAGTATCGGTTG 1 10271 ( 251) GTTGTTGTTTTTGTTG 1 3787 ( 195) GCAGTCGAGACGTTTG 1 50457 ( 133) GGTGACGTGTCTTTTT 1 7136 ( 276) GTAGCAGCGACTGTCG 1 47803 ( 98) GAAGTCGTCTCGCTTT 1 40535 ( 390) GAAGTAGTTACTATTC 1 43970 ( 201) TTTGTAGTTATGTCTG 1 18473 ( 239) TCAGAAGTTTTGGTTT 1 26970 ( 97) TTTGTAGTGACCTTCT 1 50513 ( 320) GCAGTCGCGAGTTTCG 1 43666 ( 14) GTTGAATCTTTTTTTG 1 45098 ( 254) GAAGTACTTTCTTTTA 1 38948 ( 327) TTAGCTGTTTTTCTCG 1 7660 ( 171) GATGACGACTCGATCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7760 bayes= 8.91886 E= 7.5e+000 -1064 -1064 180 -2 -12 -47 -178 98 105 -1064 -1064 79 -1064 -1064 222 -1064 -12 -106 -1064 130 88 52 -1064 -102 -1064 -206 203 -202 -112 -47 -1064 144 -212 -106 54 98 69 -1064 -1064 115 -1064 126 -178 30 -1064 -206 103 98 -112 -106 22 98 -1064 -206 -1064 189 -1064 52 -1064 130 -212 -206 154 -2 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 16 E= 7.5e+000 0.000000 0.000000 0.750000 0.250000 0.250000 0.187500 0.062500 0.500000 0.562500 0.000000 0.000000 0.437500 0.000000 0.000000 1.000000 0.000000 0.250000 0.125000 0.000000 0.625000 0.500000 0.375000 0.000000 0.125000 0.000000 0.062500 0.875000 0.062500 0.125000 0.187500 0.000000 0.687500 0.062500 0.125000 0.312500 0.500000 0.437500 0.000000 0.000000 0.562500 0.000000 0.625000 0.062500 0.312500 0.000000 0.062500 0.437500 0.500000 0.125000 0.125000 0.250000 0.500000 0.000000 0.062500 0.000000 0.937500 0.000000 0.375000 0.000000 0.625000 0.062500 0.062500 0.625000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GT][TA][AT]G[TA][AC]GT[TG][TA][CT][TG][TG]T[TC][GT] -------------------------------------------------------------------------------- Time 4.58 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 4 llr = 80 E-value = 1.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::::a:::3::35::8::: pos.-specific C :3:3::a::::3:::333:3 probability G a5a8:::8:8a583a5:85: matrix T :3::a::3a::3:3:3::58 bits 2.2 * * * * 2.0 * * * * * * * 1.8 * * *** * * * 1.6 * * *** * * * Relative 1.3 * ********* * * * Entropy 1.1 * ********* * * **** (28.9 bits) 0.9 * ********* * * **** 0.7 ************* ****** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GGGGTACGTGGGGAGGAGGT consensus C C T A CAG CCCTC sequence T T T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 45098 43 3.97e-12 CGGCGGAGAA GGGGTACGTGGTGAGGAGTT TTAGCTGGTC 10271 19 2.37e-10 GTGACGGTGA GTGCTACTTGGGGAGGAGGT TGAGGAGGAG 40535 245 1.29e-09 ATCCTACCAT GCGGTACGTAGGGTGTCGTT AGAGAATAAA 7136 84 2.61e-09 CGCGAAATAG GGGGTACGTGGCAGGCACGC CTCCGGCCCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45098 4e-12 42_[+3]_438 10271 2.4e-10 18_[+3]_462 40535 1.3e-09 244_[+3]_236 7136 2.6e-09 83_[+3]_397 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=4 45098 ( 43) GGGGTACGTGGTGAGGAGTT 1 10271 ( 19) GTGCTACTTGGGGAGGAGGT 1 40535 ( 245) GCGGTACGTAGGGTGTCGTT 1 7136 ( 84) GGGGTACGTGGCAGGCACGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7696 bayes= 11.6464 E= 1.8e+002 -865 -865 222 -865 -865 -6 122 -2 -865 -865 222 -865 -865 -6 180 -865 -865 -865 -865 198 188 -865 -865 -865 -865 194 -865 -865 -865 -865 180 -2 -865 -865 -865 198 -12 -865 180 -865 -865 -865 222 -865 -865 -6 122 -2 -12 -865 180 -865 88 -865 22 -2 -865 -865 222 -865 -865 -6 122 -2 146 -6 -865 -865 -865 -6 180 -865 -865 -865 122 98 -865 -6 -865 156 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 1.8e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.500000 0.250000 0.250000 0.000000 0.750000 0.000000 0.500000 0.000000 0.250000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.500000 0.250000 0.750000 0.250000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.250000 0.000000 0.750000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[GCT]G[GC]TAC[GT]T[GA]G[GCT][GA][AGT]G[GCT][AC][GC][GT][TC] -------------------------------------------------------------------------------- Time 6.48 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43666 2.24e-03 13_[+2(1.90e-05)]_274_\ [+1(8.17e-06)]_182 10271 7.29e-11 18_[+3(2.37e-10)]_144_\ [+1(1.36e-05)]_53_[+2(4.31e-07)]_2_[+2(1.18e-05)]_216 18473 7.86e-05 238_[+2(1.08e-05)]_29_\ [+1(2.59e-06)]_202 50457 5.26e-06 132_[+2(4.82e-06)]_193_\ [+1(3.43e-07)]_144 50513 4.82e-04 70_[+1(1.54e-06)]_234_\ [+2(1.63e-05)]_165 45012 5.10e-05 6_[+1(3.36e-05)]_155_[+1(1.27e-05)]_\ 21_[+2(3.24e-07)]_91_[+2(1.76e-05)]_165 45098 2.96e-14 42_[+3(3.97e-12)]_191_\ [+2(2.05e-05)]_215_[+1(4.28e-09)]_1 7660 3.65e-03 170_[+2(4.74e-05)]_285_\ [+1(2.55e-05)]_14 7136 8.86e-09 83_[+3(2.61e-09)]_78_[+1(1.77e-05)]_\ 79_[+2(5.30e-06)]_209 26970 2.83e-03 8_[+1(8.55e-05)]_73_[+2(1.39e-05)]_\ 18_[+1(1.56e-05)]_355 3787 4.17e-07 65_[+1(1.49e-08)]_114_\ [+2(1.35e-06)]_290 38948 1.13e-01 326_[+2(3.89e-05)]_158 47803 3.88e-04 97_[+2(5.82e-06)]_251_\ [+1(3.82e-06)]_121 46332 2.18e-07 57_[+2(2.45e-08)]_412_\ [+1(1.78e-07)] 40535 6.01e-09 92_[+1(1.77e-05)]_137_\ [+3(1.29e-09)]_125_[+2(7.00e-06)]_95 43970 2.40e-02 200_[+2(7.66e-06)]_284 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************