******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/494/494.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10766 1.0000 500 11029 1.0000 500 11267 1.0000 500 1711 1.0000 500 1949 1.0000 500 21000 1.0000 500 21661 1.0000 500 21808 1.0000 500 22003 1.0000 500 22237 1.0000 500 23411 1.0000 500 23993 1.0000 500 24134 1.0000 500 25428 1.0000 500 25430 1.0000 500 25467 1.0000 500 25659 1.0000 500 25816 1.0000 500 268354 1.0000 500 269826 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/494/494.seqs.fa -oc motifs/494 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10000 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.258 C 0.238 G 0.231 T 0.273 Background letter frequencies (from dataset with add-one prior applied): A 0.258 C 0.238 G 0.231 T 0.273 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 16 llr = 175 E-value = 6.7e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1843:37718961881 pos.-specific C 9253672:91:47237 probability G 1:1:3:111:::2::1 matrix T :::511:3:11::::1 bits 2.1 1.9 1.7 1.5 * * * Relative 1.3 ** * * ** Entropy 1.1 ** * ** ** (15.8 bits) 0.8 *** ********** 0.6 *** ************ 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CACTCCAACAAACAAC consensus AAGA T C C sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 25816 83 6.68e-09 CTTAGCACTC CAATCAAACAAACAAC CGACGCGTCT 1949 458 5.96e-07 GACACAAAAT CAATTCAACATCCAAC AACATTGTCA 25467 195 8.18e-07 CTCCATTCCT CACACAGACAACCACC TCGAGCACTA 22237 479 8.18e-07 AAGCCTCCAT CCCTGCCACAACCACC ACCACC 268354 285 9.15e-07 TTGGCAGGTT CACTGCAAGAAAGAAC GCGACCAAAG 269826 476 1.01e-06 AGAATACATC CAACCTATCAACCAAC CAATCAACC 21000 156 1.67e-06 TGGTCATTCC CACTCCCACTACCAAA GAACAAAGTT 22003 473 1.84e-06 CTCCTGCCCC CACCTCATCAAACAAT ACCAATAGAT 21808 113 3.19e-06 TGTAGTGGTG CCATCAATCAAACAAT ACCGTAACTC 24134 202 5.37e-06 CAAAAAGAGG CAAAGCAAAAACAAAC CAATAAATTT 23411 309 6.35e-06 TTCTCAGACC GAGTCCGACAAACAAC AACACAAACC 25659 438 1.10e-05 GGCATTCACG CAAACCAACCTCGACC AGCGTTGAAC 25428 33 1.10e-05 ACAAACTCGT CACCGCAGCAAAAAAA GTCGGCAGAA 23993 342 1.28e-05 ATCAGAACCG AACTGCAACAAACCAG ATCATTTATT 11267 482 1.37e-05 CATCTATACA CAACCACTCCAACCAC ATC 21661 426 1.70e-05 ACGAGCTGCT CCCACCAACTAAGCCC TCAACCATCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25816 6.7e-09 82_[+1]_402 1949 6e-07 457_[+1]_27 25467 8.2e-07 194_[+1]_290 22237 8.2e-07 478_[+1]_6 268354 9.2e-07 284_[+1]_200 269826 1e-06 475_[+1]_9 21000 1.7e-06 155_[+1]_329 22003 1.8e-06 472_[+1]_12 21808 3.2e-06 112_[+1]_372 24134 5.4e-06 201_[+1]_283 23411 6.3e-06 308_[+1]_176 25659 1.1e-05 437_[+1]_47 25428 1.1e-05 32_[+1]_452 23993 1.3e-05 341_[+1]_143 11267 1.4e-05 481_[+1]_3 21661 1.7e-05 425_[+1]_59 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=16 25816 ( 83) CAATCAAACAAACAAC 1 1949 ( 458) CAATTCAACATCCAAC 1 25467 ( 195) CACACAGACAACCACC 1 22237 ( 479) CCCTGCCACAACCACC 1 268354 ( 285) CACTGCAAGAAAGAAC 1 269826 ( 476) CAACCTATCAACCAAC 1 21000 ( 156) CACTCCCACTACCAAA 1 22003 ( 473) CACCTCATCAAACAAT 1 21808 ( 113) CCATCAATCAAACAAT 1 24134 ( 202) CAAAGCAAAAACAAAC 1 23411 ( 309) GAGTCCGACAAACAAC 1 25659 ( 438) CAAACCAACCTCGACC 1 25428 ( 33) CACCGCAGCAAAAAAA 1 23993 ( 342) AACTGCAACAAACCAG 1 11267 ( 482) CAACCACTCCAACCAC 1 21661 ( 426) CCCACCAACTAAGCCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 9.24139 E= 6.7e+000 -204 188 -188 -1064 165 -34 -1064 -1064 76 107 -188 -1064 -5 7 -1064 87 -1064 124 43 -113 -5 153 -1064 -212 141 -34 -89 -1064 141 -1064 -188 -13 -204 188 -188 -1064 154 -93 -1064 -113 176 -1064 -1064 -113 112 88 -1064 -1064 -105 153 -30 -1064 165 -34 -1064 -1064 154 7 -1064 -1064 -105 153 -188 -113 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 16 E= 6.7e+000 0.062500 0.875000 0.062500 0.000000 0.812500 0.187500 0.000000 0.000000 0.437500 0.500000 0.062500 0.000000 0.250000 0.250000 0.000000 0.500000 0.000000 0.562500 0.312500 0.125000 0.250000 0.687500 0.000000 0.062500 0.687500 0.187500 0.125000 0.000000 0.687500 0.000000 0.062500 0.250000 0.062500 0.875000 0.062500 0.000000 0.750000 0.125000 0.000000 0.125000 0.875000 0.000000 0.000000 0.125000 0.562500 0.437500 0.000000 0.000000 0.125000 0.687500 0.187500 0.000000 0.812500 0.187500 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.125000 0.687500 0.062500 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CA[CA][TAC][CG][CA]A[AT]CAA[AC]CA[AC]C -------------------------------------------------------------------------------- Time 3.41 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 6 llr = 94 E-value = 1.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :8:327::73:22::a pos.-specific C ::::82:3:::::::: probability G a2a5::8732a8:8a: matrix T :::2:22::5::82:: bits 2.1 * * * * 1.9 * * * ** 1.7 * * * ** 1.5 * * * * ** *** Relative 1.3 *** * ** ****** Entropy 1.1 *** * *** ****** (22.7 bits) 0.8 *** * *** ****** 0.6 ********* ****** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GAGGCAGGATGGTGGA consensus A CGA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 21661 289 5.82e-10 CAAGGAGAAG GAGGCAGGGTGGTGGA ACGAGGTGTG 25430 317 1.25e-08 TTTCGCAATC GAGAAAGGAAGGTGGA ATACAATCTT 22237 229 2.40e-08 TGATGGGATT GAGGCCGGATGATGGA TAGCACTCTG 23993 141 5.92e-08 ATAACTGGTT GAGACAGGATGGATGA AGAACGAATA 269826 53 1.09e-07 GGGCATTCTC GGGGCTGCAGGGTGGA CTTGTGCTAA 1711 355 1.16e-07 CGTGAAAGAA GAGTCATCGAGGTGGA CGGGACATAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21661 5.8e-10 288_[+2]_196 25430 1.2e-08 316_[+2]_168 22237 2.4e-08 228_[+2]_256 23993 5.9e-08 140_[+2]_344 269826 1.1e-07 52_[+2]_432 1711 1.2e-07 354_[+2]_130 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=6 21661 ( 289) GAGGCAGGGTGGTGGA 1 25430 ( 317) GAGAAAGGAAGGTGGA 1 22237 ( 229) GAGGCCGGATGATGGA 1 23993 ( 141) GAGACAGGATGGATGA 1 269826 ( 53) GGGGCTGCAGGGTGGA 1 1711 ( 355) GAGTCATCGAGGTGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 11.1056 E= 1.5e+001 -923 -923 211 -923 169 -923 -47 -923 -923 -923 211 -923 37 -923 111 -71 -63 181 -923 -923 137 -51 -923 -71 -923 -923 185 -71 -923 49 153 -923 137 -923 53 -923 37 -923 -47 87 -923 -923 211 -923 -63 -923 185 -923 -63 -923 -923 161 -923 -923 185 -71 -923 -923 211 -923 195 -923 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 1.5e+001 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.500000 0.166667 0.166667 0.833333 0.000000 0.000000 0.666667 0.166667 0.000000 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.333333 0.666667 0.000000 0.666667 0.000000 0.333333 0.000000 0.333333 0.000000 0.166667 0.500000 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GAG[GA]CAG[GC][AG][TA]GGTGGA -------------------------------------------------------------------------------- Time 6.69 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 12 llr = 139 E-value = 6.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 7:834:327521a84 pos.-specific C ::::31:8:::::2: probability G 3a3729313539::6 matrix T 1:::2:5:::5::1: bits 2.1 * 1.9 * * 1.7 * * ** 1.5 * * ** Relative 1.3 ** * ** Entropy 1.1 *** * *** ** * (16.8 bits) 0.8 **** * *** **** 0.6 **** * *** **** 0.4 **** ********** 0.2 *************** 0.0 --------------- Multilevel AGAGAGTCAATGAAG consensus G GAC A GGG A sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 25430 359 3.25e-08 TGATGTTTGG AGAGAGGCAGTGAAA TAACTACCTC 25816 485 1.66e-07 AATTCTAGAG AGAGTGTCAAGGAAA C 22237 417 5.53e-07 GCCCCCCTTG AGAGCGTGAGTGAAG GCTGTCCAAC 1949 333 6.18e-07 TGCCGATCCA AGAGGGTCAGTGACA AAACTGAATG 11029 107 8.55e-07 GTTGAACGGT GGGGAGGCGGTGAAG TGTTGGGAGG 268354 197 1.43e-06 GAAGTCTGAG AGAGAGTCAAGAAAA TGGCAGGGCG 23993 379 2.71e-06 GATGAAGAGA AGAAGGGAAAGGAAG GTATCCTTCT 10766 254 2.71e-06 TCTGATGCTT GGAAAGTCAATGATG AGATTTATGT 22003 300 3.45e-06 GATCCTTTCT AGAACGAAAGGGAAA GAGGAATATT 23411 239 7.57e-06 ATTCTTCCCC AGGGCGACGAAGACG GTCCACCACA 11267 129 8.60e-06 AACAAGCCAC TGAAAGACGAAGAAG AATGGGGTAC 21808 145 1.15e-05 ACTCGCTTGG GGGGTCTCGGTGAAG GGATACTTAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25430 3.3e-08 358_[+3]_127 25816 1.7e-07 484_[+3]_1 22237 5.5e-07 416_[+3]_69 1949 6.2e-07 332_[+3]_153 11029 8.6e-07 106_[+3]_379 268354 1.4e-06 196_[+3]_289 23993 2.7e-06 378_[+3]_107 10766 2.7e-06 253_[+3]_232 22003 3.5e-06 299_[+3]_186 23411 7.6e-06 238_[+3]_247 11267 8.6e-06 128_[+3]_357 21808 1.2e-05 144_[+3]_341 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=12 25430 ( 359) AGAGAGGCAGTGAAA 1 25816 ( 485) AGAGTGTCAAGGAAA 1 22237 ( 417) AGAGCGTGAGTGAAG 1 1949 ( 333) AGAGGGTCAGTGACA 1 11029 ( 107) GGGGAGGCGGTGAAG 1 268354 ( 197) AGAGAGTCAAGAAAA 1 23993 ( 379) AGAAGGGAAAGGAAG 1 10766 ( 254) GGAAAGTCAATGATG 1 22003 ( 300) AGAACGAAAGGGAAA 1 23411 ( 239) AGGGCGACGAAGACG 1 11267 ( 129) TGAAAGACGAAGAAG 1 21808 ( 145) GGGGTCTCGGTGAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 9720 bayes= 10.1079 E= 6.0e+001 137 -1023 11 -171 -1023 -1023 211 -1023 154 -1023 11 -1023 37 -1023 153 -1023 69 7 -47 -71 -1023 -151 199 -1023 -5 -1023 11 87 -63 166 -147 -1023 137 -1023 53 -1023 95 -1023 111 -1023 -63 -1023 53 87 -163 -1023 199 -1023 195 -1023 -1023 -1023 154 -51 -1023 -171 69 -1023 133 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 12 E= 6.0e+001 0.666667 0.000000 0.250000 0.083333 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.333333 0.000000 0.666667 0.000000 0.416667 0.250000 0.166667 0.166667 0.000000 0.083333 0.916667 0.000000 0.250000 0.000000 0.250000 0.500000 0.166667 0.750000 0.083333 0.000000 0.666667 0.000000 0.333333 0.000000 0.500000 0.000000 0.500000 0.000000 0.166667 0.000000 0.333333 0.500000 0.083333 0.000000 0.916667 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.166667 0.000000 0.083333 0.416667 0.000000 0.583333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AG]G[AG][GA][AC]G[TAG]C[AG][AG][TG]GAA[GA] -------------------------------------------------------------------------------- Time 9.89 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10766 3.33e-03 253_[+3(2.71e-06)]_232 11029 4.92e-03 106_[+3(8.55e-07)]_379 11267 8.16e-04 128_[+3(8.60e-06)]_338_\ [+1(1.37e-05)]_3 1711 4.83e-04 354_[+2(1.16e-07)]_130 1949 1.61e-06 332_[+3(6.18e-07)]_110_\ [+1(5.96e-07)]_27 21000 5.71e-03 155_[+1(1.67e-06)]_329 21661 1.42e-07 288_[+2(5.82e-10)]_121_\ [+1(1.70e-05)]_59 21808 1.13e-04 112_[+1(3.19e-06)]_16_\ [+3(1.15e-05)]_341 22003 1.43e-05 299_[+3(3.45e-06)]_158_\ [+1(1.84e-06)]_12 22237 5.02e-10 228_[+2(2.40e-08)]_172_\ [+3(5.53e-07)]_47_[+1(8.18e-07)]_6 23411 4.03e-04 238_[+3(7.57e-06)]_55_\ [+1(6.35e-06)]_176 23993 6.27e-08 140_[+2(5.92e-08)]_185_\ [+1(1.28e-05)]_21_[+3(2.71e-06)]_107 24134 4.81e-03 201_[+1(5.37e-06)]_283 25428 7.90e-02 32_[+1(1.10e-05)]_452 25430 6.38e-09 316_[+2(1.25e-08)]_26_\ [+3(3.25e-08)]_127 25467 7.26e-03 194_[+1(8.18e-07)]_290 25659 3.11e-02 437_[+1(1.10e-05)]_47 25816 2.07e-08 82_[+1(6.68e-09)]_274_\ [+3(5.91e-05)]_97_[+3(1.66e-07)]_1 268354 1.09e-05 196_[+3(1.43e-06)]_73_\ [+1(9.15e-07)]_180_[+1(6.07e-05)]_4 269826 4.40e-06 52_[+2(1.09e-07)]_407_\ [+1(1.01e-06)]_9 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************