******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/349/349.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31778 1.0000 500 9478 1.0000 500 46565 1.0000 500 54611 1.0000 500 46740 1.0000 500 14770 1.0000 500 15747 1.0000 500 51199 1.0000 500 30389 1.0000 500 49594 1.0000 500 16363 1.0000 500 4234 1.0000 500 17048 1.0000 500 34488 1.0000 500 43941 1.0000 500 43191 1.0000 500 43755 1.0000 500 49786 1.0000 500 37873 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/349/349.seqs.fa -oc motifs/349 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.231 G 0.225 T 0.283 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.231 G 0.225 T 0.283 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 19 llr = 182 E-value = 6.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::13211::2:42::4 pos.-specific C ::91121:42911:8: probability G 1::3235:23:13124 matrix T 9a:3544a541459:3 bits 2.2 1.9 1.7 ** * * 1.5 *** * * Relative 1.3 *** * * ** Entropy 1.1 *** * * ** (13.8 bits) 0.9 *** * * ** 0.6 *** * * ** 0.4 *** *** * *** 0.2 *** ***** ****** 0.0 ---------------- Multilevel TTCGTTGTTTCATTCA consensus TGGT CG TG GG sequence A C A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 34488 470 4.55e-09 AAATTCCCTT TTCGTTGTTTCATTCA TATTGATATA 43191 38 2.30e-06 GTCTGTTACC TTCAGTTTGGCATTCG ACTTGGAGTA 16363 365 2.30e-06 CTCATCGTTT TTCTGTGTCTCGTTCA CTGAAGCGCA 15747 176 2.30e-06 GGAGAAAAAA TTCAGTGTCACAGTCA GAAATAACCA 49786 472 7.02e-06 CGTTCTGGGG TTCTTGTTTTCTTGCG TCTGATCGTT 37873 219 7.81e-06 CAAGGTTATG TTCATGGTCTCAATGT GAGGGGACTA 30389 322 8.67e-06 GGCATAAGAG TTCTTCATTCCTTTCG CTTCCTAGTC 54611 296 9.60e-06 TAGATGTGAA TTCTTCTTCTCCATCG TTTCCTAGCT 14770 481 1.17e-05 ACCTGTGTTA TTCGTGTTTACTTTGT AGCT 43941 160 1.28e-05 GAAGAAATCT TTCACAGTCGCAGTCA ACCATATCTC 51199 337 1.41e-05 ATCTCGATCA TTCGTGGTTACTCTCT GTCAGTCTCT 46565 12 1.54e-05 TTATTTACGA TTCCCTTTCTCTGTCG GCTGGTCAGT 9478 137 1.84e-05 ATGGCGAGCT TTCGTGATTGCAATGA CTCGGCTCGG 46740 438 2.18e-05 TCGTTACGGA TTCGTCGTCGTTGTCG TTGCAGTAGT 31778 464 2.36e-05 TGAGCAAAGA TTCTAAGTTGCGGTCA AATAGGTCTT 4234 347 5.97e-05 CGGGAATCCC GTCGACGTTCCATTCT AACACAACGA 49594 474 7.19e-05 TTGTTCCCAC TTCAATTTTCCAAGCT TTTCTTCCAC 17048 376 9.61e-05 CTGGTCAAAA TTCCGTCTGGCCTTCG CCCAAAATAC 43755 269 1.07e-04 GTGTATTCTA TTATTTTTGTCTTTGA TACAGTTTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34488 4.5e-09 469_[+1]_15 43191 2.3e-06 37_[+1]_447 16363 2.3e-06 364_[+1]_120 15747 2.3e-06 175_[+1]_309 49786 7e-06 471_[+1]_13 37873 7.8e-06 218_[+1]_266 30389 8.7e-06 321_[+1]_163 54611 9.6e-06 295_[+1]_189 14770 1.2e-05 480_[+1]_4 43941 1.3e-05 159_[+1]_325 51199 1.4e-05 336_[+1]_148 46565 1.5e-05 11_[+1]_473 9478 1.8e-05 136_[+1]_348 46740 2.2e-05 437_[+1]_47 31778 2.4e-05 463_[+1]_21 4234 6e-05 346_[+1]_138 49594 7.2e-05 473_[+1]_11 17048 9.6e-05 375_[+1]_109 43755 0.00011 268_[+1]_216 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=19 34488 ( 470) TTCGTTGTTTCATTCA 1 43191 ( 38) TTCAGTTTGGCATTCG 1 16363 ( 365) TTCTGTGTCTCGTTCA 1 15747 ( 176) TTCAGTGTCACAGTCA 1 49786 ( 472) TTCTTGTTTTCTTGCG 1 37873 ( 219) TTCATGGTCTCAATGT 1 30389 ( 322) TTCTTCATTCCTTTCG 1 54611 ( 296) TTCTTCTTCTCCATCG 1 14770 ( 481) TTCGTGTTTACTTTGT 1 43941 ( 160) TTCACAGTCGCAGTCA 1 51199 ( 337) TTCGTGGTTACTCTCT 1 46565 ( 12) TTCCCTTTCTCTGTCG 1 9478 ( 137) TTCGTGATTGCAATGA 1 46740 ( 438) TTCGTCGTCGTTGTCG 1 31778 ( 464) TTCTAAGTTGCGGTCA 1 4234 ( 347) GTCGACGTTCCATTCT 1 49594 ( 474) TTCAATTTTCCAAGCT 1 17048 ( 376) TTCCGTCTGGCCTTCG 1 43755 ( 269) TTATTTTTGTCTTTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9215 bayes= 8.91886 E= 6.6e-001 -1089 -1089 -209 174 -1089 -1089 -1089 182 -231 203 -1089 -1089 1 -114 49 16 -72 -114 -9 90 -131 -14 23 57 -131 -213 107 38 -1089 -1089 -1089 182 -1089 67 -51 74 -72 -55 49 38 -1089 203 -1089 -242 69 -114 -109 38 -31 -213 23 74 -1089 -1089 -109 166 -1089 177 -9 -1089 50 -1089 71 -10 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 19 E= 6.6e-001 0.000000 0.000000 0.052632 0.947368 0.000000 0.000000 0.000000 1.000000 0.052632 0.947368 0.000000 0.000000 0.263158 0.105263 0.315789 0.315789 0.157895 0.105263 0.210526 0.526316 0.105263 0.210526 0.263158 0.421053 0.105263 0.052632 0.473684 0.368421 0.000000 0.000000 0.000000 1.000000 0.000000 0.368421 0.157895 0.473684 0.157895 0.157895 0.315789 0.368421 0.000000 0.947368 0.000000 0.052632 0.421053 0.105263 0.105263 0.368421 0.210526 0.052632 0.263158 0.473684 0.000000 0.000000 0.105263 0.894737 0.000000 0.789474 0.210526 0.000000 0.368421 0.000000 0.368421 0.263158 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TTC[GTA][TG][TGC][GT]T[TC][TG]C[AT][TGA]T[CG][AGT] -------------------------------------------------------------------------------- Time 3.12 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 6 llr = 80 E-value = 1.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :8:::a:::::5 pos.-specific C 228:::a::::: probability G 8::3a::3a:a3 matrix T ::27:::7:a:2 bits 2.2 * * * * 1.9 *** * * 1.7 *** *** 1.5 * * *** *** Relative 1.3 *** *** *** Entropy 1.1 *********** (19.1 bits) 0.9 *********** 0.6 *********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GACTGACTGTGA consensus G G G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43755 196 1.03e-07 GTAAAAGTGA GACTGACTGTGG CTTGGACCAA 4234 50 1.03e-07 GATGAATTGT GACTGACTGTGG CGGGAAACTA 46565 485 3.01e-07 TCCAGGTTTC GACGGACGGTGA AAGA 31778 445 4.39e-07 AGGCGAGGAC GCCTGACTGTGA GCAAAGATTC 46740 353 1.01e-06 GGTTTAGATG CACTGACGGTGA CGGTCCATTT 16363 112 2.27e-06 TAAACAACCG GATGGACTGTGT CGAGGGCCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43755 1e-07 195_[+2]_293 4234 1e-07 49_[+2]_439 46565 3e-07 484_[+2]_4 31778 4.4e-07 444_[+2]_44 46740 1e-06 352_[+2]_136 16363 2.3e-06 111_[+2]_377 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=6 43755 ( 196) GACTGACTGTGG 1 4234 ( 50) GACTGACTGTGG 1 46565 ( 485) GACGGACGGTGA 1 31778 ( 445) GCCTGACTGTGA 1 46740 ( 353) CACTGACGGTGA 1 16363 ( 112) GATGGACTGTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9291 bayes= 10.2544 E= 1.8e+002 -923 -47 189 -923 167 -47 -923 -923 -923 185 -923 -76 -923 -923 57 124 -923 -923 215 -923 194 -923 -923 -923 -923 211 -923 -923 -923 -923 57 124 -923 -923 215 -923 -923 -923 -923 182 -923 -923 215 -923 94 -923 57 -76 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 1.8e+002 0.000000 0.166667 0.833333 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.333333 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GAC[TG]GAC[TG]GTG[AG] -------------------------------------------------------------------------------- Time 6.20 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 8 llr = 129 E-value = 1.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 63144:8::3:8865::491: pos.-specific C ::::::31a61::433551:8 probability G 4:645a:9::9:1:3551:91 matrix T :8331::::1:31::3::::1 bits 2.2 * * 1.9 * * 1.7 * * 1.5 * ** * ** Relative 1.3 * ** * ** Entropy 1.1 ** **** ** * * *** (23.2 bits) 0.9 ** **** **** * *** 0.6 *** ********** ****** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel ATGAGGAGCCGAAAAGCCAGC consensus GATGA C A T CCCGA sequence T GT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 17048 54 8.00e-10 CGGAGTTTTA ATGTGGCGCCGAAACTCCAGC ATTGCGGCAG 43755 455 2.65e-09 CCTTTGCGAC AAGAAGACCCGAACAGCCAGC TTGGCAAGCT 46565 319 6.83e-09 AACCTGGATG ATTGGGAGCCGTTACGGCAGC GACCAGCAGT 43191 466 1.33e-08 CGTCCGGAAG GTGGAGAGCAGAACACCAAGT CACACGAGCA 51199 182 2.23e-08 CCAGACGCCC GTGTGGAGCCGAACGTGGAGG GTACCTCCTA 31778 16 4.91e-08 TTCGAGAAAA ATGAGGCGCCGAGAAGGCCAC GATTCGACTG 37873 415 5.67e-08 ACTTTGCTTT ATTGTGAGCTGTAAGGCAAGC ACTTTACAAG 16363 452 1.34e-07 TCTTTCGCGA GAAAAGAGCACAAAACGAAGC AATCAAAATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17048 8e-10 53_[+3]_426 43755 2.7e-09 454_[+3]_25 46565 6.8e-09 318_[+3]_161 43191 1.3e-08 465_[+3]_14 51199 2.2e-08 181_[+3]_298 31778 4.9e-08 15_[+3]_464 37873 5.7e-08 414_[+3]_65 16363 1.3e-07 451_[+3]_28 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=8 17048 ( 54) ATGTGGCGCCGAAACTCCAGC 1 43755 ( 455) AAGAAGACCCGAACAGCCAGC 1 46565 ( 319) ATTGGGAGCCGTTACGGCAGC 1 43191 ( 466) GTGGAGAGCAGAACACCAAGT 1 51199 ( 182) GTGTGGAGCCGAACGTGGAGG 1 31778 ( 16) ATGAGGCGCCGAGAAGGCCAC 1 37873 ( 415) ATTGTGAGCTGTAAGGCAAGC 1 16363 ( 452) GAAAAGAGCACAAAACGAAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9120 bayes= 10.891 E= 1.4e+002 126 -965 74 -965 -6 -965 -965 141 -106 -965 147 -18 52 -965 74 -18 52 -965 115 -118 -965 -965 215 -965 152 11 -965 -965 -965 -89 196 -965 -965 211 -965 -965 -6 143 -965 -118 -965 -89 196 -965 152 -965 -965 -18 152 -965 -85 -118 126 70 -965 -965 94 11 15 -965 -965 11 115 -18 -965 111 115 -965 52 111 -85 -965 174 -89 -965 -965 -106 -965 196 -965 -965 170 -85 -118 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 1.4e+002 0.625000 0.000000 0.375000 0.000000 0.250000 0.000000 0.000000 0.750000 0.125000 0.000000 0.625000 0.250000 0.375000 0.000000 0.375000 0.250000 0.375000 0.000000 0.500000 0.125000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.625000 0.000000 0.125000 0.000000 0.125000 0.875000 0.000000 0.750000 0.000000 0.000000 0.250000 0.750000 0.000000 0.125000 0.125000 0.625000 0.375000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.500000 0.500000 0.000000 0.375000 0.500000 0.125000 0.000000 0.875000 0.125000 0.000000 0.000000 0.125000 0.000000 0.875000 0.000000 0.000000 0.750000 0.125000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AG][TA][GT][AGT][GA]G[AC]GC[CA]G[AT]A[AC][ACG][GCT][CG][CA]AGC -------------------------------------------------------------------------------- Time 9.32 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31778 1.74e-08 15_[+3(4.91e-08)]_408_\ [+2(4.39e-07)]_7_[+1(2.36e-05)]_21 9478 1.68e-02 136_[+1(1.84e-05)]_348 46565 1.35e-09 11_[+1(1.54e-05)]_291_\ [+3(6.83e-09)]_145_[+2(3.01e-07)]_4 54611 1.38e-02 295_[+1(9.60e-06)]_189 46740 2.09e-04 352_[+2(1.01e-06)]_73_\ [+1(2.18e-05)]_47 14770 4.04e-02 480_[+1(1.17e-05)]_4 15747 2.93e-02 175_[+1(2.30e-06)]_309 51199 6.29e-06 181_[+3(2.23e-08)]_134_\ [+1(1.41e-05)]_148 30389 3.03e-02 321_[+1(8.67e-06)]_163 49594 1.59e-01 473_[+1(7.19e-05)]_11 16363 2.35e-08 111_[+2(2.27e-06)]_241_\ [+1(2.30e-06)]_71_[+3(1.34e-07)]_28 4234 1.18e-04 49_[+2(1.03e-07)]_285_\ [+1(5.97e-05)]_63_[+2(9.85e-05)]_63 17048 2.68e-06 53_[+3(8.00e-10)]_301_\ [+1(9.61e-05)]_109 34488 1.50e-05 469_[+1(4.55e-09)]_15 43941 6.67e-02 159_[+1(1.28e-05)]_325 43191 4.16e-07 37_[+1(2.30e-06)]_412_\ [+3(1.33e-08)]_14 43755 1.22e-09 195_[+2(1.03e-07)]_247_\ [+3(2.65e-09)]_25 49786 2.82e-02 471_[+1(7.02e-06)]_13 37873 5.69e-06 218_[+1(7.81e-06)]_180_\ [+3(5.67e-08)]_65 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************