******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/302/302.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31559 1.0000 500 8717 1.0000 500 8876 1.0000 500 5526 1.0000 500 46570 1.0000 500 46813 1.0000 500 14163 1.0000 500 15301 1.0000 500 43452 1.0000 500 49314 1.0000 500 49652 1.0000 500 16707 1.0000 500 43913 1.0000 500 43931 1.0000 500 44346 1.0000 500 44692 1.0000 500 11144 1.0000 500 45250 1.0000 500 46851 1.0000 500 39283 1.0000 500 32144 1.0000 500 43297 1.0000 500 46866 1.0000 500 33141 1.0000 500 41248 1.0000 500 49945 1.0000 500 49055 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/302/302.seqs.fa -oc motifs/302 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 27 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 13500 N= 27 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.258 C 0.238 G 0.223 T 0.281 Background letter frequencies (from dataset with add-one prior applied): A 0.258 C 0.238 G 0.223 T 0.281 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 27 llr = 232 E-value = 1.3e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 241:37:3::39 pos.-specific C 6:414:a::14: probability G :51:2:::a:2: matrix T 1:4912:7:911 bits 2.2 * * 2.0 * * 1.7 * * 1.5 * * * Relative 1.3 * * ** * Entropy 1.1 * **** * (12.4 bits) 0.9 * * ***** * 0.7 ** * ***** * 0.4 ** * ***** * 0.2 ************ 0.0 ------------ Multilevel CGCTCACTGTCA consensus AT AT A A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 8876 129 2.48e-07 TAGCGAATTT CGTTCACTGTCA CTGTCGCGAC 49314 325 6.84e-07 CAATGAATCT CGCTCACTGTGA CGGTCATTTT 33141 358 2.71e-06 CGACTCCGAA CATTCACAGTCA GAAACAGTGG 46866 258 3.31e-06 GACGTCACAC CAGTCACTGTCA AAAATAGGCG 49652 368 4.99e-06 TCGAGTTCAC TACTCACTGTCA ATTTAAGTAG 46570 303 4.99e-06 AGAAAATTGG CGTTGACTGTGA TAACTTTCAA 43297 452 7.36e-06 CGAGAACACC CACTAACAGTAA CATCAATATC 14163 396 7.36e-06 CTTCTGCGAA CATTGACTGTGA TTGCTGAGAT 46851 107 1.10e-05 CAACTGCACT CGCTATCAGTCA ACCAATCTAG 39283 378 2.02e-05 TATCCCCACC CGCCAACTGTAA AACCGAGCGT 44346 238 2.20e-05 GTTGTCATAT CACTGTCAGTCA GCGCAAACGT 43931 27 3.75e-05 CATGGTGCAA CGCCCACTGTTA TGGCTTAGCA 45250 478 4.16e-05 TCAAAGAACA AGCTATCTGTAA ATGCGCGTAC 15301 460 4.92e-05 GAGCAAAATG CATTCACAGCCA GCGCTAATGC 46813 455 5.46e-05 TAGACTTTCA CGTTCACAGTCT GTGAGAGATT 49945 483 5.90e-05 TCGCCTCGTG TGGTAACTGTAA ACACGA 11144 19 5.90e-05 CGTTGGAGAC CACTCTCAGTTA ACAGTGGAGC 31559 408 5.90e-05 ATTTTGCAAC AGTCAACTGTCA ACTGCCTTTC 43913 286 7.06e-05 CTGTAAGTGT AAGTAACTGTGA ATGTTCGCCC 49055 29 7.67e-05 TGACCTAGGT CACTAACTGGAA AGACGTCCAC 8717 60 8.95e-05 CGACCGAGAC AGTTTACTGTTA GTTTTACTAC 16707 15 1.05e-04 CATGACTTGG CACTGGCTGTGA TATTGCTTCC 43452 383 1.22e-04 CATTTCGAGC TGATGACTGTAA AGGATGCCAA 41248 35 1.73e-04 TAGCTGCATG CGCTTTCTGCAA ACCTATATCG 5526 306 2.00e-04 CTTTCGCTCT TGATTACAGTCA CTACACTTAC 44692 91 2.56e-04 TTGGATTTCA ATTTCTCTGTCA GTGGTCCACC 32144 408 5.42e-04 GTACGTTGAC GATTTACTGTCT GACTCCTACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8876 2.5e-07 128_[+1]_360 49314 6.8e-07 324_[+1]_164 33141 2.7e-06 357_[+1]_131 46866 3.3e-06 257_[+1]_231 49652 5e-06 367_[+1]_121 46570 5e-06 302_[+1]_186 43297 7.4e-06 451_[+1]_37 14163 7.4e-06 395_[+1]_93 46851 1.1e-05 106_[+1]_382 39283 2e-05 377_[+1]_111 44346 2.2e-05 237_[+1]_251 43931 3.8e-05 26_[+1]_462 45250 4.2e-05 477_[+1]_11 15301 4.9e-05 459_[+1]_29 46813 5.5e-05 454_[+1]_34 49945 5.9e-05 482_[+1]_6 11144 5.9e-05 18_[+1]_470 31559 5.9e-05 407_[+1]_81 43913 7.1e-05 285_[+1]_203 49055 7.7e-05 28_[+1]_460 8717 9e-05 59_[+1]_429 16707 0.0001 14_[+1]_474 43452 0.00012 382_[+1]_106 41248 0.00017 34_[+1]_454 5526 0.0002 305_[+1]_183 44692 0.00026 90_[+1]_398 32144 0.00054 407_[+1]_81 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=27 8876 ( 129) CGTTCACTGTCA 1 49314 ( 325) CGCTCACTGTGA 1 33141 ( 358) CATTCACAGTCA 1 46866 ( 258) CAGTCACTGTCA 1 49652 ( 368) TACTCACTGTCA 1 46570 ( 303) CGTTGACTGTGA 1 43297 ( 452) CACTAACAGTAA 1 14163 ( 396) CATTGACTGTGA 1 46851 ( 107) CGCTATCAGTCA 1 39283 ( 378) CGCCAACTGTAA 1 44346 ( 238) CACTGTCAGTCA 1 43931 ( 27) CGCCCACTGTTA 1 45250 ( 478) AGCTATCTGTAA 1 15301 ( 460) CATTCACAGCCA 1 46813 ( 455) CGTTCACAGTCT 1 49945 ( 483) TGGTAACTGTAA 1 11144 ( 19) CACTCTCAGTTA 1 31559 ( 408) AGTCAACTGTCA 1 43913 ( 286) AAGTAACTGTGA 1 49055 ( 29) CACTAACTGGAA 1 8717 ( 60) AGTTTACTGTTA 1 16707 ( 15) CACTGGCTGTGA 1 43452 ( 383) TGATGACTGTAA 1 41248 ( 35) CGCTTTCTGCAA 1 5526 ( 306) TGATTACAGTCA 1 44692 ( 91) ATTTCTCTGTCA 1 32144 ( 408) GATTTACTGTCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 13203 bayes= 8.93074 E= 1.3e-004 -48 140 -258 -92 78 -1140 122 -292 -180 90 -100 40 -1140 -110 -1140 166 20 64 -26 -92 152 -1140 -258 -34 -1140 207 -1140 -1140 20 -1140 -1140 132 -1140 -1140 217 -1140 -1140 -168 -258 166 1 90 -26 -134 184 -1140 -1140 -192 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 27 E= 1.3e-004 0.185185 0.629630 0.037037 0.148148 0.444444 0.000000 0.518519 0.037037 0.074074 0.444444 0.111111 0.370370 0.000000 0.111111 0.000000 0.888889 0.296296 0.370370 0.185185 0.148148 0.740741 0.000000 0.037037 0.222222 0.000000 1.000000 0.000000 0.000000 0.296296 0.000000 0.000000 0.703704 0.000000 0.000000 1.000000 0.000000 0.000000 0.074074 0.037037 0.888889 0.259259 0.444444 0.185185 0.111111 0.925926 0.000000 0.000000 0.074074 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[GA][CT]T[CA][AT]C[TA]GT[CA]A -------------------------------------------------------------------------------- Time 6.68 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 3 llr = 56 E-value = 4.7e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::::::::a::: pos.-specific C a7::3:a:a:::::7 probability G ::a77::a:aa:::3 matrix T :3:3:a::::::aa: bits 2.2 * * ***** 2.0 * * ****** 1.7 * * ********* 1.5 * * ********* Relative 1.3 * * ********** Entropy 1.1 *************** (26.9 bits) 0.9 *************** 0.7 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CCGGGTCGCGGATTC consensus T TC G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 46851 60 5.38e-10 GTGGGTCGTA CCGGGTCGCGGATTC GTCGTGGCCT 33141 184 2.25e-09 TTGGCATACT CTGGGTCGCGGATTC CCTTTCAACG 46813 281 8.21e-09 ATTTTTGATT CCGTCTCGCGGATTG ACATCGATAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46851 5.4e-10 59_[+2]_426 33141 2.2e-09 183_[+2]_302 46813 8.2e-09 280_[+2]_205 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=3 46851 ( 60) CCGGGTCGCGGATTC 1 33141 ( 184) CTGGGTCGCGGATTC 1 46813 ( 281) CCGTCTCGCGGATTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 13122 bayes= 12.542 E= 4.7e+003 -823 207 -823 -823 -823 148 -823 25 -823 -823 216 -823 -823 -823 158 25 -823 48 158 -823 -823 -823 -823 183 -823 207 -823 -823 -823 -823 216 -823 -823 207 -823 -823 -823 -823 216 -823 -823 -823 216 -823 195 -823 -823 -823 -823 -823 -823 183 -823 -823 -823 183 -823 148 58 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 3 E= 4.7e+003 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.333333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[CT]G[GT][GC]TCGCGGATT[CG] -------------------------------------------------------------------------------- Time 13.08 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 7 llr = 88 E-value = 2.3e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 3:a:::19:::: pos.-specific C 1:::::91:16: probability G 49:6:a::a::a matrix T 11:4a::::94: bits 2.2 * * * 2.0 * * * * 1.7 * ** * * 1.5 ** *** * * Relative 1.3 ** ****** * Entropy 1.1 *********** (18.2 bits) 0.9 *********** 0.7 *********** 0.4 *********** 0.2 ************ 0.0 ------------ Multilevel GGAGTGCAGTCG consensus A T T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43931 385 7.98e-08 GTATTCCCGC GGAGTGCAGTTG ACATGTCATA 43913 339 7.98e-08 GTTCGCGTAT GGAGTGCAGTTG CCCTTGCTAG 8876 295 6.89e-07 CGTCTGTTTC TGAGTGCAGTTG CGGTAAACTC 15301 268 1.19e-06 TCTTTTACCG AGAGTGCCGTCG GAAACGCTTG 44692 213 1.59e-06 GTCGACACAG GTATTGCAGTCG ACCGATGGTT 46813 472 1.73e-06 AGTCTGTGAG AGATTGCAGCCG GATGATATTG 16707 280 3.29e-06 CCTGGAACAA CGATTGAAGTCG TACATGTATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43931 8e-08 384_[+3]_104 43913 8e-08 338_[+3]_150 8876 6.9e-07 294_[+3]_194 15301 1.2e-06 267_[+3]_221 44692 1.6e-06 212_[+3]_276 46813 1.7e-06 471_[+3]_17 16707 3.3e-06 279_[+3]_209 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=7 43931 ( 385) GGAGTGCAGTTG 1 43913 ( 339) GGAGTGCAGTTG 1 8876 ( 295) TGAGTGCAGTTG 1 15301 ( 268) AGAGTGCCGTCG 1 44692 ( 213) GTATTGCAGTCG 1 46813 ( 472) AGATTGCAGCCG 1 16707 ( 280) CGATTGAAGTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 13203 bayes= 12.1033 E= 2.3e+003 15 -74 94 -97 -945 -945 194 -97 195 -945 -945 -945 -945 -945 136 61 -945 -945 -945 183 -945 -945 217 -945 -85 185 -945 -945 173 -74 -945 -945 -945 -945 217 -945 -945 -74 -945 161 -945 126 -945 61 -945 -945 217 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 2.3e+003 0.285714 0.142857 0.428571 0.142857 0.000000 0.000000 0.857143 0.142857 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.571429 0.428571 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.571429 0.000000 0.428571 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA]GA[GT]TGCAGT[CT]G -------------------------------------------------------------------------------- Time 19.53 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31559 2.12e-02 140_[+2(7.70e-05)]_252_\ [+1(5.90e-05)]_81 8717 3.36e-01 59_[+1(8.95e-05)]_429 8876 7.68e-07 128_[+1(2.48e-07)]_67_\ [+1(1.59e-05)]_75_[+3(6.89e-07)]_194 5526 7.18e-02 500 46570 7.83e-03 141_[+1(2.48e-05)]_149_\ [+1(4.99e-06)]_186 46813 2.60e-08 280_[+2(8.21e-09)]_159_\ [+1(5.46e-05)]_5_[+3(1.73e-06)]_17 14163 2.33e-02 395_[+1(7.36e-06)]_93 15301 2.85e-04 267_[+3(1.19e-06)]_180_\ [+1(4.92e-05)]_29 43452 1.95e-01 500 49314 3.75e-03 324_[+1(6.84e-07)]_164 49652 1.54e-02 367_[+1(4.99e-06)]_121 16707 1.65e-03 279_[+3(3.29e-06)]_209 43913 8.13e-06 285_[+1(7.06e-05)]_11_\ [+2(8.11e-05)]_15_[+3(7.98e-08)]_150 43931 5.46e-05 26_[+1(3.75e-05)]_346_\ [+3(7.98e-08)]_104 44346 3.43e-02 237_[+1(2.20e-05)]_251 44692 2.04e-03 212_[+3(1.59e-06)]_276 11144 5.69e-02 18_[+1(5.90e-05)]_470 45250 5.82e-02 477_[+1(4.16e-05)]_11 46851 2.74e-07 59_[+2(5.38e-10)]_32_[+1(1.10e-05)]_\ 382 39283 3.58e-02 377_[+1(2.02e-05)]_111 32144 2.05e-01 500 43297 3.10e-02 451_[+1(7.36e-06)]_37 46866 3.10e-02 257_[+1(3.31e-06)]_231 33141 2.97e-07 183_[+2(2.25e-09)]_159_\ [+1(2.71e-06)]_131 41248 2.40e-01 500 49945 1.10e-01 482_[+1(5.90e-05)]_6 49055 6.79e-02 28_[+1(7.67e-05)]_460 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************