******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/433/433.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11242 1.0000 500 11256 1.0000 500 11302 1.0000 500 11357 1.0000 500 14616 1.0000 500 25541 1.0000 500 264575 1.0000 500 4074 1.0000 500 5774 1.0000 500 7623 1.0000 500 8615 1.0000 500 8919 1.0000 500 9282 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/433/433.seqs.fa -oc motifs/433 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.275 C 0.232 G 0.230 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.275 C 0.232 G 0.230 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 14 sites = 9 llr = 108 E-value = 1.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :79::a::32::12 pos.-specific C ::::7::2111::7 probability G a1:a1:8363::91 matrix T :21:2:24:39a:: bits 2.1 * * 1.9 * * * * 1.7 * * * * 1.5 * * * *** Relative 1.3 * ** ** *** Entropy 1.1 * ** ** *** (17.4 bits) 0.8 * ***** **** 0.6 ******* * **** 0.4 ********* **** 0.2 ************** 0.0 -------------- Multilevel GAAGCAGTGGTTGC consensus T T TGAT A sequence C A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 264575 114 2.82e-08 TTTAGCGGCA GAAGCAGCGTTTGC TGCCTTTGGT 8919 199 6.84e-08 CCCCGCAGAT GAAGCAGTGGTTGA AGTCAATGCA 5774 185 9.15e-07 TATTGATTTT GAAGTAGGCGTTGC CTCTAGTAAG 9282 393 1.67e-06 TTTGTTTTTC GTAGCATTGCTTGC TATTTGCATT 25541 3 1.98e-06 AC GATGCAGCAATTGC ATCAAACTTT 14616 269 2.65e-06 ACAGGAAGAG GAAGCAGGGTCTGG CTTATGAGAG 8615 394 3.10e-06 ATATTTGAAC GGAGGAGTATTTGC AGCCTCTGTG 7623 258 3.10e-06 TCAATTCATA GTAGCATTGATTGA AATGGCTTCC 4074 84 3.28e-06 AGAAGGAGGC GAAGTAGGAGTTAC CAAAGCCATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 264575 2.8e-08 113_[+1]_373 8919 6.8e-08 198_[+1]_288 5774 9.2e-07 184_[+1]_302 9282 1.7e-06 392_[+1]_94 25541 2e-06 2_[+1]_484 14616 2.6e-06 268_[+1]_218 8615 3.1e-06 393_[+1]_93 7623 3.1e-06 257_[+1]_229 4074 3.3e-06 83_[+1]_403 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=14 seqs=9 264575 ( 114) GAAGCAGCGTTTGC 1 8919 ( 199) GAAGCAGTGGTTGA 1 5774 ( 185) GAAGTAGGCGTTGC 1 9282 ( 393) GTAGCATTGCTTGC 1 25541 ( 3) GATGCAGCAATTGC 1 14616 ( 269) GAAGCAGGGTCTGG 1 8615 ( 394) GGAGGAGTATTTGC 1 7623 ( 258) GTAGCATTGATTGA 1 4074 ( 84) GAAGTAGGAGTTAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 6331 bayes= 9.59072 E= 1.4e+001 -982 -982 212 -982 128 -982 -105 -24 169 -982 -982 -124 -982 -982 212 -982 -982 152 -105 -24 186 -982 -982 -982 -982 -982 176 -24 -982 -6 54 75 28 -106 127 -982 -31 -106 54 34 -982 -106 -982 175 -982 -982 -982 192 -131 -982 195 -982 -31 152 -105 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 9 E= 1.4e+001 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.111111 0.222222 0.888889 0.000000 0.000000 0.111111 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.111111 0.222222 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.777778 0.222222 0.000000 0.222222 0.333333 0.444444 0.333333 0.111111 0.555556 0.000000 0.222222 0.111111 0.333333 0.333333 0.000000 0.111111 0.000000 0.888889 0.000000 0.000000 0.000000 1.000000 0.111111 0.000000 0.888889 0.000000 0.222222 0.666667 0.111111 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[AT]AG[CT]A[GT][TGC][GA][GTA]TTG[CA] -------------------------------------------------------------------------------- Time 1.56 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 9 llr = 115 E-value = 5.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :87:1:74:4:6331: pos.-specific C a::11a31a3a:66:9 probability G :2::4::1:2::1131 matrix T ::393::3:::4::6: bits 2.1 * * * * 1.9 * * * * 1.7 * * * * * 1.5 * * * * * * Relative 1.3 * * * * * * Entropy 1.1 **** ** * * * (18.4 bits) 0.8 **** ** * ** * 0.6 **** ** * ****** 0.4 **** ** ******** 0.2 **************** 0.0 ---------------- Multilevel CAATGCAACACACCTC consensus GT T CT C TAAG sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 8919 216 3.31e-09 TGGTTGAAGT CAATGCAACGCACCTC TGAGGAGAAA 11357 145 3.52e-08 TAGGAACCGA CATTTCAACCCTCCTC TTTCCCAACG 264575 172 1.40e-07 GGCCATCAAC CAATTCAACACACCAC ACAGGCGGTC 5774 51 2.92e-07 CAATTACAAG CAATGCCTCCCTGCTC AAAAATTCAT 7623 466 5.95e-07 AAAGCATACG CAATCCAACCCAAAGC TTTAGCGATA 11302 278 1.37e-06 AACGTACCTT CGATGCAGCACTCGTC TTCCTAGGCC 14616 370 2.05e-06 GACCCCTTGA CAATACACCACAAAGC TAATGAGAGG 25541 53 2.33e-06 CTGCATATGG CATTGCCTCGCAACTG AAAAGTGCAG 11242 451 5.37e-06 CAAGAAGCGT CGTCTCCTCACTCAGC ATCAACGCCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8919 3.3e-09 215_[+2]_269 11357 3.5e-08 144_[+2]_340 264575 1.4e-07 171_[+2]_313 5774 2.9e-07 50_[+2]_434 7623 6e-07 465_[+2]_19 11302 1.4e-06 277_[+2]_207 14616 2.1e-06 369_[+2]_115 25541 2.3e-06 52_[+2]_432 11242 5.4e-06 450_[+2]_34 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=9 8919 ( 216) CAATGCAACGCACCTC 1 11357 ( 145) CATTTCAACCCTCCTC 1 264575 ( 172) CAATTCAACACACCAC 1 5774 ( 51) CAATGCCTCCCTGCTC 1 7623 ( 466) CAATCCAACCCAAAGC 1 11302 ( 278) CGATGCAGCACTCGTC 1 14616 ( 370) CAATACACCACAAAGC 1 25541 ( 53) CATTGCCTCGCAACTG 1 11242 ( 451) CGTCTCCTCACTCAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 9.58478 E= 5.4e+001 -982 211 -982 -982 150 -982 -5 -982 128 -982 -982 34 -982 -106 -982 175 -131 -106 95 34 -982 211 -982 -982 128 52 -982 -982 69 -106 -105 34 -982 211 -982 -982 69 52 -5 -982 -982 211 -982 -982 101 -982 -982 75 28 126 -105 -982 28 126 -105 -982 -131 -982 54 108 -982 194 -105 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 5.4e+001 0.000000 1.000000 0.000000 0.000000 0.777778 0.000000 0.222222 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 0.111111 0.000000 0.888889 0.111111 0.111111 0.444444 0.333333 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.444444 0.111111 0.111111 0.333333 0.000000 1.000000 0.000000 0.000000 0.444444 0.333333 0.222222 0.000000 0.000000 1.000000 0.000000 0.000000 0.555556 0.000000 0.000000 0.444444 0.333333 0.555556 0.111111 0.000000 0.333333 0.555556 0.111111 0.000000 0.111111 0.000000 0.333333 0.555556 0.000000 0.888889 0.111111 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[AG][AT]T[GT]C[AC][AT]C[ACG]C[AT][CA][CA][TG]C -------------------------------------------------------------------------------- Time 3.09 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 12 llr = 117 E-value = 4.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :2::3a2:a:43 pos.-specific C a17a8:38:341 probability G :12:::2::::1 matrix T :72:::42:825 bits 2.1 * * 1.9 * * * * 1.7 * * * * 1.5 * * * ** Relative 1.3 * *** ** Entropy 1.1 * *** *** (14.0 bits) 0.8 * **** *** 0.6 * **** *** 0.4 ****** **** 0.2 ****** ***** 0.0 ------------ Multilevel CTCCCATCATAT consensus A C CCA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 11357 174 2.91e-07 CCCAACGGCT CTCCCACCATAT GGAACAGAGG 11302 3 8.84e-07 GG CTCCCATCACAT CACCCATCAC 7623 439 1.33e-06 AGGAGAGATA CTGCCATCATCT CATAAAAAGC 9282 283 1.86e-06 GGACTTGTGA CTCCCAACATAA ATGGTGCCAC 8919 92 9.30e-06 TCCCCCCACT CTCCCATCATTC ACAGCAATAA 8615 310 1.29e-05 TTTCTGACAC CTTCAATCATAT CTTTCTCTAG 4074 403 1.68e-05 TGATATAAAC CACCAATCATCA TGACGGATCA 25541 221 2.04e-05 ACAACACATC CTCCAACTATCT CTTAAAACAA 11242 407 4.15e-05 GCAGGTATTT CGGCCACCATCA ACATGAGTAC 14616 420 5.37e-05 AGAAAATTAG CCCCCAACACAA CATCGAGCCC 264575 484 6.88e-05 ACTATCCCAG CACCCAGCACCG CCGCG 5774 106 7.66e-05 TTCCTCATCA CTTCCAGTATTT AGGTCTCTAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11357 2.9e-07 173_[+3]_315 11302 8.8e-07 2_[+3]_486 7623 1.3e-06 438_[+3]_50 9282 1.9e-06 282_[+3]_206 8919 9.3e-06 91_[+3]_397 8615 1.3e-05 309_[+3]_179 4074 1.7e-05 402_[+3]_86 25541 2e-05 220_[+3]_268 11242 4.1e-05 406_[+3]_82 14616 5.4e-05 419_[+3]_69 264575 6.9e-05 483_[+3]_5 5774 7.7e-05 105_[+3]_383 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=12 11357 ( 174) CTCCCACCATAT 1 11302 ( 3) CTCCCATCACAT 1 7623 ( 439) CTGCCATCATCT 1 9282 ( 283) CTCCCAACATAA 1 8919 ( 92) CTCCCATCATTC 1 8615 ( 310) CTTCAATCATAT 1 4074 ( 403) CACCAATCATCA 1 25541 ( 221) CTCCAACTATCT 1 11242 ( 407) CGGCCACCATCA 1 14616 ( 420) CCCCCAACACAA 1 264575 ( 484) CACCCAGCACCG 1 5774 ( 106) CTTCCAGTATTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 9.49463 E= 4.4e+002 -1023 211 -1023 -1023 -72 -147 -146 134 -1023 152 -46 -66 -1023 211 -1023 -1023 -14 169 -1023 -1023 186 -1023 -1023 -1023 -72 11 -46 66 -1023 184 -1023 -66 186 -1023 -1023 -1023 -1023 11 -1023 151 60 85 -1023 -66 28 -147 -146 92 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 4.4e+002 0.000000 1.000000 0.000000 0.000000 0.166667 0.083333 0.083333 0.666667 0.000000 0.666667 0.166667 0.166667 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.250000 0.166667 0.416667 0.000000 0.833333 0.000000 0.166667 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.416667 0.416667 0.000000 0.166667 0.333333 0.083333 0.083333 0.500000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CTCC[CA]A[TC]CA[TC][AC][TA] -------------------------------------------------------------------------------- Time 4.48 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11242 1.75e-03 406_[+3(4.15e-05)]_32_\ [+2(5.37e-06)]_34 11256 1.00e+00 500 11302 1.01e-05 2_[+3(8.84e-07)]_[+3(4.45e-05)]_251_\ [+2(1.37e-06)]_207 11357 5.00e-07 144_[+2(3.52e-08)]_13_\ [+3(2.91e-07)]_315 14616 5.54e-06 268_[+1(2.65e-06)]_87_\ [+2(2.05e-06)]_34_[+3(5.37e-05)]_69 25541 2.03e-06 2_[+1(1.98e-06)]_36_[+2(2.33e-06)]_\ 152_[+3(2.04e-05)]_268 264575 9.78e-09 113_[+1(2.82e-08)]_44_\ [+2(1.40e-07)]_296_[+3(6.88e-05)]_5 4074 1.62e-04 83_[+1(3.28e-06)]_305_\ [+3(1.68e-05)]_86 5774 5.07e-07 50_[+2(2.92e-07)]_39_[+3(7.66e-05)]_\ 67_[+1(9.15e-07)]_16_[+1(8.65e-05)]_272 7623 7.52e-08 257_[+1(3.10e-06)]_167_\ [+3(1.33e-06)]_15_[+2(5.95e-07)]_19 8615 7.28e-04 309_[+3(1.29e-05)]_72_\ [+1(3.10e-06)]_93 8919 1.10e-10 91_[+3(9.30e-06)]_95_[+1(6.84e-08)]_\ 3_[+2(3.31e-09)]_269 9282 3.56e-05 282_[+3(1.86e-06)]_98_\ [+1(1.67e-06)]_94 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************