******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/460/460.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42998 1.0000 500 37855 1.0000 500 48200 1.0000 500 39772 1.0000 500 15768 1.0000 500 40416 1.0000 500 44749 1.0000 500 11866 1.0000 500 12157 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/460/460.seqs.fa -oc motifs/460 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.241 C 0.253 G 0.247 T 0.258 Background letter frequencies (from dataset with add-one prior applied): A 0.241 C 0.253 G 0.247 T 0.258 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 8 llr = 98 E-value = 1.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :94::14596839a9 pos.-specific C 1:::8:34:1:8::: probability G 6:6:364:113:1:1 matrix T 31:a:3:1:1::::: bits 2.1 * * 1.8 * * 1.6 * * 1.4 * * * *** Relative 1.2 * ** * ***** Entropy 1.0 **** * ***** (17.6 bits) 0.8 **** * ***** 0.6 ****** ** ***** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel GAGTCGAAAAACAAA consensus T A GTGC GA sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 37855 91 8.17e-08 GCTTGTGTGT GAGTGTGAAAACAAA ACTCTTGATG 42998 405 2.18e-07 GGACACAGTT GAGTCGCCGAACAAA AATATTGGTC 11866 432 3.36e-07 GAGCAGATTG GTGTCGAAAAGCAAA CGCGCTTACG 44749 281 4.60e-07 TGAAGTCCCC GAGTCGGCAAAAAAG CAACGAAACT 48200 83 7.12e-07 TGGTTCAGCA GAATGACAAAACAAA GACTACCGAA 39772 149 1.65e-06 GTTGCCTGCC TAATCGACATGCAAA AACTGGACGA 40416 260 2.62e-06 CATTCTGCGT TAGTCTATAGACAAA CACGTTTCCA 12157 429 9.80e-06 GGACGTATCA CAATCGGAACAAGAA CTAGCCGCTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37855 8.2e-08 90_[+1]_395 42998 2.2e-07 404_[+1]_81 11866 3.4e-07 431_[+1]_54 44749 4.6e-07 280_[+1]_205 48200 7.1e-07 82_[+1]_403 39772 1.6e-06 148_[+1]_337 40416 2.6e-06 259_[+1]_226 12157 9.8e-06 428_[+1]_57 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=8 37855 ( 91) GAGTGTGAAAACAAA 1 42998 ( 405) GAGTCGCCGAACAAA 1 11866 ( 432) GTGTCGAAAAGCAAA 1 44749 ( 281) GAGTCGGCAAAAAAG 1 48200 ( 83) GAATGACAAAACAAA 1 39772 ( 149) TAATCGACATGCAAA 1 40416 ( 260) TAGTCTATAGACAAA 1 12157 ( 429) CAATCGGAACAAGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4374 bayes= 8.95433 E= 1.9e+001 -965 -102 134 -5 186 -965 -965 -105 64 -965 134 -965 -965 -965 -965 195 -965 157 2 -965 -95 -965 134 -5 64 -2 60 -965 105 57 -965 -105 186 -965 -98 -965 137 -102 -98 -105 164 -965 2 -965 5 157 -965 -965 186 -965 -98 -965 205 -965 -965 -965 186 -965 -98 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 8 E= 1.9e+001 0.000000 0.125000 0.625000 0.250000 0.875000 0.000000 0.000000 0.125000 0.375000 0.000000 0.625000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.125000 0.000000 0.625000 0.250000 0.375000 0.250000 0.375000 0.000000 0.500000 0.375000 0.000000 0.125000 0.875000 0.000000 0.125000 0.000000 0.625000 0.125000 0.125000 0.125000 0.750000 0.000000 0.250000 0.000000 0.250000 0.750000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GT]A[GA]T[CG][GT][AGC][AC]AA[AG][CA]AAA -------------------------------------------------------------------------------- Time 0.78 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 5 llr = 84 E-value = 5.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A a4a488:::2aa46::22: pos.-specific C :::222:::2:::2:8:6: probability G :6:2:::246::22::8:2 matrix T :::2::a86:::4:a2:28 bits 2.1 * * * ** * 1.8 * * * ** * 1.6 * * * ** * 1.4 * * * ** * Relative 1.2 * * **** ** *** * Entropy 1.0 *** ***** ** *** * (24.3 bits) 0.8 *** ***** ** *** * 0.6 *** ******** ****** 0.4 *** *************** 0.2 *** *************** 0.0 ------------------- Multilevel AGAAAATTTGAAAATCGCT consensus A CCC GGA TC TAAG sequence G C GG T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 37855 266 2.20e-10 AAATTAAAAA AAAAAATTTGAAAATCGAT TTCAAACGAG 11866 169 6.16e-10 CTGAGGGCTC AGAGAATTGAAAAATCGCT TGTCGCAGGA 40416 6 7.62e-09 AAGGA AGATACTTTGAATATTGCT CTTTGAGCTG 44749 392 6.28e-08 ACTGCGTTTC AGACCATTGCAATCTCGTT GGCGTAGTCA 48200 283 6.63e-08 GATTTTGGTT AAAAAATGTGAAGGTCACG GTCATTGATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37855 2.2e-10 265_[+2]_216 11866 6.2e-10 168_[+2]_313 40416 7.6e-09 5_[+2]_476 44749 6.3e-08 391_[+2]_90 48200 6.6e-08 282_[+2]_199 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=5 37855 ( 266) AAAAAATTTGAAAATCGAT 1 11866 ( 169) AGAGAATTGAAAAATCGCT 1 40416 ( 6) AGATACTTTGAATATTGCT 1 44749 ( 392) AGACCATTGCAATCTCGTT 1 48200 ( 283) AAAAAATGTGAAGGTCACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 4338 bayes= 10.011 E= 5.6e+002 205 -897 -897 -897 73 -897 128 -897 205 -897 -897 -897 73 -34 -31 -37 173 -34 -897 -897 173 -34 -897 -897 -897 -897 -897 195 -897 -897 -31 163 -897 -897 69 121 -27 -34 128 -897 205 -897 -897 -897 205 -897 -897 -897 73 -897 -31 63 131 -34 -31 -897 -897 -897 -897 195 -897 166 -897 -37 -27 -897 169 -897 -27 124 -897 -37 -897 -897 -31 163 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 5 E= 5.6e+002 1.000000 0.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.200000 0.200000 0.200000 0.800000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.400000 0.600000 0.200000 0.200000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.000000 0.200000 0.400000 0.600000 0.200000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.200000 0.200000 0.000000 0.800000 0.000000 0.200000 0.600000 0.000000 0.200000 0.000000 0.000000 0.200000 0.800000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[GA]A[ACGT][AC][AC]T[TG][TG][GAC]AA[ATG][ACG]T[CT][GA][CAT][TG] -------------------------------------------------------------------------------- Time 1.55 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 17 sites = 3 llr = 58 E-value = 1.3e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A aa3:::a7a:3a:33:: pos.-specific C ::7:::::::7::3::a probability G ::::a3:3::::a37a: matrix T :::a:7:::a::::::: bits 2.1 ** ** * ** ** ** 1.8 ** ** * ** ** ** 1.6 ** ** * ** ** ** 1.4 ** ** * ** ** ** Relative 1.2 ** ** * ** ** ** Entropy 1.0 ************* *** (28.1 bits) 0.8 ************* *** 0.6 ************* *** 0.4 ***************** 0.2 ***************** 0.0 ----------------- Multilevel AACTGTAAATCAGAGGC consensus A G G A CA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ----------------- 48200 384 7.77e-10 GTAAACTGTG AACTGTAAATAAGGGGC AAGCTAGTTC 42998 326 8.81e-10 ATAATAAACT AACTGTAGATCAGCGGC GACGGAACGC 44749 64 2.48e-09 GTAGTAAATC AAATGGAAATCAGAAGC TTGGGATGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48200 7.8e-10 383_[+3]_100 42998 8.8e-10 325_[+3]_158 44749 2.5e-09 63_[+3]_420 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=17 seqs=3 48200 ( 384) AACTGTAAATAAGGGGC 1 42998 ( 326) AACTGTAGATCAGCGGC 1 44749 ( 64) AAATGGAAATCAGAAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 17 n= 4356 bayes= 10.1615 E= 1.3e+003 205 -823 -823 -823 205 -823 -823 -823 47 139 -823 -823 -823 -823 -823 195 -823 -823 201 -823 -823 -823 43 136 205 -823 -823 -823 146 -823 43 -823 205 -823 -823 -823 -823 -823 -823 195 47 139 -823 -823 205 -823 -823 -823 -823 -823 201 -823 47 40 43 -823 47 -823 143 -823 -823 -823 201 -823 -823 198 -823 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 17 nsites= 3 E= 1.3e+003 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 1.000000 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.333333 0.333333 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AA[CA]TG[TG]A[AG]AT[CA]AG[ACG][GA]GC -------------------------------------------------------------------------------- Time 2.37 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42998 8.09e-09 325_[+3(8.81e-10)]_62_\ [+1(2.18e-07)]_81 37855 5.12e-10 90_[+1(8.17e-08)]_160_\ [+2(2.20e-10)]_216 48200 2.42e-12 82_[+1(7.12e-07)]_185_\ [+2(6.63e-08)]_82_[+3(7.77e-10)]_100 39772 4.27e-03 148_[+1(1.65e-06)]_337 15768 9.92e-01 500 40416 4.99e-07 5_[+2(7.62e-09)]_235_[+1(2.62e-06)]_\ 226 44749 4.53e-12 63_[+3(2.48e-09)]_200_\ [+1(4.60e-07)]_96_[+2(6.28e-08)]_33_[+1(5.60e-05)]_42 11866 1.37e-08 168_[+2(6.16e-10)]_244_\ [+1(3.36e-07)]_54 12157 6.36e-02 428_[+1(9.80e-06)]_57 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************