******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/178/178.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42483 1.0000 500 48797 1.0000 500 54952 1.0000 500 32535 1.0000 500 49822 1.0000 500 50296 1.0000 500 12313 1.0000 500 43443 1.0000 500 39807 1.0000 500 46199 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/178/178.seqs.fa -oc motifs/178 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.266 C 0.235 G 0.221 T 0.278 Background letter frequencies (from dataset with add-one prior applied): A 0.266 C 0.235 G 0.221 T 0.278 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 8 llr = 120 E-value = 4.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 18::1::111:::8::::534 pos.-specific C 4:631:5:35144:81:51:: probability G 53388:166::55331:5:86 matrix T ::1::a43:4911::8a:4:: bits 2.2 2.0 * * 1.7 * * 1.5 * * Relative 1.3 * * * * * * Entropy 1.1 * *** * ** ** ** (21.6 bits) 0.9 ***** ** * ***** ** 0.7 ****************** ** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GACGGTCGGCTGGACTTCAGG consensus CGGC TTCT CCGG GTAA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 50296 429 3.14e-10 GCGTCGCCGC CACGGTGGGCTCCACTTGTGG GATATCTCGT 46199 263 1.74e-09 TGCTGACAGT CACCGTTGGCTGCACTTCAAG AGGTGTCCAC 43443 109 2.75e-08 AAAGACACTT CGCGGTTGGTTGCAGTTCCGA TCTTGCCGCG 32535 247 2.75e-08 AGCGGCTTTC GACCGTCGACTCGACGTGAGA TTGTTAACAG 39807 121 7.83e-08 TGGTTTGTCT GGGGGTCTCCTTGGCTTGAGG AGAAAGCCAA 49822 288 3.47e-07 TACACTGGTA GACGATTAGTTGTACCTGTGG AATAATGGAA 48797 384 3.47e-07 AATAGGGAAC GATGGTCGCACGGAGTTCTGA CGGTGAGTTT 12313 397 4.42e-07 TCCATTTTTG AAGGCTCTGTTCGGCTTCAAG GTCCAATCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50296 3.1e-10 428_[+1]_51 46199 1.7e-09 262_[+1]_217 43443 2.8e-08 108_[+1]_371 32535 2.8e-08 246_[+1]_233 39807 7.8e-08 120_[+1]_359 49822 3.5e-07 287_[+1]_192 48797 3.5e-07 383_[+1]_96 12313 4.4e-07 396_[+1]_83 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=8 50296 ( 429) CACGGTGGGCTCCACTTGTGG 1 46199 ( 263) CACCGTTGGCTGCACTTCAAG 1 43443 ( 109) CGCGGTTGGTTGCAGTTCCGA 1 32535 ( 247) GACCGTCGACTCGACGTGAGA 1 39807 ( 121) GGGGGTCTCCTTGGCTTGAGG 1 49822 ( 288) GACGATTAGTTGTACCTGTGG 1 48797 ( 384) GATGGTCGCACGGAGTTCTGA 1 12313 ( 397) AAGGCTCTGTTCGGCTTCAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 9.96434 E= 4.1e+002 -109 67 117 -965 150 -965 17 -965 -965 141 17 -115 -965 9 176 -965 -109 -91 176 -965 -965 -965 -965 185 -965 109 -82 43 -109 -965 150 -15 -109 9 150 -965 -109 109 -965 43 -965 -91 -965 166 -965 67 117 -115 -965 67 117 -115 150 -965 17 -965 -965 167 17 -965 -965 -91 -82 143 -965 -965 -965 185 -965 109 117 -965 91 -91 -965 43 -9 -965 176 -965 50 -965 150 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 4.1e+002 0.125000 0.375000 0.500000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.625000 0.250000 0.125000 0.000000 0.250000 0.750000 0.000000 0.125000 0.125000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.125000 0.375000 0.125000 0.000000 0.625000 0.250000 0.125000 0.250000 0.625000 0.000000 0.125000 0.500000 0.000000 0.375000 0.000000 0.125000 0.000000 0.875000 0.000000 0.375000 0.500000 0.125000 0.000000 0.375000 0.500000 0.125000 0.750000 0.000000 0.250000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.125000 0.125000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.500000 0.125000 0.000000 0.375000 0.250000 0.000000 0.750000 0.000000 0.375000 0.000000 0.625000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC][AG][CG][GC]GT[CT][GT][GC][CT]T[GC][GC][AG][CG]TT[CG][AT][GA][GA] -------------------------------------------------------------------------------- Time 0.85 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 8 llr = 88 E-value = 4.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::a3149389 pos.-specific C 8::a:16:18:: probability G :9::::16:::: matrix T 31a::61:::31 bits 2.2 * 2.0 *** 1.7 *** 1.5 **** Relative 1.3 ***** ** * Entropy 1.1 ***** ***** (15.9 bits) 0.9 ***** ***** 0.7 ****** ***** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CGTCATCGACAA consensus T A A AT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 39807 320 2.42e-07 CCCGGACATT CGTCATCGACTA AACAACCCGG 48797 6 3.76e-07 CGGCC TGTCATCGACAA AGCTTAACGC 12313 449 2.56e-06 ACGTCAGCCC CGTCAACAACTA TATTTGGAAA 42483 32 3.40e-06 CTTCGTTCCA CGTCATTGAAAA TATATCAGTG 54952 479 5.19e-06 GTCCAGAGTG TTTCATCGACAA CTCCTTACGT 43443 159 6.36e-06 CCGTGGTTTA CGTCAACAACAT GAGACCTTGA 50296 101 8.17e-06 CAACTTCTGC CGTCACAAACAA CTCGCACACA 32535 215 1.44e-05 TCATTGTCGG CGTCATGGCAAA GCAAAGCTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39807 2.4e-07 319_[+2]_169 48797 3.8e-07 5_[+2]_483 12313 2.6e-06 448_[+2]_40 42483 3.4e-06 31_[+2]_457 54952 5.2e-06 478_[+2]_10 43443 6.4e-06 158_[+2]_330 50296 8.2e-06 100_[+2]_388 32535 1.4e-05 214_[+2]_274 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=8 39807 ( 320) CGTCATCGACTA 1 48797 ( 6) TGTCATCGACAA 1 12313 ( 449) CGTCAACAACTA 1 42483 ( 32) CGTCATTGAAAA 1 54952 ( 479) TTTCATCGACAA 1 43443 ( 159) CGTCAACAACAT 1 50296 ( 101) CGTCACAAACAA 1 32535 ( 215) CGTCATGGCAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 8.98975 E= 4.8e+002 -965 167 -965 -15 -965 -965 198 -115 -965 -965 -965 185 -965 209 -965 -965 191 -965 -965 -965 -9 -91 -965 117 -109 141 -82 -115 50 -965 150 -965 172 -91 -965 -965 -9 167 -965 -965 150 -965 -965 -15 172 -965 -965 -115 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 4.8e+002 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.125000 0.000000 0.625000 0.125000 0.625000 0.125000 0.125000 0.375000 0.000000 0.625000 0.000000 0.875000 0.125000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.875000 0.000000 0.000000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CT]GTCA[TA]C[GA]A[CA][AT]A -------------------------------------------------------------------------------- Time 1.73 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 4 llr = 78 E-value = 2.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::::83:3:aa::::3853: pos.-specific C 35:8::55:a::a5:a3:38a probability G 83:3a3:38::::3::33::: matrix T :3a:::33:::::3a:3:3:: bits 2.2 * * * * * 2.0 * * **** ** * 1.7 * * **** ** * 1.5 * * **** ** * Relative 1.3 * *** ***** ** ** Entropy 1.1 * **** ***** ** * ** (28.1 bits) 0.9 * **** ***** ** * ** 0.7 ****** ********* * ** 0.4 **************** **** 0.2 **************** **** 0.0 --------------------- Multilevel GCTCGACCGCAACCTCAAACC consensus CG G GAGA G CGCA sequence T TT T G T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 39807 287 3.95e-11 GCGTAGACTC GCTCGAATGCAACCTCCAACC AACCCGGACA 12313 428 1.06e-10 GTCCAATCGG GCTCGACCGCAACGTCAGCCC CGTCAACAAC 50296 244 3.19e-09 AAATATAAGA GGTGGACCACAACCTCTATAC AAACGCAATG 46199 325 3.55e-09 TCGTCCGTAG CTTCGGTGGCAACTTCGAACC CGATCCAGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39807 4e-11 286_[+3]_193 12313 1.1e-10 427_[+3]_52 50296 3.2e-09 243_[+3]_236 46199 3.5e-09 324_[+3]_155 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=4 39807 ( 287) GCTCGAATGCAACCTCCAACC 1 12313 ( 428) GCTCGACCGCAACGTCAGCCC 1 50296 ( 244) GGTGGACCACAACCTCTATAC 1 46199 ( 325) CTTCGGTGGCAACTTCGAACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 10.2276 E= 2.2e+003 -865 9 176 -865 -865 109 17 -15 -865 -865 -865 185 -865 167 17 -865 -865 -865 217 -865 149 -865 17 -865 -9 109 -865 -15 -865 109 17 -15 -9 -865 176 -865 -865 209 -865 -865 191 -865 -865 -865 191 -865 -865 -865 -865 209 -865 -865 -865 109 17 -15 -865 -865 -865 185 -865 209 -865 -865 -9 9 17 -15 149 -865 17 -865 91 9 -865 -15 -9 167 -865 -865 -865 209 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 2.2e+003 0.000000 0.250000 0.750000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.250000 0.500000 0.000000 0.250000 0.000000 0.500000 0.250000 0.250000 0.250000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.250000 0.250000 0.250000 0.750000 0.000000 0.250000 0.000000 0.500000 0.250000 0.000000 0.250000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC][CGT]T[CG]G[AG][CAT][CGT][GA]CAAC[CGT]TC[ACGT][AG][ACT][CA]C -------------------------------------------------------------------------------- Time 2.55 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42483 1.38e-02 31_[+2(3.40e-06)]_457 48797 1.77e-06 5_[+2(3.76e-07)]_366_[+1(3.47e-07)]_\ 96 54952 7.25e-03 478_[+2(5.19e-06)]_10 32535 8.26e-06 214_[+2(1.44e-05)]_20_\ [+1(2.75e-08)]_233 49822 1.70e-03 287_[+1(3.47e-07)]_192 50296 5.84e-13 100_[+2(8.17e-06)]_131_\ [+3(3.19e-09)]_164_[+1(3.14e-10)]_51 12313 7.32e-12 396_[+1(4.42e-07)]_10_\ [+3(1.06e-10)]_[+2(2.56e-06)]_40 43443 3.70e-06 108_[+1(2.75e-08)]_29_\ [+2(6.36e-06)]_330 39807 6.11e-14 120_[+1(7.83e-08)]_145_\ [+3(3.95e-11)]_12_[+2(2.42e-07)]_169 46199 4.32e-10 262_[+1(1.74e-09)]_41_\ [+3(3.55e-09)]_155 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************