******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/363/363.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11097 1.0000 500 11500 1.0000 500 14527 1.0000 500 18378 1.0000 500 20577 1.0000 500 20823 1.0000 500 22899 1.0000 500 24490 1.0000 500 264866 1.0000 500 31156 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/363/363.seqs.fa -oc motifs/363 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.269 C 0.238 G 0.222 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.269 C 0.238 G 0.222 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 10 llr = 121 E-value = 5.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::2::122:::1:1:1 pos.-specific C 1:1::1::1:1:6:1: probability G 873518318:751:9: matrix T 13459:571a2439:9 bits 2.2 2.0 * 1.7 * * 1.5 * * *** Relative 1.3 ** ** ** *** Entropy 1.1 ** *** ** *** (17.5 bits) 0.9 ** *** *** *** 0.7 ** *** ********* 0.4 ** ************* 0.2 **************** 0.0 ---------------- Multilevel GGTGTGTTGTGGCTGT consensus TGT GA TTT sequence A A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 14527 451 8.10e-09 ATCACATTGA GGAGTGGTGTGTCTGT AGATAGCACA 20823 210 2.72e-08 GTCACCCGCC GGTGTCTTGTGGCTGT GATGAGAACC 11500 10 5.27e-08 GTGTTGATG GGTGTGATGTGGGTGT GATGTAGAGG 11097 413 4.75e-07 AAAAATTAGT GGTTTGTGGTCTCTGT TTCCGTTCAT 31156 58 5.20e-07 TATGGTGATG GTGTTGTTGTGGTTGA TCGGTGAATT 24490 319 8.89e-07 TGTTGTGACT GGCTGGTTGTTGCTGT CGCTCCAAGA 22899 337 1.57e-06 GCGCCTTATC GGTTTGGTTTGTCTCT CGTCCACCTG 264866 64 2.14e-06 ACACACCAAC CTGTTGTTGTTGTTGT TTTGAATTCT 18378 104 1.26e-05 AGACTGATAT TGGGTGAAGTGTTAGT TCCCGAATCC 20577 330 3.33e-05 GACGTGAACA GTAGTAGACTGACTGT CCGTGTTGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14527 8.1e-09 450_[+1]_34 20823 2.7e-08 209_[+1]_275 11500 5.3e-08 9_[+1]_475 11097 4.7e-07 412_[+1]_72 31156 5.2e-07 57_[+1]_427 24490 8.9e-07 318_[+1]_166 22899 1.6e-06 336_[+1]_148 264866 2.1e-06 63_[+1]_421 18378 1.3e-05 103_[+1]_381 20577 3.3e-05 329_[+1]_155 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=10 14527 ( 451) GGAGTGGTGTGTCTGT 1 20823 ( 210) GGTGTCTTGTGGCTGT 1 11500 ( 10) GGTGTGATGTGGGTGT 1 11097 ( 413) GGTTTGTGGTCTCTGT 1 31156 ( 58) GTGTTGTTGTGGTTGA 1 24490 ( 319) GGCTGGTTGTTGCTGT 1 22899 ( 337) GGTTTGGTTTGTCTCT 1 264866 ( 64) CTGTTGTTGTTGTTGT 1 18378 ( 104) TGGGTGAAGTGTTAGT 1 20577 ( 330) GTAGTAGACTGACTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4850 bayes= 8.91886 E= 5.6e-001 -997 -125 185 -143 -997 -997 165 15 -43 -125 43 56 -997 -997 117 89 -997 -997 -115 173 -142 -125 185 -997 -43 -997 43 89 -43 -997 -115 137 -997 -125 185 -143 -997 -997 -997 188 -997 -125 165 -44 -142 -997 117 56 -997 133 -115 15 -142 -997 -997 173 -997 -125 202 -997 -142 -997 -997 173 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 5.6e-001 0.000000 0.100000 0.800000 0.100000 0.000000 0.000000 0.700000 0.300000 0.200000 0.100000 0.300000 0.400000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.100000 0.900000 0.100000 0.100000 0.800000 0.000000 0.200000 0.000000 0.300000 0.500000 0.200000 0.000000 0.100000 0.700000 0.000000 0.100000 0.800000 0.100000 0.000000 0.000000 0.000000 1.000000 0.000000 0.100000 0.700000 0.200000 0.100000 0.000000 0.500000 0.400000 0.000000 0.600000 0.100000 0.300000 0.100000 0.000000 0.000000 0.900000 0.000000 0.100000 0.900000 0.000000 0.100000 0.000000 0.000000 0.900000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[GT][TGA][GT]TG[TGA][TA]GT[GT][GT][CT]TGT -------------------------------------------------------------------------------- Time 1.00 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 8 llr = 108 E-value = 1.4e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 14::94:a1:19:41: pos.-specific C 658a::a:854:a6:9 probability G 31::15::::4:::31 matrix T ::3::1::1511::6: bits 2.2 * * * 2.0 * ** * 1.7 * ** * 1.5 * ** * * Relative 1.3 *** ** ** * Entropy 1.1 *** ** * *** * (19.5 bits) 0.9 * *** **** *** * 0.7 ********** ***** 0.4 ********** ***** 0.2 **************** 0.0 ---------------- Multilevel CCCCAGCACCCACCTC consensus GAT A TG AG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 22899 262 2.14e-08 ACTCTTCAAT CCCCAGCACTCACCAC CAGGGCATTG 20577 479 6.06e-08 TGTCCTCCCG CACCAGCACCAACCGC CCCCTC 24490 414 1.40e-07 AGGCGGGAAC ACCCAGCACCGACAGC CAACGCCCCG 20823 429 2.72e-07 TTGAAACAAC CGCCGACACTGACCTC TGAGTCGACA 14527 286 3.84e-07 AGGTGTTTCC GCCCAGCAACTACCTC GTTCCCGTCC 11500 461 4.83e-07 CACCACTGAC GCCCATCACCGACCTG TTTATACAGC 31156 436 8.11e-07 TTATCGCACC CATCAACATTCACATC ATCGTCAACG 11097 436 8.69e-07 TGTTTCCGTT CATCAACACTCTCATC ACCCTCTCTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22899 2.1e-08 261_[+2]_223 20577 6.1e-08 478_[+2]_6 24490 1.4e-07 413_[+2]_71 20823 2.7e-07 428_[+2]_56 14527 3.8e-07 285_[+2]_199 11500 4.8e-07 460_[+2]_24 31156 8.1e-07 435_[+2]_49 11097 8.7e-07 435_[+2]_49 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=8 22899 ( 262) CCCCAGCACTCACCAC 1 20577 ( 479) CACCAGCACCAACCGC 1 24490 ( 414) ACCCAGCACCGACAGC 1 20823 ( 429) CGCCGACACTGACCTC 1 14527 ( 286) GCCCAGCAACTACCTC 1 11500 ( 461) GCCCATCACCGACCTG 1 31156 ( 436) CATCAACATTCACATC 1 11097 ( 436) CATCAACACTCTCATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4850 bayes= 9.24139 E= 1.4e+000 -110 139 17 -965 48 107 -83 -965 -965 165 -965 -11 -965 207 -965 -965 170 -965 -83 -965 48 -965 117 -111 -965 207 -965 -965 189 -965 -965 -965 -110 165 -965 -111 -965 107 -965 89 -110 65 75 -111 170 -965 -965 -111 -965 207 -965 -965 48 139 -965 -965 -110 -965 17 121 -965 187 -83 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 1.4e+000 0.125000 0.625000 0.250000 0.000000 0.375000 0.500000 0.125000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.375000 0.000000 0.500000 0.125000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.750000 0.000000 0.125000 0.000000 0.500000 0.000000 0.500000 0.125000 0.375000 0.375000 0.125000 0.875000 0.000000 0.000000 0.125000 0.000000 1.000000 0.000000 0.000000 0.375000 0.625000 0.000000 0.000000 0.125000 0.000000 0.250000 0.625000 0.000000 0.875000 0.125000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG][CA][CT]CA[GA]CAC[CT][CG]AC[CA][TG]C -------------------------------------------------------------------------------- Time 1.92 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 18 sites = 8 llr = 114 E-value = 2.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 8364:31:6::99:6:a: pos.-specific C ::::3::6:3:::1:::: probability G 35358:4416a11948:8 matrix T :311:85:31:::::3:3 bits 2.2 * 2.0 * * 1.7 * * 1.5 * * * Relative 1.3 * **** *** Entropy 1.1 * ** * ******** (20.6 bits) 0.9 * ** * ********* 0.7 * **************** 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel AGAGGTTCAGGAAGAGAG consensus GAGACAGGTC GT T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 22899 61 1.27e-09 TAGTTAAGTG AGTGGTGCAGGAAGAGAG CTACGATACC 18378 57 1.17e-08 CAAAATCCAA AGGAGTTGACGAAGGGAG GCAATCAGAG 11097 238 6.43e-08 CTGTAGGGAG GGAGGAGGAGGAGGAGAG GGTTTGCAGT 14527 377 1.49e-07 GCCGTGTGCC GTGGCTTCAGGAAGATAG CGCTTCGGCA 11500 31 1.62e-07 GGTGTGATGT AGAGGAAGTGGAAGATAG CAAGCTGTTG 31156 109 2.25e-07 AAATGGCGCC AAAAGTTCGCGAAGGGAT AGGGTCGACT 20577 188 4.69e-07 ATACATATTA AAATCTTCTGGAAGGGAT CCATTTTCCC 20823 87 8.72e-07 CTTCTTATTT ATAAGTGCATGGACAGAG TTCTCGTCCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22899 1.3e-09 60_[+3]_422 18378 1.2e-08 56_[+3]_426 11097 6.4e-08 237_[+3]_245 14527 1.5e-07 376_[+3]_106 11500 1.6e-07 30_[+3]_452 31156 2.3e-07 108_[+3]_374 20577 4.7e-07 187_[+3]_295 20823 8.7e-07 86_[+3]_396 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=18 seqs=8 22899 ( 61) AGTGGTGCAGGAAGAGAG 1 18378 ( 57) AGGAGTTGACGAAGGGAG 1 11097 ( 238) GGAGGAGGAGGAGGAGAG 1 14527 ( 377) GTGGCTTCAGGAAGATAG 1 11500 ( 31) AGAGGAAGTGGAAGATAG 1 31156 ( 109) AAAAGTTCGCGAAGGGAT 1 20577 ( 188) AAATCTTCTGGAAGGGAT 1 20823 ( 87) ATAAGTGCATGGACAGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 4830 bayes= 9.23542 E= 2.1e+000 148 -965 17 -965 -10 -965 117 -11 122 -965 17 -111 48 -965 117 -111 -965 7 175 -965 -10 -965 -965 147 -110 -965 75 89 -965 139 75 -965 122 -965 -83 -11 -965 7 149 -111 -965 -965 217 -965 170 -965 -83 -965 170 -965 -83 -965 -965 -93 198 -965 122 -965 75 -965 -965 -965 175 -11 189 -965 -965 -965 -965 -965 175 -11 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 8 E= 2.1e+000 0.750000 0.000000 0.250000 0.000000 0.250000 0.000000 0.500000 0.250000 0.625000 0.000000 0.250000 0.125000 0.375000 0.000000 0.500000 0.125000 0.000000 0.250000 0.750000 0.000000 0.250000 0.000000 0.000000 0.750000 0.125000 0.000000 0.375000 0.500000 0.000000 0.625000 0.375000 0.000000 0.625000 0.000000 0.125000 0.250000 0.000000 0.250000 0.625000 0.125000 0.000000 0.000000 1.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.875000 0.000000 0.125000 0.000000 0.000000 0.125000 0.875000 0.000000 0.625000 0.000000 0.375000 0.000000 0.000000 0.000000 0.750000 0.250000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AG][GAT][AG][GA][GC][TA][TG][CG][AT][GC]GAAG[AG][GT]A[GT] -------------------------------------------------------------------------------- Time 2.95 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11097 1.14e-09 237_[+3(6.43e-08)]_157_\ [+1(4.75e-07)]_7_[+2(8.69e-07)]_49 11500 2.02e-10 9_[+1(5.27e-08)]_5_[+3(1.62e-07)]_\ 412_[+2(4.83e-07)]_24 14527 2.62e-11 285_[+2(3.84e-07)]_75_\ [+3(1.49e-07)]_56_[+1(8.10e-09)]_34 18378 3.05e-06 56_[+3(1.17e-08)]_29_[+1(1.26e-05)]_\ 381 20577 3.07e-08 187_[+3(4.69e-07)]_72_\ [+2(5.71e-05)]_36_[+1(3.33e-05)]_133_[+2(6.06e-08)]_6 20823 3.08e-10 86_[+3(8.72e-07)]_105_\ [+1(2.72e-08)]_203_[+2(2.72e-07)]_56 22899 2.82e-12 60_[+3(1.27e-09)]_183_\ [+2(2.14e-08)]_59_[+1(1.57e-06)]_33_[+2(5.71e-05)]_99 24490 4.19e-06 228_[+2(8.87e-05)]_74_\ [+1(8.89e-07)]_79_[+2(1.40e-07)]_71 264866 1.22e-02 63_[+1(2.14e-06)]_421 31156 3.73e-09 57_[+1(5.20e-07)]_35_[+3(2.25e-07)]_\ 309_[+2(8.11e-07)]_49 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************