******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/450/450.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11037 1.0000 500 21796 1.0000 500 23991 1.0000 500 24338 1.0000 500 25872 1.0000 500 263437 1.0000 500 4813 1.0000 500 6095 1.0000 500 6703 1.0000 500 7451 1.0000 500 7613 1.0000 500 9086 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/450/450.seqs.fa -oc motifs/450 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.260 C 0.250 G 0.248 T 0.242 Background letter frequencies (from dataset with add-one prior applied): A 0.260 C 0.250 G 0.248 T 0.243 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 10 llr = 151 E-value = 7.6e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::6::1:::::22:21:2::: pos.-specific C 392:4:161:8318::a1231 probability G 212:161:43:41:5::71:9 matrix T 5::a538457216239::77: bits 2.0 * * 1.8 * * 1.6 * * ** * 1.4 * * ** * Relative 1.2 * * ** * ** ** Entropy 1.0 * * ** ** * ** ** (21.7 bits) 0.8 * * *** ** * ****** 0.6 *********** ******** 0.4 *********** ********* 0.2 ********************* 0.0 --------------------- Multilevel TCATTGTCTTCGTCGTCGTTG consensus C C CT TGGTCATT ACC sequence G G A A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 23991 161 3.81e-12 AACTTTCGTG CCATTGTCGTCGTCGTCGTTG GGCGGACTCC 6095 151 6.16e-10 AGAGGCAGCA CCGTCGTCGTCGTCGTCGTCG TCCAACACCA 21796 366 4.75e-09 TCAGTCACTT CCCTTGTTGTTCTCTTCGTTG TGAGAGGAGA 4813 401 3.02e-08 TACGAGAAGG TCGTCATCGTCATCTTCATTG CCCTCCGATA 9086 68 3.96e-08 ATTCAACACA TCATTGCTTTCAACTTCGCTG GTGGCTTCCA 7451 140 8.47e-08 TGAAAGCCAC GCCTCGTCTGCGCCATCGTCG TGCATTTGCA 263437 140 3.12e-07 TAGAATCTGT TCATTTGTTGCTTCATCGGTG TACTGACGTG 6703 381 3.55e-07 GAGTTCTCAT TCATTTTTCGCGTTGACGCTG AAGACTTCGA 11037 370 3.55e-07 CCAGCAGTTC GCATCTTCTTCCACGTCCTCC TTGACACCAG 7613 257 9.46e-07 TCAGTTATGG TGATGGTCTTTCGTGTCATTG CATCCCTGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23991 3.8e-12 160_[+1]_319 6095 6.2e-10 150_[+1]_329 21796 4.8e-09 365_[+1]_114 4813 3e-08 400_[+1]_79 9086 4e-08 67_[+1]_412 7451 8.5e-08 139_[+1]_340 263437 3.1e-07 139_[+1]_340 6703 3.5e-07 380_[+1]_99 11037 3.5e-07 369_[+1]_110 7613 9.5e-07 256_[+1]_223 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=10 23991 ( 161) CCATTGTCGTCGTCGTCGTTG 1 6095 ( 151) CCGTCGTCGTCGTCGTCGTCG 1 21796 ( 366) CCCTTGTTGTTCTCTTCGTTG 1 4813 ( 401) TCGTCATCGTCATCTTCATTG 1 9086 ( 68) TCATTGCTTTCAACTTCGCTG 1 7451 ( 140) GCCTCGTCTGCGCCATCGTCG 1 263437 ( 140) TCATTTGTTGCTTCATCGGTG 1 6703 ( 381) TCATTTTTCGCGTTGACGCTG 1 11037 ( 370) GCATCTTCTTCCACGTCCTCC 1 7613 ( 257) TGATGGTCTTTCGTGTCATTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5760 bayes= 10.112 E= 7.6e-004 -997 26 -31 104 -997 185 -131 -997 121 -32 -31 -997 -997 -997 -997 204 -997 68 -131 104 -137 -997 127 31 -997 -132 -131 172 -997 126 -997 72 -997 -132 69 104 -997 -997 27 153 -997 168 -997 -28 -38 26 69 -128 -38 -132 -131 131 -997 168 -997 -28 -38 -997 101 31 -137 -997 -997 189 -997 200 -997 -997 -38 -132 150 -997 -997 -32 -131 153 -997 26 -997 153 -997 -132 186 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 7.6e-004 0.000000 0.300000 0.200000 0.500000 0.000000 0.900000 0.100000 0.000000 0.600000 0.200000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.100000 0.500000 0.100000 0.000000 0.600000 0.300000 0.000000 0.100000 0.100000 0.800000 0.000000 0.600000 0.000000 0.400000 0.000000 0.100000 0.400000 0.500000 0.000000 0.000000 0.300000 0.700000 0.000000 0.800000 0.000000 0.200000 0.200000 0.300000 0.400000 0.100000 0.200000 0.100000 0.100000 0.600000 0.000000 0.800000 0.000000 0.200000 0.200000 0.000000 0.500000 0.300000 0.100000 0.000000 0.000000 0.900000 0.000000 1.000000 0.000000 0.000000 0.200000 0.100000 0.700000 0.000000 0.000000 0.200000 0.100000 0.700000 0.000000 0.300000 0.000000 0.700000 0.000000 0.100000 0.900000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TCG]C[ACG]T[TC][GT]T[CT][TG][TG][CT][GCA][TA][CT][GTA]TC[GA][TC][TC]G -------------------------------------------------------------------------------- Time 1.24 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 7 llr = 119 E-value = 2.4e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :3::::4::41::::::::: pos.-specific C :::1:43:::1:4::::::: probability G 47:34:37:361394:3a1: matrix T 6:a666:3a319316a7:9a bits 2.0 * * * * * 1.8 * * * * * 1.6 * * * * * 1.4 * * * * * *** Relative 1.2 ** ** * * ***** Entropy 1.0 *** ** ** * ******* (24.4 bits) 0.8 *** ** ** * ******* 0.6 ****** ** * ******* 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel TGTTTTAGTAGTCGTTTGTT consensus GA GGCCT G G G G sequence G T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 11037 41 4.07e-10 CGATCTTCAT GGTGTTCGTAGTCGGTTGTT TTGTTGGTGT 23991 322 9.87e-10 GTGGTTTGTG TGTGGTGGTTGTTGTTTGTT TGTGCCGTCC 4813 152 1.37e-08 TATGTTGGGG TGTTGTGGTAGTGTGTGGTT GTAGATGAAT 21796 27 1.50e-08 TGGTGGTGCG GGTTTCAGTATTGGTTTGGT AGAGGAAAGG 6095 311 2.17e-08 CGGAGCTGGC GGTCGTATTTGTCGGTGGTT GCCATCTCGT 7613 167 2.50e-08 GACTCAAAGT TATTTCCTTGCTTGTTTGTT GTATGTTTGT 7451 60 2.85e-08 CCAAGTAGAT TATTTCAGTGAGCGTTTGTT CACATCACAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11037 4.1e-10 40_[+2]_440 23991 9.9e-10 321_[+2]_159 4813 1.4e-08 151_[+2]_329 21796 1.5e-08 26_[+2]_454 6095 2.2e-08 310_[+2]_170 7613 2.5e-08 166_[+2]_314 7451 2.9e-08 59_[+2]_421 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=7 11037 ( 41) GGTGTTCGTAGTCGGTTGTT 1 23991 ( 322) TGTGGTGGTTGTTGTTTGTT 1 4813 ( 152) TGTTGTGGTAGTGTGTGGTT 1 21796 ( 27) GGTTTCAGTATTGGTTTGGT 1 6095 ( 311) GGTCGTATTTGTCGGTGGTT 1 7613 ( 167) TATTTCCTTGCTTGTTTGTT 1 7451 ( 60) TATTTCAGTGAGCGTTTGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5772 bayes= 9.52943 E= 2.4e-002 -945 -945 79 124 14 -945 152 -945 -945 -945 -945 204 -945 -80 20 124 -945 -945 79 124 -945 78 -945 124 72 19 20 -945 -945 -945 152 24 -945 -945 -945 204 72 -945 20 24 -86 -80 120 -76 -945 -945 -79 182 -945 78 20 24 -945 -945 179 -76 -945 -945 79 124 -945 -945 -945 204 -945 -945 20 156 -945 -945 201 -945 -945 -945 -79 182 -945 -945 -945 204 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 2.4e-002 0.000000 0.000000 0.428571 0.571429 0.285714 0.000000 0.714286 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.285714 0.571429 0.000000 0.000000 0.428571 0.571429 0.000000 0.428571 0.000000 0.571429 0.428571 0.285714 0.285714 0.000000 0.000000 0.000000 0.714286 0.285714 0.000000 0.000000 0.000000 1.000000 0.428571 0.000000 0.285714 0.285714 0.142857 0.142857 0.571429 0.142857 0.000000 0.000000 0.142857 0.857143 0.000000 0.428571 0.285714 0.285714 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.428571 0.571429 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TG][GA]T[TG][TG][TC][ACG][GT]T[AGT]GT[CGT]G[TG]T[TG]GTT -------------------------------------------------------------------------------- Time 2.44 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 11 llr = 134 E-value = 1.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 15:12:6:2:2:31:7 pos.-specific C 1::1::::::8:22:: probability G 6:9579458a:7519: matrix T 251311:5:::3:613 bits 2.0 * 1.8 * 1.6 * * * * 1.4 * * * * Relative 1.2 * * **** ** Entropy 1.0 ** ******* ** (17.6 bits) 0.8 ** ******** ** 0.6 *** ************ 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GTGGGGAGGGCGGTGA consensus A T GT TA T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 24338 6 1.86e-08 CGGAG GAGTGGAGGGCTGTGA GAACACTGGA 6703 53 2.80e-08 AAATGATGAA GAGCGGAGGGCGGTGA TGCACCATCC 7451 257 1.06e-07 TGTTTGAGTG GTGGTGATGGCGATGA GAACATTCAA 4813 80 1.34e-07 AGATTATAGA GAGTGGATGGCGGAGA GGAGGTGAAA 263437 367 5.19e-07 GTTCAAAAAA GAGAGGGGGGAGGTGA CGTGGCGGTG 7613 86 6.32e-07 TGGGGATGTC ATGGGGGGGGCTGTGT TGTGGTGGCA 23991 279 8.28e-07 GATGGAGTTT TTGTGGGTGGCGCTGT CCAGCTCATC 21796 105 1.91e-06 GGTAGGCTGT TTGGGTGTGGCGGCGA CGTTGTTGTC 9086 136 6.18e-06 ACAAGATAGA GAGGGGAGAGCGAGTA CGTTCGGTCT 6095 4 9.33e-06 ATC GTTGAGATGGAGCTGT CGGCTGATGA 25872 481 1.28e-05 GCAACAATTA CTGGAGAGAGCTACGA ATGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24338 1.9e-08 5_[+3]_479 6703 2.8e-08 52_[+3]_432 7451 1.1e-07 256_[+3]_228 4813 1.3e-07 79_[+3]_405 263437 5.2e-07 366_[+3]_118 7613 6.3e-07 85_[+3]_399 23991 8.3e-07 278_[+3]_206 21796 1.9e-06 104_[+3]_380 9086 6.2e-06 135_[+3]_349 6095 9.3e-06 3_[+3]_481 25872 1.3e-05 480_[+3]_4 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=11 24338 ( 6) GAGTGGAGGGCTGTGA 1 6703 ( 53) GAGCGGAGGGCGGTGA 1 7451 ( 257) GTGGTGATGGCGATGA 1 4813 ( 80) GAGTGGATGGCGGAGA 1 263437 ( 367) GAGAGGGGGGAGGTGA 1 7613 ( 86) ATGGGGGGGGCTGTGT 1 23991 ( 279) TTGTGGGTGGCGCTGT 1 21796 ( 105) TTGGGTGTGGCGGCGA 1 9086 ( 136) GAGGGGAGAGCGAGTA 1 6095 ( 4) GTTGAGATGGAGCTGT 1 25872 ( 481) CTGGAGAGAGCTACGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 10.0725 E= 1.6e-001 -151 -146 136 -42 81 -1010 -1010 117 -1010 -1010 187 -141 -151 -146 114 17 -51 -1010 155 -141 -1010 -1010 187 -141 129 -1010 55 -1010 -1010 -1010 114 91 -51 -1010 172 -1010 -1010 -1010 201 -1010 -51 171 -1010 -1010 -1010 -1010 155 17 7 -46 114 -1010 -151 -46 -145 139 -1010 -1010 187 -141 149 -1010 -1010 17 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 1.6e-001 0.090909 0.090909 0.636364 0.181818 0.454545 0.000000 0.000000 0.545455 0.000000 0.000000 0.909091 0.090909 0.090909 0.090909 0.545455 0.272727 0.181818 0.000000 0.727273 0.090909 0.000000 0.000000 0.909091 0.090909 0.636364 0.000000 0.363636 0.000000 0.000000 0.000000 0.545455 0.454545 0.181818 0.000000 0.818182 0.000000 0.000000 0.000000 1.000000 0.000000 0.181818 0.818182 0.000000 0.000000 0.000000 0.000000 0.727273 0.272727 0.272727 0.181818 0.545455 0.000000 0.090909 0.181818 0.090909 0.636364 0.000000 0.000000 0.909091 0.090909 0.727273 0.000000 0.000000 0.272727 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[TA]G[GT]GG[AG][GT]GGC[GT][GA]TG[AT] -------------------------------------------------------------------------------- Time 3.55 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11037 8.58e-09 40_[+2(4.07e-10)]_309_\ [+1(3.55e-07)]_110 21796 8.21e-12 26_[+2(1.50e-08)]_58_[+3(1.91e-06)]_\ 67_[+2(9.35e-05)]_158_[+1(4.75e-09)]_114 23991 3.30e-16 160_[+1(3.81e-12)]_97_\ [+3(8.28e-07)]_13_[+2(3.73e-06)]_6_[+1(3.91e-05)]_146 24338 4.39e-04 5_[+3(1.86e-08)]_479 25872 4.42e-02 480_[+3(1.28e-05)]_4 263437 1.21e-06 139_[+1(3.12e-07)]_206_\ [+3(5.19e-07)]_118 4813 3.52e-12 79_[+3(1.34e-07)]_56_[+2(1.37e-08)]_\ 166_[+1(1.34e-06)]_42_[+1(3.02e-08)]_34_[+1(5.42e-05)]_24 6095 7.57e-12 3_[+3(9.33e-06)]_131_[+1(6.16e-10)]_\ 139_[+2(2.17e-08)]_170 6703 1.89e-07 52_[+3(2.80e-08)]_312_\ [+1(3.55e-07)]_99 7451 1.49e-11 59_[+2(2.85e-08)]_60_[+1(8.47e-08)]_\ 96_[+3(1.06e-07)]_228 7613 6.62e-10 85_[+3(6.32e-07)]_65_[+2(2.50e-08)]_\ 70_[+1(9.46e-07)]_223 9086 1.76e-06 67_[+1(3.96e-08)]_47_[+3(6.18e-06)]_\ 349 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************