******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/341/341.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10023 1.0000 500 10100 1.0000 500 10529 1.0000 500 10537 1.0000 500 10781 1.0000 500 11408 1.0000 500 11648 1.0000 500 11728 1.0000 500 11790 1.0000 500 1960 1.0000 500 24577 1.0000 500 3030 1.0000 500 3608 1.0000 500 4845 1.0000 500 8009 1.0000 500 8243 1.0000 500 8617 1.0000 500 9154 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/341/341.seqs.fa -oc motifs/341 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.281 C 0.240 G 0.219 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.281 C 0.240 G 0.219 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 12 llr = 125 E-value = 7.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1:::88:3:12: pos.-specific C :41331:3:1:: probability G :3:8::61a::a matrix T 939::144:88: bits 2.2 * * 2.0 * * 1.8 * * 1.5 * * * * Relative 1.3 * ** * ** Entropy 1.1 * ***** **** (15.1 bits) 0.9 * ***** **** 0.7 * ***** **** 0.4 ******* **** 0.2 ************ 0.0 ------------ Multilevel TCTGAAGTGTTG consensus T CC TA sequence G C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 24577 253 3.20e-07 GATGGTGGGA TCTGAAGAGTTG GAAAGGTTGT 3030 20 1.93e-06 ACCACTTTGC TTTGCATTGTTG ACATATTGAC 4845 210 2.89e-06 CCATCTCGTC TCTCCAGTGTTG ACATTGTACC 11648 99 4.43e-06 GAGTACGATG TCTCCAGCGTTG GAGCGACAGA 1960 90 5.13e-06 TTGTGATGGT TGTGATGTGTTG CTTACGCTTC 10023 28 5.65e-06 GAAATCGGCC TGCGAAGTGTTG ATCTGTGCGA 8243 265 6.08e-06 GCAAGAAGAG TCTGAATAGTAG TGCAACAGAT 10100 216 7.94e-06 TCGTCTCATT TTTGAATTGCTG GGCTCTATTT 8009 188 8.83e-06 ACAGAGCAGC TTTGAAGCGATG ACGCTACCGT 3608 26 1.19e-05 GGGGCAGTGC TTTCAATGGTTG TATTGCCTTC 11790 171 1.19e-05 TTGGGAGAAG AGTGAAGAGTTG TTTGAAGGAT 11728 301 3.18e-05 ACGGTCTACC TCTGACTCGTAG GAGTCCCATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24577 3.2e-07 252_[+1]_236 3030 1.9e-06 19_[+1]_469 4845 2.9e-06 209_[+1]_279 11648 4.4e-06 98_[+1]_390 1960 5.1e-06 89_[+1]_399 10023 5.7e-06 27_[+1]_461 8243 6.1e-06 264_[+1]_224 10100 7.9e-06 215_[+1]_273 8009 8.8e-06 187_[+1]_301 3608 1.2e-05 25_[+1]_463 11790 1.2e-05 170_[+1]_318 11728 3.2e-05 300_[+1]_188 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=12 24577 ( 253) TCTGAAGAGTTG 1 3030 ( 20) TTTGCATTGTTG 1 4845 ( 210) TCTCCAGTGTTG 1 11648 ( 99) TCTCCAGCGTTG 1 1960 ( 90) TGTGATGTGTTG 1 10023 ( 28) TGCGAAGTGTTG 1 8243 ( 265) TCTGAATAGTAG 1 10100 ( 216) TTTGAATTGCTG 1 8009 ( 188) TTTGAAGCGATG 1 3608 ( 26) TTTCAATGGTTG 1 11790 ( 171) AGTGAAGAGTTG 1 11728 ( 301) TCTGACTCGTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 9.17512 E= 7.4e+001 -175 -1023 -1023 182 -1023 79 19 36 -1023 -152 -1023 182 -1023 6 178 -1023 142 6 -1023 -1023 157 -152 -1023 -164 -1023 -1023 141 68 -17 6 -139 68 -1023 -1023 219 -1023 -175 -152 -1023 168 -75 -1023 -1023 168 -1023 -1023 219 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 7.4e+001 0.083333 0.000000 0.000000 0.916667 0.000000 0.416667 0.250000 0.333333 0.000000 0.083333 0.000000 0.916667 0.000000 0.250000 0.750000 0.000000 0.750000 0.250000 0.000000 0.000000 0.833333 0.083333 0.000000 0.083333 0.000000 0.000000 0.583333 0.416667 0.250000 0.250000 0.083333 0.416667 0.000000 0.000000 1.000000 0.000000 0.083333 0.083333 0.000000 0.833333 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[CTG]T[GC][AC]A[GT][TAC]GTTG -------------------------------------------------------------------------------- Time 3.00 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 9 llr = 116 E-value = 1.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::2::2211::42:4: pos.-specific C ::1::::11:21112: probability G a:7::8:311816819 matrix T :a:aa:8479:31121 bits 2.2 * 2.0 ** ** 1.8 ** ** * 1.5 ** ** * * Relative 1.3 ** *** ** * Entropy 1.1 ** **** ** * * (18.6 bits) 0.9 ******* ** * * 0.7 ******* ** * * 0.4 ******* *** ** * 0.2 **************** 0.0 ---------------- Multilevel GTGTTGTTTTGAGGAG consensus A AAG CTA C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 8009 1 4.24e-10 . GTGTTGTTTTGTGGAG AAGTGGGAGG 10781 307 1.50e-08 CATTAGGGCT GTGTTATTTTGAGGCG GCATCCGGAG 10529 176 8.93e-08 GTTTTGTTCG GTGTTGTGTTGAGTGG AGACAGGTGC 4845 101 5.77e-07 GTCCTCGATA GTATTGTGGTGTCGAG TAACCATGAT 11790 3 1.22e-06 TA GTGTTGACTGGGGGAG GAGAGGTGGG 1960 46 1.52e-06 TGGAGTTATA GTGTTATTTTCTGGTT ACAGTCTTGG 3608 1 1.87e-06 . GTCTTGTGATGCAGAG GGGCAGTGCT 11728 42 2.01e-06 GATGTTCTCG GTGTTGTACTCAAGTG TACCACAACA 10100 115 3.64e-06 GTTGGTGTTG GTATTGATTTGATCCG TCATGTCGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8009 4.2e-10 [+2]_484 10781 1.5e-08 306_[+2]_178 10529 8.9e-08 175_[+2]_309 4845 5.8e-07 100_[+2]_384 11790 1.2e-06 2_[+2]_482 1960 1.5e-06 45_[+2]_439 3608 1.9e-06 [+2]_484 11728 2e-06 41_[+2]_443 10100 3.6e-06 114_[+2]_370 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=9 8009 ( 1) GTGTTGTTTTGTGGAG 1 10781 ( 307) GTGTTATTTTGAGGCG 1 10529 ( 176) GTGTTGTGTTGAGTGG 1 4845 ( 101) GTATTGTGGTGTCGAG 1 11790 ( 3) GTGTTGACTGGGGGAG 1 1960 ( 46) GTGTTATTTTCTGGTT 1 3608 ( 1) GTCTTGTGATGCAGAG 1 11728 ( 42) GTGTTGTACTCAAGTG 1 10100 ( 115) GTATTGATTTGATCCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8730 bayes= 10.769 E= 1.1e+002 -982 -982 219 -982 -982 -982 -982 194 -34 -111 160 -982 -982 -982 -982 194 -982 -982 -982 194 -34 -982 183 -982 -34 -982 -982 158 -134 -111 61 77 -134 -111 -98 136 -982 -982 -98 177 -982 -11 183 -982 66 -111 -98 36 -34 -111 134 -122 -982 -111 183 -122 66 -11 -98 -22 -982 -982 202 -122 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 1.1e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.222222 0.111111 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.222222 0.000000 0.777778 0.000000 0.222222 0.000000 0.000000 0.777778 0.111111 0.111111 0.333333 0.444444 0.111111 0.111111 0.111111 0.666667 0.000000 0.000000 0.111111 0.888889 0.000000 0.222222 0.777778 0.000000 0.444444 0.111111 0.111111 0.333333 0.222222 0.111111 0.555556 0.111111 0.000000 0.111111 0.777778 0.111111 0.444444 0.222222 0.111111 0.222222 0.000000 0.000000 0.888889 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GT[GA]TT[GA][TA][TG]TT[GC][AT][GA]G[ACT]G -------------------------------------------------------------------------------- Time 5.95 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 18 llr = 158 E-value = 1.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 86351126:69: pos.-specific C :46:938:a2:8 probability G ::12:1:1:21: matrix T 2::3:6:4:::2 bits 2.2 2.0 * 1.8 * 1.5 * * Relative 1.3 * * * ** Entropy 1.1 * * * * ** (12.7 bits) 0.9 ** * * * ** 0.7 *** * ****** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AACACTCACAAC consensus CAT C T G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 10100 442 3.64e-07 CTCCTCACTC ACCACTCTCAAC TCTCACAATA 11648 355 4.58e-07 ATCATCCTGT AACTCTCACAAC TTCATCCCAC 10537 301 7.80e-07 CAGCAGCAAC AACACCCACAAC TGTATTAGCA 3608 363 5.33e-06 ACAGTAGTGC AAAGCTCTCAAC TCGGTAGTCA 10529 13 5.33e-06 AATCAACAGT AACACTCACAAT AACGCCACCC 3030 317 1.00e-05 TCACGTGACC ACAGCCCACAAC CACTCAAAGT 24577 482 1.63e-05 TCTTGCCATC ACCTCACTCAAC GCTTCCG 11790 488 1.82e-05 TCTTCTTTCA ACCTCTCGCAAC C 4845 454 2.95e-05 GAGCAGTTGG TCCACTCTCCAC ACAACCACCT 11408 260 3.91e-05 ATCCTTACCG AAAGCTCACAGC TATATTGAAA 8617 260 7.09e-05 ACCAGGGGTG ACGGCTCACAAT CAATCTAAAA 11728 272 7.09e-05 TACACAAGCA ACAACTCTCGGC CCATAACACG 8243 232 8.28e-05 ATGACACACA AAAACCAACGAC GAGAACAATG 1960 488 8.28e-05 AAGTTGCACA TCCTCCCTCCAC C 9154 2 1.26e-04 C AACAACAACAAC AACAACAACA 10023 242 1.34e-04 CTACCGGAAA AACAATCACGAT GATCATCGCA 10781 242 1.95e-04 GCATGCCGAG AAGTCGCTCGAC TAGCTGGTTA 8009 386 3.28e-04 TGTTACACTC TACACAAACCAC CGATACTTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10100 3.6e-07 441_[+3]_47 11648 4.6e-07 354_[+3]_134 10537 7.8e-07 300_[+3]_188 3608 5.3e-06 362_[+3]_126 10529 5.3e-06 12_[+3]_476 3030 1e-05 316_[+3]_172 24577 1.6e-05 481_[+3]_7 11790 1.8e-05 487_[+3]_1 4845 3e-05 453_[+3]_35 11408 3.9e-05 259_[+3]_229 8617 7.1e-05 259_[+3]_229 11728 7.1e-05 271_[+3]_217 8243 8.3e-05 231_[+3]_257 1960 8.3e-05 487_[+3]_1 9154 0.00013 1_[+3]_487 10023 0.00013 241_[+3]_247 10781 0.00019 241_[+3]_247 8009 0.00033 385_[+3]_103 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=18 10100 ( 442) ACCACTCTCAAC 1 11648 ( 355) AACTCTCACAAC 1 10537 ( 301) AACACCCACAAC 1 3608 ( 363) AAAGCTCTCAAC 1 10529 ( 13) AACACTCACAAT 1 3030 ( 317) ACAGCCCACAAC 1 24577 ( 482) ACCTCACTCAAC 1 11790 ( 488) ACCTCTCGCAAC 1 4845 ( 454) TCCACTCTCCAC 1 11408 ( 260) AAAGCTCACAGC 1 8617 ( 260) ACGGCTCACAAT 1 11728 ( 272) ACAACTCTCGGC 1 8243 ( 232) AAAACCAACGAC 1 1960 ( 488) TCCTCCCTCCAC 1 9154 ( 2) AACAACAACAAC 1 10023 ( 242) AACAATCACGAT 1 10781 ( 242) AAGTCGCTCGAC 1 8009 ( 386) TACACAAACCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 9.0653 E= 1.3e+002 157 -1081 -1081 -64 98 89 -1081 -1081 -2 135 -98 -1081 83 -1081 2 10 -134 189 -1081 -1081 -134 21 -198 110 -75 179 -1081 -1081 98 -1081 -198 58 -1081 206 -1081 -1081 112 -53 2 -1081 166 -1081 -98 -1081 -1081 179 -1081 -64 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 18 E= 1.3e+002 0.833333 0.000000 0.000000 0.166667 0.555556 0.444444 0.000000 0.000000 0.277778 0.611111 0.111111 0.000000 0.500000 0.000000 0.222222 0.277778 0.111111 0.888889 0.000000 0.000000 0.111111 0.277778 0.055556 0.555556 0.166667 0.833333 0.000000 0.000000 0.555556 0.000000 0.055556 0.388889 0.000000 1.000000 0.000000 0.000000 0.611111 0.166667 0.222222 0.000000 0.888889 0.000000 0.111111 0.000000 0.000000 0.833333 0.000000 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[AC][CA][ATG]C[TC]C[AT]C[AG]AC -------------------------------------------------------------------------------- Time 8.86 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10023 7.01e-04 27_[+1(5.65e-06)]_461 10100 2.82e-07 114_[+2(3.64e-06)]_85_\ [+1(7.94e-06)]_214_[+3(3.64e-07)]_47 10529 1.43e-05 12_[+3(5.33e-06)]_151_\ [+2(8.93e-08)]_309 10537 1.31e-02 300_[+3(7.80e-07)]_188 10781 4.94e-06 31_[+2(1.32e-06)]_17_[+1(9.27e-05)]_\ 47_[+2(1.32e-06)]_17_[+1(9.27e-05)]_47_[+2(1.32e-06)]_75_[+2(1.50e-08)]_\ 178 11408 1.72e-01 259_[+3(3.91e-05)]_229 11648 1.21e-05 98_[+1(4.43e-06)]_244_\ [+3(4.58e-07)]_134 11728 6.17e-05 41_[+2(2.01e-06)]_214_\ [+3(7.09e-05)]_17_[+1(3.18e-05)]_188 11790 5.13e-06 2_[+2(1.22e-06)]_152_[+1(1.19e-05)]_\ 305_[+3(1.82e-05)]_1 1960 1.12e-05 45_[+2(1.52e-06)]_28_[+1(5.13e-06)]_\ 386_[+3(8.28e-05)]_1 24577 8.00e-05 252_[+1(3.20e-07)]_78_\ [+1(7.86e-05)]_127_[+3(1.63e-05)]_7 3030 2.38e-04 19_[+1(1.93e-06)]_285_\ [+3(1.00e-05)]_137_[+3(3.57e-05)]_23 3608 2.52e-06 [+2(1.87e-06)]_9_[+1(1.19e-05)]_325_\ [+3(5.33e-06)]_126 4845 1.14e-06 100_[+2(5.77e-07)]_93_\ [+1(2.89e-06)]_232_[+3(2.95e-05)]_35 8009 3.71e-08 [+2(4.24e-10)]_171_[+1(8.83e-06)]_\ 26_[+2(6.60e-05)]_259 8243 5.19e-03 231_[+3(8.28e-05)]_21_\ [+1(6.08e-06)]_224 8617 2.50e-01 259_[+3(7.09e-05)]_229 9154 3.52e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************