******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/16/16.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11214 1.0000 500 11460 1.0000 500 18078 1.0000 500 20194 1.0000 500 21261 1.0000 500 21348 1.0000 500 21604 1.0000 500 22076 1.0000 500 22214 1.0000 500 22546 1.0000 500 2371 1.0000 500 24023 1.0000 500 25610 1.0000 500 25891 1.0000 500 262116 1.0000 500 263935 1.0000 500 264039 1.0000 500 26530 1.0000 500 268343 1.0000 500 268592 1.0000 500 268669 1.0000 500 270213 1.0000 500 31930 1.0000 500 33701 1.0000 500 34551 1.0000 500 3883 1.0000 500 4737 1.0000 500 5945 1.0000 500 5959 1.0000 500 687 1.0000 500 6988 1.0000 500 7679 1.0000 500 7740 1.0000 500 8014 1.0000 500 9720 1.0000 500 bd1857 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/16/16.seqs.fa -oc motifs/16 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 36 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 18000 N= 36 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.259 C 0.233 G 0.240 T 0.269 Background letter frequencies (from dataset with add-one prior applied): A 0.259 C 0.233 G 0.240 T 0.269 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 27 llr = 283 E-value = 8.3e-011 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :643a5:8:9422651 pos.-specific C :21:::::11231323 probability G a337:5a19:327126 matrix T ::1::::1::13::1: bits 2.1 * 1.9 * * * 1.7 * * * 1.5 * * * ** Relative 1.3 * * * ** Entropy 1.1 * ****** (15.1 bits) 0.8 * ******* * * 0.6 ** ******* ** * 0.4 ** ******* ** * 0.2 *********** **** 0.0 ---------------- Multilevel GAAGAGGAGAATGAAG consensus GGA A GC CCC sequence A G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 25610 64 3.47e-08 TTGTGTATGA GAGGAGGAGAAGGAGG TCAAAGGAGA 21604 49 3.47e-08 GTTTGTGATG GAGGAGGAGAAGGAGG ACCGAGATTT 6988 218 3.14e-07 TGAGAGGACC GAAGAGGAGAAGGCTG CCATTAAGGC 2371 39 5.76e-07 AGTGAGAGAT GGCGAAGAGAACGACG CAACAATGAC 11460 62 7.72e-07 AGGCAACGTA GAAGAGGAGAGTGGGG GTTCGCGTGG 7740 306 8.83e-07 CTAAAAATAC GAGGAGGAGAAGCAAC CTCCCCGTTT 22076 196 1.47e-06 GCGATAGGTA GATGAAGACAGTGAAG AAATTTGTAG 270213 239 1.66e-06 ATAAAGACGG GAAAAGGAGCAAGACG AGTGGTGGAA 268343 174 1.66e-06 ATAAAGACGG GAAAAGGAGCAAGACG AGTGGTGGAA 8014 102 2.66e-06 TGGATAGGAA GATGAGGAGAGTGGCG TGAATGTTTG 31930 207 2.66e-06 CCTCTGACGC GGTGAGGAGAACACAG TCAGCACAAC 5959 75 4.07e-06 TCATGCGGAA GGAGAAGAGATCAAAC GTTGGTTTGG 5945 34 6.69e-06 GCTGCTTAGC GGGTAAGAGAGGGAAG CTATGGGGGT 33701 387 7.34e-06 CTGAAGTGGC GGGAAAGAGACCACAG CCTAACAAAA 26530 87 8.05e-06 GAGCATCACC GGAGAAGAGACCCCAC AAGGCGCCTC 7679 170 1.14e-05 TGGAGCAGTA GGAGAGGTCAATGAAC TGGAAGCACT 34551 170 1.14e-05 CCTTACTTAG GCGAAAGAGATTGAAA ACCTTTTGAA 22546 169 1.14e-05 CCTTACTTAG GCGAAAGAGATTGAAA ACCTTTTGAA 3883 152 1.24e-05 AAGTAGTGCG GACGAAGAGACAAACC TCAACCATAA 22214 250 1.24e-05 TGATTTTTGG GAGGAGGAGCTAGCTG TTTGCAGCTC 262116 181 1.46e-05 GCCATCGTCT GCAGAAGAGCAGGGGG CGCAAGAAAA 687 340 1.58e-05 GTTTGTCACA GATAAAGTGAGAGCAG TAGAGTTGTT 9720 306 2.14e-05 GGCGGCTAGT GCAGAGGTGACTGCTG CGATGGAGAG 25891 281 2.14e-05 CAGAGAACAC GACGAAGGGAACAAAC GCCTCGACTA 11214 431 2.66e-05 CTGTGGCGAA GAGGAGGGCAGTGAGC AGGAGTTGTG 20194 93 3.94e-05 TATCAAATTT ACAAAAGAGAGAGACG TGATCAAAAA 264039 338 6.29e-05 CCTAACTGAG GAAAAGGACACCCCAA GCTCGTCCAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25610 3.5e-08 63_[+1]_421 21604 3.5e-08 48_[+1]_436 6988 3.1e-07 217_[+1]_267 2371 5.8e-07 38_[+1]_446 11460 7.7e-07 61_[+1]_423 7740 8.8e-07 305_[+1]_179 22076 1.5e-06 195_[+1]_289 270213 1.7e-06 238_[+1]_246 268343 1.7e-06 173_[+1]_311 8014 2.7e-06 101_[+1]_383 31930 2.7e-06 206_[+1]_278 5959 4.1e-06 74_[+1]_410 5945 6.7e-06 33_[+1]_451 33701 7.3e-06 386_[+1]_98 26530 8e-06 86_[+1]_398 7679 1.1e-05 169_[+1]_315 34551 1.1e-05 169_[+1]_315 22546 1.1e-05 168_[+1]_316 3883 1.2e-05 151_[+1]_333 22214 1.2e-05 249_[+1]_235 262116 1.5e-05 180_[+1]_304 687 1.6e-05 339_[+1]_145 9720 2.1e-05 305_[+1]_179 25891 2.1e-05 280_[+1]_204 11214 2.7e-05 430_[+1]_54 20194 3.9e-05 92_[+1]_392 264039 6.3e-05 337_[+1]_147 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=27 25610 ( 64) GAGGAGGAGAAGGAGG 1 21604 ( 49) GAGGAGGAGAAGGAGG 1 6988 ( 218) GAAGAGGAGAAGGCTG 1 2371 ( 39) GGCGAAGAGAACGACG 1 11460 ( 62) GAAGAGGAGAGTGGGG 1 7740 ( 306) GAGGAGGAGAAGCAAC 1 22076 ( 196) GATGAAGACAGTGAAG 1 270213 ( 239) GAAAAGGAGCAAGACG 1 268343 ( 174) GAAAAGGAGCAAGACG 1 8014 ( 102) GATGAGGAGAGTGGCG 1 31930 ( 207) GGTGAGGAGAACACAG 1 5959 ( 75) GGAGAAGAGATCAAAC 1 5945 ( 34) GGGTAAGAGAGGGAAG 1 33701 ( 387) GGGAAAGAGACCACAG 1 26530 ( 87) GGAGAAGAGACCCCAC 1 7679 ( 170) GGAGAGGTCAATGAAC 1 34551 ( 170) GCGAAAGAGATTGAAA 1 22546 ( 169) GCGAAAGAGATTGAAA 1 3883 ( 152) GACGAAGAGACAAACC 1 22214 ( 250) GAGGAGGAGCTAGCTG 1 262116 ( 181) GCAGAAGAGCAGGGGG 1 687 ( 340) GATAAAGTGAGAGCAG 1 9720 ( 306) GCAGAGGTGACTGCTG 1 25891 ( 281) GACGAAGGGAACAAAC 1 11214 ( 431) GAGGAGGGCAGTGAGC 1 20194 ( 93) ACAAAAGAGAGAGACG 1 264039 ( 338) GAAAAGGACACCCCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 17460 bayes= 9.90439 E= 8.3e-011 -280 -1140 201 -1140 110 -33 11 -1140 65 -107 48 -86 20 -1140 148 -286 195 -1140 -1140 -1140 90 -1140 111 -1140 -1140 -1140 206 -1140 165 -1140 -169 -127 -1140 -65 183 -1140 172 -65 -1140 -1140 65 -33 11 -86 -22 15 -11 14 -48 -107 155 -1140 120 35 -111 -1140 90 -7 -37 -127 -122 15 139 -1140 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 27 E= 8.3e-011 0.037037 0.000000 0.962963 0.000000 0.555556 0.185185 0.259259 0.000000 0.407407 0.111111 0.333333 0.148148 0.296296 0.000000 0.666667 0.037037 1.000000 0.000000 0.000000 0.000000 0.481481 0.000000 0.518519 0.000000 0.000000 0.000000 1.000000 0.000000 0.814815 0.000000 0.074074 0.111111 0.000000 0.148148 0.851852 0.000000 0.851852 0.148148 0.000000 0.000000 0.407407 0.185185 0.259259 0.148148 0.222222 0.259259 0.222222 0.296296 0.185185 0.111111 0.703704 0.000000 0.592593 0.296296 0.111111 0.000000 0.481481 0.222222 0.185185 0.111111 0.111111 0.259259 0.629630 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[AG][AG][GA]A[GA]GAGA[AG][TCAG]G[AC][AC][GC] -------------------------------------------------------------------------------- Time 13.22 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 35 llr = 339 E-value = 2.5e-007 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 214::23532172:63:2235 pos.-specific C 17419552137:5821a126: probability G :::1:3:111:3::21:42:1 matrix T 72281:13643122:5:3414 bits 2.1 1.9 * 1.7 * 1.5 * * * Relative 1.3 * * * Entropy 1.1 * * * (14.0 bits) 0.8 * ** ** * * * 0.6 ** *** ** ** * ** 0.4 ******* * ***** * ** 0.2 ********* ******** ** 0.0 --------------------- Multilevel TCCTCCCATTCACCATCGTCA consensus A A GATACTGT CA TAAT sequence T A C A AC G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 21261 478 3.66e-10 ATATAACCAA TCATCCCATACACCATCATCA CC 3883 376 6.17e-08 TACACCACCT TCCTCCCAACCGACGTCGTCT CTGTGCAACG 20194 234 3.91e-07 GCATTCGCTC TCCTCGCATCTGCCAACGCTT TTGCACCCGC 687 412 4.49e-07 TGCAATATTA TCATCACATACACCGTCGGTA GCACTTCACC 262116 24 5.15e-07 ACTGTTGTGA TTCTCGCCAGCAACATCTTCA TGGAGAGGGA 21604 457 5.90e-07 GAGCATGAGT TCCTCCTCATCGTCATCTGCA ACCGACAATA 6988 313 6.75e-07 ACGCATATCA TCATCACATCCGCTCACATCA TAATATGTGC 24023 322 7.70e-07 AGTTAGTAGT ACATCCAAATTAACAACATCA ACATGAGCAA 5945 420 1.13e-06 AGCCTCTACG TCTTCCAATATACCATCCCAT CTCATCGGAC 270213 135 2.63e-06 CTTCCAATCA TCTTCCATCTCATCGTCGTCT TCAAAACATG 268343 70 2.63e-06 CTTCCAATCA TCTTCCATCTCATCGTCGTCT TCAAAACATG 2371 402 2.95e-06 CTCCGCACTC CTCTCGCCAGTACCATCGACA CTCTCTCTCT 7740 377 4.59e-06 CATAATCATC TACTTCCCTCCATCCACGTCA TTCTATCTTC 25891 17 4.59e-06 ATTGAAACGG TACTCGATGCCACCATCTCAA ATTGAGACAT 34551 445 6.31e-06 AATACCGTCG TACTCACATTCACTCGCTACA GTGAACTACG 22546 444 6.31e-06 AATACCGTCG TACTCACATTCACTCGCTACA GTGAACTACG 33701 230 8.58e-06 ACACACCATA TCTGCGAAAGCACTAACTGCA CACGCGACAA 268592 122 9.48e-06 CCCCCCCATA ACAGCGCTTCCAACACCGCAA CATGAGACCG 22076 480 9.48e-06 CCCCCCCATA ACAGCGCTTCCAACACCGCAA 5959 421 1.27e-05 TGTCTGTGAG TCATCGTCTTCATCCACACCG TCCATCCTCA 26530 406 1.27e-05 CCTGGCACTA CCACCACCACCACCACCACCA CGAACTACAC 263935 400 1.53e-05 ATCGCAAATA ACATAGCTATCGCCATCATCT CCCAATCATA 21348 457 2.02e-05 GTACCCTAAA TTCTCAACTATGTCAACTTAA TTTACAATTT 268669 322 2.21e-05 GGTGACTGAC TCATCGAGAGCAACAACCAAA ACAAAATCCC 22214 348 2.21e-05 CAGAACCAGC TTCGCAGTTCTACCATCGTCT GTCCGTCGCT 31930 239 3.71e-05 CAACTGCGTG TTATCAAAGACAGCATCGTCT GGCATAAAAC 25610 180 4.38e-05 CAGGGGCGCA CCCGCCCGTTCGCCCTCCAAT CACTCCCGGC 264039 290 4.75e-05 TCCATAAATT ACTTCCCTTTTACTAGCTGCG ATGGAAGTTA 9720 120 5.14e-05 GTTGAGGTTC ATCTTCAATTCTCCATCCTCT TTTGCTTTTA 7679 432 7.58e-05 GCTGAGTCAG TCCTCCCCAATGTCCTTGGCT TCCTTTTTCC 4737 19 1.02e-04 CGTAGTGCCG TCTTCGTTAGCTCCACCTGAA GTGGTAAAAC 11214 5 1.02e-04 TATA CCACCCCATTTGTCCACTATT TCTTTGATTG 8014 345 1.55e-04 TCAACTTAAC TATTCCTATTTGACGACAAAT TTCATAATTG bd1857 148 2.59e-04 AGACGAAGAT ACATCCAATCAACTGTCTGTG GGTAAGACAG 11460 221 4.90e-04 CGATCTTACA CCGCTCAATCAACCAACGACT ATGAAACCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21261 3.7e-10 477_[+2]_2 3883 6.2e-08 375_[+2]_104 20194 3.9e-07 233_[+2]_246 687 4.5e-07 411_[+2]_68 262116 5.2e-07 23_[+2]_456 21604 5.9e-07 456_[+2]_23 6988 6.7e-07 312_[+2]_167 24023 7.7e-07 321_[+2]_158 5945 1.1e-06 419_[+2]_60 270213 2.6e-06 134_[+2]_345 268343 2.6e-06 69_[+2]_410 2371 2.9e-06 401_[+2]_78 7740 4.6e-06 376_[+2]_103 25891 4.6e-06 16_[+2]_463 34551 6.3e-06 444_[+2]_35 22546 6.3e-06 443_[+2]_36 33701 8.6e-06 229_[+2]_250 268592 9.5e-06 121_[+2]_358 22076 9.5e-06 479_[+2] 5959 1.3e-05 420_[+2]_59 26530 1.3e-05 405_[+2]_74 263935 1.5e-05 399_[+2]_80 21348 2e-05 456_[+2]_23 268669 2.2e-05 321_[+2]_158 22214 2.2e-05 347_[+2]_132 31930 3.7e-05 238_[+2]_241 25610 4.4e-05 179_[+2]_300 264039 4.7e-05 289_[+2]_190 9720 5.1e-05 119_[+2]_360 7679 7.6e-05 431_[+2]_48 4737 0.0001 18_[+2]_461 11214 0.0001 4_[+2]_475 8014 0.00015 344_[+2]_135 bd1857 0.00026 147_[+2]_332 11460 0.00049 220_[+2]_259 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=35 21261 ( 478) TCATCCCATACACCATCATCA 1 3883 ( 376) TCCTCCCAACCGACGTCGTCT 1 20194 ( 234) TCCTCGCATCTGCCAACGCTT 1 687 ( 412) TCATCACATACACCGTCGGTA 1 262116 ( 24) TTCTCGCCAGCAACATCTTCA 1 21604 ( 457) TCCTCCTCATCGTCATCTGCA 1 6988 ( 313) TCATCACATCCGCTCACATCA 1 24023 ( 322) ACATCCAAATTAACAACATCA 1 5945 ( 420) TCTTCCAATATACCATCCCAT 1 270213 ( 135) TCTTCCATCTCATCGTCGTCT 1 268343 ( 70) TCTTCCATCTCATCGTCGTCT 1 2371 ( 402) CTCTCGCCAGTACCATCGACA 1 7740 ( 377) TACTTCCCTCCATCCACGTCA 1 25891 ( 17) TACTCGATGCCACCATCTCAA 1 34551 ( 445) TACTCACATTCACTCGCTACA 1 22546 ( 444) TACTCACATTCACTCGCTACA 1 33701 ( 230) TCTGCGAAAGCACTAACTGCA 1 268592 ( 122) ACAGCGCTTCCAACACCGCAA 1 22076 ( 480) ACAGCGCTTCCAACACCGCAA 1 5959 ( 421) TCATCGTCTTCATCCACACCG 1 26530 ( 406) CCACCACCACCACCACCACCA 1 263935 ( 400) ACATAGCTATCGCCATCATCT 1 21348 ( 457) TTCTCAACTATGTCAACTTAA 1 268669 ( 322) TCATCGAGAGCAACAACCAAA 1 22214 ( 348) TTCGCAGTTCTACCATCGTCT 1 31930 ( 239) TTATCAAAGACAGCATCGTCT 1 25610 ( 180) CCCGCCCGTTCGCCCTCCAAT 1 264039 ( 290) ACTTCCCTTTTACTAGCTGCG 1 9720 ( 120) ATCTTCAATTCTCCATCCTCT 1 7679 ( 432) TCCTCCCCAATGTCCTTGGCT 1 4737 ( 19) TCTTCGTTAGCTCCACCTGAA 1 11214 ( 5) CCACCCCATTTGTCCACTATT 1 8014 ( 345) TATTCCTATTTGACGACAAAT 1 bd1857 ( 148) ACATCCAATCAACTGTCTGTG 1 11460 ( 221) CCGCTCAATCAACCAACGACT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 17280 bayes= 8.91194 E= 2.5e-007 -37 -70 -1177 129 -86 156 -1177 -65 52 78 -307 -43 -1177 -144 -75 152 -318 193 -1177 -165 -18 97 39 -1177 41 114 -307 -123 82 -3 -207 -6 28 -203 -207 109 -59 43 -75 47 -218 150 -1177 9 134 -1177 25 -223 -37 122 -307 -23 -1177 183 -1177 -65 121 -3 -48 -1177 28 -103 -148 85 -1177 206 -1177 -323 -37 -103 63 23 -18 -22 -26 47 -1 143 -1177 -123 99 -1177 -148 57 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 35 E= 2.5e-007 0.200000 0.142857 0.000000 0.657143 0.142857 0.685714 0.000000 0.171429 0.371429 0.400000 0.028571 0.200000 0.000000 0.085714 0.142857 0.771429 0.028571 0.885714 0.000000 0.085714 0.228571 0.457143 0.314286 0.000000 0.342857 0.514286 0.028571 0.114286 0.457143 0.228571 0.057143 0.257143 0.314286 0.057143 0.057143 0.571429 0.171429 0.314286 0.142857 0.371429 0.057143 0.657143 0.000000 0.285714 0.657143 0.000000 0.285714 0.057143 0.200000 0.542857 0.028571 0.228571 0.000000 0.828571 0.000000 0.171429 0.600000 0.228571 0.171429 0.000000 0.314286 0.114286 0.085714 0.485714 0.000000 0.971429 0.000000 0.028571 0.200000 0.114286 0.371429 0.314286 0.228571 0.200000 0.200000 0.371429 0.257143 0.628571 0.000000 0.114286 0.514286 0.000000 0.085714 0.400000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TA]C[CAT]TC[CGA][CA][ATC][TA][TC][CT][AG][CTA]C[AC][TA]C[GTA][TACG][CA][AT] -------------------------------------------------------------------------------- Time 26.01 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 28 llr = 279 E-value = 2.8e-007 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::311:::1::2:32 pos.-specific C ::1::::2:26136: probability G 481911a:36143:8 matrix T 625:89:8623341: bits 2.1 * 1.9 * 1.7 * 1.5 * ** Relative 1.3 * *** * Entropy 1.1 ** ***** * (14.4 bits) 0.8 ** ***** * * 0.6 ** ******** ** 0.4 ** ******** *** 0.2 *************** 0.0 --------------- Multilevel TGTGTTGTTGCGTCG consensus G A CGCTTCA sequence T AG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 20194 437 1.98e-08 ATGTCGGATT TGTGTTGTTGCATCG GATATCGGCG 264039 140 5.58e-07 TTCTCGTGAG TGTGTTGTTGCACCA TGATTGCTGC 11214 448 1.15e-06 GCAGTGAGCA GGAGTTGTGGCCTCG TCATGACGGG 270213 459 1.33e-06 CGATGACGTG TGAGTTGTTGCTGCA CACATCCACA 268343 394 1.33e-06 CGATGACGTG TGAGTTGTTGCTGCA CACATCCACA 3883 277 1.50e-06 ATTGCCTGGT TGTGTTGTGTCCCCG TTCCAAATTA 21348 251 1.72e-06 TGATTCCGCC TCTGTTGTTGCGTCG GCTGAATGAA 687 104 2.50e-06 GTCCACTCAC GGTGTTGTGTTGGAG GTGAAGGCGG 34551 3 2.50e-06 GT TGGGTTGTGCCGTCG TCGGTGTCGT 22546 2 2.50e-06 T TGGGTTGTGCCGTCG TCGGTGTCGT 21261 99 2.79e-06 AAGGAAGAGG GGTGTTGTTTCCCAG AGTCAGCTGT 8014 84 4.42e-06 TATTTCGAGA GGTGTTGTTGGATAG GAAGATGAGG 22214 54 4.91e-06 GTCGCAGACG GGTGATGTTGTTTCG TCATATCTCT 21604 138 6.07e-06 GGTCGTGGTC GTCGTTGCTGCGTCG GGTGGGCATG 7679 104 6.74e-06 GAAGTTGACT TGTGTGGTGGTTGCG AGTGTTCTCT bd1857 11 1.21e-05 GATTGAGATT TGAGTTGCAGCGTAG TGATGATGCG 5959 33 1.21e-05 ACTGGTTAAG GGAGTTGTACCGGAG GAGGGCATCG 4737 307 1.45e-05 ACAAAGAGTG GGCGTTGTGTCTCTG TAATTTTTGT 25610 2 1.45e-05 T GGTGGTGTGGTGGTG ACGAAAAGGA 262116 258 1.73e-05 CAGATCAAAC TTCGTTGCTGTTGCG TATCAATCTG 18078 288 3.32e-05 TTTGTTCTGG TGAGTTGTTCGTGTG TTGCTTTTGC 5945 64 3.58e-05 GGGGGTAAGG GTTGTTGTCGTGCAG GCTGCCTCGG 268669 161 3.86e-05 AAAGCTATCA TGCGTTGTTTTACAA GAGAGAGCGA 268592 46 3.86e-05 TCCTAATTCG TGTAGTGCTGCACCG TTGAAATATT 22076 404 3.86e-05 TCCTAATTCG TGTAGTGCTGCACCG TTGAAATATT 11460 33 6.73e-05 GACGGCGTGA TGTAATGCTCCGCCG AAGGAGGCAA 24023 5 1.04e-04 CATG GTTGTGGTTCTGTTG TCTGCTGCAG 33701 300 1.23e-04 CGCGAGTGCA GTAGTTGTGTTTGAA AGAACCCATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20194 2e-08 436_[+3]_49 264039 5.6e-07 139_[+3]_346 11214 1.2e-06 447_[+3]_38 270213 1.3e-06 458_[+3]_27 268343 1.3e-06 393_[+3]_92 3883 1.5e-06 276_[+3]_209 21348 1.7e-06 250_[+3]_235 687 2.5e-06 103_[+3]_382 34551 2.5e-06 2_[+3]_483 22546 2.5e-06 1_[+3]_484 21261 2.8e-06 98_[+3]_387 8014 4.4e-06 83_[+3]_402 22214 4.9e-06 53_[+3]_432 21604 6.1e-06 137_[+3]_348 7679 6.7e-06 103_[+3]_382 bd1857 1.2e-05 10_[+3]_475 5959 1.2e-05 32_[+3]_453 4737 1.4e-05 306_[+3]_179 25610 1.4e-05 1_[+3]_484 262116 1.7e-05 257_[+3]_228 18078 3.3e-05 287_[+3]_198 5945 3.6e-05 63_[+3]_422 268669 3.9e-05 160_[+3]_325 268592 3.9e-05 45_[+3]_440 22076 3.9e-05 403_[+3]_82 11460 6.7e-05 32_[+3]_453 24023 0.0001 4_[+3]_481 33701 0.00012 299_[+3]_186 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=28 20194 ( 437) TGTGTTGTTGCATCG 1 264039 ( 140) TGTGTTGTTGCACCA 1 11214 ( 448) GGAGTTGTGGCCTCG 1 270213 ( 459) TGAGTTGTTGCTGCA 1 268343 ( 394) TGAGTTGTTGCTGCA 1 3883 ( 277) TGTGTTGTGTCCCCG 1 21348 ( 251) TCTGTTGTTGCGTCG 1 687 ( 104) GGTGTTGTGTTGGAG 1 34551 ( 3) TGGGTTGTGCCGTCG 1 22546 ( 2) TGGGTTGTGCCGTCG 1 21261 ( 99) GGTGTTGTTTCCCAG 1 8014 ( 84) GGTGTTGTTGGATAG 1 22214 ( 54) GGTGATGTTGTTTCG 1 21604 ( 138) GTCGTTGCTGCGTCG 1 7679 ( 104) TGTGTGGTGGTTGCG 1 bd1857 ( 11) TGAGTTGCAGCGTAG 1 5959 ( 33) GGAGTTGTACCGGAG 1 4737 ( 307) GGCGTTGTGTCTCTG 1 25610 ( 2) GGTGGTGTGGTGGTG 1 262116 ( 258) TTCGTTGCTGTTGCG 1 18078 ( 288) TGAGTTGTTCGTGTG 1 5945 ( 64) GTTGTTGTCGTGCAG 1 268669 ( 161) TGCGTTGTTTTACAA 1 268592 ( 46) TGTAGTGCTGCACCG 1 22076 ( 404) TGTAGTGCTGCACCG 1 11460 ( 33) TGTAATGCTCCGCCG 1 24023 ( 5) GTTGTGGTTCTGTTG 1 33701 ( 300) GTAGTTGTGTTTGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 17496 bayes= 9.89159 E= 2.8e-007 -1145 -1145 84 109 -1145 -270 171 -59 -5 -70 -175 100 -127 -1145 190 -1145 -186 -1145 -116 161 -1145 -1145 -175 179 -1145 -1145 206 -1145 -1145 -12 -1145 155 -186 -270 42 109 -1145 -12 125 -33 -1145 138 -175 26 -27 -112 71 9 -1145 46 42 41 14 129 -1145 -91 -53 -1145 178 -1145 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 28 E= 2.8e-007 0.000000 0.000000 0.428571 0.571429 0.000000 0.035714 0.785714 0.178571 0.250000 0.142857 0.071429 0.535714 0.107143 0.000000 0.892857 0.000000 0.071429 0.000000 0.107143 0.821429 0.000000 0.000000 0.071429 0.928571 0.000000 0.000000 1.000000 0.000000 0.000000 0.214286 0.000000 0.785714 0.071429 0.035714 0.321429 0.571429 0.000000 0.214286 0.571429 0.214286 0.000000 0.607143 0.071429 0.321429 0.214286 0.107143 0.392857 0.285714 0.000000 0.321429 0.321429 0.357143 0.285714 0.571429 0.000000 0.142857 0.178571 0.000000 0.821429 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TG]G[TA]GTTG[TC][TG][GCT][CT][GTA][TCG][CA]G -------------------------------------------------------------------------------- Time 38.30 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11214 4.33e-05 183_[+1(9.41e-05)]_231_\ [+1(2.66e-05)]_1_[+3(1.15e-06)]_38 11460 2.45e-04 32_[+3(6.73e-05)]_14_[+1(7.72e-07)]_\ 423 18078 1.33e-01 287_[+3(3.32e-05)]_198 20194 1.08e-08 92_[+1(3.94e-05)]_125_\ [+2(3.91e-07)]_182_[+3(1.98e-08)]_49 21261 5.88e-08 98_[+3(2.79e-06)]_364_\ [+2(3.66e-10)]_2 21348 5.45e-04 250_[+3(1.72e-06)]_191_\ [+2(2.02e-05)]_23 21604 4.74e-09 48_[+1(3.47e-08)]_73_[+3(6.07e-06)]_\ 304_[+2(5.90e-07)]_23 22076 9.35e-06 195_[+1(1.47e-06)]_192_\ [+3(3.86e-05)]_61_[+2(9.48e-06)] 22214 2.12e-05 53_[+3(4.91e-06)]_181_\ [+1(1.24e-05)]_82_[+2(2.21e-05)]_132 22546 3.55e-06 1_[+3(2.50e-06)]_152_[+1(1.14e-05)]_\ 259_[+2(6.31e-06)]_36 2371 2.46e-05 38_[+1(5.76e-07)]_347_\ [+2(2.95e-06)]_78 24023 5.29e-04 321_[+2(7.70e-07)]_158 25610 5.33e-07 1_[+3(1.45e-05)]_47_[+1(3.47e-08)]_\ 100_[+2(4.38e-05)]_300 25891 1.22e-03 16_[+2(4.59e-06)]_243_\ [+1(2.14e-05)]_204 262116 2.66e-06 23_[+2(5.15e-07)]_136_\ [+1(1.46e-05)]_61_[+3(1.73e-05)]_228 263935 9.94e-02 399_[+2(1.53e-05)]_80 264039 2.51e-05 139_[+3(5.58e-07)]_135_\ [+2(4.75e-05)]_27_[+1(6.29e-05)]_147 26530 9.63e-04 86_[+1(8.05e-06)]_303_\ [+2(1.27e-05)]_74 268343 1.62e-07 69_[+2(2.63e-06)]_83_[+1(1.66e-06)]_\ 204_[+3(1.33e-06)]_92 268592 3.79e-03 45_[+3(3.86e-05)]_61_[+2(9.48e-06)]_\ 272_[+2(6.51e-05)]_65 268669 5.36e-03 160_[+3(3.86e-05)]_146_\ [+2(2.21e-05)]_158 270213 1.62e-07 134_[+2(2.63e-06)]_83_\ [+1(1.66e-06)]_204_[+3(1.33e-06)]_27 31930 5.29e-04 157_[+1(6.29e-05)]_33_\ [+1(2.66e-06)]_16_[+2(3.71e-05)]_241 33701 9.57e-05 229_[+2(8.58e-06)]_136_\ [+1(7.34e-06)]_98 34551 3.55e-06 2_[+3(2.50e-06)]_152_[+1(1.14e-05)]_\ 259_[+2(6.31e-06)]_35 3883 3.66e-08 151_[+1(1.24e-05)]_109_\ [+3(1.50e-06)]_84_[+2(6.17e-08)]_104 4737 8.66e-03 306_[+3(1.45e-05)]_179 5945 5.11e-06 33_[+1(6.69e-06)]_14_[+3(3.58e-05)]_\ 59_[+1(3.94e-05)]_266_[+2(1.13e-06)]_60 5959 1.08e-05 32_[+3(1.21e-05)]_27_[+1(4.07e-06)]_\ 330_[+2(1.27e-05)]_59 687 4.43e-07 40_[+3(3.32e-05)]_48_[+3(2.50e-06)]_\ 221_[+1(1.58e-05)]_56_[+2(4.49e-07)]_68 6988 7.51e-06 217_[+1(3.14e-07)]_79_\ [+2(6.75e-07)]_167 7679 7.54e-05 103_[+3(6.74e-06)]_51_\ [+1(1.14e-05)]_61_[+3(4.79e-05)]_170_[+2(7.58e-05)]_48 7740 8.65e-05 305_[+1(8.83e-07)]_55_\ [+2(4.59e-06)]_56_[+2(6.02e-05)]_26 8014 2.68e-05 83_[+3(4.42e-06)]_3_[+1(2.66e-06)]_\ 383 9720 6.14e-03 119_[+2(5.14e-05)]_165_\ [+1(2.14e-05)]_179 bd1857 1.49e-02 10_[+3(1.21e-05)]_475 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************