******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/121/121.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11319 1.0000 500 11977 1.0000 500 1932 1.0000 500 20817 1.0000 500 22784 1.0000 500 22857 1.0000 500 24626 1.0000 500 263361 1.0000 500 269952 1.0000 500 31382 1.0000 500 6476 1.0000 500 6723 1.0000 500 9666 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/121/121.seqs.fa -oc motifs/121 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.222 G 0.245 T 0.262 Background letter frequencies (from dataset with add-one prior applied): A 0.272 C 0.222 G 0.245 T 0.262 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 13 llr = 146 E-value = 2.1e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 123::1:12:2::5: pos.-specific C 121:33112:::::: probability G :52:71:7:9::8:2 matrix T 825a:592518a258 bits 2.2 2.0 * * 1.7 * * * 1.5 * * * * Relative 1.3 * * * ** * Entropy 1.1 * ** * **** * (16.2 bits) 0.9 * ** * ****** 0.7 * ** ** ****** 0.4 * ************ 0.2 *************** 0.0 --------------- Multilevel TGTTGTTGTGTTGAT consensus TA CC A A T sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 20817 186 2.74e-09 CGGCTTCGTT TGTTGTTGTGTTGTT TGGTTTGTCT 22857 224 6.00e-08 TCTCTGGTAT TTTTGTTGCGTTGAT TATACAGACG 1932 130 1.67e-07 TCTACCAGAG TCGTGTTGTGTTGAT GGGAGTAAAT 11319 237 1.92e-07 CGGCTGCTTG TAATGCTGTGTTGTT AGGAGTACTG 24626 179 2.14e-06 ACTGGGAGCT TGATCTTGTGATGAG CTGACACCGG 22784 216 3.04e-06 TCACTTGTGG TATTGGTGTGATGAT AATAGATGTA 31382 179 3.92e-06 TGGATGATGG TGTTGATGAGTTGAG AGGAGCTGCC 263361 134 4.24e-06 TTACTGTGTG TTATCTTGAGTTTTT TGTCAGTACC 6476 380 5.80e-06 GGGAAAGCAA TGCTCTTCCGTTGAT AAACATTAAG 9666 310 8.37e-06 TTATTGTACT TGTTCCTTAGTTTTT TATTACAAAA 6723 33 1.10e-05 CATGATGTTG CTGTGCTGTGATGTT TGTGTCTGCG 11977 462 2.18e-05 GGTCGCCTTC TCATGCTTCTTTGTT CTACTTTCCT 269952 280 2.72e-05 CAGTGGAATG AGTTGTCATGTTGAT GTATTGTGGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20817 2.7e-09 185_[+1]_300 22857 6e-08 223_[+1]_262 1932 1.7e-07 129_[+1]_356 11319 1.9e-07 236_[+1]_249 24626 2.1e-06 178_[+1]_307 22784 3e-06 215_[+1]_270 31382 3.9e-06 178_[+1]_307 263361 4.2e-06 133_[+1]_352 6476 5.8e-06 379_[+1]_106 9666 8.4e-06 309_[+1]_176 6723 1.1e-05 32_[+1]_453 11977 2.2e-05 461_[+1]_24 269952 2.7e-05 279_[+1]_206 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=13 20817 ( 186) TGTTGTTGTGTTGTT 1 22857 ( 224) TTTTGTTGCGTTGAT 1 1932 ( 130) TCGTGTTGTGTTGAT 1 11319 ( 237) TAATGCTGTGTTGTT 1 24626 ( 179) TGATCTTGTGATGAG 1 22784 ( 216) TATTGGTGTGATGAT 1 31382 ( 179) TGTTGATGAGTTGAG 1 263361 ( 134) TTATCTTGAGTTTTT 1 6476 ( 380) TGCTCTTCCGTTGAT 1 9666 ( 310) TGTTCCTTAGTTTTT 1 6723 ( 33) CTGTGCTGTGATGTT 1 11977 ( 462) TCATGCTTCTTTGTT 1 269952 ( 280) AGTTGTCATGTTGAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6318 bayes= 9.45327 E= 2.1e-004 -182 -153 -1035 169 -82 -53 91 -18 18 -153 -67 82 -1035 -1035 -1035 193 -1035 47 150 -1035 -182 47 -167 104 -1035 -153 -1035 182 -182 -153 150 -77 -24 6 -1035 104 -1035 -1035 191 -176 -24 -1035 -1035 155 -1035 -1035 -1035 193 -1035 -1035 179 -77 99 -1035 -1035 82 -1035 -1035 -67 169 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 13 E= 2.1e-004 0.076923 0.076923 0.000000 0.846154 0.153846 0.153846 0.461538 0.230769 0.307692 0.076923 0.153846 0.461538 0.000000 0.000000 0.000000 1.000000 0.000000 0.307692 0.692308 0.000000 0.076923 0.307692 0.076923 0.538462 0.000000 0.076923 0.000000 0.923077 0.076923 0.076923 0.692308 0.153846 0.230769 0.230769 0.000000 0.538462 0.000000 0.000000 0.923077 0.076923 0.230769 0.000000 0.000000 0.769231 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.846154 0.153846 0.538462 0.000000 0.000000 0.461538 0.000000 0.000000 0.153846 0.846154 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[GT][TA]T[GC][TC]TG[TAC]G[TA]TG[AT]T -------------------------------------------------------------------------------- Time 1.91 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 12 llr = 155 E-value = 7.7e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 13::52:3137:333::623 pos.-specific C 8:98:6846::776:86378 probability G :5:1:1:1:3:3::::11:: matrix T 23115222343111823:2: bits 2.2 2.0 1.7 * 1.5 * * * Relative 1.3 ** * * * Entropy 1.1 * ** * * ** * (18.7 bits) 0.9 * *** * * ******* ** 0.7 * *** * * ********** 0.4 ******* ************ 0.2 ******************** 0.0 -------------------- Multilevel CGCCACCCCTACCCTCCACC consensus A T ATATGAAA TC A sequence T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 263361 244 1.06e-09 TCTCTTCTCC CGCCACCCCTACCCACCGCC CGTCCACACA 24626 462 2.51e-08 ACACGAACAC CGCCACCAAAAGCATCCCCC CAGTTTCCAT 31382 312 3.17e-08 CGCCGGCCCA CGCCAGCCCGACACTCCAAC CAATCAACAA 1932 106 9.20e-08 CTGGAGTGTC CTCGTCCATAACCATCTACC AGAGTCGTGT 22857 399 1.13e-07 GAAACTCACT CTCCACTCTTTCTCTCCACC TAACACCACG 6476 164 3.48e-07 CATTGAAGAT TGCTTACCCTTGCCTCCACC ATAGCACCGA 9666 280 4.52e-07 AGACGATTAT CTCCTTCACAACCATCGCAC TTATTGTACT 6723 144 8.77e-07 AGGACTCATC CACCACCGCGTTCCTCTATC CGTCCAACCA 269952 435 1.03e-06 AACAACCACT CGCCTCTCTGAGAATCTCCA ACCACCGCCG 20817 294 1.20e-06 GCAAAATCAT CACCAACACAACACATCACA GAAGCCCTGG 22784 331 3.06e-06 CAGCTGTCAA AACCTCCTCTACCTTTCACA CCAAAGGCTG 11977 239 1.61e-05 TGAACTCACT TGTCTTCTTTTCCCACTCTC AAGCTAAATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 263361 1.1e-09 243_[+2]_237 24626 2.5e-08 461_[+2]_19 31382 3.2e-08 311_[+2]_169 1932 9.2e-08 105_[+2]_375 22857 1.1e-07 398_[+2]_82 6476 3.5e-07 163_[+2]_317 9666 4.5e-07 279_[+2]_201 6723 8.8e-07 143_[+2]_337 269952 1e-06 434_[+2]_46 20817 1.2e-06 293_[+2]_187 22784 3.1e-06 330_[+2]_150 11977 1.6e-05 238_[+2]_242 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=12 263361 ( 244) CGCCACCCCTACCCACCGCC 1 24626 ( 462) CGCCACCAAAAGCATCCCCC 1 31382 ( 312) CGCCAGCCCGACACTCCAAC 1 1932 ( 106) CTCGTCCATAACCATCTACC 1 22857 ( 399) CTCCACTCTTTCTCTCCACC 1 6476 ( 164) TGCTTACCCTTGCCTCCACC 1 9666 ( 280) CTCCTTCACAACCATCGCAC 1 6723 ( 144) CACCACCGCGTTCCTCTATC 1 269952 ( 435) CGCCTCTCTGAGAATCTCCA 1 20817 ( 294) CACCAACACAACACATCACA 1 22784 ( 331) AACCTCCTCTACCTTTCACA 1 11977 ( 239) TGTCTTCTTTTCCCACTCTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6253 bayes= 10.1236 E= 7.7e-001 -170 176 -1023 -65 -12 -1023 103 -7 -1023 205 -1023 -165 -1023 191 -155 -165 88 -1023 -1023 93 -70 139 -155 -65 -1023 191 -1023 -65 29 91 -155 -65 -170 139 -1023 35 29 -1023 3 67 129 -1023 -1023 35 -1023 159 3 -165 -12 159 -1023 -165 29 139 -1023 -165 -12 -1023 -1023 152 -1023 191 -1023 -65 -1023 139 -155 35 110 59 -155 -1023 -70 159 -1023 -65 -12 176 -1023 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 12 E= 7.7e-001 0.083333 0.750000 0.000000 0.166667 0.250000 0.000000 0.500000 0.250000 0.000000 0.916667 0.000000 0.083333 0.000000 0.833333 0.083333 0.083333 0.500000 0.000000 0.000000 0.500000 0.166667 0.583333 0.083333 0.166667 0.000000 0.833333 0.000000 0.166667 0.333333 0.416667 0.083333 0.166667 0.083333 0.583333 0.000000 0.333333 0.333333 0.000000 0.250000 0.416667 0.666667 0.000000 0.000000 0.333333 0.000000 0.666667 0.250000 0.083333 0.250000 0.666667 0.000000 0.083333 0.333333 0.583333 0.000000 0.083333 0.250000 0.000000 0.000000 0.750000 0.000000 0.833333 0.000000 0.166667 0.000000 0.583333 0.083333 0.333333 0.583333 0.333333 0.083333 0.000000 0.166667 0.666667 0.000000 0.166667 0.250000 0.750000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[GAT]CC[AT]CC[CA][CT][TAG][AT][CG][CA][CA][TA]C[CT][AC]C[CA] -------------------------------------------------------------------------------- Time 3.70 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 9 llr = 102 E-value = 3.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :19::::4:::: pos.-specific C 791:a:16166: probability G :::::7::::22 matrix T 3::a:39:9428 bits 2.2 * 2.0 ** 1.7 * ** 1.5 * ** * * Relative 1.3 **** * * Entropy 1.1 ********** * (16.3 bits) 0.9 ********** * 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CCATCGTCTCCT consensus T T A TGG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 9666 473 1.28e-07 ACACACACTA CCATCGTCTTCT TGATGTCCTT 1932 318 1.28e-07 ATCAGTATTA CCATCGTCTTCT GCATTGCATT 20817 437 1.28e-06 AGTGTCAGCT TCATCGTCTCGT ATCCCACAAA 22784 120 2.62e-06 TGATGTGGCT CAATCGTCTCCT GGAAGGAGTC 11977 133 4.17e-06 CGCCAGCTAG CCATCTTCTCGG AAAGCTCTGA 6476 408 5.22e-06 CATTAAGTTT TCCTCGTATCCT AAGGGTTTGA 263361 436 5.22e-06 TGTTCGGGAC TCATCTTATCTT GAGCTTTGCT 24626 488 9.52e-06 CCCCCAGTTT CCATCGCATTTT C 6723 225 1.68e-05 GTCGGCACCT CCATCTTACTCG GATTATGCGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9666 1.3e-07 472_[+3]_16 1932 1.3e-07 317_[+3]_171 20817 1.3e-06 436_[+3]_52 22784 2.6e-06 119_[+3]_369 11977 4.2e-06 132_[+3]_356 6476 5.2e-06 407_[+3]_81 263361 5.2e-06 435_[+3]_53 24626 9.5e-06 487_[+3]_1 6723 1.7e-05 224_[+3]_264 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=9 9666 ( 473) CCATCGTCTTCT 1 1932 ( 318) CCATCGTCTTCT 1 20817 ( 437) TCATCGTCTCGT 1 22784 ( 120) CAATCGTCTCCT 1 11977 ( 133) CCATCTTCTCGG 1 6476 ( 408) TCCTCGTATCCT 1 263361 ( 436) TCATCTTATCTT 1 24626 ( 488) CCATCGCATTTT 1 6723 ( 225) CCATCTTACTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 10.3111 E= 3.0e+001 -982 159 -982 35 -129 200 -982 -982 171 -100 -982 -982 -982 -982 -982 193 -982 217 -982 -982 -982 -982 144 35 -982 -100 -982 176 71 132 -982 -982 -982 -100 -982 176 -982 132 -982 76 -982 132 -14 -24 -982 -982 -14 157 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 3.0e+001 0.000000 0.666667 0.000000 0.333333 0.111111 0.888889 0.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.111111 0.000000 0.888889 0.444444 0.555556 0.000000 0.000000 0.000000 0.111111 0.000000 0.888889 0.000000 0.555556 0.000000 0.444444 0.000000 0.555556 0.222222 0.222222 0.000000 0.000000 0.222222 0.777778 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CT]CATC[GT]T[CA]T[CT][CGT][TG] -------------------------------------------------------------------------------- Time 5.41 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11319 4.78e-03 236_[+1(1.92e-07)]_249 11977 2.29e-05 132_[+3(4.17e-06)]_94_\ [+2(1.61e-05)]_203_[+1(2.18e-05)]_24 1932 1.02e-10 105_[+2(9.20e-08)]_4_[+1(1.67e-07)]_\ 173_[+3(1.28e-07)]_171 20817 2.08e-10 185_[+1(2.74e-09)]_93_\ [+2(1.20e-06)]_123_[+3(1.28e-06)]_52 22784 5.96e-07 119_[+3(2.62e-06)]_84_\ [+1(3.04e-06)]_100_[+2(3.06e-06)]_150 22857 1.32e-07 197_[+2(6.11e-05)]_6_[+1(6.00e-08)]_\ 160_[+2(1.13e-07)]_82 24626 1.77e-08 178_[+1(2.14e-06)]_43_\ [+2(4.08e-05)]_205_[+2(2.51e-08)]_6_[+3(9.52e-06)]_1 263361 1.02e-09 133_[+1(4.24e-06)]_95_\ [+2(1.06e-09)]_172_[+3(5.22e-06)]_53 269952 2.17e-04 279_[+1(2.72e-05)]_140_\ [+2(1.03e-06)]_46 31382 3.06e-06 69_[+1(4.08e-05)]_94_[+1(3.92e-06)]_\ 118_[+2(3.17e-08)]_169 6476 2.80e-07 163_[+2(3.48e-07)]_196_\ [+1(5.80e-06)]_13_[+3(5.22e-06)]_81 6723 3.27e-06 32_[+1(1.10e-05)]_96_[+2(8.77e-07)]_\ 61_[+3(1.68e-05)]_264 9666 1.68e-08 279_[+2(4.52e-07)]_10_\ [+1(8.37e-06)]_148_[+3(1.28e-07)]_16 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************