******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/382/382.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11665 1.0000 500 20565 1.0000 500 20576 1.0000 500 20773 1.0000 500 21756 1.0000 500 21857 1.0000 500 24359 1.0000 500 24455 1.0000 500 24657 1.0000 500 25419 1.0000 500 262103 1.0000 500 262422 1.0000 500 268084 1.0000 500 269586 1.0000 500 35713 1.0000 500 9002 1.0000 500 ThpsCs002 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/382/382.seqs.fa -oc motifs/382 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.228 G 0.244 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.228 G 0.244 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 12 llr = 174 E-value = 4.2e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :22111:::342637385:1 pos.-specific C 88:8325a9428:8182188 probability G :11:15:::2::::1:::1: matrix T 2:81535:114:4:2::411 bits 2.1 * 1.9 * 1.7 ** 1.5 * ** * Relative 1.3 * * ** * * ** ** Entropy 1.1 ** * *** * * ** ** (20.9 bits) 0.9 **** *** *** ** ** 0.6 **** *** ********* 0.4 ***** *** ********** 0.2 ******************** 0.0 -------------------- Multilevel CCTCTGCCCCACACACAACC consensus CTT AT TA A T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 269586 215 1.30e-10 TTCAAAACCT CCTCCGCCCCACACAAATCC CCATAAATCT 11665 215 1.30e-10 TTCAAAACCT CCTCCGCCCCACACAAATCC CCATAAATCT 268084 473 2.60e-09 AGCCCTCTCT CCTCTTCCCCTCTCACAAGC TCTCAACC 25419 208 3.73e-08 TGAGACGAGG CATCTCTCCATCACACAACA ATAATCAGCG 21857 435 3.73e-08 TCATCTTATC CCTCTCTCCCTCTCTCATCT GCCTCGAACC 262422 441 1.51e-07 CAGAATCCCG CCTCCGTCCGCCTCCCAATC CCAACCCATC 9002 457 1.77e-07 AAAAAACTCA CCTTCGCCCCAAACAACACC TTCTACGAAG 35713 449 2.61e-07 AACAACAATC CATCGGTCCACAAAACAACC CATCCAGGCA 21756 478 2.81e-07 TACACAACCA CGACAGCCCAACTAACATCC GAC 24657 455 6.18e-07 GCCTCCAACC TCAATTTCCAACACACCTCC AACACCTTTT 20773 144 8.63e-07 TGTGGAGGGC TCGCTTCCCGTCAAGCAACC AGAGAAAGAC 20565 382 1.52e-06 ATGATTGGCT CCTCTATCTTTCTCTCACCC TTCTTCATGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 269586 1.3e-10 214_[+1]_266 11665 1.3e-10 214_[+1]_266 268084 2.6e-09 472_[+1]_8 25419 3.7e-08 207_[+1]_273 21857 3.7e-08 434_[+1]_46 262422 1.5e-07 440_[+1]_40 9002 1.8e-07 456_[+1]_24 35713 2.6e-07 448_[+1]_32 21756 2.8e-07 477_[+1]_3 24657 6.2e-07 454_[+1]_26 20773 8.6e-07 143_[+1]_337 20565 1.5e-06 381_[+1]_99 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=12 269586 ( 215) CCTCCGCCCCACACAAATCC 1 11665 ( 215) CCTCCGCCCCACACAAATCC 1 268084 ( 473) CCTCTTCCCCTCTCACAAGC 1 25419 ( 208) CATCTCTCCATCACACAACA 1 21857 ( 435) CCTCTCTCCCTCTCTCATCT 1 262422 ( 441) CCTCCGTCCGCCTCCCAATC 1 9002 ( 457) CCTTCGCCCCAAACAACACC 1 35713 ( 449) CATCGGTCCACAAAACAACC 1 21756 ( 478) CGACAGCCCAACTAACATCC 1 24657 ( 455) TCAATTTCCAACACACCTCC 1 20773 ( 144) TCGCTTCCCGTCAAGCAACC 1 20565 ( 382) CCTCTATCTTTCTCTCACCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 8177 bayes= 9.8583 E= 4.2e-006 -1023 187 -1023 -68 -65 172 -155 -1023 -65 -1023 -155 149 -164 187 -1023 -168 -164 55 -155 90 -164 -45 103 -9 -1023 113 -1023 90 -1023 213 -1023 -1023 -1023 201 -1023 -168 35 87 -55 -168 67 -45 -1023 64 -65 187 -1023 -1023 116 -1023 -1023 64 -6 172 -1023 -1023 135 -145 -155 -68 -6 172 -1023 -1023 167 -45 -1023 -1023 94 -145 -1023 64 -1023 187 -155 -168 -164 187 -1023 -168 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 12 E= 4.2e-006 0.000000 0.833333 0.000000 0.166667 0.166667 0.750000 0.083333 0.000000 0.166667 0.000000 0.083333 0.750000 0.083333 0.833333 0.000000 0.083333 0.083333 0.333333 0.083333 0.500000 0.083333 0.166667 0.500000 0.250000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.916667 0.000000 0.083333 0.333333 0.416667 0.166667 0.083333 0.416667 0.166667 0.000000 0.416667 0.166667 0.833333 0.000000 0.000000 0.583333 0.000000 0.000000 0.416667 0.250000 0.750000 0.000000 0.000000 0.666667 0.083333 0.083333 0.166667 0.250000 0.750000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.500000 0.083333 0.000000 0.416667 0.000000 0.833333 0.083333 0.083333 0.083333 0.833333 0.000000 0.083333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CCTC[TC][GT][CT]CC[CA][AT]C[AT][CA]A[CA]A[AT]CC -------------------------------------------------------------------------------- Time 2.64 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 7 llr = 124 E-value = 1.2e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::7::131::46:1::a44:: pos.-specific C 791941::9911a1aa:43a: probability G 3:1:37::1:11:1:::1::9 matrix T :1:13:79:131:6::::3:1 bits 2.1 * ** * 1.9 * *** * 1.7 * *** * 1.5 * * ** * *** ** Relative 1.3 ** * *** * *** ** Entropy 1.1 ** * **** * *** ** (25.5 bits) 0.9 **** ***** * *** ** 0.6 **** ***** * **** ** 0.4 ********** * ******* 0.2 ********************* 0.0 --------------------- Multilevel CCACCGTTCCAACTCCAAACG consensus G G A T CC sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 269586 309 9.93e-13 CCCTCTCCAT CCACCGTTCCAACTCCACTCG ATTCTCATTC 11665 309 9.93e-13 CCCTCTCCAT CCACCGTTCCAACTCCACTCG ATTCTCATTC 24455 474 7.63e-09 GTCTCACAGA CCACGATTCCAGCCCCAGCCG ATCAGA 21857 353 1.56e-08 ATTAACCAAC GCCCTGTTCTTTCTCCAAACG TAGGATTGGA 262422 89 2.17e-08 GATGGGCGAT GTACTGATGCCACTCCAAACG GTTGATGGCT 20773 439 2.78e-08 TCAACCTTTG CCGTGGTTCCTACACCACACT TTCGCTCTCC 25419 379 4.15e-08 TGCACAACAT CCACCCAACCGCCGCCAACCG AGTCGCCCGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 269586 9.9e-13 308_[+2]_171 11665 9.9e-13 308_[+2]_171 24455 7.6e-09 473_[+2]_6 21857 1.6e-08 352_[+2]_127 262422 2.2e-08 88_[+2]_391 20773 2.8e-08 438_[+2]_41 25419 4.1e-08 378_[+2]_101 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=7 269586 ( 309) CCACCGTTCCAACTCCACTCG 1 11665 ( 309) CCACCGTTCCAACTCCACTCG 1 24455 ( 474) CCACGATTCCAGCCCCAGCCG 1 21857 ( 353) GCCCTGTTCTTTCTCCAAACG 1 262422 ( 89) GTACTGATGCCACTCCAAACG 1 20773 ( 439) CCGTGGTTCCTACACCACACT 1 25419 ( 379) CCACCCAACCGCCGCCAACCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8160 bayes= 10.7919 E= 1.2e-002 -945 165 23 -945 -945 191 -945 -90 145 -67 -77 -945 -945 191 -945 -90 -945 91 23 10 -87 -67 155 -945 13 -945 -945 142 -87 -945 -945 168 -945 191 -77 -945 -945 191 -945 -90 71 -67 -77 10 113 -67 -77 -90 -945 213 -945 -945 -87 -67 -77 110 -945 213 -945 -945 -945 213 -945 -945 194 -945 -945 -945 71 91 -77 -945 71 33 -945 10 -945 213 -945 -945 -945 -945 181 -90 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 1.2e-002 0.000000 0.714286 0.285714 0.000000 0.000000 0.857143 0.000000 0.142857 0.714286 0.142857 0.142857 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.428571 0.285714 0.285714 0.142857 0.142857 0.714286 0.000000 0.285714 0.000000 0.000000 0.714286 0.142857 0.000000 0.000000 0.857143 0.000000 0.857143 0.142857 0.000000 0.000000 0.857143 0.000000 0.142857 0.428571 0.142857 0.142857 0.285714 0.571429 0.142857 0.142857 0.142857 0.000000 1.000000 0.000000 0.000000 0.142857 0.142857 0.142857 0.571429 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.428571 0.428571 0.142857 0.000000 0.428571 0.285714 0.000000 0.285714 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.857143 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG]CAC[CGT]G[TA]TCC[AT]ACTCCA[AC][ACT]CG -------------------------------------------------------------------------------- Time 5.05 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 17 llr = 172 E-value = 5.5e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 21::51:31111:21 pos.-specific C ::1141:221:2::: probability G 84:81:91:56:9:8 matrix T :59118148337181 bits 2.1 1.9 1.7 * * 1.5 * * * Relative 1.3 * ** * *** Entropy 1.1 * ** * * *** (14.6 bits) 0.9 * ** ** * **** 0.6 **** ** * ***** 0.4 ******* ******* 0.2 ******* ******* 0.0 --------------- Multilevel GTTGATGTTGGTGTG consensus AG C A TTC sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 21756 47 4.66e-09 AGCTGCTGTT GTTGCTGTTGGTGTG GAGGAGGTGC 269586 102 1.57e-07 AATACGTCGA GGTGATGATTGCGTG AACAAGTGAC 11665 102 1.57e-07 AATACGTCGA GGTGATGATTGCGTG AACAAGTGAC 24455 146 2.75e-07 TGAACATGGC GGTGCTGTTTGTGAG CTGAGGAGTA 25419 17 8.44e-07 TGTTGTTGTT GTTGTTGTTGTTGTG GGGCTGTGGT 20773 123 1.63e-06 CAAGGGAGAG ATTGCTGTCGTTGTG GAGGGCTCGC 35713 221 3.01e-06 TGAGCAATCT GTTGATGCTGATGTT GGGGGAAACA 9002 237 3.65e-06 TTCCGGTCAT GTTGACGGCGGTGTG CATCAACATC 24359 196 6.39e-06 GTTGAGTTGG GATGATGATGATGAG TTTGATGACT 20576 426 7.00e-06 GCCAGATACC ATTCATGCTTGTGTG TGGGAGGGAT 24657 103 8.33e-06 AGGTTTAGAA GGTGATGCTAGCGAG GTTTGGTTCT 268084 397 3.68e-05 TCGAGTGTGT GGCGGTGAAGGTGTG TCTGTTTGCA 262422 150 4.23e-05 ATACCATCAT GGTGCATTTGGTGTT AAGAACGACG 262103 456 5.52e-05 TGTTGAGAGC AGCGATGCTGTTTTG ACTAGTACAG 20565 166 6.29e-05 TTCAATGACA GTTTCAGGTCGCGTG AGGATACTTA ThpsCs002 157 1.03e-04 AATTGAACAA GATGCTGATTTAGTA TTAATGTTAT 21857 90 1.53e-04 AGATGATGAT ATTTCCGTCCTTGTG TTGACGCTCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21756 4.7e-09 46_[+3]_439 269586 1.6e-07 101_[+3]_384 11665 1.6e-07 101_[+3]_384 24455 2.8e-07 145_[+3]_340 25419 8.4e-07 16_[+3]_469 20773 1.6e-06 122_[+3]_363 35713 3e-06 220_[+3]_265 9002 3.6e-06 236_[+3]_249 24359 6.4e-06 195_[+3]_290 20576 7e-06 425_[+3]_60 24657 8.3e-06 102_[+3]_383 268084 3.7e-05 396_[+3]_89 262422 4.2e-05 149_[+3]_336 262103 5.5e-05 455_[+3]_30 20565 6.3e-05 165_[+3]_320 ThpsCs002 0.0001 156_[+3]_329 21857 0.00015 89_[+3]_396 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=17 21756 ( 47) GTTGCTGTTGGTGTG 1 269586 ( 102) GGTGATGATTGCGTG 1 11665 ( 102) GGTGATGATTGCGTG 1 24455 ( 146) GGTGCTGTTTGTGAG 1 25419 ( 17) GTTGTTGTTGTTGTG 1 20773 ( 123) ATTGCTGTCGTTGTG 1 35713 ( 221) GTTGATGCTGATGTT 1 9002 ( 237) GTTGACGGCGGTGTG 1 24359 ( 196) GATGATGATGATGAG 1 20576 ( 426) ATTCATGCTTGTGTG 1 24657 ( 103) GGTGATGCTAGCGAG 1 268084 ( 397) GGCGGTGAAGGTGTG 1 262422 ( 150) GGTGCATTTGGTGTT 1 262103 ( 456) AGCGATGCTGTTTTG 1 20565 ( 166) GTTTCAGGTCGCGTG 1 ThpsCs002 ( 157) GATGCTGATTTAGTA 1 21857 ( 90) ATTTCCGTCCTTGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8262 bayes= 8.99152 E= 5.5e-002 -15 -1073 165 -1073 -115 -1073 75 82 -1073 -95 -1073 172 -1073 -195 175 -118 85 85 -205 -218 -115 -95 -1073 152 -1073 -1073 195 -218 17 5 -105 40 -215 -37 -1073 152 -215 -95 112 14 -115 -1073 127 14 -215 5 -1073 140 -1073 -1073 195 -218 -56 -1073 -1073 162 -215 -1073 175 -118 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 17 E= 5.5e-002 0.235294 0.000000 0.764706 0.000000 0.117647 0.000000 0.411765 0.470588 0.000000 0.117647 0.000000 0.882353 0.000000 0.058824 0.823529 0.117647 0.470588 0.411765 0.058824 0.058824 0.117647 0.117647 0.000000 0.764706 0.000000 0.000000 0.941176 0.058824 0.294118 0.235294 0.117647 0.352941 0.058824 0.176471 0.000000 0.764706 0.058824 0.117647 0.529412 0.294118 0.117647 0.000000 0.588235 0.294118 0.058824 0.235294 0.000000 0.705882 0.000000 0.000000 0.941176 0.058824 0.176471 0.000000 0.000000 0.823529 0.058824 0.000000 0.823529 0.117647 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA][TG]TG[AC]TG[TAC]T[GT][GT][TC]GTG -------------------------------------------------------------------------------- Time 7.62 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11665 2.68e-18 101_[+3(1.57e-07)]_98_\ [+1(1.30e-10)]_74_[+2(9.93e-13)]_20_[+1(1.87e-05)]_131 20565 5.79e-04 165_[+3(6.29e-05)]_201_\ [+1(1.52e-06)]_99 20576 6.04e-02 425_[+3(7.00e-06)]_60 20773 1.62e-09 122_[+3(1.63e-06)]_6_[+1(8.63e-07)]_\ 275_[+2(2.78e-08)]_41 21756 1.16e-08 46_[+3(4.66e-09)]_72_[+3(2.76e-05)]_\ 329_[+1(2.81e-07)]_3 21857 3.37e-09 213_[+2(7.20e-05)]_118_\ [+2(1.56e-08)]_61_[+1(3.73e-08)]_46 24359 4.75e-02 195_[+3(6.39e-06)]_290 24455 5.23e-08 145_[+3(2.75e-07)]_313_\ [+2(7.63e-09)]_6 24657 1.01e-04 102_[+3(8.33e-06)]_337_\ [+1(6.18e-07)]_26 25419 6.85e-11 16_[+3(8.44e-07)]_176_\ [+1(3.73e-08)]_151_[+2(4.15e-08)]_101 262103 1.01e-01 455_[+3(5.52e-05)]_30 262422 5.16e-09 88_[+2(2.17e-08)]_40_[+3(4.23e-05)]_\ 276_[+1(1.51e-07)]_40 268084 9.98e-07 396_[+3(3.68e-05)]_61_\ [+1(2.60e-09)]_8 269586 2.68e-18 101_[+3(1.57e-07)]_98_\ [+1(1.30e-10)]_74_[+2(9.93e-13)]_20_[+1(1.87e-05)]_131 35713 1.68e-05 220_[+3(3.01e-06)]_213_\ [+1(2.61e-07)]_32 9002 4.25e-06 236_[+3(3.65e-06)]_205_\ [+1(1.77e-07)]_24 ThpsCs002 8.39e-02 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************