******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/267/267.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10282 1.0000 500 10905 1.0000 500 11114 1.0000 500 12943 1.0000 500 22723 1.0000 500 22801 1.0000 500 23738 1.0000 500 262303 1.0000 500 262980 1.0000 500 264727 1.0000 500 32894 1.0000 500 3777 1.0000 500 38466 1.0000 500 41300 1.0000 500 5132 1.0000 500 7530 1.0000 500 7804 1.0000 500 8503 1.0000 500 9371 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/267/267.seqs.fa -oc motifs/267 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.249 C 0.231 G 0.248 T 0.272 Background letter frequencies (from dataset with add-one prior applied): A 0.249 C 0.231 G 0.248 T 0.272 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 19 llr = 217 E-value = 3.0e-008 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :42512511422151264: pos.-specific C a152782554288:88329 probability G ::3:::::113:1:::21: matrix T :5:32:353241152::41 bits 2.1 * 1.9 * 1.7 * * 1.5 * * * Relative 1.3 * * * * * Entropy 1.1 * * ** ** * (16.5 bits) 0.8 * ** * ***** * 0.6 *** ** * ****** * 0.4 ********** ****** * 0.2 ********** ******** 0.0 ------------------- Multilevel CTCACCACCCTCCACCAAC consensus AGTTATTTAG T CT sequence AC C A C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 8503 400 2.28e-09 TTTAGGTCCT CTCCCCACCAGCCACCACC ACCACCACCA 5132 219 1.27e-08 CCGACCATTT CAGACCCCCAGCCTCCATC TTGGCTACAA 7804 104 2.16e-07 CTCACAGCTT CTCTCCACCAACCACAGTC CACCAAAATA 23738 429 2.75e-07 CACCGCCGTC CTGCCAACCACCCTCCACC TTTCCAAGCT 262303 245 3.08e-07 ACCAAACATT CTCTCCTCCAACAACCACC CCCCCTCTGC 12943 412 4.33e-07 GTGGCCTCTG CACATCATCATCGTCCATC TGCGTAGGTA 32894 443 4.84e-07 GAAAACACAG CTCAACACAAGCCACCCAC CACACCTGGC 264727 429 8.23e-07 CCACTCTTCA CTAACATTTCGCCTCCCTC GATTTTCAAC 22801 480 1.49e-06 ACTCACCACA CACTCCTTTCTCCACCCCT TC 10905 471 2.17e-06 TTCAACCCTT CAATCCATCTCCCTTCCAC CACCCGAAGA 3777 469 3.67e-06 TTCCTTTGAC CAAACAACCCTCTTCCGAC CTCCCCGCAC 22723 389 4.00e-06 ACCTCCTTCA CACTTCACCCTCCACACTT CACGAGACGA 10282 91 4.00e-06 CATGTTCCTC CTGCTCCATCTCCACCATC AAACTCTGGA 7530 388 4.72e-06 AAACAGAGGC CCGACCTTTGGCCACCAAC ACATCCGCTA 11114 172 5.12e-06 ACATGAGAGA CTCACCTTATTACATCAAC AACAGTGCCG 38466 14 6.50e-06 ATCACACGCA CACATCATGCACCTACAAC ATACCTGCAA 262980 466 7.03e-06 TTCCACTGAG CTCACAATTCATCACCAGC GATTGGCATA 41300 112 2.32e-05 CTTGTCAATG CTGCCCCCTTTACTTCGTC GGAACTATCT 9371 161 7.59e-05 CAATATCCTC CAATACCTCCCAATCAAAC TCTTCGCTGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8503 2.3e-09 399_[+1]_82 5132 1.3e-08 218_[+1]_263 7804 2.2e-07 103_[+1]_378 23738 2.7e-07 428_[+1]_53 262303 3.1e-07 244_[+1]_237 12943 4.3e-07 411_[+1]_70 32894 4.8e-07 442_[+1]_39 264727 8.2e-07 428_[+1]_53 22801 1.5e-06 479_[+1]_2 10905 2.2e-06 470_[+1]_11 3777 3.7e-06 468_[+1]_13 22723 4e-06 388_[+1]_93 10282 4e-06 90_[+1]_391 7530 4.7e-06 387_[+1]_94 11114 5.1e-06 171_[+1]_310 38466 6.5e-06 13_[+1]_468 262980 7e-06 465_[+1]_16 41300 2.3e-05 111_[+1]_370 9371 7.6e-05 160_[+1]_321 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=19 8503 ( 400) CTCCCCACCAGCCACCACC 1 5132 ( 219) CAGACCCCCAGCCTCCATC 1 7804 ( 104) CTCTCCACCAACCACAGTC 1 23738 ( 429) CTGCCAACCACCCTCCACC 1 262303 ( 245) CTCTCCTCCAACAACCACC 1 12943 ( 412) CACATCATCATCGTCCATC 1 32894 ( 443) CTCAACACAAGCCACCCAC 1 264727 ( 429) CTAACATTTCGCCTCCCTC 1 22801 ( 480) CACTCCTTTCTCCACCCCT 1 10905 ( 471) CAATCCATCTCCCTTCCAC 1 3777 ( 469) CAAACAACCCTCTTCCGAC 1 22723 ( 389) CACTTCACCCTCCACACTT 1 10282 ( 91) CTGCTCCATCTCCACCATC 1 7530 ( 388) CCGACCTTTGGCCACCAAC 1 11114 ( 172) CTCACCTTATTACATCAAC 1 38466 ( 14) CACATCATGCACCTACAAC 1 262980 ( 466) CTCACAATTCATCACCAGC 1 41300 ( 112) CTGCCCCCTTTACTTCGTC 1 9371 ( 161) CAATACCTCCCAATCAAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 9158 bayes= 8.90989 E= 3.0e-008 -1089 212 -1089 -1089 76 -213 -1089 95 -24 119 8 -1089 93 -13 -1089 22 -124 157 -1089 -37 -24 178 -1089 -1089 108 -13 -1089 -5 -224 104 -1089 80 -124 119 -223 22 56 87 -223 -78 -24 -55 8 44 -66 178 -1089 -237 -124 178 -223 -237 108 -1089 -1089 80 -224 178 -1089 -78 -66 187 -1089 -1089 122 19 -65 -1089 56 -13 -223 44 -1089 196 -1089 -137 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 19 E= 3.0e-008 0.000000 1.000000 0.000000 0.000000 0.421053 0.052632 0.000000 0.526316 0.210526 0.526316 0.263158 0.000000 0.473684 0.210526 0.000000 0.315789 0.105263 0.684211 0.000000 0.210526 0.210526 0.789474 0.000000 0.000000 0.526316 0.210526 0.000000 0.263158 0.052632 0.473684 0.000000 0.473684 0.105263 0.526316 0.052632 0.315789 0.368421 0.421053 0.052632 0.157895 0.210526 0.157895 0.263158 0.368421 0.157895 0.789474 0.000000 0.052632 0.105263 0.789474 0.052632 0.052632 0.526316 0.000000 0.000000 0.473684 0.052632 0.789474 0.000000 0.157895 0.157895 0.842105 0.000000 0.000000 0.578947 0.263158 0.157895 0.000000 0.368421 0.210526 0.052632 0.368421 0.000000 0.894737 0.000000 0.105263 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[TA][CGA][ATC][CT][CA][ATC][CT][CT][CA][TGA]CC[AT]CC[AC][ATC]C -------------------------------------------------------------------------------- Time 3.34 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 13 llr = 157 E-value = 3.9e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :5::47165:7::2:: pos.-specific C ::1:1311:21:1:11 probability G 75:a5:53582:9627 matrix T 3:9:::3:1::a:272 bits 2.1 1.9 * * 1.7 * ** 1.5 ** * ** Relative 1.3 ** * ** Entropy 1.1 **** * * ** (17.4 bits) 0.8 **** * * **** ** 0.6 ****** ********* 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GATGGAGAAGATGGTG consensus TG ACTGG G AGT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 41300 434 1.28e-08 CAAAGGAAGA GATGGAGAAGATGGTT TCTTCCTTGG 3777 314 3.65e-08 TAATACGAAC GATGAAGAAGATGTTG GGCATGAAAG 23738 226 1.64e-07 TACTTGAATG GGTGGCGGAGGTGGTG GACTGGGTGG 262980 229 2.14e-07 GGATGGCACG TGTGGAGGGGGTGGTG GGCGACGGAC 8503 141 3.78e-07 GGCGATGGAT GATGGATGGGATGAGG AAAAAGAGGT 38466 159 3.78e-07 GGAAGCCTAG GGTGGAGAGGGTGGCG TTGTTGAAGA 22801 282 1.01e-06 AATGGAAGAT GATGACGAGGATGATC TTTATAGCTA 22723 13 1.69e-06 GTATTGGTAG GATGACAAACATGGTG AAGATGCGAG 9371 353 3.39e-06 TGATGGTGTG TGTGGAGGAGATGTGT TTGGAATTGT 11114 268 4.47e-06 AGCAACTGTG TATGGATCGGATGGGT GGCAGGCGTC 7804 304 6.60e-06 TGTGACGGAT GATGACTAGGCTCGTG TCATCGGTCA 32894 251 6.60e-06 GGGCCGTGGA GGCGAACAACATGGTG TGTGTATTTG 10282 294 7.44e-06 TGCAGCTCAT TGTGCATATGATGATG TTCAAAAGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41300 1.3e-08 433_[+2]_51 3777 3.6e-08 313_[+2]_171 23738 1.6e-07 225_[+2]_259 262980 2.1e-07 228_[+2]_256 8503 3.8e-07 140_[+2]_344 38466 3.8e-07 158_[+2]_326 22801 1e-06 281_[+2]_203 22723 1.7e-06 12_[+2]_472 9371 3.4e-06 352_[+2]_132 11114 4.5e-06 267_[+2]_217 7804 6.6e-06 303_[+2]_181 32894 6.6e-06 250_[+2]_234 10282 7.4e-06 293_[+2]_191 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=13 41300 ( 434) GATGGAGAAGATGGTT 1 3777 ( 314) GATGAAGAAGATGTTG 1 23738 ( 226) GGTGGCGGAGGTGGTG 1 262980 ( 229) TGTGGAGGGGGTGGTG 1 8503 ( 141) GATGGATGGGATGAGG 1 38466 ( 159) GGTGGAGAGGGTGGCG 1 22801 ( 282) GATGACGAGGATGATC 1 22723 ( 13) GATGACAAACATGGTG 1 9371 ( 353) TGTGGAGGAGATGTGT 1 11114 ( 268) TATGGATCGGATGGGT 1 7804 ( 304) GATGACTAGGCTCGTG 1 32894 ( 251) GGCGAACAACATGGTG 1 10282 ( 294) TGTGCATATGATGATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9215 bayes= 9.2225 E= 3.9e-001 -1035 -1035 148 18 111 -1035 89 -1035 -1035 -158 -1035 176 -1035 -1035 201 -1035 63 -158 112 -1035 147 42 -1035 -1035 -169 -158 112 18 130 -158 31 -1035 89 -1035 89 -182 -1035 -58 177 -1035 147 -158 -11 -1035 -1035 -1035 -1035 188 -1035 -158 189 -1035 -11 -1035 131 -82 -1035 -158 -11 135 -1035 -158 148 -24 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 13 E= 3.9e-001 0.000000 0.000000 0.692308 0.307692 0.538462 0.000000 0.461538 0.000000 0.000000 0.076923 0.000000 0.923077 0.000000 0.000000 1.000000 0.000000 0.384615 0.076923 0.538462 0.000000 0.692308 0.307692 0.000000 0.000000 0.076923 0.076923 0.538462 0.307692 0.615385 0.076923 0.307692 0.000000 0.461538 0.000000 0.461538 0.076923 0.000000 0.153846 0.846154 0.000000 0.692308 0.076923 0.230769 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.076923 0.923077 0.000000 0.230769 0.000000 0.615385 0.153846 0.000000 0.076923 0.230769 0.692308 0.000000 0.076923 0.692308 0.230769 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GT][AG]TG[GA][AC][GT][AG][AG]G[AG]TG[GA][TG][GT] -------------------------------------------------------------------------------- Time 6.62 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 14 sites = 9 llr = 117 E-value = 5.8e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :a:16::4:617:: pos.-specific C 8:881a9:93936a probability G ::::2:::1::::: matrix T 2:211:16:1::4: bits 2.1 * * 1.9 * * * 1.7 * * * * * 1.5 * ** * * * Relative 1.3 *** ** * * * Entropy 1.1 **** ** * **** (18.8 bits) 0.8 **** **** **** 0.6 **** ********* 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel CACCACCTCACACC consensus T T G A C CT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 12943 458 3.75e-09 CATTCTTCGT CACCACCACACACC TCTTCACCGC 8503 424 1.35e-08 CCACCACCAC CACCACCACCCACC ACGCTTCTGT 262303 455 4.10e-08 AAATACATCA CACCACCTCCCCTC CTACGTACGT 38466 462 2.30e-07 TGAACTCATT CACCACCTGACATC ACTGAAATGC 23738 409 2.91e-07 ACTCCTCTAA TATCACCTCACACC GCCGTCCTGC 7804 141 9.03e-07 TACTTGACAA CACCGCTACACCCC AGACAGCGCA 264727 411 2.57e-06 TCTGATATTC CACCTCCTCCACTC TTCACTAACA 10905 76 2.66e-06 CCCTTTATCT TACACCCTCACATC ATCTACTGAT 22801 463 4.98e-06 ACACACCACC CATTGCCACTCACC ACACACTCCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12943 3.8e-09 457_[+3]_29 8503 1.3e-08 423_[+3]_63 262303 4.1e-08 454_[+3]_32 38466 2.3e-07 461_[+3]_25 23738 2.9e-07 408_[+3]_78 7804 9e-07 140_[+3]_346 264727 2.6e-06 410_[+3]_76 10905 2.7e-06 75_[+3]_411 22801 5e-06 462_[+3]_24 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=14 seqs=9 12943 ( 458) CACCACCACACACC 1 8503 ( 424) CACCACCACCCACC 1 262303 ( 455) CACCACCTCCCCTC 1 38466 ( 462) CACCACCTGACATC 1 23738 ( 409) TATCACCTCACACC 1 7804 ( 141) CACCGCTACACCCC 1 264727 ( 411) CACCTCCTCCACTC 1 10905 ( 76) TACACCCTCACATC 1 22801 ( 463) CATTGCCACTCACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 9253 bayes= 11.4096 E= 5.8e-001 -982 175 -982 -29 200 -982 -982 -982 -982 175 -982 -29 -116 175 -982 -129 116 -105 -16 -129 -982 212 -982 -982 -982 195 -982 -129 83 -982 -982 103 -982 195 -116 -982 116 53 -982 -129 -116 195 -982 -982 142 53 -982 -982 -982 127 -982 71 -982 212 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 9 E= 5.8e-001 0.000000 0.777778 0.000000 0.222222 1.000000 0.000000 0.000000 0.000000 0.000000 0.777778 0.000000 0.222222 0.111111 0.777778 0.000000 0.111111 0.555556 0.111111 0.222222 0.111111 0.000000 1.000000 0.000000 0.000000 0.000000 0.888889 0.000000 0.111111 0.444444 0.000000 0.000000 0.555556 0.000000 0.888889 0.111111 0.000000 0.555556 0.333333 0.000000 0.111111 0.111111 0.888889 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 0.555556 0.000000 0.444444 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CT]A[CT]C[AG]CC[TA]C[AC]C[AC][CT]C -------------------------------------------------------------------------------- Time 9.59 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10282 2.71e-04 90_[+1(4.00e-06)]_184_\ [+2(7.44e-06)]_191 10905 1.35e-04 75_[+3(2.66e-06)]_381_\ [+1(2.17e-06)]_11 11114 3.85e-04 2_[+2(3.07e-05)]_153_[+1(5.12e-06)]_\ 77_[+2(4.47e-06)]_217 12943 6.26e-08 411_[+1(4.33e-07)]_27_\ [+3(3.75e-09)]_29 22723 4.62e-05 12_[+2(1.69e-06)]_360_\ [+1(4.00e-06)]_93 22801 2.05e-07 281_[+2(1.01e-06)]_165_\ [+3(4.98e-06)]_3_[+1(1.49e-06)]_2 23738 5.96e-10 225_[+2(1.64e-07)]_167_\ [+3(2.91e-07)]_6_[+1(2.75e-07)]_53 262303 4.48e-07 224_[+1(4.00e-05)]_1_[+1(3.08e-07)]_\ 191_[+3(4.10e-08)]_32 262980 5.57e-06 181_[+2(5.51e-05)]_31_\ [+2(2.14e-07)]_221_[+1(7.03e-06)]_16 264727 3.18e-05 208_[+1(7.22e-05)]_183_\ [+3(2.57e-06)]_4_[+1(8.23e-07)]_53 32894 1.40e-05 250_[+2(6.60e-06)]_176_\ [+1(4.84e-07)]_39 3777 4.42e-06 313_[+2(3.65e-08)]_139_\ [+1(3.67e-06)]_13 38466 1.93e-08 13_[+1(6.50e-06)]_126_\ [+2(3.78e-07)]_287_[+3(2.30e-07)]_25 41300 4.55e-06 47_[+2(7.32e-05)]_48_[+1(2.32e-05)]_\ 303_[+2(1.28e-08)]_51 5132 7.50e-05 218_[+1(1.27e-08)]_263 7530 4.49e-02 387_[+1(4.72e-06)]_94 7804 4.11e-08 103_[+1(2.16e-07)]_18_\ [+3(9.03e-07)]_149_[+2(6.60e-06)]_162_[+1(8.38e-05)] 8503 8.23e-13 140_[+2(3.78e-07)]_243_\ [+1(2.28e-09)]_5_[+3(1.35e-08)]_11_[+3(5.32e-06)]_38 9371 2.68e-03 160_[+1(7.59e-05)]_173_\ [+2(3.39e-06)]_132 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************