******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/231/231.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 17335 1.0000 500 13326 1.0000 500 32976 1.0000 500 40052 1.0000 500 49843 1.0000 500 55157 1.0000 500 23908 1.0000 500 50559 1.0000 500 44876 1.0000 500 11916 1.0000 500 45701 1.0000 500 54486 1.0000 500 45964 1.0000 500 45965 1.0000 500 44340 1.0000 500 46046 1.0000 500 37368 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/231/231.seqs.fa -oc motifs/231 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.254 C 0.261 G 0.222 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.254 C 0.261 G 0.222 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 14 llr = 141 E-value = 3.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::21:1::31 pos.-specific C :1:11:8:::36 probability G 21896:294a:: matrix T 872::9::6:43 bits 2.2 * 2.0 * 1.7 * * * 1.5 * * * Relative 1.3 * ** *** * Entropy 1.1 * ** ***** (14.5 bits) 0.9 ********** 0.7 ********** * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTGGGTCGTGTC consensus G T A G G AT sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44340 194 8.52e-08 TTGGGGGAAG TTGGGTCGGGTC TATCCGGTCG 23908 270 2.52e-07 CGTGTGGGTT TTGGGTCGGGCC AAACCATTCC 17335 294 1.49e-06 TCCCGTGAAA TTGGCTCGTGTC GAAACTATCG 32976 352 1.93e-06 TCTGGAACGG TCGGGTCGGGTC CCTTGCCGTC 54486 64 5.05e-06 TCGGGGAGTT TGGGATCGTGTC ATTTTCTACT 44876 217 7.28e-06 CGAGTATCAT TTGGGTGGTGCA TAAGCGGACA 49843 17 1.03e-05 TACCTAGGCG TTTGATCGGGCC CGTGAGGAGT 45701 363 1.53e-05 AGGAGATGAA TTTGGTGGTGAT TTTCATTTCT 45964 76 1.87e-05 GATGGAAGAA GTGGCTCGTGAT GATTGGTTGG 46046 46 2.01e-05 TTCCATATAA TTGGGTCAGGAT CGATCGGTAT 11916 106 3.23e-05 TCGCGCCCTC TTGCGTCGTGAA GGCGGTCGGC 55157 161 3.23e-05 ATCTCACTCG GCTGGTCGTGCC ATCAAACTTC 13326 27 3.23e-05 ATGTTGGATC TGGGGAGGGGTC TCGGCGCCGG 40052 71 4.19e-05 GTTAGCTGTG GTGGAACGTGTT TTTGGTTTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44340 8.5e-08 193_[+1]_295 23908 2.5e-07 269_[+1]_219 17335 1.5e-06 293_[+1]_195 32976 1.9e-06 351_[+1]_137 54486 5.1e-06 63_[+1]_425 44876 7.3e-06 216_[+1]_272 49843 1e-05 16_[+1]_472 45701 1.5e-05 362_[+1]_126 45964 1.9e-05 75_[+1]_413 46046 2e-05 45_[+1]_443 11916 3.2e-05 105_[+1]_383 55157 3.2e-05 160_[+1]_328 13326 3.2e-05 26_[+1]_462 40052 4.2e-05 70_[+1]_418 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=14 44340 ( 194) TTGGGTCGGGTC 1 23908 ( 270) TTGGGTCGGGCC 1 17335 ( 294) TTGGCTCGTGTC 1 32976 ( 352) TCGGGTCGGGTC 1 54486 ( 64) TGGGATCGTGTC 1 44876 ( 217) TTGGGTGGTGCA 1 49843 ( 17) TTTGATCGGGCC 1 45701 ( 363) TTTGGTGGTGAT 1 45964 ( 76) GTGGCTCGTGAT 1 46046 ( 46) TTGGGTCAGGAT 1 11916 ( 106) TTGCGTCGTGAA 1 55157 ( 161) GCTGGTCGTGCC 1 13326 ( 27) TGGGGAGGGGTC 1 40052 ( 71) GTGGAACGTGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.81792 E= 3.0e+000 -1045 -1045 -5 158 -1045 -87 -64 144 -1045 -1045 182 -29 -1045 -187 206 -1045 -25 -87 153 -1045 -83 -1045 -1045 171 -1045 159 -5 -1045 -183 -1045 206 -1045 -1045 -1045 95 112 -1045 -1045 217 -1045 17 13 -1045 71 -83 113 -1045 12 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 3.0e+000 0.000000 0.000000 0.214286 0.785714 0.000000 0.142857 0.142857 0.714286 0.000000 0.000000 0.785714 0.214286 0.000000 0.071429 0.928571 0.000000 0.214286 0.142857 0.642857 0.000000 0.142857 0.000000 0.000000 0.857143 0.000000 0.785714 0.214286 0.000000 0.071429 0.000000 0.928571 0.000000 0.000000 0.000000 0.428571 0.571429 0.000000 0.000000 1.000000 0.000000 0.285714 0.285714 0.000000 0.428571 0.142857 0.571429 0.000000 0.285714 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TG]T[GT]G[GA]T[CG]G[TG]G[TAC][CT] -------------------------------------------------------------------------------- Time 2.82 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 13 llr = 132 E-value = 4.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :4a224266a:: pos.-specific C 9::73::42:82 probability G 16:2558:2:28 matrix T :::::1:::::: bits 2.2 2.0 * * 1.7 * * 1.5 * * * * Relative 1.3 * * * *** Entropy 1.1 *** ** *** (14.6 bits) 0.9 **** *** *** 0.7 **** ******* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CGACGGGAAACG consensus A CAACG sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 32976 209 2.19e-07 GGACCAGACC CAACGGGAAACG GGACCTATTC 44340 297 5.48e-07 CAGCCGCTAC CAACGAGAAACG ATTCCTCCCC 11916 264 5.48e-07 GCAATCGATT CGACCGGCAACG GAGGATTCCC 50559 239 2.50e-06 AGTGTGAGTC CGACCGGACACG ATTGGTTGCA 17335 80 6.75e-06 GATATTGTTC CGACAGGAAACC TCTAACCCTG 37368 92 1.17e-05 GGGCCCACAC CGACCGACGACG ACATCAACAG 54486 439 1.17e-05 CTACCCTCAC CGACAAGCAAGG AAACAAAAAA 40052 320 1.30e-05 GCCATACCAG CGAAGGGAAACC GCACATTGAC 13326 371 1.50e-05 TTCAGCATGA CAACGTGAGACG AACTAGTTTC 45964 278 1.60e-05 GCGATGATCT CAAGAAGCAACG TCACACGTGA 55157 97 2.52e-05 GTGATTGGTG GGACGAGACACG TACGAGAATC 49843 406 4.00e-05 GATCGGAACT CGAACGACGACG ACGACGACTC 44876 258 5.50e-05 CAATCAAAAA CAAGGAAAAAGG CCTTCTGCTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32976 2.2e-07 208_[+2]_280 44340 5.5e-07 296_[+2]_192 11916 5.5e-07 263_[+2]_225 50559 2.5e-06 238_[+2]_250 17335 6.8e-06 79_[+2]_409 37368 1.2e-05 91_[+2]_397 54486 1.2e-05 438_[+2]_50 40052 1.3e-05 319_[+2]_169 13326 1.5e-05 370_[+2]_118 45964 1.6e-05 277_[+2]_211 55157 2.5e-05 96_[+2]_392 49843 4e-05 405_[+2]_83 44876 5.5e-05 257_[+2]_231 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=13 32976 ( 209) CAACGGGAAACG 1 44340 ( 297) CAACGAGAAACG 1 11916 ( 264) CGACCGGCAACG 1 50559 ( 239) CGACCGGACACG 1 17335 ( 80) CGACAGGAAACC 1 37368 ( 92) CGACCGACGACG 1 54486 ( 439) CGACAAGCAAGG 1 40052 ( 320) CGAAGGGAAACC 1 13326 ( 371) CAACGTGAGACG 1 45964 ( 278) CAAGAAGCAACG 1 55157 ( 97) GGACGAGACACG 1 49843 ( 406) CGAACGACGACG 1 44876 ( 258) CAAGGAAAAAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.84967 E= 4.4e+001 -1035 182 -153 -1035 60 -1035 147 -1035 197 -1035 -1035 -1035 -72 141 -53 -1035 -14 24 105 -1035 60 -1035 128 -177 -14 -1035 179 -1035 127 56 -1035 -1035 127 -76 5 -1035 197 -1035 -1035 -1035 -1035 170 -53 -1035 -1035 -76 193 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 4.4e+001 0.000000 0.923077 0.076923 0.000000 0.384615 0.000000 0.615385 0.000000 1.000000 0.000000 0.000000 0.000000 0.153846 0.692308 0.153846 0.000000 0.230769 0.307692 0.461538 0.000000 0.384615 0.000000 0.538462 0.076923 0.230769 0.000000 0.769231 0.000000 0.615385 0.384615 0.000000 0.000000 0.615385 0.153846 0.230769 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.846154 0.153846 0.000000 0.000000 0.153846 0.846154 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[GA]AC[GCA][GA][GA][AC][AG]ACG -------------------------------------------------------------------------------- Time 5.58 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 17 llr = 147 E-value = 9.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a84:247699:6 pos.-specific C :21924:4:131 probability G ::11423:1:73 matrix T ::511::::1:: bits 2.2 2.0 * 1.7 * * 1.5 * * Relative 1.3 * * *** Entropy 1.1 ** * ***** (12.5 bits) 0.9 ** * ***** 0.7 ** * ****** 0.4 ** * ******* 0.2 ************ 0.0 ------------ Multilevel AATCGAAAAAGA consensus CA ACGC CG sequence CG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 17335 457 6.86e-07 TCGTACTCAC AATCCAAAAAGA ACGTGGGATA 55157 310 4.60e-06 TAGTCGCACC AAACAAGAAAGA CTCGTTTCAC 40052 206 4.60e-06 TAGGTTTCCA AATCGGACAAGG TACAGTATTC 54486 375 7.04e-06 ACGCACCAAC AAACTCAAAAGA CTGTGCAAGC 49843 161 8.10e-06 TTGACTCGGC ACTCGCAAAAGG GAGGACCGAA 23908 352 1.80e-05 CTAGAGTAAC AAACAAACAACA TTCATCCTTT 13326 65 1.80e-05 ATGGAACTCC AATCACGAAACA AGGGACTGGA 45965 79 2.89e-05 TCCACAATTT AATCCAACAAGC TGGATGTCAC 37368 149 4.11e-05 CGCCGAATGG AATGACAAAAGA GGACGCAACA 11916 344 4.47e-05 TGCTGGCCCA AAGCGGAAAACA TGGAGATTTT 46046 266 4.83e-05 ACAAGGCAAG ACTCGAACAAGC CATTCCCGTA 32976 156 5.21e-05 ACCGTCTCTA ACCCGGAAAAGG CGTGACGATG 45701 6 8.81e-05 TGTTT AACCGGGCAACA TCCCCCCACC 44876 48 9.46e-05 CAAATATTAG AAACCAAAATGG GAGTATCGAT 50559 359 1.09e-04 ATGTGACGTG AAACCCGAGAGA TGCACGGACG 45964 307 1.82e-04 TGATTCACAG AAATTCACAAGA CCCCTTCATT 44340 94 3.49e-04 CCACTCGGCC ACTCGAGAACCG GTACTCAATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17335 6.9e-07 456_[+3]_32 55157 4.6e-06 309_[+3]_179 40052 4.6e-06 205_[+3]_283 54486 7e-06 374_[+3]_114 49843 8.1e-06 160_[+3]_328 23908 1.8e-05 351_[+3]_137 13326 1.8e-05 64_[+3]_424 45965 2.9e-05 78_[+3]_410 37368 4.1e-05 148_[+3]_340 11916 4.5e-05 343_[+3]_145 46046 4.8e-05 265_[+3]_223 32976 5.2e-05 155_[+3]_333 45701 8.8e-05 5_[+3]_483 44876 9.5e-05 47_[+3]_441 50559 0.00011 358_[+3]_130 45964 0.00018 306_[+3]_182 44340 0.00035 93_[+3]_395 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=17 17335 ( 457) AATCCAAAAAGA 1 55157 ( 310) AAACAAGAAAGA 1 40052 ( 206) AATCGGACAAGG 1 54486 ( 375) AAACTCAAAAGA 1 49843 ( 161) ACTCGCAAAAGG 1 23908 ( 352) AAACAAACAACA 1 13326 ( 65) AATCACGAAACA 1 45965 ( 79) AATCCAACAAGC 1 37368 ( 149) AATGACAAAAGA 1 11916 ( 344) AAGCGGAAAACA 1 46046 ( 266) ACTCGAACAAGC 1 32976 ( 156) ACCCGGAAAAGG 1 45701 ( 6) AACCGGGCAACA 1 44876 ( 48) AAACCAAAATGG 1 50559 ( 359) AAACCCGAGAGA 1 45964 ( 307) AAATTCACAAGA 1 44340 ( 94) ACTCGAGAACCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.00042 E= 9.8e+002 197 -1073 -1073 -1073 159 -15 -1073 -1073 47 -115 -192 84 -1073 176 -192 -216 -11 -15 89 -116 69 44 8 -1073 147 -1073 40 -1073 135 44 -1073 -1073 189 -1073 -192 -1073 179 -215 -1073 -216 -1073 17 167 -1073 121 -115 40 -1073 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 17 E= 9.8e+002 1.000000 0.000000 0.000000 0.000000 0.764706 0.235294 0.000000 0.000000 0.352941 0.117647 0.058824 0.470588 0.000000 0.882353 0.058824 0.058824 0.235294 0.235294 0.411765 0.117647 0.411765 0.352941 0.235294 0.000000 0.705882 0.000000 0.294118 0.000000 0.647059 0.352941 0.000000 0.000000 0.941176 0.000000 0.058824 0.000000 0.882353 0.058824 0.000000 0.058824 0.000000 0.294118 0.705882 0.000000 0.588235 0.117647 0.294118 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[AC][TA]C[GAC][ACG][AG][AC]AA[GC][AG] -------------------------------------------------------------------------------- Time 8.17 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17335 1.94e-07 79_[+2(6.75e-06)]_202_\ [+1(1.49e-06)]_151_[+3(6.86e-07)]_32 13326 1.10e-04 26_[+1(3.23e-05)]_26_[+3(1.80e-05)]_\ 294_[+2(1.50e-05)]_118 32976 5.52e-07 155_[+3(5.21e-05)]_41_\ [+2(2.19e-07)]_131_[+1(1.93e-06)]_137 40052 3.74e-05 70_[+1(4.19e-05)]_123_\ [+3(4.60e-06)]_102_[+2(1.30e-05)]_169 49843 4.78e-05 16_[+1(1.03e-05)]_132_\ [+3(8.10e-06)]_233_[+2(4.00e-05)]_83 55157 5.30e-05 96_[+2(2.52e-05)]_52_[+1(3.23e-05)]_\ 137_[+3(4.60e-06)]_179 23908 8.81e-05 269_[+1(2.52e-07)]_70_\ [+3(1.80e-05)]_137 50559 3.39e-03 238_[+2(2.50e-06)]_125_\ [+2(2.35e-05)]_113 44876 3.83e-04 47_[+3(9.46e-05)]_157_\ [+1(7.28e-06)]_29_[+2(5.50e-05)]_231 11916 1.35e-05 105_[+1(3.23e-05)]_146_\ [+2(5.48e-07)]_68_[+3(4.47e-05)]_145 45701 1.31e-02 5_[+3(8.81e-05)]_345_[+1(1.53e-05)]_\ 126 54486 7.74e-06 63_[+1(5.05e-06)]_299_\ [+3(7.04e-06)]_52_[+2(1.17e-05)]_50 45964 5.13e-04 75_[+1(1.87e-05)]_190_\ [+2(1.60e-05)]_211 45965 1.44e-01 78_[+3(2.89e-05)]_410 44340 3.93e-07 193_[+1(8.52e-08)]_91_\ [+2(5.48e-07)]_192 46046 1.61e-03 45_[+1(2.01e-05)]_208_\ [+3(4.83e-05)]_223 37368 3.76e-03 91_[+2(1.17e-05)]_45_[+3(4.11e-05)]_\ 340 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************