******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/194/194.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10111 1.0000 500 12180 1.0000 500 2133 1.0000 500 24529 1.0000 500 25076 1.0000 500 25101 1.0000 500 25297 1.0000 500 25517 1.0000 500 264461 1.0000 500 270115 1.0000 500 31543 1.0000 500 31658 1.0000 500 33153 1.0000 500 33895 1.0000 500 38991 1.0000 500 40704 1.0000 500 6592 1.0000 500 7440 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/194/194.seqs.fa -oc motifs/194 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.219 G 0.237 T 0.272 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.219 G 0.237 T 0.272 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 11 llr = 141 E-value = 2.7e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 18:14:31:5::57:2 pos.-specific C 9:a6:a55a1265197 probability G ::::5:22:432::1: matrix T :2:32::2::52:2:1 bits 2.2 * * * 2.0 * * * 1.8 * * * * * 1.5 * * * * * Relative 1.3 * * * * * Entropy 1.1 *** * * * ** (18.5 bits) 0.9 **** * * ***** 0.7 **** ** ******** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CACCGCCCCATCCACC consensus TA A GG A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 270115 92 6.00e-09 TCATGCAATT CACCACCCCATCCTCC GCAAAGACTC 33153 476 1.12e-08 TACCACTCGC CACCGCCGCAGCAACC TACCAAACC 25297 485 2.99e-08 CGGACACTAA CACTGCGCCATCAACC 24529 4 3.50e-07 CGC CTCCGCATCGGCCACC GCTGAACCCG 12180 425 4.21e-07 TGGTGTTGTA CACTGCACCGCCCACA AAATACAATC 40704 445 8.40e-07 CAAAACTCTA CACCGCCACCTCCCCC TTGCCTTCTC 31543 307 9.10e-07 CCACGTGCAA CACCACACCACGAACA GAATATTCCG 2133 411 1.67e-06 TGTTTTGTCC ATCCACCCCGTTCACC ATCACTTGCC 25101 32 1.79e-06 CAAATGCCCA CACATCGTCATCAACC CCAGATCGTC 7440 273 2.68e-06 TCACTTCGTG CACTTCCCCAGGAACT TTAGGGGATA 33895 305 3.44e-06 ATTGTAGCAG CACCACCGCGTTCTGC CAAACGGCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 270115 6e-09 91_[+1]_393 33153 1.1e-08 475_[+1]_9 25297 3e-08 484_[+1] 24529 3.5e-07 3_[+1]_481 12180 4.2e-07 424_[+1]_60 40704 8.4e-07 444_[+1]_40 31543 9.1e-07 306_[+1]_178 2133 1.7e-06 410_[+1]_74 25101 1.8e-06 31_[+1]_453 7440 2.7e-06 272_[+1]_212 33895 3.4e-06 304_[+1]_180 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=11 270115 ( 92) CACCACCCCATCCTCC 1 33153 ( 476) CACCGCCGCAGCAACC 1 25297 ( 485) CACTGCGCCATCAACC 1 24529 ( 4) CTCCGCATCGGCCACC 1 12180 ( 425) CACTGCACCGCCCACA 1 40704 ( 445) CACCGCCACCTCCCCC 1 31543 ( 307) CACCACACCACGAACA 1 2133 ( 411) ATCCACCCCGTTCACC 1 25101 ( 32) CACATCGTCATCAACC 1 7440 ( 273) CACTTCCCCAGGAACT 1 33895 ( 305) CACCACCGCGTTCTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8730 bayes= 11.1651 E= 2.7e-001 -157 205 -1010 -1010 159 -1010 -1010 -58 -1010 219 -1010 -1010 -157 154 -1010 0 42 -1010 94 -58 -1010 219 -1010 -1010 1 131 -38 -1010 -157 131 -38 -58 -1010 219 -1010 -1010 101 -127 62 -1010 -1010 -27 20 100 -1010 154 -38 -58 75 131 -1010 -1010 142 -127 -1010 -58 -1010 205 -138 -1010 -58 173 -1010 -158 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 2.7e-001 0.090909 0.909091 0.000000 0.000000 0.818182 0.000000 0.000000 0.181818 0.000000 1.000000 0.000000 0.000000 0.090909 0.636364 0.000000 0.272727 0.363636 0.000000 0.454545 0.181818 0.000000 1.000000 0.000000 0.000000 0.272727 0.545455 0.181818 0.000000 0.090909 0.545455 0.181818 0.181818 0.000000 1.000000 0.000000 0.000000 0.545455 0.090909 0.363636 0.000000 0.000000 0.181818 0.272727 0.545455 0.000000 0.636364 0.181818 0.181818 0.454545 0.545455 0.000000 0.000000 0.727273 0.090909 0.000000 0.181818 0.000000 0.909091 0.090909 0.000000 0.181818 0.727273 0.000000 0.090909 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CAC[CT][GA]C[CA]CC[AG][TG]C[CA]ACC -------------------------------------------------------------------------------- Time 2.88 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 12 llr = 148 E-value = 6.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 2:2::3::3:111:2: pos.-specific C :::3:1:::::3:11: probability G :92:a:29:65419:6 matrix T 8178:68184438:84 bits 2.2 2.0 * 1.8 * * * * 1.5 * * * * Relative 1.3 ** * ** * Entropy 1.1 ** ** **** ** * (17.9 bits) 0.9 ** ** **** **** 0.7 *********** **** 0.4 *********** **** 0.2 **************** 0.0 ---------------- Multilevel TGTTGTTGTGGGTGTG consensus C A ATTC T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 24529 133 6.30e-09 GCGTTGTTGT TGTTGTTGTTGGTGTT GGTGGGCTGG 25517 46 8.00e-09 GGCGATATGA TGTTGTTGTTGTTGTG ATGATGATGA 7440 138 9.52e-09 TGTGTTGCTA TGTTGTTGTGTCTGTT GTCAGCAAAG 12180 406 4.33e-08 TACGAACGCT TGTTGTTGATGGTGTT GTACACTGCA 25076 198 2.06e-07 TGATGGATTG TGTTGATGTGTCTGCG GTCTGTCCTG 10111 422 1.12e-06 CAAATAGGCT TGTTGATGTGAGAGTG AGAGGTCGTT 25297 305 1.93e-06 CCCAATCTAT TTTCGTTGATTGTGTG ATCGAGATAG 31658 102 3.88e-06 GTCATTGGTA AGATGTTGTTTGTCTG CTTCGCTCTC 40704 76 4.68e-06 CCTCCTGCCA TGTCGCTGTGGAGGTG TGGAAGACAA 33895 180 4.68e-06 CTCAAGACAG TGACGAGGTGGTTGAG ATTGACGAAT 2133 204 4.99e-06 CTTTTTGTCT TGGTGTGTTGTCTGTT TGAATTTTTG 264461 36 7.96e-06 GAGTGGGGAG AGGTGATGAGGTTGAT TGAAGCGAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24529 6.3e-09 132_[+2]_352 25517 8e-09 45_[+2]_439 7440 9.5e-09 137_[+2]_347 12180 4.3e-08 405_[+2]_79 25076 2.1e-07 197_[+2]_287 10111 1.1e-06 421_[+2]_63 25297 1.9e-06 304_[+2]_180 31658 3.9e-06 101_[+2]_383 40704 4.7e-06 75_[+2]_409 33895 4.7e-06 179_[+2]_305 2133 5e-06 203_[+2]_281 264461 8e-06 35_[+2]_449 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=12 24529 ( 133) TGTTGTTGTTGGTGTT 1 25517 ( 46) TGTTGTTGTTGTTGTG 1 7440 ( 138) TGTTGTTGTGTCTGTT 1 12180 ( 406) TGTTGTTGATGGTGTT 1 25076 ( 198) TGTTGATGTGTCTGCG 1 10111 ( 422) TGTTGATGTGAGAGTG 1 25297 ( 305) TTTCGTTGATTGTGTG 1 31658 ( 102) AGATGTTGTTTGTCTG 1 40704 ( 76) TGTCGCTGTGGAGGTG 1 33895 ( 180) TGACGAGGTGGTTGAG 1 2133 ( 204) TGGTGTGTTGTCTGTT 1 264461 ( 36) AGGTGATGAGGTTGAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8730 bayes= 9.95281 E= 6.0e-001 -70 -1023 -1023 161 -1023 -1023 195 -171 -70 -1023 -51 129 -1023 19 -1023 146 -1023 -1023 208 -1023 30 -140 -1023 110 -1023 -1023 -51 161 -1023 -1023 195 -171 -12 -1023 -1023 146 -1023 -1023 130 61 -170 -1023 108 61 -170 19 81 -12 -170 -1023 -151 161 -1023 -140 195 -1023 -70 -140 -1023 146 -1023 -1023 130 61 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 12 E= 6.0e-001 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.916667 0.083333 0.166667 0.000000 0.166667 0.666667 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.333333 0.083333 0.000000 0.583333 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.916667 0.083333 0.250000 0.000000 0.000000 0.750000 0.000000 0.000000 0.583333 0.416667 0.083333 0.000000 0.500000 0.416667 0.083333 0.250000 0.416667 0.250000 0.083333 0.000000 0.083333 0.833333 0.000000 0.083333 0.916667 0.000000 0.166667 0.083333 0.000000 0.750000 0.000000 0.000000 0.583333 0.416667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TGT[TC]G[TA]TG[TA][GT][GT][GCT]TGT[GT] -------------------------------------------------------------------------------- Time 5.81 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 8 llr = 129 E-value = 1.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 198313:::19:a13943411 pos.-specific C 913::51:3:1a:54:15::9 probability G :::1:39:68:::4111::6: matrix T :::69::a11::::3:4363: bits 2.2 * 2.0 * ** 1.8 * ** 1.5 * ** ** * Relative 1.3 ** * ** *** * * Entropy 1.1 *** * ** *** * * (23.4 bits) 0.9 *** * ******* * * * 0.7 ************** * *** 0.4 ************** * **** 0.2 ********************* 0.0 --------------------- Multilevel CAATTCGTGGACACCAACTGC consensus CA A C GA TAAT sequence G T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 25101 420 1.62e-10 AATAATTAGA CAATTAGTGGACACAATATGC CACACTCCAA 6592 437 1.87e-10 ATAAACTAGA CAATTGGTGGACACCACCTTC CAACTTCACA 38991 439 7.63e-09 CATTTATGAA AAATTGGTGGACACTATCTAC CCCTCGGTGT 7440 371 1.85e-08 TTGCTACTCT CAAATAGTGGACAGGAAAATC ATGTTGAAGT 33895 328 2.56e-08 TGCCAAACGG CAAATCGTTGACAGCGATAGC GATTCTGAGA 25297 269 5.05e-08 CGTATCAACT CCCTTCGTCGACAACATCTGA CATCTCCCAA 10111 99 1.12e-07 TCTAGTAGAT CAAGTCCTGAACAGAAGTTGC CTTCGCTTCC 25517 436 1.52e-07 GCTTGTCTTT CACTACGTCTCCACTAACAGC AACCATACCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25101 1.6e-10 419_[+3]_60 6592 1.9e-10 436_[+3]_43 38991 7.6e-09 438_[+3]_41 7440 1.8e-08 370_[+3]_109 33895 2.6e-08 327_[+3]_152 25297 5e-08 268_[+3]_211 10111 1.1e-07 98_[+3]_381 25517 1.5e-07 435_[+3]_44 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=8 25101 ( 420) CAATTAGTGGACACAATATGC 1 6592 ( 437) CAATTGGTGGACACCACCTTC 1 38991 ( 439) AAATTGGTGGACACTATCTAC 1 7440 ( 371) CAAATAGTGGACAGGAAAATC 1 33895 ( 328) CAAATCGTTGACAGCGATAGC 1 25297 ( 269) CCCTTCGTCGACAACATCTGA 1 10111 ( 99) CAAGTCCTGAACAGAAGTTGC 1 25517 ( 436) CACTACGTCTCCACTAACAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 10.0755 E= 1.5e+000 -111 199 -965 -965 169 -81 -965 -965 147 19 -965 -965 -12 -965 -92 120 -111 -965 -965 168 -12 119 8 -965 -965 -81 188 -965 -965 -965 -965 187 -965 19 140 -112 -111 -965 166 -112 169 -81 -965 -965 -965 219 -965 -965 188 -965 -965 -965 -111 119 66 -965 -12 77 -92 -12 169 -965 -92 -965 47 -81 -92 46 -12 119 -965 -12 47 -965 -965 120 -111 -965 140 -12 -111 199 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 1.5e+000 0.125000 0.875000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.000000 0.125000 0.625000 0.125000 0.000000 0.000000 0.875000 0.250000 0.500000 0.250000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.625000 0.125000 0.125000 0.000000 0.750000 0.125000 0.875000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.500000 0.375000 0.000000 0.250000 0.375000 0.125000 0.250000 0.875000 0.000000 0.125000 0.000000 0.375000 0.125000 0.125000 0.375000 0.250000 0.500000 0.000000 0.250000 0.375000 0.000000 0.000000 0.625000 0.125000 0.000000 0.625000 0.250000 0.125000 0.875000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CA[AC][TA]T[CAG]GT[GC]GACA[CG][CAT]A[AT][CAT][TA][GT]C -------------------------------------------------------------------------------- Time 8.70 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10111 4.94e-06 98_[+3(1.12e-07)]_302_\ [+2(1.12e-06)]_63 12180 8.00e-07 405_[+2(4.33e-08)]_3_[+1(4.21e-07)]_\ 60 2133 9.60e-05 203_[+2(4.99e-06)]_191_\ [+1(1.67e-06)]_41_[+1(4.89e-05)]_17 24529 7.96e-08 3_[+1(3.50e-07)]_113_[+2(6.30e-09)]_\ 352 25076 4.73e-03 197_[+2(2.06e-07)]_287 25101 1.93e-08 31_[+1(1.79e-06)]_372_\ [+3(1.62e-10)]_60 25297 1.46e-10 268_[+3(5.05e-08)]_15_\ [+2(1.93e-06)]_164_[+1(2.99e-08)] 25517 4.01e-08 45_[+2(8.00e-09)]_13_[+2(2.04e-08)]_\ 345_[+3(1.52e-07)]_44 264461 4.45e-02 35_[+2(7.96e-06)]_449 270115 1.25e-04 91_[+1(6.00e-09)]_393 31543 3.56e-03 306_[+1(9.10e-07)]_178 31658 3.89e-02 101_[+2(3.88e-06)]_383 33153 1.67e-04 475_[+1(1.12e-08)]_9 33895 1.43e-08 179_[+2(4.68e-06)]_109_\ [+1(3.44e-06)]_7_[+3(2.56e-08)]_152 38991 1.24e-04 438_[+3(7.63e-09)]_41 40704 6.01e-06 75_[+2(4.68e-06)]_353_\ [+1(8.40e-07)]_40 6592 9.93e-06 436_[+3(1.87e-10)]_43 7440 2.65e-11 137_[+2(9.52e-09)]_119_\ [+1(2.68e-06)]_82_[+3(1.85e-08)]_109 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************