******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/26/26.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 14374 1.0000 500 20663 1.0000 500 23505 1.0000 500 24623 1.0000 500 262963 1.0000 500 262964 1.0000 500 30963 1.0000 500 35005 1.0000 500 35523 1.0000 500 5715 1.0000 500 7863 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/26/26.seqs.fa -oc motifs/26 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.246 C 0.238 G 0.236 T 0.280 Background letter frequencies (from dataset with add-one prior applied): A 0.246 C 0.238 G 0.236 T 0.280 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 7 llr = 125 E-value = 2.1e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1a1:61:71:3::74:::7:: pos.-specific C 1:4:41:::::4:14176::: probability G 7:1a::a::a4:a:19:3::7 matrix T ::3::7:39:36:1::313a3 bits 2.1 * * * * * 1.9 * * * * * * 1.7 * * * * * * 1.5 * * * * * * * Relative 1.2 * * * ** * * ** Entropy 1.0 * ** **** ** ** *** (25.8 bits) 0.8 ** ******* *** ** *** 0.6 ** ******* ********** 0.4 ** ****************** 0.2 ********************* 0.0 --------------------- Multilevel GACGATGATGGTGAAGCCATG consensus T C T AC C TGT T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 262964 114 1.21e-11 ACCGATGTGA GACGATGATGGCGACGCCTTG TGCTTATTGT 262963 114 1.21e-11 ACCGATGTGA GACGATGATGGCGACGCCTTG TGCTTATTGT 20663 31 3.51e-10 CTCTTTGCTC GAGGCTGATGTTGAGGCCATG GTGTGGACTT 35005 186 2.43e-09 TGTGTCGGCC AATGATGTTGGTGAAGCCATT GGTCACCGAA 24623 41 3.46e-09 CGAAGCCGAC GACGCTGATGATGCACCGATG CCACTGCTGA 23505 365 2.84e-08 CGGGAGTCGA GATGACGTTGATGAAGTTATT CCAACATTGC 35523 119 1.23e-07 CAGCATCGTC CAAGCAGAAGTCGTCGTGATG ACCAGGATTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 262964 1.2e-11 113_[+1]_366 262963 1.2e-11 113_[+1]_366 20663 3.5e-10 30_[+1]_449 35005 2.4e-09 185_[+1]_294 24623 3.5e-09 40_[+1]_439 23505 2.8e-08 364_[+1]_115 35523 1.2e-07 118_[+1]_361 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=7 262964 ( 114) GACGATGATGGCGACGCCTTG 1 262963 ( 114) GACGATGATGGCGACGCCTTG 1 20663 ( 31) GAGGCTGATGTTGAGGCCATG 1 35005 ( 186) AATGATGTTGGTGAAGCCATT 1 24623 ( 41) GACGCTGATGATGCACCGATG 1 23505 ( 365) GATGACGTTGATGAAGTTATT 1 35523 ( 119) CAAGCAGAAGTCGTCGTGATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 10.1634 E= 2.1e-003 -78 -73 159 -945 202 -945 -945 -945 -78 85 -73 3 -945 -945 208 -945 122 85 -945 -945 -78 -73 -945 135 -945 -945 208 -945 154 -945 -945 3 -78 -945 -945 161 -945 -945 208 -945 22 -945 86 3 -945 85 -945 103 -945 -945 208 -945 154 -73 -945 -97 80 85 -73 -945 -945 -73 186 -945 -945 159 -945 3 -945 126 27 -97 154 -945 -945 3 -945 -945 -945 183 -945 -945 159 3 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 2.1e-003 0.142857 0.142857 0.714286 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.428571 0.142857 0.285714 0.000000 0.000000 1.000000 0.000000 0.571429 0.428571 0.000000 0.000000 0.142857 0.142857 0.000000 0.714286 0.000000 0.000000 1.000000 0.000000 0.714286 0.000000 0.000000 0.285714 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 1.000000 0.000000 0.285714 0.000000 0.428571 0.285714 0.000000 0.428571 0.000000 0.571429 0.000000 0.000000 1.000000 0.000000 0.714286 0.142857 0.000000 0.142857 0.428571 0.428571 0.142857 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.714286 0.000000 0.285714 0.000000 0.571429 0.285714 0.142857 0.714286 0.000000 0.000000 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.714286 0.285714 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GA[CT]G[AC]TG[AT]TG[GAT][TC]GA[AC]G[CT][CG][AT]T[GT] -------------------------------------------------------------------------------- Time 1.16 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 8 llr = 131 E-value = 1.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 31::1:6:::::::::6:11: pos.-specific C ::::1111::1:::6:::::3 probability G 691a16145a5:9::419:81 matrix T 1:9:63155:4a1a4631916 bits 2.1 * * 1.9 * * * * 1.7 * * * * 1.5 * * * *** * Relative 1.2 *** * *** ** Entropy 1.0 *** ** ***** *** (23.7 bits) 0.8 **** ** ***** *** 0.6 **** * ************** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GGTGTGATGGGTGTCTAGTGT consensus A T GT T TGT C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 262964 288 2.92e-13 TTGGCTCTGT GGTGTGATGGGTGTCTAGTGT CTAGTAAAAC 262963 288 2.92e-13 TTGGCTCTGT GGTGTGATGGGTGTCTAGTGT CTAGTAAAAC 24623 213 2.25e-08 TGGTGGTGAG GGTGGCACTGTTGTTTAGTGC CGTGGTTGTC 5715 367 4.09e-08 CCATGGTGGC GGTGTTGGGGTTGTCTGTTGT GGTTGTGGTT 20663 65 4.09e-08 TGGACTTGTG TGGGCGAGGGGTGTCGAGTGG GTAGGGGGGG 35005 101 5.05e-08 GATCTTTGGA AGTGTTCTTGTTGTTGAGTTT GAGAGGTGGC 30963 343 5.79e-08 CCGATACAAT GGTGAGAGTGGTGTTGTGAAT GCTCTTGATT 23505 335 3.98e-07 TCCAGTCCTG AATGTGTTTGCTTTCTTGTGC GGGAGTCGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 262964 2.9e-13 287_[+2]_192 262963 2.9e-13 287_[+2]_192 24623 2.2e-08 212_[+2]_267 5715 4.1e-08 366_[+2]_113 20663 4.1e-08 64_[+2]_415 35005 5e-08 100_[+2]_379 30963 5.8e-08 342_[+2]_137 23505 4e-07 334_[+2]_145 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=8 262964 ( 288) GGTGTGATGGGTGTCTAGTGT 1 262963 ( 288) GGTGTGATGGGTGTCTAGTGT 1 24623 ( 213) GGTGGCACTGTTGTTTAGTGC 1 5715 ( 367) GGTGTTGGGGTTGTCTGTTGT 1 20663 ( 65) TGGGCGAGGGGTGTCGAGTGG 1 35005 ( 101) AGTGTTCTTGTTGTTGAGTTT 1 30963 ( 343) GGTGAGAGTGGTGTTGTGAAT 1 23505 ( 335) AATGTGTTTGCTTTCTTGTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 9.36413 E= 1.6e-001 2 -965 140 -116 -97 -965 189 -965 -965 -965 -92 164 -965 -965 208 -965 -97 -93 -92 116 -965 -93 140 -16 135 -93 -92 -116 -965 -93 67 83 -965 -965 108 83 -965 -965 208 -965 -965 -93 108 42 -965 -965 -965 183 -965 -965 189 -116 -965 -965 -965 183 -965 139 -965 42 -965 -965 67 116 135 -965 -92 -16 -965 -965 189 -116 -97 -965 -965 164 -97 -965 166 -116 -965 7 -92 116 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 1.6e-001 0.250000 0.000000 0.625000 0.125000 0.125000 0.000000 0.875000 0.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 1.000000 0.000000 0.125000 0.125000 0.125000 0.625000 0.000000 0.125000 0.625000 0.250000 0.625000 0.125000 0.125000 0.125000 0.000000 0.125000 0.375000 0.500000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.125000 0.500000 0.375000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 0.625000 0.000000 0.375000 0.000000 0.000000 0.375000 0.625000 0.625000 0.000000 0.125000 0.250000 0.000000 0.000000 0.875000 0.125000 0.125000 0.000000 0.000000 0.875000 0.125000 0.000000 0.750000 0.125000 0.000000 0.250000 0.125000 0.625000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GA]GTGT[GT]A[TG][GT]G[GT]TGT[CT][TG][AT]GTG[TC] -------------------------------------------------------------------------------- Time 2.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 101 E-value = 3.2e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::8::::6:4:::26:2:: pos.-specific C 6:2a2:848:a24aa6:a:2a probability G 4a4::a:62:::6::2::48: matrix T ::4:::2::4:4::::4:4:: bits 2.1 * * * * ** * * 1.9 * * * * ** * * 1.7 * * * * ** * * 1.5 * * * * * ** * ** Relative 1.2 * **** * * ** * ** Entropy 1.0 ** ******** *** ** ** (29.2 bits) 0.8 ** ******** *** ** ** 0.6 ** ******** ****** ** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CGGCAGCGCACAGCCCACGGC consensus G T C TCGT TC AT TC sequence C C G A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 262964 35 1.25e-11 CCCCATCTTT GGGCAGCGCACTCCCCACTGC ATTATCGTCC 262963 35 1.25e-11 CCCCATCTTT GGGCAGCGCACTCCCCACTGC ATTATCGTCC 20663 371 8.70e-10 CTCGTCACCT CGTCAGTGCTCAGCCATCGGC CAGTGGCTAG 35523 326 1.43e-09 AGACACTCTC CGCCCGCCCACCGCCCACGCC CCGGCGTCGA 7863 365 1.75e-09 TCCGCGTCGT CGTCAGCCGTCAGCCGTCAGC GCCGCCGCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 262964 1.3e-11 34_[+3]_445 262963 1.3e-11 34_[+3]_445 20663 8.7e-10 370_[+3]_109 35523 1.4e-09 325_[+3]_154 7863 1.7e-09 364_[+3]_115 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 262964 ( 35) GGGCAGCGCACTCCCCACTGC 1 262963 ( 35) GGGCAGCGCACTCCCCACTGC 1 20663 ( 371) CGTCAGTGCTCAGCCATCGGC 1 35523 ( 326) CGCCCGCCCACCGCCCACGCC 1 7863 ( 365) CGTCAGCCGTCAGCCGTCAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 10.9871 E= 3.2e-001 -897 133 76 -897 -897 -897 208 -897 -897 -25 76 51 -897 207 -897 -897 170 -25 -897 -897 -897 -897 208 -897 -897 175 -897 -49 -897 75 134 -897 -897 175 -24 -897 129 -897 -897 51 -897 207 -897 -897 70 -25 -897 51 -897 75 134 -897 -897 207 -897 -897 -897 207 -897 -897 -30 133 -24 -897 129 -897 -897 51 -897 207 -897 -897 -30 -897 76 51 -897 -25 176 -897 -897 207 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 3.2e-001 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.400000 0.400000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.400000 0.600000 0.000000 0.000000 0.800000 0.200000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 1.000000 0.000000 0.000000 0.400000 0.200000 0.000000 0.400000 0.000000 0.400000 0.600000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.600000 0.200000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.400000 0.400000 0.000000 0.200000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG]G[GTC]C[AC]G[CT][GC][CG][AT]C[ATC][GC]CC[CAG][AT]C[GTA][GC]C -------------------------------------------------------------------------------- Time 3.33 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14374 4.70e-01 500 20663 1.23e-15 5_[+1(3.06e-06)]_4_[+1(3.51e-10)]_\ 13_[+2(4.09e-08)]_285_[+3(8.70e-10)]_109 23505 3.76e-07 334_[+2(3.98e-07)]_9_[+1(2.84e-08)]_\ 115 24623 6.07e-10 40_[+1(3.46e-09)]_8_[+1(4.15e-06)]_\ 122_[+2(2.25e-08)]_267 262963 9.23e-24 34_[+3(1.25e-11)]_58_[+1(1.21e-11)]_\ 62_[+2(6.24e-05)]_70_[+2(2.92e-13)]_192 262964 9.23e-24 34_[+3(1.25e-11)]_58_[+1(1.21e-11)]_\ 62_[+2(6.24e-05)]_70_[+2(2.92e-13)]_192 30963 8.69e-04 342_[+2(5.79e-08)]_137 35005 4.70e-09 100_[+2(5.05e-08)]_64_\ [+1(2.43e-09)]_294 35523 1.07e-08 118_[+1(1.23e-07)]_186_\ [+3(1.43e-09)]_154 5715 6.23e-04 366_[+2(4.09e-08)]_113 7863 4.92e-05 364_[+3(1.75e-09)]_115 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************