******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/155/155.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 5762 1.0000 500 47878 1.0000 500 4283 1.0000 500 52368 1.0000 500 44174 1.0000 500 44177 1.0000 500 35336 1.0000 500 46126 1.0000 500 43370 1.0000 500 50124 1.0000 500 45086 1.0000 500 44420 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/155/155.seqs.fa -oc motifs/155 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.249 G 0.222 T 0.262 Background letter frequencies (from dataset with add-one prior applied): A 0.267 C 0.249 G 0.222 T 0.261 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 10 llr = 123 E-value = 4.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :795112aa5279:87 pos.-specific C 9:134:3::522:4:3 probability G 13:2391:::61:62: matrix T ::::2:4:::::1::: bits 2.2 2.0 ** 1.7 * ** 1.5 * * * ** * Relative 1.3 * * * ** * * Entropy 1.1 *** * ** **** (17.7 bits) 0.9 *** * *** ***** 0.7 *** * ********* 0.4 **** * ********* 0.2 **************** 0.0 ---------------- Multilevel CAAACGTAAAGAAGAA consensus G CG C CAC CGC sequence GT A C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 35336 387 1.32e-08 TAATGTAACA CAACGGCAAAGAAGAA ACATACACAC 50124 394 2.23e-08 CTTTGTTGAC CAAACGAAAAGAACAA CCTGCTTTAC 47878 447 4.02e-08 TTTTGTATAA CAAAGGTAACGAACAC CAACCAAGCA 5762 369 2.04e-07 GACCGGAATT CAACCGCAACCAACAA AGACTGCTTT 44420 352 4.43e-07 CAAGTCAGTT CAAGGGTAAAGCAGAC AGCCCTCTGC 43370 103 9.53e-07 CAAACAACAG CAACTGTAAAAAAGGA TGTACGCACA 46126 32 4.63e-06 CAAAGGAAAT GGAACGAAACGATGAA AAAGTGCCCC 4283 348 4.63e-06 CAATCCGTTG CGAACGGAACGCACGC AACCGGTGGT 44177 84 6.13e-06 TTATTTTAGT CACATGTAAAAGAGAA GGTTATCAGG 52368 190 1.00e-05 GGCCACGGCT CGAGAACAACCAAGAA CATACGTAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35336 1.3e-08 386_[+1]_98 50124 2.2e-08 393_[+1]_91 47878 4e-08 446_[+1]_38 5762 2e-07 368_[+1]_116 44420 4.4e-07 351_[+1]_133 43370 9.5e-07 102_[+1]_382 46126 4.6e-06 31_[+1]_453 4283 4.6e-06 347_[+1]_137 44177 6.1e-06 83_[+1]_401 52368 1e-05 189_[+1]_295 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=10 35336 ( 387) CAACGGCAAAGAAGAA 1 50124 ( 394) CAAACGAAAAGAACAA 1 47878 ( 447) CAAAGGTAACGAACAC 1 5762 ( 369) CAACCGCAACCAACAA 1 44420 ( 352) CAAGGGTAAAGCAGAC 1 43370 ( 103) CAACTGTAAAAAAGGA 1 46126 ( 32) GGAACGAAACGATGAA 1 4283 ( 348) CGAACGGAACGCACGC 1 44177 ( 84) CACATGTAAAAGAGAA 1 52368 ( 190) CGAGAACAACCAAGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 9.43433 E= 4.0e+000 -997 185 -115 -997 139 -997 43 -997 175 -131 -997 -997 90 27 -15 -997 -142 68 43 -39 -142 -997 202 -997 -42 27 -115 61 190 -997 -997 -997 190 -997 -997 -997 90 101 -997 -997 -42 -32 143 -997 139 -32 -115 -997 175 -997 -997 -138 -997 68 143 -997 158 -997 -15 -997 139 27 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 4.0e+000 0.000000 0.900000 0.100000 0.000000 0.700000 0.000000 0.300000 0.000000 0.900000 0.100000 0.000000 0.000000 0.500000 0.300000 0.200000 0.000000 0.100000 0.400000 0.300000 0.200000 0.100000 0.000000 0.900000 0.000000 0.200000 0.300000 0.100000 0.400000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.200000 0.200000 0.600000 0.000000 0.700000 0.200000 0.100000 0.000000 0.900000 0.000000 0.000000 0.100000 0.000000 0.400000 0.600000 0.000000 0.800000 0.000000 0.200000 0.000000 0.700000 0.300000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[AG]A[ACG][CGT]G[TCA]AA[AC][GAC][AC]A[GC][AG][AC] -------------------------------------------------------------------------------- Time 1.28 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 12 llr = 130 E-value = 7.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 28::::::1::2::: pos.-specific C 2:37771:1513:57 probability G 7:1:217:1:92::3 matrix T :273233a85:3a51 bits 2.2 2.0 * * 1.7 * * * 1.5 * * * Relative 1.3 * * * * Entropy 1.1 * * * * * (15.7 bits) 0.9 ******** ** *** 0.7 *********** *** 0.4 *********** *** 0.2 *********** *** 0.0 --------------- Multilevel GATCCCGTTCGCTCC consensus CT TT T T TG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 46126 178 7.63e-09 GGGGAACGGG GATCCCGTTCGGTCC GCAGTCGTCC 4283 284 1.37e-07 CTCTCGCATC GACCCCGTTCGTTCG CTCCACGCGT 50124 347 1.04e-06 ACGATGCTTC GACCCCGTGTGTTTC CAATTCTTTC 45086 92 2.83e-06 AATATTATGA AATCTTGTTCGCTTC TTGAAGTAAC 44420 292 3.41e-06 GGCGATTCGT GTTTCCTTTTGGTTC AAAAGATTTC 44174 322 3.73e-06 CTACATCTTT GATCGCGTACGCTCG AGTTCGAAAT 47878 289 3.73e-06 GCGGATGCGG CATTGCGTTCGATTC TGGGTGTCCC 43370 2 4.45e-06 A CATCCTTTTTGCTCG ACCGTAAAGA 5762 234 7.33e-06 AATAGTCTTC GTTCCTCTTTGTTCC ATTGGAACTA 44177 348 1.24e-05 GAAAGAGTTC GATTCCGTCCCTTCC TCAAATTCCG 52368 89 2.25e-05 GTAGTTTCTG GAGCTCGTTTGATTT GCGTCACGAC 35336 466 2.40e-05 CAATCAATTC AACTCGTTTTGCTTC TTATTTGTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46126 7.6e-09 177_[+2]_308 4283 1.4e-07 283_[+2]_202 50124 1e-06 346_[+2]_139 45086 2.8e-06 91_[+2]_394 44420 3.4e-06 291_[+2]_194 44174 3.7e-06 321_[+2]_164 47878 3.7e-06 288_[+2]_197 43370 4.5e-06 1_[+2]_484 5762 7.3e-06 233_[+2]_252 44177 1.2e-05 347_[+2]_138 52368 2.3e-05 88_[+2]_397 35336 2.4e-05 465_[+2]_20 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=12 46126 ( 178) GATCCCGTTCGGTCC 1 4283 ( 284) GACCCCGTTCGTTCG 1 50124 ( 347) GACCCCGTGTGTTTC 1 45086 ( 92) AATCTTGTTCGCTTC 1 44420 ( 292) GTTTCCTTTTGGTTC 1 44174 ( 322) GATCGCGTACGCTCG 1 47878 ( 289) CATTGCGTTCGATTC 1 43370 ( 2) CATCCTTTTTGCTCG 1 5762 ( 234) GTTCCTCTTTGTTCC 1 44177 ( 348) GATTCCGTCCCTTCC 1 52368 ( 89) GAGCTCGTTTGATTT 1 35336 ( 466) AACTCGTTTTGCTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 5832 bayes= 9.37009 E= 7.0e+000 -68 -58 159 -1023 164 -1023 -1023 -65 -1023 1 -141 135 -1023 142 -1023 35 -1023 142 -41 -65 -1023 142 -141 -6 -1023 -158 159 -6 -1023 -1023 -1023 193 -168 -158 -141 152 -1023 101 -1023 93 -1023 -158 204 -1023 -68 42 -41 35 -1023 -1023 -1023 193 -1023 101 -1023 93 -1023 142 17 -165 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 12 E= 7.0e+000 0.166667 0.166667 0.666667 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.250000 0.083333 0.666667 0.000000 0.666667 0.000000 0.333333 0.000000 0.666667 0.166667 0.166667 0.000000 0.666667 0.083333 0.250000 0.000000 0.083333 0.666667 0.250000 0.000000 0.000000 0.000000 1.000000 0.083333 0.083333 0.083333 0.750000 0.000000 0.500000 0.000000 0.500000 0.000000 0.083333 0.916667 0.000000 0.166667 0.333333 0.166667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.666667 0.250000 0.083333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GA[TC][CT]C[CT][GT]TT[CT]G[CT]T[CT][CG] -------------------------------------------------------------------------------- Time 2.57 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 104 E-value = 3.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :3:2:::::::::::8:382: pos.-specific C 2::3a:::2:5:5:::35:5: probability G :535:3272:::2a827::28 matrix T 827::7837a5a3:2::2222 bits 2.2 * 2.0 * * * * 1.7 * * * * 1.5 * * * ** * Relative 1.3 * * * * * **** * * Entropy 1.1 * * **** * * **** * * (24.9 bits) 0.9 * * **** *** **** * * 0.7 ***************** * * 0.4 ******************* * 0.2 ********************* 0.0 --------------------- Multilevel TGTGCTTGTTCTCGGAGCACG consensus AGC G T T T CA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 47878 259 1.24e-10 GGAATACGGT TGGGCGTGTTTTCGGAGTACG CGGATGCGGC 46126 155 3.12e-09 ACGGCGGCGA CATGCTTGTTCTTGGGGAACG GGGATCCCGT 35336 81 7.52e-09 GGAGAGGTTG TATACTTTTTCTCGGAGCTGG CGATAGAGGT 50124 276 8.15e-09 TTTCGAATCG TTTCCGTGTTTTGGGACCAAG TTCGAAATTT 5762 303 8.15e-09 ATTTCGATCG TGTGCTTGGTCTCGTAGAACT TCTTTGACCA 45086 312 4.04e-08 TGGACCTCCG TGGCCTGTCTTTTGGACCATG ATGGGGGCGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47878 1.2e-10 258_[+3]_221 46126 3.1e-09 154_[+3]_325 35336 7.5e-09 80_[+3]_399 50124 8.2e-09 275_[+3]_204 5762 8.2e-09 302_[+3]_177 45086 4e-08 311_[+3]_168 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 47878 ( 259) TGGGCGTGTTTTCGGAGTACG 1 46126 ( 155) CATGCTTGTTCTTGGGGAACG 1 35336 ( 81) TATACTTTTTCTCGGAGCTGG 1 50124 ( 276) TTTCCGTGTTTTGGGACCAAG 1 5762 ( 303) TGTGCTTGGTCTCGTAGAACT 1 45086 ( 312) TGGCCTGTCTTTTGGACCATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5760 bayes= 9.56395 E= 3.6e+002 -923 -58 -923 167 32 -923 117 -65 -923 -923 59 135 -68 42 117 -923 -923 200 -923 -923 -923 -923 59 135 -923 -923 -41 167 -923 -923 158 35 -923 -58 -41 135 -923 -923 -923 193 -923 100 -923 93 -923 -923 -923 193 -923 100 -41 35 -923 -923 217 -923 -923 -923 191 -65 164 -923 -41 -923 -923 42 158 -923 32 100 -923 -65 164 -923 -923 -65 -68 100 -41 -65 -923 -923 191 -65 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 3.6e+002 0.000000 0.166667 0.000000 0.833333 0.333333 0.000000 0.500000 0.166667 0.000000 0.000000 0.333333 0.666667 0.166667 0.333333 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.666667 0.333333 0.000000 0.166667 0.166667 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.166667 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.833333 0.000000 0.166667 0.000000 0.000000 0.333333 0.666667 0.000000 0.333333 0.500000 0.000000 0.166667 0.833333 0.000000 0.000000 0.166667 0.166667 0.500000 0.166667 0.166667 0.000000 0.000000 0.833333 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[GA][TG][GC]C[TG]T[GT]TT[CT]T[CT]GGA[GC][CA]ACG -------------------------------------------------------------------------------- Time 4.20 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5762 5.54e-10 233_[+2(7.33e-06)]_54_\ [+3(8.15e-09)]_45_[+1(2.04e-07)]_116 47878 1.28e-12 258_[+3(1.24e-10)]_9_[+2(3.73e-06)]_\ 143_[+1(4.02e-08)]_38 4283 4.74e-06 283_[+2(1.37e-07)]_49_\ [+1(4.63e-06)]_137 52368 1.25e-03 88_[+2(2.25e-05)]_86_[+1(1.00e-05)]_\ 295 44174 8.92e-03 321_[+2(3.73e-06)]_164 44177 9.39e-04 83_[+1(6.13e-06)]_248_\ [+2(1.24e-05)]_138 35336 1.20e-10 80_[+3(7.52e-09)]_285_\ [+1(1.32e-08)]_63_[+2(2.40e-05)]_20 46126 6.80e-12 31_[+1(4.63e-06)]_107_\ [+3(3.12e-09)]_2_[+2(7.63e-09)]_266_[+2(5.68e-05)]_27 43370 1.20e-05 1_[+2(4.45e-06)]_86_[+1(9.53e-07)]_\ 382 50124 1.13e-11 275_[+3(8.15e-09)]_50_\ [+2(1.04e-06)]_32_[+1(2.23e-08)]_91 45086 3.77e-06 91_[+2(2.83e-06)]_205_\ [+3(4.04e-08)]_168 44420 3.31e-05 291_[+2(3.41e-06)]_45_\ [+1(4.43e-07)]_133 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************