******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/374/374.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 36922 1.0000 500 47160 1.0000 500 37903 1.0000 500 48525 1.0000 500 2527 1.0000 500 23380 1.0000 500 50228 1.0000 500 45204 1.0000 500 4129 1.0000 500 45411 1.0000 500 42496 1.0000 500 38240 1.0000 500 44974 1.0000 500 43372 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/374/374.seqs.fa -oc motifs/374 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.267 C 0.248 G 0.242 T 0.243 Background letter frequencies (from dataset with add-one prior applied): A 0.267 C 0.248 G 0.242 T 0.243 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 7 llr = 114 E-value = 5.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 6a::7:19::43:4::176 pos.-specific C ::6:36:::::7::16:3: probability G 4::a:491:a1:66:33:4 matrix T ::4:::::a:4:4:916:: bits 2.0 * ** 1.8 * * ** 1.6 * * ** 1.4 * * **** * Relative 1.2 * * **** * Entropy 1.0 ********** **** ** (23.4 bits) 0.8 ********** **** ** 0.6 ******************* 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel AACGACGATGACGGTCTAA consensus G T CG TATA GGCG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 44974 249 1.99e-09 ACGGACTTTT GACGCGGATGTCTGTCTAA TGCCAAAGAC 4129 414 3.09e-09 CGTTATTGAA GATGACGATGACGGTTTAA ACTGGGATAC 37903 109 3.95e-09 AAGAACACAA AACGACGATGAAGATCGAA GTCAGGAAAA 43372 109 1.66e-08 GACCGAACGG AACGAGGGTGTCTGTCTCA GTCCCAATTC 2527 438 5.69e-08 TCCAGAGCTC AATGCGGATGACGATGAAG ATGATACTGA 42496 257 6.43e-08 GGGTCTCTTC GATGACAATGTCTGTCGCG TCCCCCACTC 45204 45 9.54e-08 TACAGCGCAA AACGACGATGGAGACGTAG GACATACACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44974 2e-09 248_[+1]_233 4129 3.1e-09 413_[+1]_68 37903 4e-09 108_[+1]_373 43372 1.7e-08 108_[+1]_373 2527 5.7e-08 437_[+1]_44 42496 6.4e-08 256_[+1]_225 45204 9.5e-08 44_[+1]_437 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=7 44974 ( 249) GACGCGGATGTCTGTCTAA 1 4129 ( 414) GATGACGATGACGGTTTAA 1 37903 ( 109) AACGACGATGAAGATCGAA 1 43372 ( 109) AACGAGGGTGTCTGTCTCA 1 2527 ( 438) AATGCGGATGACGATGAAG 1 42496 ( 257) GATGACAATGTCTGTCGCG 1 45204 ( 45) AACGACGATGGAGACGTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 6748 bayes= 9.7551 E= 5.9e+001 109 -945 82 -945 190 -945 -945 -945 -945 120 -945 82 -945 -945 205 -945 142 20 -945 -945 -945 120 82 -945 -90 -945 182 -945 168 -945 -76 -945 -945 -945 -945 204 -945 -945 205 -945 68 -945 -76 82 10 153 -945 -945 -945 -945 124 82 68 -945 124 -945 -945 -79 -945 182 -945 120 24 -76 -90 -945 24 123 142 20 -945 -945 109 -945 82 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 7 E= 5.9e+001 0.571429 0.000000 0.428571 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.571429 0.000000 0.428571 0.000000 0.000000 1.000000 0.000000 0.714286 0.285714 0.000000 0.000000 0.000000 0.571429 0.428571 0.000000 0.142857 0.000000 0.857143 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.428571 0.000000 0.142857 0.428571 0.285714 0.714286 0.000000 0.000000 0.000000 0.000000 0.571429 0.428571 0.428571 0.000000 0.571429 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.571429 0.285714 0.142857 0.142857 0.000000 0.285714 0.571429 0.714286 0.285714 0.000000 0.000000 0.571429 0.000000 0.428571 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AG]A[CT]G[AC][CG]GATG[AT][CA][GT][GA]T[CG][TG][AC][AG] -------------------------------------------------------------------------------- Time 1.66 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 7 llr = 95 E-value = 1.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :a1:::7::3::::9 pos.-specific C 9:3::::a:41:11: probability G 1:3:a33:13:696: matrix T ::3a:7::9:94:31 bits 2.0 ** * 1.8 * ** * 1.6 * ** * 1.4 ** ** ** * * * Relative 1.2 ** *** ** * * * Entropy 1.0 ** ****** *** * (19.6 bits) 0.8 ** ****** *** * 0.6 ** ****** ***** 0.4 ** ************ 0.2 ** ************ 0.0 --------------- Multilevel CACTGTACTCTGGGA consensus G GG A T T sequence T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 37903 165 5.10e-09 TATGAGTCTA CATTGTACTCTTGGA TATTCGTAGG 42496 68 4.16e-08 CACTGTCGGC CAATGTACTATGGGA CCATAAGTTC 50228 464 1.13e-07 ACAAGAATCG CACTGGACTCTTGTA GCAAAGATTG 38240 65 1.81e-07 AGCGACATTT CAGTGTACTATGCGA TTGCGGTGAA 45204 297 1.95e-07 CATCACATCT CAGTGGACTGTTGTA AGGCACGGTT 48525 209 1.55e-06 GGGCCCGTAG CACTGTGCTGCGGGT ACCAGTAGCA 23380 435 2.74e-06 GCTATTGATC GATTGTGCGCTGGCA TTCGATCGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37903 5.1e-09 164_[+2]_321 42496 4.2e-08 67_[+2]_418 50228 1.1e-07 463_[+2]_22 38240 1.8e-07 64_[+2]_421 45204 1.9e-07 296_[+2]_189 48525 1.6e-06 208_[+2]_277 23380 2.7e-06 434_[+2]_51 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=7 37903 ( 165) CATTGTACTCTTGGA 1 42496 ( 68) CAATGTACTATGGGA 1 50228 ( 464) CACTGGACTCTTGTA 1 38240 ( 65) CAGTGTACTATGCGA 1 45204 ( 297) CAGTGGACTGTTGTA 1 48525 ( 209) CACTGTGCTGCGGGT 1 23380 ( 435) GATTGTGCGCTGGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6804 bayes= 10.5296 E= 1.5e+002 -945 179 -76 -945 190 -945 -945 -945 -90 20 24 23 -945 -945 -945 204 -945 -945 205 -945 -945 -945 24 156 142 -945 24 -945 -945 201 -945 -945 -945 -945 -76 182 10 79 24 -945 -945 -79 -945 182 -945 -945 124 82 -945 -79 182 -945 -945 -79 124 23 168 -945 -945 -76 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 7 E= 1.5e+002 0.000000 0.857143 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.285714 0.285714 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.714286 0.714286 0.000000 0.285714 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.285714 0.428571 0.285714 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.571429 0.428571 0.000000 0.142857 0.857143 0.000000 0.000000 0.142857 0.571429 0.285714 0.857143 0.000000 0.000000 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CA[CGT]TG[TG][AG]CT[CAG]T[GT]G[GT]A -------------------------------------------------------------------------------- Time 3.38 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 3 llr = 54 E-value = 6.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::a:::::::::::: pos.-specific C :::7:::::::a::: probability G a7:33a:aa7a:3a: matrix T :3::7:a::3::7:a bits 2.0 * **** ** ** 1.8 * * **** ** ** 1.6 * * **** ** ** 1.4 * * **** ** ** Relative 1.2 *** ******* ** Entropy 1.0 *************** (25.9 bits) 0.8 *************** 0.6 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel GGACTGTGGGGCTGT consensus T GG T G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 37903 189 6.74e-10 ATATTCGTAG GGACTGTGGGGCTGT GATTCGTTGT 38240 320 4.02e-09 TAGCCGAAAG GGAGTGTGGGGCTGT TCACTGTAAC 2527 341 2.06e-08 TTGATTGCAC GTACGGTGGTGCGGT ACCGGAAGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37903 6.7e-10 188_[+3]_297 38240 4e-09 319_[+3]_166 2527 2.1e-08 340_[+3]_145 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=3 37903 ( 189) GGACTGTGGGGCTGT 1 38240 ( 320) GGAGTGTGGGGCTGT 1 2527 ( 341) GTACGGTGGTGCGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6804 bayes= 11.5942 E= 6.2e+002 -823 -823 204 -823 -823 -823 146 46 190 -823 -823 -823 -823 142 46 -823 -823 -823 46 145 -823 -823 204 -823 -823 -823 -823 204 -823 -823 204 -823 -823 -823 204 -823 -823 -823 146 46 -823 -823 204 -823 -823 201 -823 -823 -823 -823 46 145 -823 -823 204 -823 -823 -823 -823 204 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 3 E= 6.2e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 1.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[GT]A[CG][TG]GTGG[GT]GC[TG]GT -------------------------------------------------------------------------------- Time 5.00 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36922 8.74e-01 500 47160 9.80e-01 500 37903 1.36e-15 108_[+1(3.95e-09)]_37_\ [+2(5.10e-09)]_9_[+3(6.74e-10)]_297 48525 8.44e-05 208_[+2(1.55e-06)]_49_\ [+3(9.18e-05)]_8_[+1(6.36e-05)]_186 2527 6.19e-08 340_[+3(2.06e-08)]_82_\ [+1(5.69e-08)]_44 23380 2.33e-03 434_[+2(2.74e-06)]_51 50228 8.39e-04 463_[+2(1.13e-07)]_22 45204 2.54e-07 44_[+1(9.54e-08)]_233_\ [+2(1.95e-07)]_189 4129 5.97e-05 146_[+1(2.76e-05)]_248_\ [+1(3.09e-09)]_68 45411 6.71e-01 500 42496 2.77e-08 67_[+2(4.16e-08)]_174_\ [+1(6.43e-08)]_225 38240 1.57e-08 64_[+2(1.81e-07)]_240_\ [+3(4.02e-09)]_166 44974 4.97e-05 248_[+1(1.99e-09)]_233 43372 1.17e-04 108_[+1(1.66e-08)]_373 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************