******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/347/347.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 46447 1.0000 500 46493 1.0000 500 49003 1.0000 500 41106 1.0000 500 43903 1.0000 500 34859 1.0000 500 45757 1.0000 500 12927 1.0000 500 40219 1.0000 500 42461 1.0000 500 44024 1.0000 500 48869 1.0000 500 32286 1.0000 500 36424 1.0000 500 44612 1.0000 500 47497 1.0000 500 40676 1.0000 500 49927 1.0000 500 37941 1.0000 500 35756 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/347/347.seqs.fa -oc motifs/347 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10000 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.289 C 0.220 G 0.219 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.289 C 0.220 G 0.219 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 18 llr = 182 E-value = 2.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 2a49716263:::136 pos.-specific C :::128:3:4241152 probability G 7:6:::4441:1472: matrix T 1:::11:2:28561:3 bits 2.2 2.0 1.8 * 1.5 * * Relative 1.3 * * * Entropy 1.1 **** * * (14.6 bits) 0.9 ******* * **** 0.7 ******* * ***** 0.4 ******* * ****** 0.2 **************** 0.0 ---------------- Multilevel GAGAACAGACTTTGCA consensus A A C GCGACCG AT sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 12927 212 1.56e-08 ACTGGATGGG GAGAACAGACTTTGCC ATGTGGGACC 41106 279 2.27e-07 CGTGAGAACG GAGACCGCACTCGGCT CCCAACCAAA 32286 451 4.59e-07 TGCTAACTGT AAGAACAAACTTTGCA CAAACCTGCA 46447 420 9.95e-07 CAGGCCTATC AAGAACGGAATCTGAA CAAGGACAAT 40676 142 5.13e-06 TTCCTGCATG GAGAACGGGTTTCGCT TACAGTTTGA 48869 478 7.42e-06 TCACCATAGC GAAAACGAGACTTGGA GTTCACC 36424 475 8.15e-06 AAGAATACAG TAGAACAAACCCTGCA AGAACACACT 42461 73 8.15e-06 TTAGCCGTAA AAAAACGCAGTTTGCA CTATATGTAG 45757 326 8.15e-06 GTCTAGAAAA GAAAATAGACTCGGGC GTACTTGATG 49927 43 8.91e-06 AGCTGTGAGG GAGATCACGTTCGGCT GGTTTCGGTC 37941 3 1.06e-05 TT GAGACAAGGCTCGGAA TTTAAGTTGG 49003 297 1.06e-05 TATTTCTACA GAGAACATGATTGAAA TAGTGACTGT 44024 36 1.16e-05 AACAAATACA GAGACCAGGATTTTGT TTTCAATACT 34859 266 1.37e-05 TCAAACAATC GAAAACATACTTTCAA AATTTCAAAA 40219 160 1.49e-05 ACGTCACGGA GAACCCGGACTCGGCC GATGTCACAG 47497 396 1.89e-05 GACCCCACAC GAGAACGCGACCTTGT CTTGATAGTC 46493 81 2.37e-05 TTGCAAATTC AAAAACACATTGGGCA CTTTGTTTCC 44612 161 1.62e-04 AAAGTTAATT GAAAATATAACTTAAA TAGTATTTCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12927 1.6e-08 211_[+1]_273 41106 2.3e-07 278_[+1]_206 32286 4.6e-07 450_[+1]_34 46447 9.9e-07 419_[+1]_65 40676 5.1e-06 141_[+1]_343 48869 7.4e-06 477_[+1]_7 36424 8.1e-06 474_[+1]_10 42461 8.1e-06 72_[+1]_412 45757 8.1e-06 325_[+1]_159 49927 8.9e-06 42_[+1]_442 37941 1.1e-05 2_[+1]_482 49003 1.1e-05 296_[+1]_188 44024 1.2e-05 35_[+1]_449 34859 1.4e-05 265_[+1]_219 40219 1.5e-05 159_[+1]_325 47497 1.9e-05 395_[+1]_89 46493 2.4e-05 80_[+1]_404 44612 0.00016 160_[+1]_324 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=18 12927 ( 212) GAGAACAGACTTTGCC 1 41106 ( 279) GAGACCGCACTCGGCT 1 32286 ( 451) AAGAACAAACTTTGCA 1 46447 ( 420) AAGAACGGAATCTGAA 1 40676 ( 142) GAGAACGGGTTTCGCT 1 48869 ( 478) GAAAACGAGACTTGGA 1 36424 ( 475) TAGAACAAACCCTGCA 1 42461 ( 73) AAAAACGCAGTTTGCA 1 45757 ( 326) GAAAATAGACTCGGGC 1 49927 ( 43) GAGATCACGTTCGGCT 1 37941 ( 3) GAGACAAGGCTCGGAA 1 49003 ( 297) GAGAACATGATTGAAA 1 44024 ( 36) GAGACCAGGATTTTGT 1 34859 ( 266) GAAAACATACTTTCAA 1 40219 ( 160) GAACCCGGACTCGGCC 1 47497 ( 396) GAGAACGCGACCTTGT 1 46493 ( 81) AAAAACACATTGGGCA 1 44612 ( 161) GAAAATATAACTTAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 9.92035 E= 2.2e+002 -38 -1081 172 -228 179 -1081 -1081 -1081 43 -1081 148 -1081 171 -199 -1081 -1081 132 1 -1081 -228 -238 192 -1081 -128 108 -1081 83 -1081 -80 33 83 -70 108 -1081 83 -1081 20 101 -198 -70 -1081 1 -1081 152 -1081 101 -198 88 -1081 -199 83 104 -138 -199 172 -128 -6 118 2 -1081 94 -40 -1081 4 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 18 E= 2.2e+002 0.222222 0.000000 0.722222 0.055556 1.000000 0.000000 0.000000 0.000000 0.388889 0.000000 0.611111 0.000000 0.944444 0.055556 0.000000 0.000000 0.722222 0.222222 0.000000 0.055556 0.055556 0.833333 0.000000 0.111111 0.611111 0.000000 0.388889 0.000000 0.166667 0.277778 0.388889 0.166667 0.611111 0.000000 0.388889 0.000000 0.333333 0.444444 0.055556 0.166667 0.000000 0.222222 0.000000 0.777778 0.000000 0.444444 0.055556 0.500000 0.000000 0.055556 0.388889 0.555556 0.111111 0.055556 0.722222 0.111111 0.277778 0.500000 0.222222 0.000000 0.555556 0.166667 0.000000 0.277778 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA]A[GA]A[AC]C[AG][GC][AG][CA][TC][TC][TG]G[CAG][AT] -------------------------------------------------------------------------------- Time 3.77 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 5 llr = 97 E-value = 2.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :4:4::624:4::::::2::: pos.-specific C 2::4a:28:8::a:6:822:8 probability G 8482:a::6::a:a42:26:: matrix T :22:::2::26::::8242a2 bits 2.2 ** *** 2.0 ** *** * 1.8 ** *** * 1.5 * ** *** * Relative 1.3 * * ** * * ****** ** Entropy 1.1 * * ** *** ****** ** (27.9 bits) 0.9 * * ** ********** ** 0.7 * * ** ********** *** 0.4 ***************** *** 0.2 ***************** *** 0.0 --------------------- Multilevel GAGACGACGCTGCGCTCTGTC consensus CGTC CAATA GGTAC T sequence T G T CT G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 41106 200 8.49e-12 CAAATTTACG GAGCCGACGCTGCGCTCTTTC TCAAAACACT 43903 195 1.84e-10 TCTGGTTTAT GGGCCGAAACTGCGGTCAGTC ACAGCAAGCG 49927 426 7.82e-10 AAGACGAGTA GGTGCGACACTGCGCTCCCTC GTCGGATCTT 35756 43 1.37e-09 CCAAAACCCA GAGACGCCGTAGCGCGCGGTC TTTGCTGGGG 40219 306 9.78e-09 TTTGTAGTGC CTGACGTCGCAGCGGTTTGTT GCCTCCCAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41106 8.5e-12 199_[+2]_280 43903 1.8e-10 194_[+2]_285 49927 7.8e-10 425_[+2]_54 35756 1.4e-09 42_[+2]_437 40219 9.8e-09 305_[+2]_174 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=5 41106 ( 200) GAGCCGACGCTGCGCTCTTTC 1 43903 ( 195) GGGCCGAAACTGCGGTCAGTC 1 49927 ( 426) GGTGCGACACTGCGCTCCCTC 1 35756 ( 43) GAGACGCCGTAGCGCGCGGTC 1 40219 ( 306) CTGACGTCGCAGCGGTTTGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9600 bayes= 11.1578 E= 2.9e+002 -897 -14 186 -897 47 -897 87 -44 -897 -897 186 -44 47 86 -13 -897 -897 218 -897 -897 -897 -897 219 -897 105 -14 -897 -44 -53 186 -897 -897 47 -897 145 -897 -897 186 -897 -44 47 -897 -897 115 -897 -897 219 -897 -897 218 -897 -897 -897 -897 219 -897 -897 144 87 -897 -897 -897 -13 156 -897 186 -897 -44 -53 -14 -13 56 -897 -14 145 -44 -897 -897 -897 188 -897 186 -897 -44 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 2.9e+002 0.000000 0.200000 0.800000 0.000000 0.400000 0.000000 0.400000 0.200000 0.000000 0.000000 0.800000 0.200000 0.400000 0.400000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.200000 0.000000 0.200000 0.200000 0.800000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.800000 0.000000 0.200000 0.400000 0.000000 0.000000 0.600000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.800000 0.000000 0.200000 0.200000 0.200000 0.200000 0.400000 0.000000 0.200000 0.600000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GC][AGT][GT][ACG]CG[ACT][CA][GA][CT][TA]GCG[CG][TG][CT][TACG][GCT]T[CT] -------------------------------------------------------------------------------- Time 7.14 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 5 llr = 80 E-value = 2.1e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::22:aaa4288:::: pos.-specific C :::2:::::6::::8: probability G a:24a:::2:22:a2: matrix T :a62::::42::a::a bits 2.2 * * * 2.0 ** * ** * 1.8 ** **** ** * 1.5 ** **** **** Relative 1.3 ** **** **** Entropy 1.1 ** **** ****** (23.1 bits) 0.9 ** **** ****** 0.7 *** **** ******* 0.4 *** ************ 0.2 **************** 0.0 ---------------- Multilevel GTTGGAAAACAATGCT consensus AA TAGG G sequence GC GT T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 48869 93 4.93e-09 ATGTGAAATC GTTGGAAAACAGTGCT GGTGGACGAG 46493 110 1.22e-08 TGTTTCCCGG GTTCGAAATCGATGCT TTCTGTTAAA 44024 67 3.02e-08 ATACTCGTTT GTTTGAAAAAAATGCT ATTCGTCTCT 32286 429 3.33e-08 ACTTTTCGAC GTGAGAAAGCAATGCT AACTGTAAGA 37941 260 1.25e-07 GTCCAATTTT GTAGGAAATTAATGGT AACAATTAGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48869 4.9e-09 92_[+3]_392 46493 1.2e-08 109_[+3]_375 44024 3e-08 66_[+3]_418 32286 3.3e-08 428_[+3]_56 37941 1.3e-07 259_[+3]_225 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=5 48869 ( 93) GTTGGAAAACAGTGCT 1 46493 ( 110) GTTCGAAATCGATGCT 1 44024 ( 67) GTTTGAAAAAAATGCT 1 32286 ( 429) GTGAGAAAGCAATGCT 1 37941 ( 260) GTAGGAAATTAATGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 11.1728 E= 2.1e+003 -897 -897 219 -897 -897 -897 -897 188 -53 -897 -13 115 -53 -14 87 -44 -897 -897 219 -897 179 -897 -897 -897 179 -897 -897 -897 179 -897 -897 -897 47 -897 -13 56 -53 144 -897 -44 146 -897 -13 -897 146 -897 -13 -897 -897 -897 -897 188 -897 -897 219 -897 -897 186 -13 -897 -897 -897 -897 188 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 2.1e+003 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.200000 0.000000 0.200000 0.600000 0.200000 0.200000 0.400000 0.200000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.000000 0.200000 0.400000 0.200000 0.600000 0.000000 0.200000 0.800000 0.000000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GT[TAG][GACT]GAAA[ATG][CAT][AG][AG]TG[CG]T -------------------------------------------------------------------------------- Time 10.50 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46447 2.88e-03 419_[+1(9.95e-07)]_65 46493 2.35e-06 80_[+1(2.37e-05)]_13_[+3(1.22e-08)]_\ 375 49003 6.95e-02 296_[+1(1.06e-05)]_188 41106 1.61e-10 199_[+2(8.49e-12)]_58_\ [+1(2.27e-07)]_206 43903 1.44e-06 194_[+2(1.84e-10)]_285 34859 2.29e-02 265_[+1(1.37e-05)]_219 45757 6.48e-02 325_[+1(8.15e-06)]_159 12927 9.30e-05 211_[+1(1.56e-08)]_273 40219 3.20e-06 159_[+1(1.49e-05)]_130_\ [+2(9.78e-09)]_174 42461 3.71e-02 72_[+1(8.15e-06)]_412 44024 3.77e-06 35_[+1(1.16e-05)]_15_[+3(3.02e-08)]_\ 418 48869 5.62e-07 92_[+3(4.93e-09)]_369_\ [+1(7.42e-06)]_7 32286 2.25e-07 339_[+3(2.83e-05)]_73_\ [+3(3.33e-08)]_6_[+1(4.59e-07)]_34 36424 2.15e-02 474_[+1(8.15e-06)]_10 44612 1.74e-01 500 47497 2.82e-02 395_[+1(1.89e-05)]_89 40676 2.96e-02 141_[+1(5.13e-06)]_343 49927 2.78e-07 42_[+1(8.91e-06)]_367_\ [+2(7.82e-10)]_54 37941 2.37e-05 2_[+1(1.06e-05)]_241_[+3(1.25e-07)]_\ 225 35756 1.77e-05 42_[+2(1.37e-09)]_387_\ [+2(4.28e-05)]_29 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************