******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/446/446.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10048 1.0000 500 11145 1.0000 500 16114 1.0000 500 16378 1.0000 500 23682 1.0000 500 23938 1.0000 500 23941 1.0000 500 24564 1.0000 500 3075 1.0000 500 3640 1.0000 500 3900 1.0000 500 3954 1.0000 500 6673 1.0000 500 9240 1.0000 500 9242 1.0000 500 972 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/446/446.seqs.fa -oc motifs/446 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.275 C 0.224 G 0.232 T 0.269 Background letter frequencies (from dataset with add-one prior applied): A 0.275 C 0.224 G 0.232 T 0.269 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 9 llr = 149 E-value = 4.5e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1:::347::86:6:241:18 pos.-specific C 92967422a2382a3::a82 probability G ::11:1:::::::::::::: matrix T :8:3::18::122:469:1: bits 2.2 * * * 1.9 * * * 1.7 * * * * * 1.5 * * * * * Relative 1.3 *** ** * * ** Entropy 1.1 *** * *** * * **** (23.9 bits) 0.9 *** * *** * * ***** 0.6 ************ * ***** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel CTCCCAATCAACACTTTCCA consensus C TACCC CCTC CA C sequence T A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 3900 460 3.15e-10 ATAAGCTCCA CTCTCAATCACCACTTTCCC GTTCATCTCA 16114 459 3.15e-10 ATAAGCTCCA CTCTCAATCACCACTTTCCC GTTCATCTCA 9240 467 3.66e-10 CGACCACTTT CTCCCCATCAATACATTCCA ATACAAACAT 24564 472 2.95e-09 CGACCACTAA CTCCCGATCAATACATTCCA ATACAAACA 10048 471 1.47e-08 CCTCTTCAAT CTCCACCTCAACCCCATCAA TCCATAAACA 23682 305 2.21e-08 GAGATTCCAC CTCCAAACCAACTCTATCTA TACCCAGATC 6673 439 6.33e-08 GCAGGGGCCC CTCTACTTCCTCCCTTTCCA AGCCGGAGTT 3640 474 8.08e-08 TCACCCATCA ACGCCCATCACCTCCATCCA TCAGCTC 3954 463 1.90e-07 TTTGATACCA CCCGCACCCCACACCAACCA CCTCTAGGCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3900 3.1e-10 459_[+1]_21 16114 3.1e-10 458_[+1]_22 9240 3.7e-10 466_[+1]_14 24564 3e-09 471_[+1]_9 10048 1.5e-08 470_[+1]_10 23682 2.2e-08 304_[+1]_176 6673 6.3e-08 438_[+1]_42 3640 8.1e-08 473_[+1]_7 3954 1.9e-07 462_[+1]_18 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=9 3900 ( 460) CTCTCAATCACCACTTTCCC 1 16114 ( 459) CTCTCAATCACCACTTTCCC 1 9240 ( 467) CTCCCCATCAATACATTCCA 1 24564 ( 472) CTCCCGATCAATACATTCCA 1 10048 ( 471) CTCCACCTCAACCCCATCAA 1 23682 ( 305) CTCCAAACCAACTCTATCTA 1 6673 ( 439) CTCTACTTCCTCCCTTTCCA 1 3640 ( 474) ACGCCCATCACCTCCATCCA 1 3954 ( 463) CCCGCACCCCACACCAACCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7696 bayes= 10.587 E= 4.5e-006 -130 199 -982 -982 -982 -1 -982 153 -982 199 -106 -982 -982 131 -106 31 28 157 -982 -982 69 99 -106 -982 128 -1 -982 -127 -982 -1 -982 153 -982 216 -982 -982 150 -1 -982 -982 101 57 -982 -127 -982 179 -982 -27 101 -1 -982 -27 -982 216 -982 -982 -31 57 -982 72 69 -982 -982 105 -130 -982 -982 172 -982 216 -982 -982 -130 179 -982 -127 150 -1 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 9 E= 4.5e-006 0.111111 0.888889 0.000000 0.000000 0.000000 0.222222 0.000000 0.777778 0.000000 0.888889 0.111111 0.000000 0.000000 0.555556 0.111111 0.333333 0.333333 0.666667 0.000000 0.000000 0.444444 0.444444 0.111111 0.000000 0.666667 0.222222 0.000000 0.111111 0.000000 0.222222 0.000000 0.777778 0.000000 1.000000 0.000000 0.000000 0.777778 0.222222 0.000000 0.000000 0.555556 0.333333 0.000000 0.111111 0.000000 0.777778 0.000000 0.222222 0.555556 0.222222 0.000000 0.222222 0.000000 1.000000 0.000000 0.000000 0.222222 0.333333 0.000000 0.444444 0.444444 0.000000 0.000000 0.555556 0.111111 0.000000 0.000000 0.888889 0.000000 1.000000 0.000000 0.000000 0.111111 0.777778 0.000000 0.111111 0.777778 0.222222 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[TC]C[CT][CA][AC][AC][TC]C[AC][AC][CT][ACT]C[TCA][TA]TCC[AC] -------------------------------------------------------------------------------- Time 2.40 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 8 llr = 136 E-value = 1.8e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3::611:11::6149:5a:4: pos.-specific C 8:339118:13:51:a::a:4 probability G ::8::1:1::6:431:4::5: matrix T :a:1:69:9914:3::1::16 bits 2.2 * * 1.9 * * ** 1.7 * * ** 1.5 * * * ** Relative 1.3 *** * * ** ** ** Entropy 1.1 *** * **** ** ** * (24.4 bits) 0.9 *** * ****** ** ** * 0.6 ***** ******* ******* 0.4 ************* ******* 0.2 ************* ******* 0.0 --------------------- Multilevel CTGACTTCTTGACAACAACGT consensus A CC CTGG G AC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 3900 152 2.59e-13 CGAACGTCGT CTGACTTCTTGACAACAACGT TGATGCAAGA 16114 151 2.59e-13 CGAACGTCGT CTGACTTCTTGACAACAACGT TGATGCAAGA 972 343 1.36e-09 GTTCCAGTTT CTGCCTTCTTCTGTACGACAC TACGTCGCTG 11145 426 1.30e-08 CACCTTCTCA CTGACATCACGACGACGACGT CGATACAACA 10048 206 2.94e-08 GTCACATCAC ATCACGTGTTGAGGACGACGC TCGCTCTCCG 3075 148 4.98e-08 CGATGACAAC ATCTCTTCTTCTGAACAACTT GATTGTGCGT 24564 14 8.94e-08 CTGCTGACGG CTGACCTCTTTTCTGCTACAC ATCATGCGGA 9240 378 2.14e-07 AGGCACGAAG CTGCATCATTGAACACAACAT CTCACTCTAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3900 2.6e-13 151_[+2]_328 16114 2.6e-13 150_[+2]_329 972 1.4e-09 342_[+2]_137 11145 1.3e-08 425_[+2]_54 10048 2.9e-08 205_[+2]_274 3075 5e-08 147_[+2]_332 24564 8.9e-08 13_[+2]_466 9240 2.1e-07 377_[+2]_102 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=8 3900 ( 152) CTGACTTCTTGACAACAACGT 1 16114 ( 151) CTGACTTCTTGACAACAACGT 1 972 ( 343) CTGCCTTCTTCTGTACGACAC 1 11145 ( 426) CTGACATCACGACGACGACGT 1 10048 ( 206) ATCACGTGTTGAGGACGACGC 1 3075 ( 148) ATCTCTTCTTCTGAACAACTT 1 24564 ( 14) CTGACCTCTTTTCTGCTACAC 1 9240 ( 378) CTGCATCATTGAACACAACAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 9.90539 E= 1.8e-002 -14 174 -965 -965 -965 -965 -965 189 -965 16 169 -965 118 16 -965 -110 -113 196 -965 -965 -113 -84 -89 122 -965 -84 -965 170 -113 174 -89 -965 -113 -965 -965 170 -965 -84 -965 170 -965 16 143 -110 118 -965 -965 48 -113 116 69 -965 45 -84 11 -10 167 -965 -89 -965 -965 216 -965 -965 86 -965 69 -110 186 -965 -965 -965 -965 216 -965 -965 45 -965 111 -110 -965 74 -965 122 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 1.8e-002 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.625000 0.250000 0.000000 0.125000 0.125000 0.875000 0.000000 0.000000 0.125000 0.125000 0.125000 0.625000 0.000000 0.125000 0.000000 0.875000 0.125000 0.750000 0.125000 0.000000 0.125000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.875000 0.000000 0.250000 0.625000 0.125000 0.625000 0.000000 0.000000 0.375000 0.125000 0.500000 0.375000 0.000000 0.375000 0.125000 0.250000 0.250000 0.875000 0.000000 0.125000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.375000 0.125000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.375000 0.000000 0.500000 0.125000 0.000000 0.375000 0.000000 0.625000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CA]T[GC][AC]CTTCTT[GC][AT][CG][AGT]AC[AG]AC[GA][TC] -------------------------------------------------------------------------------- Time 4.55 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 8 llr = 120 E-value = 2.1e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 3::4::::::11::1a pos.-specific C :44::::1::811:3: probability G :65:5a::a1:8::6: matrix T 8:165:a9:91:9a:: bits 2.2 * * 1.9 ** * * * 1.7 ** * * * 1.5 ** * * * Relative 1.3 ***** ** * Entropy 1.1 ** ********** * (21.7 bits) 0.9 ** ************* 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TGGTGGTTGTCGTTGA consensus ACCAT C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 16378 37 1.94e-09 AATGGCGTCA TCGTTGTTGTCGTTGA GGCGGCGGGC 3900 324 3.10e-09 TGCCTTCGAC TGGTGGTTGTCGTTCA AAAATACAGT 16114 323 3.10e-09 TGCCTTCGAC TGGTGGTTGTCGTTCA AAAATACAGT 3640 5 1.42e-08 GTAG TCTTGGTTGTCGTTGA GCTCGTCGAG 23682 9 1.91e-07 CTGCTGTC ACCATGTTGGCGTTGA TGGCTGCCAC 3954 322 2.45e-07 ATAGGAGATG TGCATGTCGTAGTTGA TAAACGTAGG 3075 406 2.73e-07 AGAGAGCGAG AGGTGGTTGTTCTTGA TCACCTCCCG 23941 238 7.19e-07 AGCCAGAATT TGCATGTTGTCACTAA TGTTCTCTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 16378 1.9e-09 36_[+3]_448 3900 3.1e-09 323_[+3]_161 16114 3.1e-09 322_[+3]_162 3640 1.4e-08 4_[+3]_480 23682 1.9e-07 8_[+3]_476 3954 2.5e-07 321_[+3]_163 3075 2.7e-07 405_[+3]_79 23941 7.2e-07 237_[+3]_247 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=8 16378 ( 37) TCGTTGTTGTCGTTGA 1 3900 ( 324) TGGTGGTTGTCGTTCA 1 16114 ( 323) TGGTGGTTGTCGTTCA 1 3640 ( 5) TCTTGGTTGTCGTTGA 1 23682 ( 9) ACCATGTTGGCGTTGA 1 3954 ( 322) TGCATGTCGTAGTTGA 1 3075 ( 406) AGGTGGTTGTTCTTGA 1 23941 ( 238) TGCATGTTGTCACTAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7760 bayes= 10.6579 E= 2.1e-002 -14 -965 -965 148 -965 74 143 -965 -965 74 111 -110 45 -965 -965 122 -965 -965 111 89 -965 -965 211 -965 -965 -965 -965 189 -965 -84 -965 170 -965 -965 211 -965 -965 -965 -89 170 -113 174 -965 -110 -113 -84 169 -965 -965 -84 -965 170 -965 -965 -965 189 -113 16 143 -965 186 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 2.1e-002 0.250000 0.000000 0.000000 0.750000 0.000000 0.375000 0.625000 0.000000 0.000000 0.375000 0.500000 0.125000 0.375000 0.000000 0.000000 0.625000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.125000 0.000000 0.875000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.875000 0.125000 0.750000 0.000000 0.125000 0.125000 0.125000 0.750000 0.000000 0.000000 0.125000 0.000000 0.875000 0.000000 0.000000 0.000000 1.000000 0.125000 0.250000 0.625000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TA][GC][GC][TA][GT]GTTGTCGTT[GC]A -------------------------------------------------------------------------------- Time 6.80 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10048 1.67e-08 205_[+2(2.94e-08)]_244_\ [+1(1.47e-08)]_10 11145 2.51e-04 425_[+2(1.30e-08)]_54 16114 3.96e-20 150_[+2(2.59e-13)]_151_\ [+3(3.10e-09)]_120_[+1(3.15e-10)]_22 16378 4.16e-05 36_[+3(1.94e-09)]_448 23682 2.42e-08 8_[+3(1.91e-07)]_280_[+1(2.21e-08)]_\ 176 23938 3.91e-01 500 23941 1.62e-03 237_[+3(7.19e-07)]_247 24564 5.71e-09 13_[+2(8.94e-08)]_437_\ [+1(2.95e-09)]_9 3075 2.42e-07 147_[+2(4.98e-08)]_237_\ [+3(2.73e-07)]_79 3640 6.87e-09 4_[+3(1.42e-08)]_5_[+3(8.01e-05)]_\ 375_[+1(6.28e-06)]_37_[+1(8.08e-08)]_7 3900 3.96e-20 151_[+2(2.59e-13)]_151_\ [+3(3.10e-09)]_120_[+1(3.15e-10)]_21 3954 1.38e-06 321_[+3(2.45e-07)]_125_\ [+1(1.90e-07)]_18 6673 2.16e-04 438_[+1(6.33e-08)]_42 9240 3.30e-09 92_[+1(1.41e-05)]_265_\ [+2(2.14e-07)]_68_[+1(3.66e-10)]_14 9242 8.36e-01 500 972 4.22e-06 342_[+2(1.36e-09)]_137 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************