******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/226/226.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 53935 1.0000 500 17862 1.0000 500 46930 1.0000 500 9709 1.0000 500 10068 1.0000 500 10025 1.0000 500 30113 1.0000 500 10896 1.0000 500 41856 1.0000 500 44833 1.0000 500 34582 1.0000 500 35311 1.0000 500 12233 1.0000 500 20657 1.0000 500 27821 1.0000 500 33675 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/226/226.seqs.fa -oc motifs/226 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.243 C 0.267 G 0.230 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.243 C 0.267 G 0.230 T 0.259 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 10 llr = 111 E-value = 6.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1::::12:4::: pos.-specific C 2::8a::712a: probability G 3:1::9::3::a matrix T 4a92::8328:: bits 2.1 * 1.9 * * ** 1.7 * ** ** 1.5 ** ** ** Relative 1.3 ****** *** Entropy 1.1 ******* *** (16.0 bits) 0.8 ******* *** 0.6 ******* *** 0.4 ******* *** 0.2 ************ 0.0 ------------ Multilevel TTTCCGTCATCG consensus G T ATGC sequence C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44833 394 7.61e-08 TTCCGGGTAC TTTCCGTCATCG TGCATAATAT 27821 234 5.13e-07 CAATTTTCCT TTTCCGTTATCG AGTTTTCTGC 9709 474 6.60e-07 ACGCCGCGCC GTTCCGTCTTCG GCTTTGTTCA 30113 322 1.52e-06 GGTATCCTCG ATTCCGTCGTCG TCGTCATCGT 10068 228 2.03e-06 CATTGATTTT GTTTCGTCATCG CTACCATGAT 46930 413 5.13e-06 ACAATATCCA GTTCCATCGTCG CTTTCCGAGT 17862 274 8.19e-06 GGAGTTTTAC TTTTCGTTTTCG AAAAGTAAAT 10896 400 9.69e-06 GCCGTCAACT TTGCCGTCCTCG AATTGGTGCG 34582 236 1.05e-05 GATTGTCGCT CTTCCGACACCG GACGATCGCC 10025 343 2.15e-05 CACGAAGATT CTTCCGATGCCG TGAGATGATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44833 7.6e-08 393_[+1]_95 27821 5.1e-07 233_[+1]_255 9709 6.6e-07 473_[+1]_15 30113 1.5e-06 321_[+1]_167 10068 2e-06 227_[+1]_261 46930 5.1e-06 412_[+1]_76 17862 8.2e-06 273_[+1]_215 10896 9.7e-06 399_[+1]_89 34582 1.1e-05 235_[+1]_253 10025 2.2e-05 342_[+1]_146 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=10 44833 ( 394) TTTCCGTCATCG 1 27821 ( 234) TTTCCGTTATCG 1 9709 ( 474) GTTCCGTCTTCG 1 30113 ( 322) ATTCCGTCGTCG 1 10068 ( 228) GTTTCGTCATCG 1 46930 ( 413) GTTCCATCGTCG 1 17862 ( 274) TTTTCGTTTTCG 1 10896 ( 400) TTGCCGTCCTCG 1 34582 ( 236) CTTCCGACACCG 1 10025 ( 343) CTTCCGATGCCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 9.86175 E= 6.6e+000 -128 -42 38 62 -997 -997 -997 195 -997 -997 -120 179 -997 158 -997 -38 -997 190 -997 -997 -128 -997 197 -997 -28 -997 -997 162 -997 139 -997 21 72 -142 38 -38 -997 -42 -997 162 -997 190 -997 -997 -997 -997 212 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 6.6e+000 0.100000 0.200000 0.300000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.100000 0.900000 0.000000 0.800000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.100000 0.000000 0.900000 0.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.700000 0.000000 0.300000 0.400000 0.100000 0.300000 0.200000 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TGC]TT[CT]CG[TA][CT][AGT][TC]CG -------------------------------------------------------------------------------- Time 2.28 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 8 llr = 112 E-value = 9.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::413:6:::1::11: pos.-specific C 3a11:a11:a:3:94: probability G 1:18:::91:18::1a matrix T 6:4:8:3:9:8:a:4: bits 2.1 * 1.9 * * * * * 1.7 * * * * * 1.5 * * *** ** * Relative 1.3 * * *** *** * Entropy 1.1 * *** *** *** * (20.2 bits) 0.8 * *** ******* * 0.6 ** *********** * 0.4 ** *********** * 0.2 **************** 0.0 ---------------- Multilevel TCAGTCAGTCTGTCCG consensus C T A T C T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 34582 305 5.15e-10 TCACTCACAG TCAGTCAGTCTGTCTG TCTACTTGAT 44833 115 4.58e-08 AAGTTCACAG TCAGTCAGTCAGTCAG TCAGTCAGTC 27821 138 7.66e-08 AGTCCAAATC CCTCTCAGTCTGTCTG GATATACGGT 10068 445 1.57e-07 TCGTTTTCTA CCAGTCAGGCTGTCGG ATTCAGTTCT 35311 356 3.18e-07 CGGGTTGGTC TCTGTCTCTCTCTCCG CTCGTCTCCA 30113 459 3.36e-07 TCAACTACTA GCTGTCTGTCTGTACG CATTCTTGTA 41856 119 6.45e-07 CCATCTTCCT TCCGACAGTCGCTCTG TTGCTCAATC 33675 152 8.86e-07 CGGTATACGA TCGAACCGTCTGTCCG CGGAAACAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34582 5.1e-10 304_[+2]_180 44833 4.6e-08 114_[+2]_370 27821 7.7e-08 137_[+2]_347 10068 1.6e-07 444_[+2]_40 35311 3.2e-07 355_[+2]_129 30113 3.4e-07 458_[+2]_26 41856 6.4e-07 118_[+2]_366 33675 8.9e-07 151_[+2]_333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=8 34582 ( 305) TCAGTCAGTCTGTCTG 1 44833 ( 115) TCAGTCAGTCAGTCAG 1 27821 ( 138) CCTCTCAGTCTGTCTG 1 10068 ( 445) CCAGTCAGGCTGTCGG 1 35311 ( 356) TCTGTCTCTCTCTCCG 1 30113 ( 459) GCTGTCTGTCTGTACG 1 41856 ( 119) TCCGACAGTCGCTCTG 1 33675 ( 152) TCGAACCGTCTGTCCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7760 bayes= 10.6579 E= 9.9e+000 -965 -10 -88 127 -965 190 -965 -965 62 -109 -88 53 -96 -109 170 -965 4 -965 -965 153 -965 190 -965 -965 136 -109 -965 -5 -965 -109 192 -965 -965 -965 -88 175 -965 190 -965 -965 -96 -965 -88 153 -965 -10 170 -965 -965 -965 -965 194 -96 171 -965 -965 -96 49 -88 53 -965 -965 212 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 9.9e+000 0.000000 0.250000 0.125000 0.625000 0.000000 1.000000 0.000000 0.000000 0.375000 0.125000 0.125000 0.375000 0.125000 0.125000 0.750000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.625000 0.125000 0.000000 0.250000 0.000000 0.125000 0.875000 0.000000 0.000000 0.000000 0.125000 0.875000 0.000000 1.000000 0.000000 0.000000 0.125000 0.000000 0.125000 0.750000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.125000 0.875000 0.000000 0.000000 0.125000 0.375000 0.125000 0.375000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TC]C[AT]G[TA]C[AT]GTCT[GC]TC[CT]G -------------------------------------------------------------------------------- Time 4.97 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 9 llr = 123 E-value = 4.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 9188611a88:313312:: pos.-specific C 1:2:1:2:::41:24:::1 probability G :9:1194:::4291233:9 matrix T :::12:2:2213:3:64a: bits 2.1 * 1.9 * * 1.7 * * * * ** 1.5 ** * * * ** Relative 1.3 *** * *** * ** Entropy 1.1 **** * *** * ** (19.7 bits) 0.8 **** * *** * ** 0.6 **** * **** * * ** 0.4 ****** **** * ***** 0.2 ************* ***** 0.0 ------------------- Multilevel AGAAAGGAAACAGACTTTG consensus C T C TTGT TAGG sequence T G CG A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 33675 218 3.83e-10 TACTGTTATT AGAAAGGAAACTGTGTGTG CGCCTGTAGG 41856 324 1.06e-08 TATCTTGTAC AGCAAGTAAAGTGACGGTG GAAGAAGATC 10025 206 6.43e-08 TTGAGGGTAT AGAAAGGATAGGGTAAATG ATTGAGGAAT 17862 135 1.15e-07 TTAGACATGT AGAAAGTATACAAAATTTG AGGCAACTTC 9709 274 1.51e-07 CATTCTCACG AGATTGGAAACCGCCTTTG GCGATACGCC 27821 45 6.31e-07 AAGGTAGTTC AGAATGAAAATTGTCTTTC CGGTAGCTAC 12233 178 8.90e-07 GAAGCCAGAT AGAGGGCAATGAGAAGATG CTGCCGATGT 44833 253 1.16e-06 CATTCACGTG AACAAAGAAAGAGGCGTTG CACGCATGCT 20657 367 2.10e-06 AACCCGGAAA CGAACGCAATCGGCGTGTG GTTGCGAGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33675 3.8e-10 217_[+3]_264 41856 1.1e-08 323_[+3]_158 10025 6.4e-08 205_[+3]_276 17862 1.1e-07 134_[+3]_347 9709 1.5e-07 273_[+3]_208 27821 6.3e-07 44_[+3]_437 12233 8.9e-07 177_[+3]_304 44833 1.2e-06 252_[+3]_229 20657 2.1e-06 366_[+3]_115 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=9 33675 ( 218) AGAAAGGAAACTGTGTGTG 1 41856 ( 324) AGCAAGTAAAGTGACGGTG 1 10025 ( 206) AGAAAGGATAGGGTAAATG 1 17862 ( 135) AGAAAGTATACAAAATTTG 1 9709 ( 274) AGATTGGAAACCGCCTTTG 1 27821 ( 45) AGAATGAAAATTGTCTTTC 1 12233 ( 178) AGAGGGCAATGAGAAGATG 1 44833 ( 253) AACAAAGAAAGAGGCGTTG 1 20657 ( 367) CGAACGCAATCGGCGTGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 7712 bayes= 9.87573 E= 4.8e+001 187 -126 -982 -982 -113 -982 195 -982 168 -27 -982 -982 168 -982 -105 -122 119 -126 -105 -22 -113 -982 195 -982 -113 -27 95 -22 204 -982 -982 -982 168 -982 -982 -22 168 -982 -982 -22 -982 73 95 -122 45 -126 -5 36 -113 -982 195 -982 45 -27 -105 36 45 73 -5 -982 -113 -982 53 110 -13 -982 53 78 -982 -982 -982 195 -982 -126 195 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 9 E= 4.8e+001 0.888889 0.111111 0.000000 0.000000 0.111111 0.000000 0.888889 0.000000 0.777778 0.222222 0.000000 0.000000 0.777778 0.000000 0.111111 0.111111 0.555556 0.111111 0.111111 0.222222 0.111111 0.000000 0.888889 0.000000 0.111111 0.222222 0.444444 0.222222 1.000000 0.000000 0.000000 0.000000 0.777778 0.000000 0.000000 0.222222 0.777778 0.000000 0.000000 0.222222 0.000000 0.444444 0.444444 0.111111 0.333333 0.111111 0.222222 0.333333 0.111111 0.000000 0.888889 0.000000 0.333333 0.222222 0.111111 0.333333 0.333333 0.444444 0.222222 0.000000 0.111111 0.000000 0.333333 0.555556 0.222222 0.000000 0.333333 0.444444 0.000000 0.000000 0.000000 1.000000 0.000000 0.111111 0.888889 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AG[AC]A[AT]G[GCT]A[AT][AT][CG][ATG]G[ATC][CAG][TG][TGA]TG -------------------------------------------------------------------------------- Time 7.55 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 53935 6.36e-01 500 17862 1.43e-05 134_[+3(1.15e-07)]_120_\ [+1(8.19e-06)]_215 46930 2.51e-02 412_[+1(5.13e-06)]_76 9709 3.47e-06 273_[+3(1.51e-07)]_181_\ [+1(6.60e-07)]_15 10068 3.89e-06 227_[+1(2.03e-06)]_64_\ [+2(4.71e-06)]_125_[+2(1.57e-07)]_40 10025 1.89e-05 205_[+3(6.43e-08)]_118_\ [+1(2.15e-05)]_146 30113 9.53e-06 227_[+1(1.19e-05)]_42_\ [+2(5.16e-05)]_24_[+1(1.52e-06)]_125_[+2(3.36e-07)]_26 10896 3.76e-02 399_[+1(9.69e-06)]_89 41856 1.72e-07 118_[+2(6.45e-07)]_189_\ [+3(1.06e-08)]_158 44833 2.00e-10 114_[+2(4.58e-08)]_[+2(2.08e-05)]_\ 106_[+3(1.16e-06)]_122_[+1(7.61e-08)]_95 34582 1.70e-07 235_[+1(1.05e-05)]_57_\ [+2(5.15e-10)]_180 35311 2.25e-03 355_[+2(3.18e-07)]_129 12233 7.19e-03 177_[+3(8.90e-07)]_304 20657 1.06e-02 366_[+3(2.10e-06)]_115 27821 1.08e-09 44_[+3(6.31e-07)]_74_[+2(7.66e-08)]_\ 80_[+1(5.13e-07)]_255 33675 8.67e-09 151_[+2(8.86e-07)]_50_\ [+3(3.83e-10)]_264 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************