******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/283/283.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 2301 1.0000 500 24284 1.0000 500 262257 1.0000 500 263554 1.0000 500 269091 1.0000 500 270379 1.0000 500 31637 1.0000 500 bd174 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/283/283.seqs.fa -oc motifs/283 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4000 N= 8 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.249 C 0.212 G 0.268 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.249 C 0.212 G 0.268 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 4 llr = 90 E-value = 2.2e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :a::::385385::8::a5a pos.-specific C 3:aa::8:58:3aa:8a::: probability G 8:::aa:3:::3::3:::5: matrix T ::::::::::3::::3:::: bits 2.2 ** ** * 2.0 *** ** ** * 1.8 ***** ** ** * 1.6 ***** ** ** * Relative 1.3 ****** * ** *** * Entropy 1.1 *********** ****** * (32.4 bits) 0.9 *********** ******** 0.7 *********** ******** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GACCGGCAACAACCACCAAA consensus C AGCATC GT G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 270379 479 8.00e-13 TCAACACACC GACCGGCAACAACCACCAAA CA 263554 480 8.00e-13 TCAACACACC GACCGGCAACAACCACCAAA C bd174 327 2.29e-10 AGACAGACTG GACCGGAGCAAGCCACCAGA GACAGTGAAA 269091 133 3.30e-10 AATCAACACT CACCGGCACCTCCCGTCAGA AGGATATTCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 270379 8e-13 478_[+1]_2 263554 8e-13 479_[+1]_1 bd174 2.3e-10 326_[+1]_154 269091 3.3e-10 132_[+1]_348 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=4 270379 ( 479) GACCGGCAACAACCACCAAA 1 263554 ( 480) GACCGGCAACAACCACCAAA 1 bd174 ( 327) GACCGGAGCAAGCCACCAGA 1 269091 ( 133) CACCGGCACCTCCCGTCAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 3848 bayes= 9.90839 E= 2.2e-003 -865 24 148 -865 200 -865 -865 -865 -865 223 -865 -865 -865 223 -865 -865 -865 -865 190 -865 -865 -865 190 -865 1 182 -865 -865 159 -865 -10 -865 100 124 -865 -865 1 182 -865 -865 159 -865 -865 -12 100 24 -10 -865 -865 223 -865 -865 -865 223 -865 -865 159 -865 -10 -865 -865 182 -865 -12 -865 223 -865 -865 200 -865 -865 -865 100 -865 90 -865 200 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 2.2e-003 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.500000 0.500000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.500000 0.250000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC]ACCGG[CA][AG][AC][CA][AT][ACG]CC[AG][CT]CA[AG]A -------------------------------------------------------------------------------- Time 0.51 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 7 llr = 121 E-value = 7.9e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 13:1:::91314:6a:7:3:3 pos.-specific C ::39:74:619:a4:9:6437 probability G :73:736:1::::::13::7: matrix T 9:4:3::116:6:::::43:: bits 2.2 * 2.0 * * 1.8 * * 1.6 * * * ** Relative 1.3 * * * * * * ** * Entropy 1.1 ** ***** * ****** ** (24.9 bits) 0.9 ** ***** ******** ** 0.7 ** ***** ********* ** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TGTCGCGACTCTCAACACCGC consensus AC TGC A A C GTACA sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 270379 458 2.86e-10 CACTACTGAA TGGCGGGACTCTCAACACACC GACCGGCAAC 263554 459 2.86e-10 CACTACTGAA TGGCGGGACTCTCAACACACC GACCGGCAAC 269091 478 6.29e-10 CCTGTGAGCC AGCCGCGACACACAACACTGC AG 31637 146 5.74e-09 GGAGCATGTC TGTCTCGAGTCTCCACGTTGC GACGCGTGTA 262257 114 5.74e-09 GGACAACAAG TGTCGCCACAAACAACGCCGA GTGGGTGACA bd174 402 2.47e-08 TCTATCCTAC TACCGCCTATCACCACATCGA CATAGTCAAC 24284 405 1.50e-07 ATAGCGTCGG TATATCCATCCTCCAGATCGC GCTGTCGCTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 270379 2.9e-10 457_[+2]_22 263554 2.9e-10 458_[+2]_21 269091 6.3e-10 477_[+2]_2 31637 5.7e-09 145_[+2]_334 262257 5.7e-09 113_[+2]_366 bd174 2.5e-08 401_[+2]_78 24284 1.5e-07 404_[+2]_75 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=7 270379 ( 458) TGGCGGGACTCTCAACACACC 1 263554 ( 459) TGGCGGGACTCTCAACACACC 1 269091 ( 478) AGCCGCGACACACAACACTGC 1 31637 ( 146) TGTCTCGAGTCTCCACGTTGC 1 262257 ( 114) TGTCGCCACAAACAACGCCGA 1 bd174 ( 402) TACCGCCTATCACCACATCGA 1 24284 ( 405) TATATCCATCCTCCAGATCGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3840 bayes= 10.3208 E= 7.9e-003 -80 -945 -945 166 20 -945 141 -945 -945 43 9 66 -80 201 -945 -945 -945 -945 141 8 -945 175 9 -945 -945 101 109 -945 178 -945 -945 -92 -80 143 -91 -92 20 -57 -945 108 -80 201 -945 -945 78 -945 -945 108 -945 224 -945 -945 120 101 -945 -945 200 -945 -945 -945 -945 201 -91 -945 152 -945 9 -945 -945 143 -945 66 20 101 -945 8 -945 43 141 -945 20 175 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 7.9e-003 0.142857 0.000000 0.000000 0.857143 0.285714 0.000000 0.714286 0.000000 0.000000 0.285714 0.285714 0.428571 0.142857 0.857143 0.000000 0.000000 0.000000 0.000000 0.714286 0.285714 0.000000 0.714286 0.285714 0.000000 0.000000 0.428571 0.571429 0.000000 0.857143 0.000000 0.000000 0.142857 0.142857 0.571429 0.142857 0.142857 0.285714 0.142857 0.000000 0.571429 0.142857 0.857143 0.000000 0.000000 0.428571 0.000000 0.000000 0.571429 0.000000 1.000000 0.000000 0.000000 0.571429 0.428571 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.714286 0.000000 0.285714 0.000000 0.000000 0.571429 0.000000 0.428571 0.285714 0.428571 0.000000 0.285714 0.000000 0.285714 0.714286 0.000000 0.285714 0.714286 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[GA][TCG]C[GT][CG][GC]AC[TA]C[TA]C[AC]AC[AG][CT][CAT][GC][CA] -------------------------------------------------------------------------------- Time 1.03 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 7 llr = 117 E-value = 1.9e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :7:9:36:aa:37611a1:: pos.-specific C ::7::6:1::313:19::14 probability G 9:3:71:9:::6:1:::3:6 matrix T 13:13:4:::7::37::69: bits 2.2 2.0 ** * 1.8 ** * 1.6 ** ** Relative 1.3 * ** *** ** * Entropy 1.1 ***** **** * ** ** (24.1 bits) 0.9 ***** ***** * *** ** 0.7 ***************** ** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GACAGCAGAATGAATCATTG consensus TG TAT CACT G C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 270379 63 1.10e-11 GCAAACGCTT GACAGCAGAATGATTCATTG TTTGATGCAC 263554 64 1.10e-11 GCAAACGCTT GACAGCAGAATGATTCATTG TTTGATGCAC 31637 53 3.82e-09 GGAGGAGGAG GAGAGCAGAACAAACCATTC TTGAAACAGA 24284 204 3.49e-08 AGAGAGAGGT GTCATGTGAATACATCAGTC CTACAACATC bd174 468 5.10e-08 ATTTTGACCG GACAGCACAATGAGAAATTC GTCCATAGCC 2301 23 8.51e-08 GAAGCAAAAT TAGATATGAATGCATCAATG AACGAGTGCA 269091 62 1.58e-07 CGACGCCTCA GTCTGATGAACCAATCAGCG TGGTTCCACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 270379 1.1e-11 62_[+3]_418 263554 1.1e-11 63_[+3]_417 31637 3.8e-09 52_[+3]_428 24284 3.5e-08 203_[+3]_277 bd174 5.1e-08 467_[+3]_13 2301 8.5e-08 22_[+3]_458 269091 1.6e-07 61_[+3]_419 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=7 270379 ( 63) GACAGCAGAATGATTCATTG 1 263554 ( 64) GACAGCAGAATGATTCATTG 1 31637 ( 53) GAGAGCAGAACAAACCATTC 1 24284 ( 204) GTCATGTGAATACATCAGTC 1 bd174 ( 468) GACAGCACAATGAGAAATTC 1 2301 ( 23) TAGATATGAATGCATCAATG 1 269091 ( 62) GTCTGATGAACCAATCAGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 3848 bayes= 9.70653 E= 1.9e-002 -945 -945 168 -92 152 -945 -945 8 -945 175 9 -945 178 -945 -945 -92 -945 -945 141 8 20 143 -91 -945 120 -945 -945 66 -945 -57 168 -945 200 -945 -945 -945 200 -945 -945 -945 -945 43 -945 140 20 -57 109 -945 152 43 -945 -945 120 -945 -91 8 -80 -57 -945 140 -80 201 -945 -945 200 -945 -945 -945 -80 -945 9 108 -945 -57 -945 166 -945 101 109 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 1.9e-002 0.000000 0.000000 0.857143 0.142857 0.714286 0.000000 0.000000 0.285714 0.000000 0.714286 0.285714 0.000000 0.857143 0.000000 0.000000 0.142857 0.000000 0.000000 0.714286 0.285714 0.285714 0.571429 0.142857 0.000000 0.571429 0.000000 0.000000 0.428571 0.000000 0.142857 0.857143 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.285714 0.000000 0.714286 0.285714 0.142857 0.571429 0.000000 0.714286 0.285714 0.000000 0.000000 0.571429 0.000000 0.142857 0.285714 0.142857 0.142857 0.000000 0.714286 0.142857 0.857143 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.000000 0.285714 0.571429 0.000000 0.142857 0.000000 0.857143 0.000000 0.428571 0.571429 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[AT][CG]A[GT][CA][AT]GAA[TC][GA][AC][AT]TCA[TG]T[GC] -------------------------------------------------------------------------------- Time 1.57 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 2301 1.63e-03 22_[+3(8.51e-08)]_458 24284 1.31e-07 203_[+3(3.49e-08)]_181_\ [+2(1.50e-07)]_13_[+3(8.87e-05)]_42 262257 1.43e-05 113_[+2(5.74e-09)]_166_\ [+1(9.50e-05)]_180 263554 4.65e-22 63_[+3(1.10e-11)]_375_\ [+2(2.86e-10)]_[+1(8.00e-13)]_1 269091 3.09e-15 61_[+3(1.58e-07)]_51_[+1(3.30e-10)]_\ 325_[+2(6.29e-10)]_2 270379 4.65e-22 62_[+3(1.10e-11)]_375_\ [+2(2.86e-10)]_[+1(8.00e-13)]_2 31637 4.07e-10 52_[+3(3.82e-09)]_73_[+2(5.74e-09)]_\ 334 bd174 2.44e-14 326_[+1(2.29e-10)]_55_\ [+2(2.47e-08)]_45_[+3(5.10e-08)]_13 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************