******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/175/175.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31408 1.0000 500 31409 1.0000 500 42613 1.0000 500 43169 1.0000 500 9258 1.0000 500 46395 1.0000 500 14529 1.0000 500 29157 1.0000 500 22122 1.0000 500 54082 1.0000 500 32708 1.0000 500 42282 1.0000 500 49957 1.0000 500 23639 1.0000 500 23717 1.0000 500 23850 1.0000 500 43831 1.0000 500 25956 1.0000 500 10518 1.0000 500 18893 1.0000 500 34120 1.0000 500 44651 1.0000 500 44437 1.0000 500 46917 1.0000 500 41038 1.0000 500 34270 1.0000 500 41030 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/175/175.seqs.fa -oc motifs/175 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 27 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 13500 N= 27 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.254 C 0.248 G 0.234 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.254 C 0.248 G 0.234 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 20 llr = 204 E-value = 3.7e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::a:611:a42 pos.-specific C 4:6:a2::6::2 probability G ::3:::a13162 matrix T 7a2::3:92::5 bits 2.1 * 1.9 * ** * 1.7 * ** * * 1.5 * ** * * Relative 1.3 * ** ** * Entropy 1.0 ** ** ** ** (14.7 bits) 0.8 ** ** ** ** 0.6 *********** 0.4 *********** 0.2 ************ 0.0 ------------ Multilevel TTCACAGTCAGT consensus C G T G AA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 23850 484 6.55e-08 CGTATATACT TTCACAGTCAGT AAGGA 34120 107 1.98e-07 CGGTACCTCC TTCACAGTCAAT TTATGAATCT 31409 242 1.98e-07 GAGCGACAGG CTCACAGTCAGT AGAGAAGACC 49957 397 1.61e-06 TCGTGAGTGC TTGACAGTGAGT GCTGCTTCTA 23639 309 1.93e-06 ATACACCTTG TTCACAGTCAAG CTCTGACTCT 18893 379 2.32e-06 AGGATTCTTT CTCACAGTCAGC TCGTTCGAGG 41038 449 4.47e-06 CTCGAATTGC TTGACTGTGAGT CGTTTGTTCA 43831 90 4.47e-06 TGAACTGGGG TTTACAGTCAGG AAACACTCTT 44651 179 5.61e-06 TTACTCCGAG TTTACCGTCAGT AGATTTCACG 31408 168 7.81e-06 CAGTCACTCA CTCACTGTCAAA CAAACAACGC 22122 461 8.63e-06 AACTTTGTAC CTGACAGTTAGT TCACTCTACT 43169 415 9.39e-06 GCATACGAGA CTCACTGTCAAG GCTGTGTGGA 42282 24 1.76e-05 GACAACGACT CTCACTGTTAGA TCAAAAACAA 14529 266 2.49e-05 CAACATCATC TTGACCGTGAGC CCAGCTACCG 46395 346 2.49e-05 AAACACAAAT TTTACAGAGAGT AAGGTAGTGA 54082 423 2.97e-05 CGGAGAGGTC TTGACTGTCGGT GGCGCTCTCT 25956 103 3.55e-05 CTTTACGGTA TTCACAATCAAA TCCAGCGTGA 34270 292 3.95e-05 AGACATTCCC CTCACAGGCAAA TATTGGAGCG 32708 328 3.95e-05 ACTGGTTTCC TTCACCGAGAAT CTTTTGCTTA 42613 31 4.43e-05 AAGAGGCTGC TTTACTGTTAAC TATGAGAGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23850 6.5e-08 483_[+1]_5 34120 2e-07 106_[+1]_382 31409 2e-07 241_[+1]_247 49957 1.6e-06 396_[+1]_92 23639 1.9e-06 308_[+1]_180 18893 2.3e-06 378_[+1]_110 41038 4.5e-06 448_[+1]_40 43831 4.5e-06 89_[+1]_399 44651 5.6e-06 178_[+1]_310 31408 7.8e-06 167_[+1]_321 22122 8.6e-06 460_[+1]_28 43169 9.4e-06 414_[+1]_74 42282 1.8e-05 23_[+1]_465 14529 2.5e-05 265_[+1]_223 46395 2.5e-05 345_[+1]_143 54082 3e-05 422_[+1]_66 25956 3.5e-05 102_[+1]_386 34270 3.9e-05 291_[+1]_197 32708 3.9e-05 327_[+1]_161 42613 4.4e-05 30_[+1]_458 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=20 23850 ( 484) TTCACAGTCAGT 1 34120 ( 107) TTCACAGTCAAT 1 31409 ( 242) CTCACAGTCAGT 1 49957 ( 397) TTGACAGTGAGT 1 23639 ( 309) TTCACAGTCAAG 1 18893 ( 379) CTCACAGTCAGC 1 41038 ( 449) TTGACTGTGAGT 1 43831 ( 90) TTTACAGTCAGG 1 44651 ( 179) TTTACCGTCAGT 1 31408 ( 168) CTCACTGTCAAA 1 22122 ( 461) CTGACAGTTAGT 1 43169 ( 415) CTCACTGTCAAG 1 42282 ( 24) CTCACTGTTAGA 1 14529 ( 266) TTGACCGTGAGC 1 46395 ( 346) TTTACAGAGAGT 1 54082 ( 423) TTGACTGTCGGT 1 25956 ( 103) TTCACAATCAAA 1 34270 ( 292) CTCACAGGCAAA 1 32708 ( 328) TTCACCGAGAAT 1 42613 ( 31) TTTACTGTTAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 13203 bayes= 10.3089 E= 3.7e-006 -1097 50 -1097 130 -1097 -1097 -1097 192 -1097 115 9 -40 197 -1097 -1097 -1097 -1097 201 -1097 -1097 111 -72 -1097 19 -234 -1097 202 -1097 -135 -1097 -223 169 -1097 128 9 -81 190 -1097 -223 -1097 65 -1097 136 -1097 -35 -72 -64 92 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 20 E= 3.7e-006 0.000000 0.350000 0.000000 0.650000 0.000000 0.000000 0.000000 1.000000 0.000000 0.550000 0.250000 0.200000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.550000 0.150000 0.000000 0.300000 0.050000 0.000000 0.950000 0.000000 0.100000 0.000000 0.050000 0.850000 0.000000 0.600000 0.250000 0.150000 0.950000 0.000000 0.050000 0.000000 0.400000 0.000000 0.600000 0.000000 0.200000 0.150000 0.150000 0.500000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TC]T[CGT]AC[AT]GT[CG]A[GA][TA] -------------------------------------------------------------------------------- Time 6.28 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 4 llr = 83 E-value = 4.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :5:::5:::::::::3:::3 pos.-specific C a:a:a33:a:5a:88::a38 probability G :3:a::5::a::8338::8: matrix T :3:::33a::5:3:::a::: bits 2.1 * *** ** * * 1.9 * *** *** * ** 1.7 * *** *** * ** 1.5 * *** *** * ** Relative 1.3 * *** *** ********* Entropy 1.0 * *** ************* (30.0 bits) 0.8 * *** ************* 0.6 * *** ************** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel CACGCAGTCGCCGCCGTCGC consensus G CC T TGGA CA sequence T TT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 23850 215 6.06e-11 GCGACATTCT CACGCACTCGTCGGCGTCGC CCACAGGCCA 31409 380 2.19e-10 GAAGTCGAAA CTCGCTGTCGTCGCCATCGC GTTGTTTGTT 42282 211 2.52e-10 CCCTAGTCCT CACGCCGTCGCCTCGGTCGC ACAGTCGACG 14529 471 5.95e-10 TGTCCCTCGT CGCGCATTCGCCGCCGTCCA GCTCCATTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23850 6.1e-11 214_[+2]_266 31409 2.2e-10 379_[+2]_101 42282 2.5e-10 210_[+2]_270 14529 6e-10 470_[+2]_10 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=4 23850 ( 215) CACGCACTCGTCGGCGTCGC 1 31409 ( 380) CTCGCTGTCGTCGCCATCGC 1 42282 ( 211) CACGCCGTCGCCTCGGTCGC 1 14529 ( 471) CGCGCATTCGCCGCCGTCCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 12987 bayes= 11.6643 E= 4.2e+002 -865 201 -865 -865 97 -865 9 -7 -865 201 -865 -865 -865 -865 209 -865 -865 201 -865 -865 97 1 -865 -7 -865 1 109 -7 -865 -865 -865 192 -865 201 -865 -865 -865 -865 209 -865 -865 101 -865 92 -865 201 -865 -865 -865 -865 168 -7 -865 159 9 -865 -865 159 9 -865 -2 -865 168 -865 -865 -865 -865 192 -865 201 -865 -865 -865 1 168 -865 -2 159 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 4.2e+002 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.250000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.250000 0.000000 0.250000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.750000 0.250000 0.000000 0.000000 0.750000 0.250000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.250000 0.750000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[AGT]CGC[ACT][GCT]TCG[CT]C[GT][CG][CG][GA]TC[GC][CA] -------------------------------------------------------------------------------- Time 12.58 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 7 llr = 123 E-value = 2.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :34:6661::::3:4::3::: pos.-specific C 6:3:::::6:4:::16::11: probability G 373a41:7:a:9:a:4:7:97 matrix T 1::::3414:617:4:a:9:3 bits 2.1 * * * 1.9 * * * * 1.7 * * * * 1.5 * * * * * * Relative 1.3 * * * * * ***** Entropy 1.0 * ** * ****** ****** (25.3 bits) 0.8 * ** ******** ****** 0.6 ** *********** ****** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CGAGAAAGCGTGTGACTGTGG consensus GAC GTT T C A TG A T sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 23717 91 6.74e-11 TGGTGGTATA CGAGAATGCGCGAGACTGTGG ATAAAGGACC 22122 185 1.20e-09 TACTGGTTTC CGCGAAAACGTGTGTCTGTGT ATGAGATTTG 34270 88 2.77e-09 ACGAATCTCA GGGGGTATCGTGTGACTGTGG AATACATCCG 23850 301 6.70e-09 CCAAGATACC CACGGGAGTGTGTGTGTGTGT TTAGCTAGGT 42282 295 1.06e-08 AACAACGTCG GGAGGATGTGCTTGTGTATGG TTGCATGCCT 23639 447 1.14e-08 ATTCCTGATT CAAGAAAGCGCGAGACTATCG TGACGGCAAA 41030 61 2.75e-08 CGCAACGGTT TGGGATTGTGTGTGCGTGCGG TGCCTGGCGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23717 6.7e-11 90_[+3]_389 22122 1.2e-09 184_[+3]_295 34270 2.8e-09 87_[+3]_392 23850 6.7e-09 300_[+3]_179 42282 1.1e-08 294_[+3]_185 23639 1.1e-08 446_[+3]_33 41030 2.7e-08 60_[+3]_419 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=7 23717 ( 91) CGAGAATGCGCGAGACTGTGG 1 22122 ( 185) CGCGAAAACGTGTGTCTGTGT 1 34270 ( 88) GGGGGTATCGTGTGACTGTGG 1 23850 ( 301) CACGGGAGTGTGTGTGTGTGT 1 42282 ( 295) GGAGGATGTGCTTGTGTATGG 1 23639 ( 447) CAAGAAAGCGCGAGACTATCG 1 41030 ( 61) TGGGATTGTGTGTGCGTGCGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 12960 bayes= 11.4596 E= 2.2e+002 -945 120 28 -88 17 -945 161 -945 75 20 28 -945 -945 -945 209 -945 117 -945 87 -945 117 -945 -71 12 117 -945 -945 70 -83 -945 161 -88 -945 120 -945 70 -945 -945 209 -945 -945 79 -945 112 -945 -945 187 -88 17 -945 -945 144 -945 -945 209 -945 75 -79 -945 70 -945 120 87 -945 -945 -945 -945 192 17 -945 161 -945 -945 -79 -945 170 -945 -79 187 -945 -945 -945 161 12 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 2.2e+002 0.000000 0.571429 0.285714 0.142857 0.285714 0.000000 0.714286 0.000000 0.428571 0.285714 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 0.571429 0.000000 0.428571 0.000000 0.571429 0.000000 0.142857 0.285714 0.571429 0.000000 0.000000 0.428571 0.142857 0.000000 0.714286 0.142857 0.000000 0.571429 0.000000 0.428571 0.000000 0.000000 1.000000 0.000000 0.000000 0.428571 0.000000 0.571429 0.000000 0.000000 0.857143 0.142857 0.285714 0.000000 0.000000 0.714286 0.000000 0.000000 1.000000 0.000000 0.428571 0.142857 0.000000 0.428571 0.000000 0.571429 0.428571 0.000000 0.000000 0.000000 0.000000 1.000000 0.285714 0.000000 0.714286 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 0.714286 0.285714 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG][GA][ACG]G[AG][AT][AT]G[CT]G[TC]G[TA]G[AT][CG]T[GA]TG[GT] -------------------------------------------------------------------------------- Time 19.01 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31408 7.36e-03 167_[+1(7.81e-06)]_155_\ [+1(2.33e-05)]_154 31409 1.09e-09 241_[+1(1.98e-07)]_126_\ [+2(2.19e-10)]_9_[+1(1.61e-06)]_80 42613 2.39e-01 30_[+1(4.43e-05)]_458 43169 8.30e-02 414_[+1(9.39e-06)]_74 9258 1.42e-01 206_[+2(8.78e-05)]_274 46395 3.88e-02 345_[+1(2.49e-05)]_143 14529 4.46e-07 265_[+1(2.49e-05)]_193_\ [+2(5.95e-10)]_10 29157 2.89e-01 500 22122 2.01e-07 184_[+3(1.20e-09)]_255_\ [+1(8.63e-06)]_28 54082 4.84e-02 422_[+1(2.97e-05)]_66 32708 7.11e-02 327_[+1(3.95e-05)]_161 42282 3.04e-12 23_[+1(1.76e-05)]_175_\ [+2(2.52e-10)]_64_[+3(1.06e-08)]_66_[+3(4.36e-05)]_98 49957 1.98e-04 84_[+3(3.85e-05)]_291_\ [+1(1.61e-06)]_92 23639 9.89e-07 174_[+1(2.62e-05)]_122_\ [+1(1.93e-06)]_126_[+3(1.14e-08)]_33 23717 1.35e-06 90_[+3(6.74e-11)]_389 23850 2.57e-15 214_[+2(6.06e-11)]_39_\ [+2(9.87e-05)]_7_[+3(6.70e-09)]_162_[+1(6.55e-08)]_5 43831 3.02e-02 89_[+1(4.47e-06)]_399 25956 4.88e-02 102_[+1(3.55e-05)]_386 10518 1.57e-01 500 18893 1.94e-03 378_[+1(2.32e-06)]_32_\ [+1(4.47e-06)]_66 34120 4.38e-04 106_[+1(1.98e-07)]_382 44651 1.09e-02 178_[+1(5.61e-06)]_155_\ [+1(3.18e-05)]_143 44437 6.94e-01 500 46917 4.37e-01 500 41038 8.82e-03 448_[+1(4.47e-06)]_40 34270 3.99e-07 87_[+3(2.77e-09)]_183_\ [+1(3.95e-05)]_197 41030 3.97e-04 60_[+3(2.75e-08)]_419 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************