******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/365/365.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10732 1.0000 500 174 1.0000 500 1791 1.0000 500 20828 1.0000 500 21539 1.0000 500 2240 1.0000 500 23101 1.0000 500 25875 1.0000 500 262408 1.0000 500 262956 1.0000 500 264364 1.0000 500 268279 1.0000 500 268963 1.0000 500 27220 1.0000 500 2791 1.0000 500 32176 1.0000 500 34441 1.0000 500 4086 1.0000 500 4403 1.0000 500 5080 1.0000 500 5330 1.0000 500 6248 1.0000 500 6873 1.0000 500 7268 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/365/365.seqs.fa -oc motifs/365 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 24 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12000 N= 24 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.242 G 0.230 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.242 G 0.230 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 8 llr = 140 E-value = 8.2e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 4:55:81:9::3954::41:4 pos.-specific C :::1:::::11::35::1::6 probability G 6a41a19a1895131a816a: matrix T ::13:1:::1:3::::343:: bits 2.1 * * * * * 1.9 * * * * * 1.7 * * * * * 1.5 * * *** * * * * Relative 1.3 * * *** * * ** * Entropy 1.1 ** * ***** * ** ** (25.2 bits) 0.8 ** ******* * ** *** 0.6 *** ******* * *** *** 0.4 *** ************* *** 0.2 ********************* 0.0 --------------------- Multilevel GGAAGAGGAGGGAACGGAGGC consensus A GT A CA TTT A sequence T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 2791 101 1.38e-11 TGAATGCGGT GGAAGAGGAGGGAAAGGATGC CATTGTGCTT 10732 59 1.71e-10 AGGCTTCTTG GGGAGAGGATGGACCGGTGGC TGCCATGATT 268279 197 4.72e-10 GCGGCAGCGG AGGAGGGGAGGGAACGGCGGC ATGCCCCTCG 32176 268 6.68e-10 CCAAGCACAA GGAGGAGGGGGGAACGGAGGA AACAACAACA 6873 412 4.07e-09 TAGGATGTGA GGACGAGGAGGAAGAGTTGGA CCACGAAGGT 4403 63 5.03e-08 CTTTGAGGCG AGATGAAGACGAAGCGTTGGC TTTTGTTGCG 34441 332 5.89e-08 GCAGTCTGGA AGGAGTGGAGGTGCAGGGTGC AGCGTACACC 20828 297 7.19e-08 CCTTACCCTT GGTTGAGGAGCTAAGGGAAGA GTGAAGCAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 2791 1.4e-11 100_[+1]_379 10732 1.7e-10 58_[+1]_421 268279 4.7e-10 196_[+1]_283 32176 6.7e-10 267_[+1]_212 6873 4.1e-09 411_[+1]_68 4403 5e-08 62_[+1]_417 34441 5.9e-08 331_[+1]_148 20828 7.2e-08 296_[+1]_183 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=8 2791 ( 101) GGAAGAGGAGGGAAAGGATGC 1 10732 ( 59) GGGAGAGGATGGACCGGTGGC 1 268279 ( 197) AGGAGGGGAGGGAACGGCGGC 1 32176 ( 268) GGAGGAGGGGGGAACGGAGGA 1 6873 ( 412) GGACGAGGAGGAAGAGTTGGA 1 4403 ( 63) AGATGAAGACGAAGCGTTGGC 1 34441 ( 332) AGGAGTGGAGGTGCAGGGTGC 1 20828 ( 297) GGTTGAGGAGCTAAGGGAAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 11520 bayes= 11.2282 E= 8.2e-004 52 -965 144 -965 -965 -965 212 -965 94 -965 71 -109 94 -95 -88 -10 -965 -965 212 -965 152 -965 -88 -109 -106 -965 193 -965 -965 -965 212 -965 174 -965 -88 -965 -965 -95 171 -109 -965 -95 193 -965 -6 -965 112 -10 174 -965 -88 -965 94 5 12 -965 52 105 -88 -965 -965 -965 212 -965 -965 -965 171 -10 52 -95 -88 49 -106 -965 144 -10 -965 -965 212 -965 52 137 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 8.2e-004 0.375000 0.000000 0.625000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.375000 0.125000 0.500000 0.125000 0.125000 0.250000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.125000 0.125000 0.125000 0.000000 0.875000 0.000000 0.000000 0.000000 1.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.000000 0.125000 0.750000 0.125000 0.000000 0.125000 0.875000 0.000000 0.250000 0.000000 0.500000 0.250000 0.875000 0.000000 0.125000 0.000000 0.500000 0.250000 0.250000 0.000000 0.375000 0.500000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.375000 0.125000 0.125000 0.375000 0.125000 0.000000 0.625000 0.250000 0.000000 0.000000 1.000000 0.000000 0.375000 0.625000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA]G[AG][AT]GAGGAGG[GAT]A[ACG][CA]G[GT][AT][GT]G[CA] -------------------------------------------------------------------------------- Time 5.38 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 13 llr = 190 E-value = 1.9e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 23:2252:2288:83:6915: pos.-specific C 82558237842:a:5a21929 probability G 1:2::132:4:2:12:2::2: matrix T :533:222:::::1:::::21 bits 2.1 * * 1.9 * * 1.7 * * * * 1.5 * * * ** * Relative 1.3 * * **** * ** * Entropy 1.1 * * * **** * ** * (21.0 bits) 0.8 * * ** **** * ** * 0.6 * * * ** ********* * 0.4 ***** ************ * 0.2 ****** ************** 0.0 --------------------- Multilevel CTCCCACCCCAACACCAACAC consensus ATTACG GC A C sequence C A A A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 5080 470 3.59e-11 CTCCGTAAAC CTCTCAGCCGAACACCCACAC ACCATCATAC 262408 469 3.59e-11 CTCCGTAAAC CTCTCAGCCGAACACCCACAC ACCATCATAC 7268 154 1.34e-08 AGCGTTCGGC CACTCCAGCCAACACCAACGC CGCAACCACA 4403 451 1.86e-08 CGACTGCCGA CAGCCCCCACAACAACAACAC GATGCGTTTG 10732 397 2.29e-08 AGCGCCCCGG CCGTCAGCCGAGCACCAACTC AGACCACCTC 20828 469 5.56e-08 TTGTGTATCA CCTCAACCCCAACAACAAAAC ACAATCCCAG 6873 191 1.57e-07 TAATATCCGG CCCCATCCCCAACAGCGACTC TACTTGAGTC 23101 290 2.00e-07 ATGTAGCCTC CTTCCAACCGCACACCACCAT TTAGTTGCAA 27220 12 2.71e-07 ACTACAACAT CTTACAATCAAACGCCGACAC GGACGACGGC 1791 453 4.17e-07 GAGCGTTGAC GTTCCCTGCGAACAGCCACAC CTACAACCAA 32176 336 6.23e-07 ATGACGATTA CTCAAATTCAAGCAACAACCC ATATTTTAAC 6248 460 8.52e-07 GCACAAGCAC AACACTCCAACACACCAACCC ACTAAAACAC 2791 387 1.02e-06 TAAGCACCCC AACCCGGCCCCACTACAACGC CACCTTTTCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5080 3.6e-11 469_[+2]_10 262408 3.6e-11 468_[+2]_11 7268 1.3e-08 153_[+2]_326 4403 1.9e-08 450_[+2]_29 10732 2.3e-08 396_[+2]_83 20828 5.6e-08 468_[+2]_11 6873 1.6e-07 190_[+2]_289 23101 2e-07 289_[+2]_190 27220 2.7e-07 11_[+2]_468 1791 4.2e-07 452_[+2]_27 32176 6.2e-07 335_[+2]_144 6248 8.5e-07 459_[+2]_20 2791 1e-06 386_[+2]_93 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=13 5080 ( 470) CTCTCAGCCGAACACCCACAC 1 262408 ( 469) CTCTCAGCCGAACACCCACAC 1 7268 ( 154) CACTCCAGCCAACACCAACGC 1 4403 ( 451) CAGCCCCCACAACAACAACAC 1 10732 ( 397) CCGTCAGCCGAGCACCAACTC 1 20828 ( 469) CCTCAACCCCAACAACAAAAC 1 6873 ( 191) CCCCATCCCCAACAGCGACTC 1 23101 ( 290) CTTCCAACCGCACACCACCAT 1 27220 ( 12) CTTACAATCAAACGCCGACAC 1 1791 ( 453) GTTCCCTGCGAACAGCCACAC 1 32176 ( 336) CTCAAATTCAAGCAACAACCC 1 6248 ( 460) AACACTCCAACACACCAACCC 1 2791 ( 387) AACCCGGCCCCACTACAACGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 11520 bayes= 10.3208 E= 1.9e-006 -76 167 -158 -1035 24 -7 -1035 79 -1035 115 -58 20 -18 93 -1035 20 -18 167 -1035 -1035 104 -7 -158 -80 -18 35 42 -80 -1035 152 -58 -80 -76 180 -1035 -1035 -18 67 74 -1035 156 -7 -1035 -1035 170 -1035 -58 -1035 -1035 205 -1035 -1035 170 -1035 -158 -179 24 115 -58 -1035 -1035 205 -1035 -1035 124 -7 -58 -1035 182 -165 -1035 -1035 -176 193 -1035 -1035 104 -65 -58 -80 -1035 193 -1035 -179 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 13 E= 1.9e-006 0.153846 0.769231 0.076923 0.000000 0.307692 0.230769 0.000000 0.461538 0.000000 0.538462 0.153846 0.307692 0.230769 0.461538 0.000000 0.307692 0.230769 0.769231 0.000000 0.000000 0.538462 0.230769 0.076923 0.153846 0.230769 0.307692 0.307692 0.153846 0.000000 0.692308 0.153846 0.153846 0.153846 0.846154 0.000000 0.000000 0.230769 0.384615 0.384615 0.000000 0.769231 0.230769 0.000000 0.000000 0.846154 0.000000 0.153846 0.000000 0.000000 1.000000 0.000000 0.000000 0.846154 0.000000 0.076923 0.076923 0.307692 0.538462 0.153846 0.000000 0.000000 1.000000 0.000000 0.000000 0.615385 0.230769 0.153846 0.000000 0.923077 0.076923 0.000000 0.000000 0.076923 0.923077 0.000000 0.000000 0.538462 0.153846 0.153846 0.153846 0.000000 0.923077 0.000000 0.076923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[TAC][CT][CTA][CA][AC][CGA]CC[CGA][AC]ACA[CA]C[AC]ACAC -------------------------------------------------------------------------------- Time 10.83 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 7 llr = 128 E-value = 2.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::746::::::::a3:7:63 pos.-specific C :::31:3:3::::4::::9:: probability G 49:::13:11a3a6::::13: matrix T 61a:434a69:7:::7a3:17 bits 2.1 * * 1.9 * * * * * * 1.7 * * * * * * 1.5 ** * * * * * * Relative 1.3 ** * ** * * * * Entropy 1.1 **** * ********** * (26.4 bits) 0.8 **** * ********** * 0.6 **** * ************** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TGTAAATTTTGTGGATTACAT consensus G CTTC C G C A T GA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 32176 25 5.45e-11 TATGCTGTGT TGTATTGTTTGTGCATTACAT ATCATGGTAC 5080 5 9.33e-10 TAGT TGTCAACTTTGGGGATTACAA CAAAATGTGA 262408 4 9.33e-10 AGT TGTCAACTTTGGGGATTACAA CAAAATGTGA 1791 194 2.16e-09 GCACAGGAAA GTTACATTTTGTGCATTACAT CGCGTACATA 2240 267 3.96e-09 TACGAAGAGC GGTATTGTCTGTGGAATTCGT TCAAAATTTG 7268 374 6.69e-09 TATCAACCAA GGTAAGTTCTGTGCAATTCGT GATGTTAGTC 2791 266 1.68e-08 GAGACTGTTT TGTATATTGGGTGGATTAGTT TTTTAGATTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32176 5.5e-11 24_[+3]_455 5080 9.3e-10 4_[+3]_475 262408 9.3e-10 3_[+3]_476 1791 2.2e-09 193_[+3]_286 2240 4e-09 266_[+3]_213 7268 6.7e-09 373_[+3]_106 2791 1.7e-08 265_[+3]_214 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=7 32176 ( 25) TGTATTGTTTGTGCATTACAT 1 5080 ( 5) TGTCAACTTTGGGGATTACAA 1 262408 ( 4) TGTCAACTTTGGGGATTACAA 1 1791 ( 194) GTTACATTTTGTGCATTACAT 1 2240 ( 267) GGTATTGTCTGTGGAATTCGT 1 7268 ( 374) GGTAAGTTCTGTGCAATTCGT 1 2791 ( 266) TGTATATTGGGTGGATTAGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 11520 bayes= 10.5274 E= 2.0e+000 -945 -945 90 110 -945 -945 190 -90 -945 -945 -945 190 145 24 -945 -945 71 -76 -945 68 113 -945 -68 10 -945 24 31 68 -945 -945 -945 190 -945 24 -68 110 -945 -945 -68 168 -945 -945 212 -945 -945 -945 31 142 -945 -945 212 -945 -945 82 131 -945 194 -945 -945 -945 13 -945 -945 142 -945 -945 -945 190 145 -945 -945 10 -945 182 -68 -945 113 -945 31 -90 13 -945 -945 142 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 2.0e+000 0.000000 0.000000 0.428571 0.571429 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 1.000000 0.714286 0.285714 0.000000 0.000000 0.428571 0.142857 0.000000 0.428571 0.571429 0.000000 0.142857 0.285714 0.000000 0.285714 0.285714 0.428571 0.000000 0.000000 0.000000 1.000000 0.000000 0.285714 0.142857 0.571429 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 1.000000 0.000000 0.000000 0.428571 0.571429 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.000000 0.000000 0.714286 0.000000 0.000000 0.000000 1.000000 0.714286 0.000000 0.000000 0.285714 0.000000 0.857143 0.142857 0.000000 0.571429 0.000000 0.285714 0.142857 0.285714 0.000000 0.000000 0.714286 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TG]GT[AC][AT][AT][TCG]T[TC]TG[TG]G[GC]A[TA]T[AT]C[AG][TA] -------------------------------------------------------------------------------- Time 16.06 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10732 2.36e-10 58_[+1(1.71e-10)]_317_\ [+2(2.29e-08)]_83 174 3.44e-01 500 1791 1.05e-08 193_[+3(2.16e-09)]_238_\ [+2(4.17e-07)]_27 20828 1.74e-07 296_[+1(7.19e-08)]_151_\ [+2(5.56e-08)]_11 21539 7.29e-02 270_[+1(5.87e-05)]_209 2240 5.38e-05 266_[+3(3.96e-09)]_213 23101 1.92e-03 289_[+2(2.00e-07)]_190 25875 2.15e-01 500 262408 3.34e-12 3_[+3(9.33e-10)]_444_[+2(3.59e-11)]_\ 11 262956 4.01e-01 500 264364 4.45e-01 500 268279 1.10e-05 196_[+1(4.72e-10)]_64_\ [+1(1.32e-05)]_198 268963 7.30e-01 500 27220 8.35e-04 11_[+2(2.71e-07)]_357_\ [+2(8.60e-05)]_90 2791 2.02e-14 100_[+1(1.38e-11)]_34_\ [+1(7.12e-05)]_89_[+3(1.68e-08)]_100_[+2(1.02e-06)]_93 32176 2.16e-15 24_[+3(5.45e-11)]_222_\ [+1(6.68e-10)]_17_[+1(4.24e-05)]_9_[+2(6.23e-07)]_144 34441 5.59e-04 331_[+1(5.89e-08)]_148 4086 8.57e-01 500 4403 2.67e-08 62_[+1(5.03e-08)]_367_\ [+2(1.86e-08)]_29 5080 3.34e-12 4_[+3(9.33e-10)]_444_[+2(3.59e-11)]_\ 10 5330 5.25e-01 500 6248 4.71e-03 459_[+2(8.52e-07)]_20 6873 2.70e-08 190_[+2(1.57e-07)]_200_\ [+1(4.07e-09)]_68 7268 4.32e-09 153_[+2(1.34e-08)]_199_\ [+3(6.69e-09)]_106 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************