******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/188/188.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 13444 1.0000 500 13909 1.0000 500 20791 1.0000 500 20834 1.0000 500 22994 1.0000 500 23902 1.0000 500 30400 1.0000 500 30928 1.0000 500 31894 1.0000 500 38095 1.0000 500 40218 1.0000 500 4441 1.0000 500 4858 1.0000 500 4991 1.0000 500 7245 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/188/188.seqs.fa -oc motifs/188 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.242 C 0.238 G 0.255 T 0.265 Background letter frequencies (from dataset with add-one prior applied): A 0.242 C 0.238 G 0.255 T 0.265 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 14 llr = 141 E-value = 5.8e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 211:1::4:5:: pos.-specific C 5179:aa:a239 probability G :4:1:::::311 matrix T 342:9::6::6: bits 2.1 ** * 1.9 ** * 1.7 * ** * 1.4 * ** * * Relative 1.2 **** * * Entropy 1.0 ******* * (14.6 bits) 0.8 ******* ** 0.6 * ********** 0.4 * ********** 0.2 ************ 0.0 ------------ Multilevel CGCCTCCTCATC consensus TTT A GC sequence A C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 38095 227 3.01e-07 ATCACCACTA CGCCTCCTCGTC GAAACCATCA 20834 481 5.04e-07 AGCTGAGAGG CGCCTCCTCACC AGGCGACG 20791 39 6.48e-07 TCTTCACTCT CGCCTCCACGTC CTTCTGATAT 4441 70 1.68e-06 TGTATGATCC CTCCTCCTCGCC AATACTGCTA 30928 60 1.68e-06 CAGATACGTC TGCCTCCTCGTC ATGGTGCAAT 4991 59 3.87e-06 ATCCCGTTCC ATCCTCCACACC AATCTTGATA 23902 236 6.40e-06 ACAGCACTGA CCTCTCCACATC AAAAAGAGAG 31894 308 9.87e-06 CCGTTGAATC AGCCACCTCATC GTTCGCGACG 40218 486 1.75e-05 TCTACGTAAC CTACTCCACACC GTC 13909 51 2.24e-05 CTCCGAGGCC ATCGTCCTCATC CACGTTGTCG 4858 396 3.56e-05 CTCATTCTGC CATCACCACATC ATAAACTCGT 7245 459 3.78e-05 ACAGCTCCCA TCCCTCCTCCTG ACAACTGACA 13444 449 4.84e-05 CCTTCCAACT TACCTCCTCCGC CGTGCTGTCT 22994 472 5.71e-05 GATCGGAGAC TTTCTCCACCTG CCTTCCACAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38095 3e-07 226_[+1]_262 20834 5e-07 480_[+1]_8 20791 6.5e-07 38_[+1]_450 4441 1.7e-06 69_[+1]_419 30928 1.7e-06 59_[+1]_429 4991 3.9e-06 58_[+1]_430 23902 6.4e-06 235_[+1]_253 31894 9.9e-06 307_[+1]_181 40218 1.8e-05 485_[+1]_3 13909 2.2e-05 50_[+1]_438 4858 3.6e-05 395_[+1]_93 7245 3.8e-05 458_[+1]_30 13444 4.8e-05 448_[+1]_40 22994 5.7e-05 471_[+1]_17 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=14 38095 ( 227) CGCCTCCTCGTC 1 20834 ( 481) CGCCTCCTCACC 1 20791 ( 39) CGCCTCCACGTC 1 4441 ( 70) CTCCTCCTCGCC 1 30928 ( 60) TGCCTCCTCGTC 1 4991 ( 59) ATCCTCCACACC 1 23902 ( 236) CCTCTCCACATC 1 31894 ( 308) AGCCACCTCATC 1 40218 ( 486) CTACTCCACACC 1 13909 ( 51) ATCGTCCTCATC 1 4858 ( 396) CATCACCACATC 1 7245 ( 459) TCCCTCCTCCTG 1 13444 ( 449) TACCTCCTCCGC 1 22994 ( 472) TTTCTCCACCTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 9.63714 E= 5.8e-003 -17 107 -1045 11 -76 -74 49 43 -176 158 -1045 -31 -1045 196 -183 -1045 -76 -1045 -1045 169 -1045 207 -1045 -1045 -1045 207 -1045 -1045 83 -1045 -1045 111 -1045 207 -1045 -1045 105 -15 17 -1045 -1045 26 -183 128 -1045 185 -83 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 5.8e-003 0.214286 0.500000 0.000000 0.285714 0.142857 0.142857 0.357143 0.357143 0.071429 0.714286 0.000000 0.214286 0.000000 0.928571 0.071429 0.000000 0.142857 0.000000 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.428571 0.000000 0.000000 0.571429 0.000000 1.000000 0.000000 0.000000 0.500000 0.214286 0.285714 0.000000 0.000000 0.285714 0.071429 0.642857 0.000000 0.857143 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CTA][GT][CT]CTCC[TA]C[AGC][TC]C -------------------------------------------------------------------------------- Time 1.89 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 15 llr = 142 E-value = 3.1e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3:922461:921 pos.-specific C 21:4::39::8: probability G :9:3731:a::9 matrix T 5:1113:::1:: bits 2.1 * 1.9 * 1.7 * * 1.4 ** *** * Relative 1.2 ** ***** Entropy 1.0 ** * ***** (13.6 bits) 0.8 ** * ****** 0.6 *** * ****** 0.4 *** ******** 0.2 ************ 0.0 ------------ Multilevel TGACGAACGACG consensus A GATC A sequence C A G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 20791 110 4.30e-07 CGCCGCGTAA TGAGGTACGACG CCGCTGCGGC 38095 1 6.43e-07 . AGACGTACGACG AACGTCGAGA 30400 485 6.43e-07 CTCTCCCACC AGACGTACGACG AACG 30928 256 3.07e-06 TTTGTGGTGG TGACGAACGACA GGGAGCATAC 13444 483 3.07e-06 TCCCTTCCAC TGATGACCGACG ACCACT 4441 489 7.44e-06 GACCAGCAAG CGACAGACGACG 4991 194 1.13e-05 ATCTGGATGA TGATGTACGAAG TGATGACGAC 31894 485 1.65e-05 ATAGTGGCGT TGAAGTGCGACG TGGA 22994 61 3.21e-05 GTAGGATTCG TGAAAACCGAAG AGATGGATCC 20834 280 3.43e-05 CGGACGACGA TGAAGAAAGAAG CAGCGTTGCT 40218 277 4.00e-05 CGCCAATGGA TCAGGAACGTCG TATCGACTTA 13909 215 6.07e-05 GCATCTTCTT CGTCGGCCGACG CACTGTCAGA 23902 254 6.94e-05 CATCAAAAAG AGAGAACCGTCG ACGATCTTCT 7245 258 8.87e-05 ACTGGACGCT AGAGGGAAGACA CCGAGCGGCG 4858 64 1.54e-04 GGGCAATTTG CCACTGCCGACG GAGGAGGTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20791 4.3e-07 109_[+2]_379 38095 6.4e-07 [+2]_488 30400 6.4e-07 484_[+2]_4 30928 3.1e-06 255_[+2]_233 13444 3.1e-06 482_[+2]_6 4441 7.4e-06 488_[+2] 4991 1.1e-05 193_[+2]_295 31894 1.6e-05 484_[+2]_4 22994 3.2e-05 60_[+2]_428 20834 3.4e-05 279_[+2]_209 40218 4e-05 276_[+2]_212 13909 6.1e-05 214_[+2]_274 23902 6.9e-05 253_[+2]_235 7245 8.9e-05 257_[+2]_231 4858 0.00015 63_[+2]_425 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=15 20791 ( 110) TGAGGTACGACG 1 38095 ( 1) AGACGTACGACG 1 30400 ( 485) AGACGTACGACG 1 30928 ( 256) TGACGAACGACA 1 13444 ( 483) TGATGACCGACG 1 4441 ( 489) CGACAGACGACG 1 4991 ( 194) TGATGTACGAAG 1 31894 ( 485) TGAAGTGCGACG 1 22994 ( 61) TGAAAACCGAAG 1 20834 ( 280) TGAAGAAAGAAG 1 40218 ( 277) TCAGGAACGTCG 1 13909 ( 215) CGTCGGCCGACG 1 23902 ( 254) AGAGAACCGTCG 1 7245 ( 258) AGAGGGAAGACA 1 4858 ( 64) CCACTGCCGACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 9.60607 E= 3.1e-001 14 -25 -1055 101 -1055 -84 177 -1055 195 -1055 -1055 -199 -27 75 7 -99 -27 -1055 153 -199 73 -1055 7 33 131 48 -193 -1055 -86 186 -1055 -1055 -1055 -1055 197 -1055 184 -1055 -1055 -99 -27 175 -1055 -1055 -86 -1055 177 -1055 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 15 E= 3.1e-001 0.266667 0.200000 0.000000 0.533333 0.000000 0.133333 0.866667 0.000000 0.933333 0.000000 0.000000 0.066667 0.200000 0.400000 0.266667 0.133333 0.200000 0.000000 0.733333 0.066667 0.400000 0.000000 0.266667 0.333333 0.600000 0.333333 0.066667 0.000000 0.133333 0.866667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.866667 0.000000 0.000000 0.133333 0.200000 0.800000 0.000000 0.000000 0.133333 0.000000 0.866667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TAC]GA[CGA][GA][ATG][AC]CGA[CA]G -------------------------------------------------------------------------------- Time 3.97 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 15 llr = 158 E-value = 3.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 25315::69:362153 pos.-specific C 8:291591:71373:7 probability G :31::1131:71:55: matrix T :24:54:::3::12:: bits 2.1 1.9 1.7 * * 1.4 * * * Relative 1.2 * * * ** * Entropy 1.0 * * * ** ** (15.2 bits) 0.8 * * ******** ** 0.6 ** ********** ** 0.4 ** ********** ** 0.2 ** ************* 0.0 ---------------- Multilevel CATCACCAACGACGGC consensus AGA TT G TACACAA sequence TC T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 31894 153 3.48e-08 GAGGAGGATG CGTCTCCGACGACGGC GACATCATGG 30928 86 4.19e-07 TGCAATCCAT CGTCTCCAACAACCGA ACAGTCCATC 40218 47 7.08e-07 TCTTGCCTCA CATCACCAATGAAGAA GTGACATCCT 38095 33 9.03e-07 CGGTTGACCT AAACATCAACGCCCAC TTCTTCCTCC 7245 482 2.73e-06 CAACTGACAA CAACTCCAATCACCGC AGC 22994 164 2.73e-06 ATGGACGTTC CATCTTGAACGACTAC GTCTTTGTAA 13444 173 4.84e-06 GGCGTACTCG CAGATCCAACGACTGC GACATACCAT 20834 352 5.30e-06 AGGGAGGCTT CACCACCGGCAACGGA GGGTCCAAAA 13909 275 5.30e-06 TCTACAGTGC AGACACCAACGCTGGC GTCGTCGTAA 20791 443 7.51e-06 CGGTGCCATT CTTCTTCCACGCCGAA ACACAGACCC 4858 452 8.17e-06 AAAGAACGTT CACCACCAATACCAAC TCATCACCTC 23902 75 1.22e-05 ACATACATGC CTGCATCAACACACGC CTCGTGTCTC 4991 478 1.76e-05 AGTTGAGGAC ATCCTGCGACGACGAC AACCATT 4441 12 2.03e-05 CTCTTTTTCG CATCATCGACGGATAA TTCCTCCTCG 30400 180 1.08e-04 CGTTGATGGA CGACCTCGGTGATGGC GCGATGGGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31894 3.5e-08 152_[+3]_332 30928 4.2e-07 85_[+3]_399 40218 7.1e-07 46_[+3]_438 38095 9e-07 32_[+3]_452 7245 2.7e-06 481_[+3]_3 22994 2.7e-06 163_[+3]_321 13444 4.8e-06 172_[+3]_312 20834 5.3e-06 351_[+3]_133 13909 5.3e-06 274_[+3]_210 20791 7.5e-06 442_[+3]_42 4858 8.2e-06 451_[+3]_33 23902 1.2e-05 74_[+3]_410 4991 1.8e-05 477_[+3]_7 4441 2e-05 11_[+3]_473 30400 0.00011 179_[+3]_305 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=15 31894 ( 153) CGTCTCCGACGACGGC 1 30928 ( 86) CGTCTCCAACAACCGA 1 40218 ( 47) CATCACCAATGAAGAA 1 38095 ( 33) AAACATCAACGCCCAC 1 7245 ( 482) CAACTCCAATCACCGC 1 22994 ( 164) CATCTTGAACGACTAC 1 13444 ( 173) CAGATCCAACGACTGC 1 20834 ( 352) CACCACCGGCAACGGA 1 13909 ( 275) AGACACCAACGCTGGC 1 20791 ( 443) CTTCTTCCACGCCGAA 1 4858 ( 452) CACCACCAATACCAAC 1 23902 ( 75) CTGCATCAACACACGC 1 4991 ( 478) ATCCTGCGACGACGAC 1 4441 ( 12) CATCATCGACGGATAA 1 30400 ( 180) CGACCTCGGTGATGGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7275 bayes= 8.91886 E= 3.4e-001 -27 175 -1055 -1055 114 -1055 7 -41 14 -25 -93 59 -186 197 -1055 -1055 95 -184 -1055 81 -1055 116 -193 59 -1055 197 -193 -1055 131 -184 39 -1055 184 -1055 -93 -1055 -1055 162 -1055 1 14 -184 139 -1055 131 48 -193 -1055 -27 148 -1055 -99 -186 16 87 -41 95 -1055 107 -1055 46 148 -1055 -1055 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 15 E= 3.4e-001 0.200000 0.800000 0.000000 0.000000 0.533333 0.000000 0.266667 0.200000 0.266667 0.200000 0.133333 0.400000 0.066667 0.933333 0.000000 0.000000 0.466667 0.066667 0.000000 0.466667 0.000000 0.533333 0.066667 0.400000 0.000000 0.933333 0.066667 0.000000 0.600000 0.066667 0.333333 0.000000 0.866667 0.000000 0.133333 0.000000 0.000000 0.733333 0.000000 0.266667 0.266667 0.066667 0.666667 0.000000 0.600000 0.333333 0.066667 0.000000 0.200000 0.666667 0.000000 0.133333 0.066667 0.266667 0.466667 0.200000 0.466667 0.000000 0.533333 0.000000 0.333333 0.666667 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA][AGT][TAC]C[AT][CT]C[AG]A[CT][GA][AC][CA][GCT][GA][CA] -------------------------------------------------------------------------------- Time 5.83 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 13444 1.24e-05 172_[+3(4.84e-06)]_260_\ [+1(4.84e-05)]_22_[+2(3.07e-06)]_6 13909 9.27e-05 50_[+1(2.24e-05)]_152_\ [+2(6.07e-05)]_48_[+3(5.30e-06)]_210 20791 6.50e-08 38_[+1(6.48e-07)]_59_[+2(4.30e-07)]_\ 321_[+3(7.51e-06)]_42 20834 1.98e-06 279_[+2(3.43e-05)]_60_\ [+3(5.30e-06)]_113_[+1(5.04e-07)]_8 22994 6.75e-05 60_[+2(3.21e-05)]_91_[+3(2.73e-06)]_\ 292_[+1(5.71e-05)]_17 23902 7.21e-05 74_[+3(1.22e-05)]_145_\ [+1(6.40e-06)]_6_[+2(6.94e-05)]_235 30400 7.82e-04 484_[+2(6.43e-07)]_4 30928 6.70e-08 59_[+1(1.68e-06)]_14_[+3(4.19e-07)]_\ 154_[+2(3.07e-06)]_233 31894 1.60e-07 152_[+3(3.48e-08)]_139_\ [+1(9.87e-06)]_165_[+2(1.65e-05)]_4 38095 6.66e-09 [+2(6.43e-07)]_20_[+3(9.03e-07)]_94_\ [+1(4.35e-05)]_72_[+1(3.01e-07)]_262 40218 8.90e-06 46_[+3(7.08e-07)]_214_\ [+2(4.00e-05)]_197_[+1(1.75e-05)]_3 4441 4.93e-06 11_[+3(2.03e-05)]_42_[+1(1.68e-06)]_\ 407_[+2(7.44e-06)] 4858 4.33e-04 395_[+1(3.56e-05)]_44_\ [+3(8.17e-06)]_33 4991 1.32e-05 58_[+1(3.87e-06)]_123_\ [+2(1.13e-05)]_272_[+3(1.76e-05)]_7 7245 1.13e-04 257_[+2(8.87e-05)]_189_\ [+1(3.78e-05)]_11_[+3(2.73e-06)]_3 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************