******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/88/88.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11943 1.0000 500 12128 1.0000 500 15537 1.0000 500 25447 1.0000 500 261219 1.0000 500 262854 1.0000 500 269281 1.0000 500 31142 1.0000 500 35059 1.0000 500 4341 1.0000 500 5551 1.0000 500 5722 1.0000 500 6434 1.0000 500 7202 1.0000 500 8515 1.0000 500 8693 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/88/88.seqs.fa -oc motifs/88 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.276 C 0.227 G 0.239 T 0.258 Background letter frequencies (from dataset with add-one prior applied): A 0.276 C 0.227 G 0.240 T 0.258 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 14 llr = 155 E-value = 3.6e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 246a19a:194198: pos.-specific C 161:91:85144:26 probability G 612::::14:151:4 matrix T ::1::::1:111::: bits 2.1 1.9 * * 1.7 * * 1.5 ** * * Relative 1.3 **** * Entropy 1.1 ***** * *** (16.0 bits) 0.9 * ***** * *** 0.6 ** ******* *** 0.4 ********** **** 0.2 *************** 0.0 --------------- Multilevel GCAACAACCAAGAAC consensus AAG G CC CG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 8515 252 4.54e-08 GTGGCAATAA GCAACAACGAGGAAC CAAGACGTGA 25447 195 9.11e-08 AACCCAACGA GCAACCACCACCAAC CAACCACTGG 4341 285 1.79e-07 GGGACGTGTG GCGACAACGAAGAAG GGGATAAATA 35059 2 3.91e-07 T GCAACAACCATGACG CCCAAAACAC 7202 343 1.17e-06 TCCTGCGTCA GAAACCACCACCACC ACCACACCCT 11943 305 1.71e-06 CCTTCTCAGA ACAACAACGTCGAAC AACTCAGTGC 31142 397 2.50e-06 AAAACCCAAG GCGACAACGAAAAAG AAAAAGAAGG 269281 140 3.56e-06 CTAGTGGCAT ACAACAATCATGAAC TATTTCGGAC 8693 442 5.86e-06 CAAGCCGCCT CGGACAACGACCAAC GACAGCGCTA 15537 218 9.92e-06 TGTACCTAAC AAAAAAACAAACAAC CCAAACCGGA 5551 473 1.31e-05 CTTCAATTTG CAAAAAAGCAAGAAC ACGGTTCAAC 262854 395 1.40e-05 AGTCTCTTGT GCAACAAGCAGCGAG CACCAAAAAG 5722 268 1.50e-05 CAGATTGGCC GACACAACCCAGACC AGCACTTGAG 6434 458 2.20e-05 GATATTCACT GATACAACAACTAAG CCAACTGCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8515 4.5e-08 251_[+1]_234 25447 9.1e-08 194_[+1]_291 4341 1.8e-07 284_[+1]_201 35059 3.9e-07 1_[+1]_484 7202 1.2e-06 342_[+1]_143 11943 1.7e-06 304_[+1]_181 31142 2.5e-06 396_[+1]_89 269281 3.6e-06 139_[+1]_346 8693 5.9e-06 441_[+1]_44 15537 9.9e-06 217_[+1]_268 5551 1.3e-05 472_[+1]_13 262854 1.4e-05 394_[+1]_91 5722 1.5e-05 267_[+1]_218 6434 2.2e-05 457_[+1]_28 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=14 8515 ( 252) GCAACAACGAGGAAC 1 25447 ( 195) GCAACCACCACCAAC 1 4341 ( 285) GCGACAACGAAGAAG 1 35059 ( 2) GCAACAACCATGACG 1 7202 ( 343) GAAACCACCACCACC 1 11943 ( 305) ACAACAACGTCGAAC 1 31142 ( 397) GCGACAACGAAAAAG 1 269281 ( 140) ACAACAATCATGAAC 1 8693 ( 442) CGGACAACGACCAAC 1 15537 ( 218) AAAAAAACAAACAAC 1 5551 ( 473) CAAAAAAGCAAGAAC 1 262854 ( 395) GCAACAAGCAGCGAG 1 5722 ( 268) GACACAACCCAGACC 1 6434 ( 458) GATACAACAACTAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 7776 bayes= 9.72147 E= 3.6e-002 -36 -66 142 -1045 37 133 -174 -1045 122 -166 -16 -185 186 -1045 -1045 -1045 -95 192 -1045 -1045 164 -66 -1045 -1045 186 -1045 -1045 -1045 -1045 179 -74 -185 -95 114 58 -1045 164 -166 -1045 -185 37 66 -74 -85 -195 66 106 -185 175 -1045 -174 -1045 151 -8 -1045 -1045 -1045 150 58 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 14 E= 3.6e-002 0.214286 0.142857 0.642857 0.000000 0.357143 0.571429 0.071429 0.000000 0.642857 0.071429 0.214286 0.071429 1.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.785714 0.142857 0.071429 0.142857 0.500000 0.357143 0.000000 0.857143 0.071429 0.000000 0.071429 0.357143 0.357143 0.142857 0.142857 0.071429 0.357143 0.500000 0.071429 0.928571 0.000000 0.071429 0.000000 0.785714 0.214286 0.000000 0.000000 0.000000 0.642857 0.357143 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA][CA][AG]ACAAC[CG]A[AC][GC]A[AC][CG] -------------------------------------------------------------------------------- Time 2.10 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 9 llr = 134 E-value = 2.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :41:131:11:6467176:99 pos.-specific C a:28761a:38333291:a1: probability G :1:1:1::72::1:::1:::: matrix T :4712:8:2321111:14::1 bits 2.1 * * * 1.9 * * * 1.7 * * * * 1.5 * * * * Relative 1.3 * * * * *** Entropy 1.1 * * ** * * *** (21.5 bits) 0.9 * *** *** * * **** 0.6 * ******* ** *** **** 0.4 ********* ** ******** 0.2 ********************* 0.0 --------------------- Multilevel CATCCCTCGCCAAAACAACAA consensus TC TA TTTCCCC T sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 25447 117 6.25e-11 GCCCCATCCT CGTCCCTCGTCACAACAACAA ACGGTCTGGT 11943 235 1.24e-08 CAACTGTCAC CACCTCTCGACCAACCAACAA TGACCACATC 15537 471 2.70e-08 ATTCTGTCGT CATCCATCGGCTACAAAACAA CAGCAACTG 35059 275 6.54e-08 ATACCAACTG CACCCATCTCTAACACCACAA ACAGGTGACC 8693 363 7.11e-08 CTTCGCTCAA CTTCCGACGCCACACCGTCAA CAACGCGTGC 261219 478 7.72e-08 CCATCGGCCA CAACACCCGGCACAACATCAA CC 7202 469 1.24e-07 CATCGCACCG CTTGCATCGTCAACTCATCCA AACACACCAC 5551 292 1.82e-07 CCGCTTCATT CTTCCCTCTCTCGTACTACAA AGAAACAGCT 6434 156 7.22e-07 CGTTTGCCAA CTTTTCTCATCCTAACATCAT TTCTGGTAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25447 6.3e-11 116_[+2]_363 11943 1.2e-08 234_[+2]_245 15537 2.7e-08 470_[+2]_9 35059 6.5e-08 274_[+2]_205 8693 7.1e-08 362_[+2]_117 261219 7.7e-08 477_[+2]_2 7202 1.2e-07 468_[+2]_11 5551 1.8e-07 291_[+2]_188 6434 7.2e-07 155_[+2]_324 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=9 25447 ( 117) CGTCCCTCGTCACAACAACAA 1 11943 ( 235) CACCTCTCGACCAACCAACAA 1 15537 ( 471) CATCCATCGGCTACAAAACAA 1 35059 ( 275) CACCCATCTCTAACACCACAA 1 8693 ( 363) CTTCCGACGCCACACCGTCAA 1 261219 ( 478) CAACACCCGGCACAACATCAA 1 7202 ( 469) CTTGCATCGTCAACTCATCCA 1 5551 ( 292) CTTCCCTCTCTCGTACTACAA 1 6434 ( 156) CTTTTCTCATCCTAACATCAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 9.86973 E= 2.7e+001 -982 214 -982 -982 69 -982 -111 78 -131 -3 -982 137 -982 178 -111 -121 -131 156 -982 -22 27 129 -111 -982 -131 -103 -982 159 -982 214 -982 -982 -131 -982 148 -22 -131 56 -11 37 -982 178 -982 -22 101 56 -982 -121 69 56 -111 -121 101 56 -982 -121 127 -3 -982 -121 -131 197 -982 -982 127 -103 -111 -121 101 -982 -982 78 -982 214 -982 -982 169 -103 -982 -982 169 -982 -982 -121 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 2.7e+001 0.000000 1.000000 0.000000 0.000000 0.444444 0.000000 0.111111 0.444444 0.111111 0.222222 0.000000 0.666667 0.000000 0.777778 0.111111 0.111111 0.111111 0.666667 0.000000 0.222222 0.333333 0.555556 0.111111 0.000000 0.111111 0.111111 0.000000 0.777778 0.000000 1.000000 0.000000 0.000000 0.111111 0.000000 0.666667 0.222222 0.111111 0.333333 0.222222 0.333333 0.000000 0.777778 0.000000 0.222222 0.555556 0.333333 0.000000 0.111111 0.444444 0.333333 0.111111 0.111111 0.555556 0.333333 0.000000 0.111111 0.666667 0.222222 0.000000 0.111111 0.111111 0.888889 0.000000 0.000000 0.666667 0.111111 0.111111 0.111111 0.555556 0.000000 0.000000 0.444444 0.000000 1.000000 0.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.888889 0.000000 0.000000 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[AT][TC]C[CT][CA]TC[GT][CTG][CT][AC][AC][AC][AC]CA[AT]CAA -------------------------------------------------------------------------------- Time 4.23 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 6 llr = 102 E-value = 1.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :7:::5:2::3::2323:: pos.-specific C 73a:a:58382728285aa probability G 3::3:5:::25:::2:::: matrix T :::7::5:7::38:3:2:: bits 2.1 * * ** 1.9 * * ** 1.7 * * ** 1.5 * * * * * * ** Relative 1.3 * * * * * ** * ** Entropy 1.1 ***** **** *** * ** (24.4 bits) 0.9 ********** *** * ** 0.6 ************** **** 0.4 ************** **** 0.2 ************** **** 0.0 ------------------- Multilevel CACTCACCTCGCTCACCCC consensus GC G GT C AT T A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 7202 314 2.04e-10 TGTAAAGTTA CCCTCACCCCGCTCTCCCC TCCTGCGTCA 262854 346 1.50e-09 CTGCGCTTTC CACTCGCCTGACTCACCCC TCGTGCTTTG 5722 206 5.13e-09 TCATCATGTT CACTCACCTCGCTACCACC GATGATCCTC 261219 303 1.47e-08 ACACGTCAAC GCCTCGTATCGTTCACCCC CGATCAATCT 11943 358 2.44e-08 TGATGCATCA GACGCATCTCACTCTAACC AATTGAAATA 25447 452 8.80e-08 GGTTCCTCTC CACGCGTCCCCTCCGCTCC GAAGTATCTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7202 2e-10 313_[+3]_168 262854 1.5e-09 345_[+3]_136 5722 5.1e-09 205_[+3]_276 261219 1.5e-08 302_[+3]_179 11943 2.4e-08 357_[+3]_124 25447 8.8e-08 451_[+3]_30 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=6 7202 ( 314) CCCTCACCCCGCTCTCCCC 1 262854 ( 346) CACTCGCCTGACTCACCCC 1 5722 ( 206) CACTCACCTCGCTACCACC 1 261219 ( 303) GCCTCGTATCGTTCACCCC 1 11943 ( 358) GACGCATCTCACTCTAACC 1 25447 ( 452) CACGCGTCCCCTCCGCTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 7712 bayes= 10.7746 E= 1.8e+001 -923 156 48 -923 127 56 -923 -923 -923 214 -923 -923 -923 -923 48 137 -923 214 -923 -923 86 -923 106 -923 -923 114 -923 95 -72 188 -923 -923 -923 56 -923 137 -923 188 -52 -923 27 -44 106 -923 -923 156 -923 37 -923 -44 -923 169 -72 188 -923 -923 27 -44 -52 37 -72 188 -923 -923 27 114 -923 -63 -923 214 -923 -923 -923 214 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 6 E= 1.8e+001 0.000000 0.666667 0.333333 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.500000 0.166667 0.833333 0.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.833333 0.166667 0.000000 0.333333 0.166667 0.500000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.166667 0.000000 0.833333 0.166667 0.833333 0.000000 0.000000 0.333333 0.166667 0.166667 0.333333 0.166667 0.833333 0.000000 0.000000 0.333333 0.500000 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG][AC]C[TG]C[AG][CT]C[TC]C[GA][CT]TC[AT]C[CA]CC -------------------------------------------------------------------------------- Time 6.37 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11943 2.87e-11 234_[+2(1.24e-08)]_49_\ [+1(1.71e-06)]_38_[+3(2.44e-08)]_124 12128 9.79e-01 500 15537 7.13e-06 148_[+1(9.47e-05)]_54_\ [+1(9.92e-06)]_238_[+2(2.70e-08)]_9 25447 4.16e-14 116_[+2(6.25e-11)]_57_\ [+1(9.11e-08)]_242_[+3(8.80e-08)]_30 261219 7.83e-09 302_[+3(1.47e-08)]_156_\ [+2(7.72e-08)]_2 262854 4.67e-07 345_[+3(1.50e-09)]_30_\ [+1(1.40e-05)]_91 269281 2.46e-02 139_[+1(3.56e-06)]_346 31142 4.43e-03 396_[+1(2.50e-06)]_89 35059 3.44e-07 1_[+1(3.91e-07)]_258_[+2(6.54e-08)]_\ 205 4341 3.93e-04 284_[+1(1.79e-07)]_201 5551 2.34e-05 291_[+2(1.82e-07)]_160_\ [+1(1.31e-05)]_13 5722 2.78e-06 205_[+3(5.13e-09)]_43_\ [+1(1.50e-05)]_218 6434 1.99e-04 155_[+2(7.22e-07)]_281_\ [+1(2.20e-05)]_28 7202 1.97e-12 313_[+3(2.04e-10)]_10_\ [+1(1.17e-06)]_111_[+2(1.24e-07)]_11 8515 8.30e-04 251_[+1(4.54e-08)]_234 8693 1.06e-05 362_[+2(7.11e-08)]_58_\ [+1(5.86e-06)]_44 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************