******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/127/127.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11970 1.0000 500 13556 1.0000 500 15123 1.0000 500 21229 1.0000 500 21268 1.0000 500 22859 1.0000 500 23450 1.0000 500 24035 1.0000 500 25833 1.0000 500 261448 1.0000 500 26192 1.0000 500 262528 1.0000 500 268862 1.0000 500 269504 1.0000 500 269863 1.0000 500 5393 1.0000 500 5496 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/127/127.seqs.fa -oc motifs/127 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.251 C 0.239 G 0.243 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.251 C 0.239 G 0.243 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 7 llr = 120 E-value = 2.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 74:::4:311:36::6::1:: pos.-specific C :4:993:7:994:3a3:a176 probability G :1a::1::7::1:::11::33 matrix T 3::111a:1:1147::9:7:1 bits 2.1 * * * 1.9 * * * * 1.7 * * * * 1.4 *** * ** * * Relative 1.2 *** ** ** * ** * Entropy 1.0 * *** ** ** *** ** * (24.7 bits) 0.8 * *** ***** *** **** 0.6 ***** ***** ********* 0.4 ***** ***** ********* 0.2 ********************* 0.0 --------------------- Multilevel AAGCCATCGCCCATCATCTCC consensus TC C A ATC C GG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 22859 409 6.46e-11 GATCCTCTCT AAGCCATCTCCCTTCATCTCC ACCTCGGCAG 25833 30 1.79e-10 CTGGTGATCT ACGCCATCACCCATCATCTGC GGTGAAGACT 11970 329 2.53e-09 TGCTCATGCG ACGCCTTCGCCTTTCAGCTCC CTCAGATTCG 13556 385 4.89e-09 CGTAAACGCC ACGTCCTCGCCAACCATCACC GAGCCATCTC 269863 179 2.50e-08 CAACCTACGA TAGCTCTCGCCAATCGTCTGG TGCCCTTGCT 21268 457 5.57e-08 AGGCCGTGCA TAGCCATAGCTGATCCTCCCG ACGGGGGCAT 15123 445 8.12e-08 AATGGCACAA AGGCCGTAGACCTCCCTCTCT ATCTCGTAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22859 6.5e-11 408_[+1]_71 25833 1.8e-10 29_[+1]_450 11970 2.5e-09 328_[+1]_151 13556 4.9e-09 384_[+1]_95 269863 2.5e-08 178_[+1]_301 21268 5.6e-08 456_[+1]_23 15123 8.1e-08 444_[+1]_35 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=7 22859 ( 409) AAGCCATCTCCCTTCATCTCC 1 25833 ( 30) ACGCCATCACCCATCATCTGC 1 11970 ( 329) ACGCCTTCGCCTTTCAGCTCC 1 13556 ( 385) ACGTCCTCGCCAACCATCACC 1 269863 ( 179) TAGCTCTCGCCAATCGTCTGG 1 21268 ( 457) TAGCCATAGCTGATCCTCCCG 1 15123 ( 445) AGGCCGTAGACCTCCCTCTCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8160 bayes= 10.7919 E= 2.6e+000 151 -945 -945 10 77 84 -76 -945 -945 -945 204 -945 -945 184 -945 -90 -945 184 -945 -90 77 26 -76 -90 -945 -945 -945 190 18 158 -945 -945 -81 -945 156 -90 -81 184 -945 -945 -945 184 -945 -90 18 84 -76 -90 118 -945 -945 68 -945 26 -945 142 -945 207 -945 -945 118 26 -76 -945 -945 -945 -76 168 -945 207 -945 -945 -81 -74 -945 142 -945 158 24 -945 -945 126 24 -90 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 2.6e+000 0.714286 0.000000 0.000000 0.285714 0.428571 0.428571 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 0.142857 0.428571 0.285714 0.142857 0.142857 0.000000 0.000000 0.000000 1.000000 0.285714 0.714286 0.000000 0.000000 0.142857 0.000000 0.714286 0.142857 0.142857 0.857143 0.000000 0.000000 0.000000 0.857143 0.000000 0.142857 0.285714 0.428571 0.142857 0.142857 0.571429 0.000000 0.000000 0.428571 0.000000 0.285714 0.000000 0.714286 0.000000 1.000000 0.000000 0.000000 0.571429 0.285714 0.142857 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 1.000000 0.000000 0.000000 0.142857 0.142857 0.000000 0.714286 0.000000 0.714286 0.285714 0.000000 0.000000 0.571429 0.285714 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AT][AC]GCC[AC]T[CA]GCC[CA][AT][TC]C[AC]TCT[CG][CG] -------------------------------------------------------------------------------- Time 3.21 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 17 llr = 169 E-value = 3.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 1421732246:111:8 pos.-specific C 1:::21:2::1:1:2: probability G 964913742177:78: matrix T ::4:1411432292:2 bits 2.1 1.9 1.7 * 1.4 * * * Relative 1.2 * * * ** Entropy 1.0 ** * * ** (14.3 bits) 0.8 ** ** * ******* 0.6 ** ** * ******* 0.4 ***** * ******** 0.2 **************** 0.0 ---------------- Multilevel GGGGATGGAAGGTGGA consensus AT A ATT T T sequence G C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 25833 196 4.72e-08 TTGTTCGCAT GGGGAAGGAAGTTGGA CACAAACTCT 21229 20 4.72e-08 ACGCCGTGGA GATGAAGCAAGGTGGA CTGCAATAGG 269863 108 2.87e-07 CTCAGAAGCG GGTGATGGATGATGGA CGGTACACGC 23450 308 1.77e-06 GTATTTGCAA GAGGGAGGGAGGTGGA TTGTAGGTAT 15123 300 1.77e-06 AGGGTCGGCC GGGGAGTAAAGGTGCA TAAAGATATG 5393 304 3.04e-06 TTGCCGAGCA GAGGCTGGTAGGTTGT GGTAGTGGTG 26192 57 5.02e-06 TGCACACCTT GGTGATAATATGTTGA GGTTATCTCG 11970 459 5.53e-06 ATGATAAAGT GGTGATATTTGGTGGT ACGGGTGGTG 269504 411 9.48e-06 CAAATAGAAA GATGAGGAAATGTAGA GACAAATTTG 268862 45 1.04e-05 GATCGGCAGA GGGAAAGATTGGTGGT TGCGGGGCTG 21268 131 1.13e-05 CTCAGATTTA GGAGACGCATGATGGA TGCTGATGTT 24035 25 1.71e-05 GAGGCAGTGC GGGGCGAGGAGGAGGA CGGTTCGGAT 13556 32 2.18e-05 ATGATGAAGA AGGGAGGGTGGGTTGA TGATGAATTG 261448 220 4.27e-05 AGCAATCTTT GGTGAGTCAATTTGCA GCTCTACAAT 262528 48 6.90e-05 CAGGCTTTGT GAAGCTGGGACGTTGT TCAATGCAGT 5496 208 1.21e-04 TGTCGTCACT GGTGTAGTTACGCGGA CTGTTCCCGT 22859 373 1.21e-04 GCACCTCTGG CAAGATGCTTGTTGCA GTAAATAACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25833 4.7e-08 195_[+2]_289 21229 4.7e-08 19_[+2]_465 269863 2.9e-07 107_[+2]_377 23450 1.8e-06 307_[+2]_177 15123 1.8e-06 299_[+2]_185 5393 3e-06 303_[+2]_181 26192 5e-06 56_[+2]_428 11970 5.5e-06 458_[+2]_26 269504 9.5e-06 410_[+2]_74 268862 1e-05 44_[+2]_440 21268 1.1e-05 130_[+2]_354 24035 1.7e-05 24_[+2]_460 13556 2.2e-05 31_[+2]_453 261448 4.3e-05 219_[+2]_265 262528 6.9e-05 47_[+2]_437 5496 0.00012 207_[+2]_277 22859 0.00012 372_[+2]_112 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=17 25833 ( 196) GGGGAAGGAAGTTGGA 1 21229 ( 20) GATGAAGCAAGGTGGA 1 269863 ( 108) GGTGATGGATGATGGA 1 23450 ( 308) GAGGGAGGGAGGTGGA 1 15123 ( 300) GGGGAGTAAAGGTGCA 1 5393 ( 304) GAGGCTGGTAGGTTGT 1 26192 ( 57) GGTGATAATATGTTGA 1 11970 ( 459) GGTGATATTTGGTGGT 1 269504 ( 411) GATGAGGAAATGTAGA 1 268862 ( 45) GGGAAAGATTGGTGGT 1 21268 ( 131) GGAGACGCATGATGGA 1 24035 ( 25) GGGGCGAGGAGGAGGA 1 13556 ( 32) AGGGAGGGTGGGTTGA 1 261448 ( 220) GGTGAGTCAATTTGCA 1 262528 ( 48) GAAGCTGGGACGTTGT 1 5496 ( 208) GGTGTAGTTACGCGGA 1 22859 ( 373) CAAGATGCTTGTTGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8245 bayes= 8.98854 E= 3.2e+001 -209 -202 186 -1073 49 -1073 141 -1073 -51 -1073 76 62 -209 -1073 196 -1073 149 -43 -204 -218 23 -202 28 40 -51 -1073 154 -118 -10 -2 76 -118 71 -1073 -46 62 136 -1073 -204 14 -1073 -102 154 -60 -109 -1073 154 -60 -209 -202 -1073 172 -209 -1073 154 -18 -1073 -43 176 -1073 160 -1073 -1073 -18 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 17 E= 3.2e+001 0.058824 0.058824 0.882353 0.000000 0.352941 0.000000 0.647059 0.000000 0.176471 0.000000 0.411765 0.411765 0.058824 0.000000 0.941176 0.000000 0.705882 0.176471 0.058824 0.058824 0.294118 0.058824 0.294118 0.352941 0.176471 0.000000 0.705882 0.117647 0.235294 0.235294 0.411765 0.117647 0.411765 0.000000 0.176471 0.411765 0.647059 0.000000 0.058824 0.294118 0.000000 0.117647 0.705882 0.176471 0.117647 0.000000 0.705882 0.176471 0.058824 0.058824 0.000000 0.882353 0.058824 0.000000 0.705882 0.235294 0.000000 0.176471 0.823529 0.000000 0.764706 0.000000 0.000000 0.235294 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[GA][GT]GA[TAG]G[GAC][AT][AT]GGT[GT]G[AT] -------------------------------------------------------------------------------- Time 6.62 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 5 llr = 90 E-value = 4.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 8::8:448:4:2a:::82: pos.-specific C 2aa:a222a428:66a26a probability G :::2:24:::8::::::2: matrix T :::::2:::2:::44:::: bits 2.1 ** * * * * * 1.9 ** * * * * * 1.7 ** * * * * * 1.4 ** * * * * * Relative 1.2 ***** ** *** ** * Entropy 1.0 ***** ** ******* * (26.0 bits) 0.8 ***** ** ******* * 0.6 ***** ** ********* 0.4 ***** ************* 0.2 ***** ************* 0.0 ------------------- Multilevel ACCACAAACAGCACCCACC consensus C G CGC CCA TT CA sequence GC T G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 21268 407 5.99e-11 ATCAACAACA ACCACTAACCGCACCCACC CCACAGTCCG 22859 300 1.81e-09 TAGCAAAGGT CCCACAGACAGCACCCCCC GCTAGTGCCG 269504 108 3.12e-09 ATGAATTGAC ACCACACACTGCATCCAAC AAACTGCCAA 13556 191 7.27e-09 TGCAGTCTGC ACCGCGGACAGAACTCACC TCCCTCTCTG 25833 2 2.16e-08 T ACCACCACCCCCATTCAGC TGGTGATCTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21268 6e-11 406_[+3]_75 22859 1.8e-09 299_[+3]_182 269504 3.1e-09 107_[+3]_374 13556 7.3e-09 190_[+3]_291 25833 2.2e-08 1_[+3]_480 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=5 21268 ( 407) ACCACTAACCGCACCCACC 1 22859 ( 300) CCCACAGACAGCACCCCCC 1 269504 ( 108) ACCACACACTGCATCCAAC 1 13556 ( 191) ACCGCGGACAGAACTCACC 1 25833 ( 2) ACCACCACCCCCATTCAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 8194 bayes= 10.9292 E= 4.8e+001 167 -25 -897 -897 -897 207 -897 -897 -897 207 -897 -897 167 -897 -28 -897 -897 207 -897 -897 67 -25 -28 -42 67 -25 72 -897 167 -25 -897 -897 -897 207 -897 -897 67 74 -897 -42 -897 -25 172 -897 -33 174 -897 -897 199 -897 -897 -897 -897 133 -897 58 -897 133 -897 58 -897 207 -897 -897 167 -25 -897 -897 -33 133 -28 -897 -897 207 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 5 E= 4.8e+001 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.200000 0.200000 0.200000 0.400000 0.200000 0.400000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.400000 0.000000 0.200000 0.000000 0.200000 0.800000 0.000000 0.200000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.600000 0.000000 0.400000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.600000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AC]CC[AG]C[ACGT][AGC][AC]C[ACT][GC][CA]A[CT][CT]C[AC][CAG]C -------------------------------------------------------------------------------- Time 9.99 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11970 5.03e-07 328_[+1(2.53e-09)]_109_\ [+2(5.53e-06)]_26 13556 4.18e-11 31_[+2(2.18e-05)]_143_\ [+3(7.27e-09)]_175_[+1(4.89e-09)]_95 15123 8.11e-07 299_[+2(1.77e-06)]_129_\ [+1(8.12e-08)]_35 21229 1.31e-03 19_[+2(4.72e-08)]_465 21268 2.46e-12 130_[+2(1.13e-05)]_260_\ [+3(5.99e-11)]_31_[+1(5.57e-08)]_23 22859 9.55e-13 268_[+3(2.47e-05)]_12_\ [+3(1.81e-09)]_90_[+1(6.46e-11)]_71 23450 2.77e-03 307_[+2(1.77e-06)]_177 24035 1.28e-02 24_[+2(1.71e-05)]_460 25833 1.59e-14 1_[+3(2.16e-08)]_9_[+1(1.79e-10)]_\ 102_[+2(3.98e-05)]_27_[+2(4.72e-08)]_289 261448 1.16e-01 219_[+2(4.27e-05)]_265 26192 1.26e-02 56_[+2(5.02e-06)]_428 262528 6.33e-03 47_[+2(6.90e-05)]_256_\ [+3(4.46e-05)]_162 268862 4.02e-02 44_[+2(1.04e-05)]_440 269504 1.10e-06 107_[+3(3.12e-09)]_284_\ [+2(9.48e-06)]_74 269863 1.52e-07 107_[+2(2.87e-07)]_55_\ [+1(2.50e-08)]_301 5393 9.53e-03 303_[+2(3.04e-06)]_181 5496 8.80e-02 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************