******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/412/412.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10149 1.0000 500 10934 1.0000 500 2066 1.0000 500 21220 1.0000 500 2222 1.0000 500 23848 1.0000 500 261613 1.0000 500 261974 1.0000 500 263138 1.0000 500 264061 1.0000 500 34746 1.0000 500 8750 1.0000 500 9048 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/412/412.seqs.fa -oc motifs/412 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.274 C 0.218 G 0.236 T 0.272 Background letter frequencies (from dataset with add-one prior applied): A 0.274 C 0.218 G 0.236 T 0.272 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 13 llr = 128 E-value = 3.9e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :a2:8:22:43: pos.-specific C :::32114:1:1 probability G a:87:9834539 matrix T ::1::::26:4: bits 2.2 2.0 ** 1.8 ** * * 1.5 ** * * Relative 1.3 ** *** * Entropy 1.1 ******* * * (14.2 bits) 0.9 ******* * * 0.7 ******* ** * 0.4 ******* **** 0.2 ************ 0.0 ------------ Multilevel GAGGAGGCTGTG consensus C GGAA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 2222 26 4.94e-08 GCCGGGGGGA GAGGAGGCTGTG TGCTCAGGGT 261613 281 6.30e-07 GAGCAAGAGA GAGGAGGCTAAG ACAACCGACG 21220 223 1.06e-06 CCTGCTGAGG GAGGAGGCGAGG TGGAGACGAG 10934 29 1.06e-06 ATCACCAGAA GAGCAGGCTGAG ATTGCACTGA 10149 74 1.17e-06 TTTGGATGAG GAGGAGGGTAAG AGACGAACTG 261974 166 2.88e-06 AAACCCAAAG GAGGAGGAGGAG GTTGTCAGGA 8750 243 1.57e-05 ACGAGAATCA GAGCAGCCTGGG ATGAAACCTT 263138 362 1.70e-05 TTCACTCTTC GAGCAGAGGGTG AGGACCAAGA 34746 257 3.15e-05 AAGCAGAGGT GAAGAGAGTGGG CATTATTGAA 2066 7 3.33e-05 AGCATC GAGCCGGATAGG AACAAGCCAG 23848 432 7.24e-05 ACTTTTGACC GAAGAGGTGCTG CCTCGTGATA 9048 278 7.68e-05 GAGATACTTT GAGGACGGTGTC CAGGCAACAA 264061 409 1.14e-04 GCGTCTCTCA GATGCGGTGATG CCTTTCTGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 2222 4.9e-08 25_[+1]_463 261613 6.3e-07 280_[+1]_208 21220 1.1e-06 222_[+1]_266 10934 1.1e-06 28_[+1]_460 10149 1.2e-06 73_[+1]_415 261974 2.9e-06 165_[+1]_323 8750 1.6e-05 242_[+1]_246 263138 1.7e-05 361_[+1]_127 34746 3.2e-05 256_[+1]_232 2066 3.3e-05 6_[+1]_482 23848 7.2e-05 431_[+1]_57 9048 7.7e-05 277_[+1]_211 264061 0.00011 408_[+1]_80 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=13 2222 ( 26) GAGGAGGCTGTG 1 261613 ( 281) GAGGAGGCTAAG 1 21220 ( 223) GAGGAGGCGAGG 1 10934 ( 29) GAGCAGGCTGAG 1 10149 ( 74) GAGGAGGGTAAG 1 261974 ( 166) GAGGAGGAGGAG 1 8750 ( 243) GAGCAGCCTGGG 1 263138 ( 362) GAGCAGAGGGTG 1 34746 ( 257) GAAGAGAGTGGG 1 2066 ( 7) GAGCCGGATAGG 1 23848 ( 432) GAAGAGGTGCTG 1 9048 ( 278) GAGGACGGTGTC 1 264061 ( 409) GATGCGGTGATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 8.93074 E= 3.9e-001 -1035 -1035 208 -1035 187 -1035 -1035 -1035 -83 -1035 170 -182 -1035 50 155 -1035 163 -50 -1035 -1035 -1035 -150 197 -1035 -83 -150 170 -1035 -83 82 38 -82 -1035 -1035 71 118 49 -150 119 -1035 17 -1035 38 50 -1035 -150 197 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 3.9e-001 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.153846 0.000000 0.769231 0.076923 0.000000 0.307692 0.692308 0.000000 0.846154 0.153846 0.000000 0.000000 0.000000 0.076923 0.923077 0.000000 0.153846 0.076923 0.769231 0.000000 0.153846 0.384615 0.307692 0.153846 0.000000 0.000000 0.384615 0.615385 0.384615 0.076923 0.538462 0.000000 0.307692 0.000000 0.307692 0.384615 0.000000 0.076923 0.923077 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GAG[GC]AGG[CG][TG][GA][TAG]G -------------------------------------------------------------------------------- Time 1.43 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 5 llr = 79 E-value = 1.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::22:2:::::4:a pos.-specific C aaa:4a82a:6848: probability G ::::4::::2::::: matrix T :::8:::8:84222: bits 2.2 *** * * 2.0 *** * * * 1.8 *** * * * 1.5 *** * * * Relative 1.3 *** **** * ** Entropy 1.1 **** ******* ** (22.8 bits) 0.9 **** ******* ** 0.7 ************ ** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CCCTCCCTCTCCACA consensus AG AC GTTCT sequence A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 2066 389 1.37e-09 CGAGTTGCAT CCCTGCCTCTCCACA CCTACCTTCC 261974 459 1.97e-08 ATAATTCATA CCCTCCCTCGTCACA GCGACACACC 2222 268 4.27e-08 CGCCAACGCT CCCTCCACCTCCCCA CAAGCCCTAT 263138 461 6.81e-08 CTCTGGACAC CCCAACCTCTCCTCA TGAGGTGGTC 264061 452 1.19e-07 CTCTGTCCTT CCCTGCCTCTTTCTA TTCTCATTTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 2066 1.4e-09 388_[+2]_97 261974 2e-08 458_[+2]_27 2222 4.3e-08 267_[+2]_218 263138 6.8e-08 460_[+2]_25 264061 1.2e-07 451_[+2]_34 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=5 2066 ( 389) CCCTGCCTCTCCACA 1 261974 ( 459) CCCTCCCTCGTCACA 1 2222 ( 268) CCCTCCACCTCCCCA 1 263138 ( 461) CCCAACCTCTCCTCA 1 264061 ( 452) CCCTGCCTCTTTCTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6318 bayes= 11.2461 E= 1.7e+001 -897 219 -897 -897 -897 219 -897 -897 -897 219 -897 -897 -45 -897 -897 156 -45 87 76 -897 -897 219 -897 -897 -45 187 -897 -897 -897 -13 -897 156 -897 219 -897 -897 -897 -897 -24 156 -897 146 -897 56 -897 187 -897 -44 54 87 -897 -44 -897 187 -897 -44 186 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 5 E= 1.7e+001 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.000000 0.800000 0.200000 0.400000 0.400000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.600000 0.000000 0.400000 0.000000 0.800000 0.000000 0.200000 0.400000 0.400000 0.000000 0.200000 0.000000 0.800000 0.000000 0.200000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CCC[TA][CGA]C[CA][TC]C[TG][CT][CT][ACT][CT]A -------------------------------------------------------------------------------- Time 2.84 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 13 llr = 140 E-value = 2.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :83194:72841a32 pos.-specific C ::2:::113:26:2: probability G 81:91:9:5:53:17 matrix T 225::6:2:2:::51 bits 2.2 2.0 * 1.8 * * * 1.5 * ** * * Relative 1.3 * ** * * * Entropy 1.1 * ** * * * (15.5 bits) 0.9 ** **** * ** * 0.7 ** ***** **** * 0.4 ************* * 0.2 *************** 0.0 --------------- Multilevel GATGATGAGAGCATG consensus A A TC AG AA sequence C A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 261974 29 1.17e-09 GGATATGGGG GATGATGAGAGCATG TTTTGTTCAA 10149 115 7.63e-08 CAGGAAAGTC GACGATGAGACCATG CTCTTGGATA 21220 255 1.96e-07 TAGTGCAGTT GACGATGAGAAGAAG AAGCAGTGAT 261613 235 1.27e-06 TGGCATTACA GAAGATGAGAAGAAA GTTAGTGTTA 23848 184 3.09e-06 CATATATAGA GAAGAAGACAACATT CAAATTGAAT 9048 304 3.37e-06 CAACAAGAAA GGTGAAGTCAGCATG AAGATGCGTG 264061 26 3.37e-06 GTGTACTGAC GTTGATGAAAGGACG GAATACAGCA 8750 389 5.58e-06 AGTATTCTCG TATGATGCCAGCAAG GAGGGTTTTC 2222 426 6.64e-06 CAAATGCCTT GACAAAGACAGCACG TTTCCTTACT 263138 123 1.21e-05 ACCATACGGT GATGGAGAGAAAATG CTTGGACGGG 34746 239 1.61e-05 AGGAAGGGGG GAAGAAGTAAGCAGA GGTGAAGAGA 2066 143 4.98e-05 AATTATGAAA GAAGATCAATACAAA GCAAGCTAAA 10934 431 7.39e-05 GAGCTGATTG TTTGATGTGTCGATG TAATGCTCTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 261974 1.2e-09 28_[+3]_457 10149 7.6e-08 114_[+3]_371 21220 2e-07 254_[+3]_231 261613 1.3e-06 234_[+3]_251 23848 3.1e-06 183_[+3]_302 9048 3.4e-06 303_[+3]_182 264061 3.4e-06 25_[+3]_460 8750 5.6e-06 388_[+3]_97 2222 6.6e-06 425_[+3]_60 263138 1.2e-05 122_[+3]_363 34746 1.6e-05 238_[+3]_247 2066 5e-05 142_[+3]_343 10934 7.4e-05 430_[+3]_55 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=13 261974 ( 29) GATGATGAGAGCATG 1 10149 ( 115) GACGATGAGACCATG 1 21220 ( 255) GACGATGAGAAGAAG 1 261613 ( 235) GAAGATGAGAAGAAA 1 23848 ( 184) GAAGAAGACAACATT 1 9048 ( 304) GGTGAAGTCAGCATG 1 264061 ( 26) GTTGATGAAAGGACG 1 8750 ( 389) TATGATGCCAGCAAG 1 2222 ( 426) GACAAAGACAGCACG 1 263138 ( 123) GATGGAGAGAAAATG 1 34746 ( 239) GAAGAAGTAAGCAGA 1 2066 ( 143) GAAGATCAATACAAA 1 10934 ( 431) TTTGATGTGTCGATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6318 bayes= 8.92184 E= 2.9e+000 -1035 -1035 184 -82 149 -1035 -161 -82 17 8 -1035 76 -183 -1035 197 -1035 175 -1035 -161 -1035 49 -1035 -1035 118 -1035 -150 197 -1035 134 -150 -1035 -24 -25 50 97 -1035 163 -1035 -1035 -82 49 -50 97 -1035 -183 150 38 -1035 187 -1035 -1035 -1035 17 -50 -161 76 -25 -1035 155 -182 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 13 E= 2.9e+000 0.000000 0.000000 0.846154 0.153846 0.769231 0.000000 0.076923 0.153846 0.307692 0.230769 0.000000 0.461538 0.076923 0.000000 0.923077 0.000000 0.923077 0.000000 0.076923 0.000000 0.384615 0.000000 0.000000 0.615385 0.000000 0.076923 0.923077 0.000000 0.692308 0.076923 0.000000 0.230769 0.230769 0.307692 0.461538 0.000000 0.846154 0.000000 0.000000 0.153846 0.384615 0.153846 0.461538 0.000000 0.076923 0.615385 0.307692 0.000000 1.000000 0.000000 0.000000 0.000000 0.307692 0.153846 0.076923 0.461538 0.230769 0.000000 0.692308 0.076923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GA[TAC]GA[TA]G[AT][GCA]A[GA][CG]A[TA][GA] -------------------------------------------------------------------------------- Time 4.40 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10149 3.18e-06 73_[+1(1.17e-06)]_29_[+3(7.63e-08)]_\ 371 10934 9.07e-04 28_[+1(1.06e-06)]_390_\ [+3(7.39e-05)]_55 2066 6.86e-08 6_[+1(3.33e-05)]_124_[+3(4.98e-05)]_\ 231_[+2(1.37e-09)]_97 21220 3.60e-06 222_[+1(1.06e-06)]_20_\ [+3(1.96e-07)]_231 2222 6.40e-10 25_[+1(4.94e-08)]_230_\ [+2(4.27e-08)]_143_[+3(6.64e-06)]_60 23848 5.81e-04 183_[+3(3.09e-06)]_233_\ [+1(7.24e-05)]_57 261613 6.64e-06 234_[+3(1.27e-06)]_31_\ [+1(6.30e-07)]_208 261974 4.30e-12 28_[+3(1.17e-09)]_122_\ [+1(2.88e-06)]_281_[+2(1.97e-08)]_27 263138 3.65e-07 122_[+3(1.21e-05)]_224_\ [+1(1.70e-05)]_87_[+2(6.81e-08)]_25 264061 1.03e-06 25_[+3(3.37e-06)]_411_\ [+2(1.19e-07)]_34 34746 4.40e-03 238_[+3(1.61e-05)]_3_[+1(3.15e-05)]_\ 232 8750 6.40e-05 242_[+1(1.57e-05)]_53_\ [+2(5.37e-05)]_66_[+3(5.58e-06)]_97 9048 2.35e-03 277_[+1(7.68e-05)]_14_\ [+3(3.37e-06)]_182 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************