******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/253/253.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42588 1.0000 500 31722 1.0000 500 51407 1.0000 500 46762 1.0000 500 46794 1.0000 500 37393 1.0000 500 47346 1.0000 500 49425 1.0000 500 55128 1.0000 500 5546 1.0000 500 3687 1.0000 500 45260 1.0000 500 35472 1.0000 500 54708 1.0000 500 46540 1.0000 500 44646 1.0000 500 48988 1.0000 500 47049 1.0000 500 47874 1.0000 500 44886 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/253/253.seqs.fa -oc motifs/253 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10000 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.270 C 0.248 G 0.232 T 0.250 Background letter frequencies (from dataset with add-one prior applied): A 0.270 C 0.248 G 0.232 T 0.250 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 116 E-value = 1.3e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::3:3::::22:2::::::2 pos.-specific C 8285:3a:32::2:2:2:::8 probability G :8:2a::a::87587::85:: matrix T 2:2::3::78:23:2a825a: bits 2.1 * ** 1.9 * ** * * 1.7 * ** * * 1.5 * * ** * * * * * Relative 1.3 *** * ** ** * *** ** Entropy 1.1 *** * ***** * ****** (28.0 bits) 0.8 *** * ****** ******** 0.6 ***** *************** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CGCCGACGTTGGGGGTTGGTC consensus A C C T T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 35472 5 4.80e-12 GGAA CGCCGCCGCTGGTGGTTGGTC ATGAAAAGCC 37393 5 4.80e-12 GGAA CGCCGCCGCTGGTGGTTGGTC ATGAAAAGGC 54708 180 1.97e-10 GAGTCACCTG CGCCGACGTTGGGGGTTTGTA CGATCGAGAC 55128 254 3.89e-09 ACTCAACAAT CCCAGTCGTTGGCGTTCGTTC ACCACAATGG 45260 89 5.16e-09 GGTTCTGATC CGTAGTCGTTGTGACTTGTTC GGGAGGTTTC 47346 51 9.28e-09 CATGGAGGAG TGCGGACGTCAAGGGTTGTTC CGTGAAGTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35472 4.8e-12 4_[+1]_475 37393 4.8e-12 4_[+1]_475 54708 2e-10 179_[+1]_300 55128 3.9e-09 253_[+1]_226 45260 5.2e-09 88_[+1]_391 47346 9.3e-09 50_[+1]_429 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 35472 ( 5) CGCCGCCGCTGGTGGTTGGTC 1 37393 ( 5) CGCCGCCGCTGGTGGTTGGTC 1 54708 ( 180) CGCCGACGTTGGGGGTTTGTA 1 55128 ( 254) CCCAGTCGTTGGCGTTCGTTC 1 45260 ( 89) CGTAGTCGTTGTGACTTGTTC 1 47346 ( 51) TGCGGACGTCAAGGGTTGTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9600 bayes= 11.743 E= 1.3e-001 -923 174 -923 -58 -923 -57 184 -923 -923 174 -923 -58 31 101 -48 -923 -923 -923 211 -923 31 42 -923 42 -923 201 -923 -923 -923 -923 211 -923 -923 42 -923 141 -923 -57 -923 174 -69 -923 184 -923 -69 -923 152 -58 -923 -57 111 42 -69 -923 184 -923 -923 -57 152 -58 -923 -923 -923 200 -923 -57 -923 174 -923 -923 184 -58 -923 -923 111 100 -923 -923 -923 200 -69 174 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 1.3e-001 0.000000 0.833333 0.000000 0.166667 0.000000 0.166667 0.833333 0.000000 0.000000 0.833333 0.000000 0.166667 0.333333 0.500000 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.333333 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.166667 0.000000 0.833333 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.666667 0.166667 0.000000 0.166667 0.500000 0.333333 0.166667 0.000000 0.833333 0.000000 0.000000 0.166667 0.666667 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.166667 0.833333 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CGC[CA]G[ACT]CG[TC]TGG[GT]GGTTG[GT]TC -------------------------------------------------------------------------------- Time 3.53 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 11 llr = 139 E-value = 1.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :1::3:::1::5:::: pos.-specific C :3:363:21:81:::3 probability G ::a514142222::76 matrix T a6:2:49568:3aa31 bits 2.1 * 1.9 * * ** 1.7 * * ** 1.5 * * * ** Relative 1.3 * * * ** *** Entropy 1.1 * * * ** *** (18.3 bits) 0.8 *** * * ** **** 0.6 ***** ** ** **** 0.4 *********** **** 0.2 **************** 0.0 ---------------- Multilevel TTGGCGTTTTCATTGG consensus C CAT G T TC sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 54708 290 2.38e-08 GGTCTAGCCG TTGGCTTGTTCCTTGG TGGAATGGTG 35472 192 5.93e-08 GTCAGTCGAT TTGGACTTTTCATTTG AAGATCTCAT 37393 192 5.93e-08 GTCAGTCGAT TTGGACTTTTCATTTG AAGACTCATG 44886 195 1.28e-07 CTCTACGGAG TTGTCGTGTGCATTGG CGATTACATA 48988 270 1.63e-07 CCAATATTTT TTGCACTCTTCATTGG ATTTCCTGTC 46540 134 5.27e-07 CAGGGGCATG TTGGCTTGTGGGTTGG AACCGAATGG 46762 123 1.26e-06 ACAAGAAACT TCGGCTGTGTCTTTGG CAGGTGGTGG 5546 127 1.71e-06 TAGTAGAGAC TAGCCGTCTTCTTTGC AAGTGTCAGT 49425 213 2.98e-06 GGCTACAGAT TTGGCGTTCTGTTTTC CGCACGACCG 55128 224 3.58e-06 TCCCAGATTC TCGTCGTTGTCGTTGT CGTCACTCAA 46794 460 5.59e-06 AATACAAACA TCGCGTTGATCATTGC TCCAGAGGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54708 2.4e-08 289_[+2]_195 35472 5.9e-08 191_[+2]_293 37393 5.9e-08 191_[+2]_293 44886 1.3e-07 194_[+2]_290 48988 1.6e-07 269_[+2]_215 46540 5.3e-07 133_[+2]_351 46762 1.3e-06 122_[+2]_362 5546 1.7e-06 126_[+2]_358 49425 3e-06 212_[+2]_272 55128 3.6e-06 223_[+2]_261 46794 5.6e-06 459_[+2]_25 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=11 54708 ( 290) TTGGCTTGTTCCTTGG 1 35472 ( 192) TTGGACTTTTCATTTG 1 37393 ( 192) TTGGACTTTTCATTTG 1 44886 ( 195) TTGTCGTGTGCATTGG 1 48988 ( 270) TTGCACTCTTCATTGG 1 46540 ( 134) TTGGCTTGTGGGTTGG 1 46762 ( 123) TCGGCTGTGTCTTTGG 1 5546 ( 127) TAGCCGTCTTCTTTGC 1 49425 ( 213) TTGGCGTTCTGTTTTC 1 55128 ( 224) TCGTCGTTGTCGTTGT 1 46794 ( 460) TCGCGTTGATCATTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 10.1382 E= 1.8e+000 -1010 -1010 -1010 200 -157 13 -1010 135 -1010 -1010 211 -1010 -1010 13 123 -46 2 136 -135 -1010 -1010 13 65 54 -1010 -1010 -135 186 -1010 -45 65 86 -157 -145 -35 135 -1010 -1010 -35 171 -1010 172 -35 -1010 75 -145 -35 13 -1010 -1010 -1010 200 -1010 -1010 -1010 200 -1010 -1010 165 13 -1010 13 145 -146 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 1.8e+000 0.000000 0.000000 0.000000 1.000000 0.090909 0.272727 0.000000 0.636364 0.000000 0.000000 1.000000 0.000000 0.000000 0.272727 0.545455 0.181818 0.272727 0.636364 0.090909 0.000000 0.000000 0.272727 0.363636 0.363636 0.000000 0.000000 0.090909 0.909091 0.000000 0.181818 0.363636 0.454545 0.090909 0.090909 0.181818 0.636364 0.000000 0.000000 0.181818 0.818182 0.000000 0.818182 0.181818 0.000000 0.454545 0.090909 0.181818 0.272727 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.727273 0.272727 0.000000 0.272727 0.636364 0.090909 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[TC]G[GC][CA][GTC]T[TG]TTC[AT]TT[GT][GC] -------------------------------------------------------------------------------- Time 7.11 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 18 llr = 183 E-value = 7.7e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 928:98:632498112 pos.-specific C 11:9::8:24211372 probability G :6:1:22461411111 matrix T :12:1::::3::1626 bits 2.1 1.9 1.7 * * 1.5 * ** Relative 1.3 * **** * Entropy 1.1 * ****** * (14.7 bits) 0.8 * ****** ** 0.6 ********* ** * 0.4 ********* ***** 0.2 **************** 0.0 ---------------- Multilevel AGACAACAGCAAATCT consensus AT GGATG C sequence AC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 55128 197 5.58e-08 AGGGATACGA AGACAACACAGAATCT CTCCCAGATT 47049 61 6.28e-07 GAACAAAGGA AGACAACGATCAATGT ACTTACTTTC 31722 371 6.28e-07 ATGGTCGACA AGACAACGGTGAGTCA GAGCAACAGT 51407 76 7.03e-07 ACCCCCTTTC AGACAACGACAAGCCT GAGCACACCG 35472 284 1.05e-06 AATACAGCGC AAACAAGAGCAAATTT AGGGTTATTA 37393 283 1.05e-06 AATACAGCGC AAACAAGAGCAAATTT AGGGTTATTA 46762 425 2.49e-06 AAGTGAACCA AAACAAGAATGAATCC AAACCTTTTC 54708 207 5.37e-06 TGTACGATCG AGACAGGGGCGAATAT GAAGCCGAAG 44886 353 7.06e-06 TACACAAAAA AGTCAACGGAACACCT AGCCGCACTC 46540 414 7.63e-06 TCTAGATTAC AGTGAACGGAGAATCG GACCCCACAG 47874 470 1.37e-05 CTACCTGCCT AATCAACGATCAACTT TAACATACAG 5546 370 1.37e-05 ACCCACCAGC ATACAACAACCAAACC GGACCGTTGT 46794 440 1.73e-05 CCATCGTCCA AGACAACAGAAATACA AACATCGCGT 49425 405 2.03e-05 TACTAGACTC CGACAACACTGAACGT CGAACGAATC 47346 228 2.35e-05 CGCCGAAGAG AGTCAGCAGCCAAGCA GTCGTCTCGT 48988 183 3.61e-05 AGTTCGTTAG AGAGAACAGGAGACCT CGTAGCGGTT 3687 204 4.40e-05 CTTTTGTCCC ACACAACACCGACTCC TATTGTAGAA 42588 174 5.74e-05 ACGATTCCTG ATACTGCGGTAAATCG GCACGCCACG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 55128 5.6e-08 196_[+3]_288 47049 6.3e-07 60_[+3]_424 31722 6.3e-07 370_[+3]_114 51407 7e-07 75_[+3]_409 35472 1.1e-06 283_[+3]_201 37393 1.1e-06 282_[+3]_202 46762 2.5e-06 424_[+3]_60 54708 5.4e-06 206_[+3]_278 44886 7.1e-06 352_[+3]_132 46540 7.6e-06 413_[+3]_71 47874 1.4e-05 469_[+3]_15 5546 1.4e-05 369_[+3]_115 46794 1.7e-05 439_[+3]_45 49425 2e-05 404_[+3]_80 47346 2.4e-05 227_[+3]_257 48988 3.6e-05 182_[+3]_302 3687 4.4e-05 203_[+3]_281 42588 5.7e-05 173_[+3]_311 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=18 55128 ( 197) AGACAACACAGAATCT 1 47049 ( 61) AGACAACGATCAATGT 1 31722 ( 371) AGACAACGGTGAGTCA 1 51407 ( 76) AGACAACGACAAGCCT 1 35472 ( 284) AAACAAGAGCAAATTT 1 37393 ( 283) AAACAAGAGCAAATTT 1 46762 ( 425) AAACAAGAATGAATCC 1 54708 ( 207) AGACAGGGGCGAATAT 1 44886 ( 353) AGTCAACGGAACACCT 1 46540 ( 414) AGTGAACGGAGAATCG 1 47874 ( 470) AATCAACGATCAACTT 1 5546 ( 370) ATACAACAACCAAACC 1 46794 ( 440) AGACAACAGAAATACA 1 49425 ( 405) CGACAACACTGAACGT 1 47346 ( 228) AGTCAGCAGCCAAGCA 1 48988 ( 183) AGAGAACAGGAGACCT 1 3687 ( 204) ACACAACACCGACTCC 1 42588 ( 174) ATACTGCGGTAAATCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 9.2057 E= 7.7e+000 181 -216 -1081 -1081 -28 -216 140 -117 153 -1081 -1081 -17 -1081 184 -106 -1081 181 -1081 -1081 -217 163 -1081 -48 -1081 -1081 165 -6 -1081 104 -1081 94 -1081 4 -58 126 -1081 -28 65 -206 42 53 -16 74 -1081 172 -216 -206 -1081 153 -216 -106 -217 -128 16 -206 115 -228 142 -106 -58 -69 -58 -106 115 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 18 E= 7.7e+000 0.944444 0.055556 0.000000 0.000000 0.222222 0.055556 0.611111 0.111111 0.777778 0.000000 0.000000 0.222222 0.000000 0.888889 0.111111 0.000000 0.944444 0.000000 0.000000 0.055556 0.833333 0.000000 0.166667 0.000000 0.000000 0.777778 0.222222 0.000000 0.555556 0.000000 0.444444 0.000000 0.277778 0.166667 0.555556 0.000000 0.222222 0.388889 0.055556 0.333333 0.388889 0.222222 0.388889 0.000000 0.888889 0.055556 0.055556 0.000000 0.777778 0.055556 0.111111 0.055556 0.111111 0.277778 0.055556 0.555556 0.055556 0.666667 0.111111 0.166667 0.166667 0.166667 0.111111 0.555556 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[GA][AT]CAA[CG][AG][GA][CTA][AGC]AA[TC]CT -------------------------------------------------------------------------------- Time 10.40 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42588 2.81e-01 173_[+3(5.74e-05)]_311 31722 7.79e-03 370_[+3(6.28e-07)]_114 51407 8.24e-03 75_[+3(7.03e-07)]_409 46762 5.76e-05 122_[+2(1.26e-06)]_286_\ [+3(2.49e-06)]_60 46794 1.23e-03 439_[+3(1.73e-05)]_4_[+2(5.59e-06)]_\ 25 37393 2.57e-14 4_[+1(4.80e-12)]_166_[+2(5.93e-08)]_\ 75_[+3(1.05e-06)]_53_[+3(7.33e-05)]_133 47346 2.42e-06 50_[+1(9.28e-09)]_156_\ [+3(2.35e-05)]_257 49425 4.98e-04 17_[+2(8.71e-05)]_179_\ [+2(2.98e-06)]_176_[+3(2.03e-05)]_80 55128 4.24e-11 196_[+3(5.58e-08)]_11_\ [+2(3.58e-06)]_14_[+1(3.89e-09)]_226 5546 2.72e-04 126_[+2(1.71e-06)]_227_\ [+3(1.37e-05)]_115 3687 1.31e-01 203_[+3(4.40e-05)]_281 45260 2.17e-05 88_[+1(5.16e-09)]_391 35472 2.57e-14 4_[+1(4.80e-12)]_166_[+2(5.93e-08)]_\ 76_[+3(1.05e-06)]_53_[+3(7.33e-05)]_132 54708 1.69e-12 179_[+1(1.97e-10)]_6_[+3(5.37e-06)]_\ 67_[+2(2.38e-08)]_195 46540 2.23e-05 133_[+2(5.27e-07)]_264_\ [+3(7.63e-06)]_71 44646 1.97e-01 500 48988 6.57e-05 182_[+3(3.61e-05)]_71_\ [+2(1.63e-07)]_215 47049 1.47e-03 60_[+3(6.28e-07)]_424 47874 8.26e-02 469_[+3(1.37e-05)]_15 44886 1.88e-05 194_[+2(1.28e-07)]_142_\ [+3(7.06e-06)]_132 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************