******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/92/92.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 21350 1.0000 500 48558 1.0000 500 48799 1.0000 500 51519 1.0000 500 16210 1.0000 500 48536 1.0000 500 46148 1.0000 500 47453 1.0000 500 44307 1.0000 500 45055 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/92/92.seqs.fa -oc motifs/92 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.265 C 0.225 G 0.229 T 0.281 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.225 G 0.229 T 0.281 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 10 llr = 105 E-value = 7.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :2111:a:51:: pos.-specific C :4:5:::a1::: probability G :181:9:::3:9 matrix T a31391::46a1 bits 2.1 * 1.9 * ** * 1.7 * *** ** 1.5 * *** ** Relative 1.3 * **** ** Entropy 1.1 * * **** ** (15.2 bits) 0.9 * * **** ** 0.6 * * ******** 0.4 * ********** 0.2 ************ 0.0 ------------ Multilevel TCGCTGACATTG consensus T T TG sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48536 484 5.23e-07 CGAAGCAGTG TTGCTGACTTTG CGAAC 16210 91 1.38e-06 GCCTTTGGAT TTGTTGACTTTG TAGCGTTTTC 51519 1 2.42e-06 . TAGTTGACAGTG AAAAACGGAT 45055 12 3.57e-06 GCCAGCTCGG TCGCTGACATTT TAAATGTTTA 48558 223 3.57e-06 TTCTTCAAAT TCGCTTACATTG ACTGTGAATA 44307 169 6.43e-06 CCGCTGTTTG TTGTTGACAATG CTTATTCATC 46148 253 1.20e-05 GCCGTTGCGA TATCTGACAGTG AGTGCACAGA 21350 64 1.30e-05 CTACCCACAT TCGATGACCGTG GTGTAGCCAG 48799 173 1.57e-05 TTTTGGACAC TGACTGACTTTG ACATGACTCT 47453 374 1.79e-05 ACTATAAATT TCGGAGACTTTG TGTCCATGGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48536 5.2e-07 483_[+1]_5 16210 1.4e-06 90_[+1]_398 51519 2.4e-06 [+1]_488 45055 3.6e-06 11_[+1]_477 48558 3.6e-06 222_[+1]_266 44307 6.4e-06 168_[+1]_320 46148 1.2e-05 252_[+1]_236 21350 1.3e-05 63_[+1]_425 48799 1.6e-05 172_[+1]_316 47453 1.8e-05 373_[+1]_115 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=10 48536 ( 484) TTGCTGACTTTG 1 16210 ( 91) TTGTTGACTTTG 1 51519 ( 1) TAGTTGACAGTG 1 45055 ( 12) TCGCTGACATTT 1 48558 ( 223) TCGCTTACATTG 1 44307 ( 169) TTGTTGACAATG 1 46148 ( 253) TATCTGACAGTG 1 21350 ( 64) TCGATGACCGTG 1 48799 ( 173) TGACTGACTTTG 1 47453 ( 374) TCGGAGACTTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 8.93074 E= 7.0e-001 -997 -997 -997 183 -40 83 -119 9 -140 -997 180 -149 -140 115 -119 9 -140 -997 -997 168 -997 -997 197 -149 192 -997 -997 -997 -997 215 -997 -997 92 -117 -997 51 -140 -997 39 109 -997 -997 -997 183 -997 -997 197 -149 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 7.0e-001 0.000000 0.000000 0.000000 1.000000 0.200000 0.400000 0.100000 0.300000 0.100000 0.000000 0.800000 0.100000 0.100000 0.500000 0.100000 0.300000 0.100000 0.000000 0.000000 0.900000 0.000000 0.000000 0.900000 0.100000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.100000 0.000000 0.400000 0.100000 0.000000 0.300000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.900000 0.100000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[CTA]G[CT]TGAC[AT][TG]TG -------------------------------------------------------------------------------- Time 0.88 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 10 llr = 101 E-value = 2.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 9:3:a:32259: pos.-specific C 1419:a:::4:: probability G :42:::18:::9 matrix T :241::6:8111 bits 2.1 * 1.9 ** 1.7 *** * 1.5 * *** ** Relative 1.3 * *** * ** Entropy 1.1 * *** ** ** (14.6 bits) 0.9 * *** ** ** 0.6 ** ********* 0.4 ** ********* 0.2 ** ********* 0.0 ------------ Multilevel ACTCACTGTAAG consensus GA AAAC sequence TG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 51519 59 5.81e-07 AGATGGATTC ACGCACTGTAAG TGTCAACTAC 46148 386 9.03e-07 TAGAATACAT ACTCACAGTCAG CGATCTCGAT 48558 382 1.37e-06 CAATCACACC ATTCACTGTCAG CTACGCACAA 47453 89 9.76e-06 TCAGACATTG ATACACTATAAG AAGCGACATA 45055 152 1.08e-05 TAGCTAACAG ACATACTGTCAG GATCGGTATC 21350 451 1.08e-05 ACCGTCCCTC CGTCACAGTCAG TGTTTTGCCC 44307 290 1.76e-05 TTTGCTCACA AGGCACTAAAAG TCCTTGGCGG 16210 305 2.03e-05 TATCTCACGA AGACACGGTTAG TGTTTGTGGT 48536 158 2.57e-05 GGCACTAGAC AGTCACTGAAAT TCCTCTCCAC 48799 199 2.94e-05 GACTCTGAGT ACCCACAGTATG GTCCATTTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 51519 5.8e-07 58_[+2]_430 46148 9e-07 385_[+2]_103 48558 1.4e-06 381_[+2]_107 47453 9.8e-06 88_[+2]_400 45055 1.1e-05 151_[+2]_337 21350 1.1e-05 450_[+2]_38 44307 1.8e-05 289_[+2]_199 16210 2e-05 304_[+2]_184 48536 2.6e-05 157_[+2]_331 48799 2.9e-05 198_[+2]_290 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=10 51519 ( 59) ACGCACTGTAAG 1 46148 ( 386) ACTCACAGTCAG 1 48558 ( 382) ATTCACTGTCAG 1 47453 ( 89) ATACACTATAAG 1 45055 ( 152) ACATACTGTCAG 1 21350 ( 451) CGTCACAGTCAG 1 44307 ( 290) AGGCACTAAAAG 1 16210 ( 305) AGACACGGTTAG 1 48536 ( 158) AGTCACTGAAAT 1 48799 ( 199) ACCCACAGTATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 8.93074 E= 2.0e+001 177 -117 -997 -997 -997 83 80 -49 18 -117 -20 51 -997 200 -997 -149 192 -997 -997 -997 -997 215 -997 -997 18 -997 -119 109 -40 -997 180 -997 -40 -997 -997 151 92 83 -997 -149 177 -997 -997 -149 -997 -997 197 -149 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 2.0e+001 0.900000 0.100000 0.000000 0.000000 0.000000 0.400000 0.400000 0.200000 0.300000 0.100000 0.200000 0.400000 0.000000 0.900000 0.000000 0.100000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.300000 0.000000 0.100000 0.600000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.800000 0.500000 0.400000 0.000000 0.100000 0.900000 0.000000 0.000000 0.100000 0.000000 0.000000 0.900000 0.100000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[CGT][TAG]CAC[TA][GA][TA][AC]AG -------------------------------------------------------------------------------- Time 1.71 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 7 llr = 112 E-value = 9.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1::344:a:1:1:771:1::: pos.-specific C 63:4:39:1411:1::741:9 probability G :7:3631:74:1a1:9:1711 matrix T 3:a:::::1:96::3:3319: bits 2.1 * 1.9 * * * 1.7 * * * 1.5 * ** * * * Relative 1.3 ** ** * * ** ** Entropy 1.1 ** * ** * * *** ** (23.1 bits) 0.9 ** * *** * ***** *** 0.6 *** * ***** ***** *** 0.4 *********** ***** *** 0.2 ********************* 0.0 --------------------- Multilevel CGTCGACAGCTTGAAGCCGTC consensus TC AAC G T TT sequence G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 16210 240 8.16e-11 CCAGTGGAGT CGTCGACAGGTTGATGTCGTC TTTTCCTTGT 44307 395 6.18e-10 CATTCCTTGA AGTCACCAGCTTGAAGCGGTC ACGTTTTACT 21350 77 2.85e-08 ATGACCGTGG TGTAGCCAGGCAGAAGCCGGC CATTGGCTGC 48536 463 3.92e-08 AGTTTCCATT CGTAGACATCTCGAAGCAGTG TTGCTGACTT 51519 336 4.23e-08 ACATCTCCAG CCTCAGCAGCTTGGAGTTTTC GAGCGAATCG 45055 100 1.13e-07 GCGGCACGCC CCTGAAGACGTTGATGCCCTC ACAGAGTAAC 48558 59 1.45e-07 GGGTGTTCTC TGTGGGCAGATGGCAACTGTC GCTCTCATTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 16210 8.2e-11 239_[+3]_240 44307 6.2e-10 394_[+3]_85 21350 2.9e-08 76_[+3]_403 48536 3.9e-08 462_[+3]_17 51519 4.2e-08 335_[+3]_144 45055 1.1e-07 99_[+3]_380 48558 1.4e-07 58_[+3]_421 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=7 16210 ( 240) CGTCGACAGGTTGATGTCGTC 1 44307 ( 395) AGTCACCAGCTTGAAGCGGTC 1 21350 ( 77) TGTAGCCAGGCAGAAGCCGGC 1 48536 ( 463) CGTAGACATCTCGAAGCAGTG 1 51519 ( 336) CCTCAGCAGCTTGGAGTTTTC 1 45055 ( 100) CCTGAAGACGTTGATGCCCTC 1 48558 ( 59) TGTGGGCAGATGGCAACTGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 10.0258 E= 9.8e+001 -89 134 -945 2 -945 34 164 -945 -945 -945 -945 183 11 93 32 -945 69 -945 132 -945 69 34 32 -945 -945 193 -68 -945 192 -945 -945 -945 -945 -66 164 -97 -89 93 90 -945 -945 -66 -945 161 -89 -66 -68 102 -945 -945 212 -945 143 -66 -68 -945 143 -945 -945 2 -89 -945 190 -945 -945 166 -945 2 -89 93 -68 2 -945 -66 164 -97 -945 -945 -68 161 -945 193 -68 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 9.8e+001 0.142857 0.571429 0.000000 0.285714 0.000000 0.285714 0.714286 0.000000 0.000000 0.000000 0.000000 1.000000 0.285714 0.428571 0.285714 0.000000 0.428571 0.000000 0.571429 0.000000 0.428571 0.285714 0.285714 0.000000 0.000000 0.857143 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.142857 0.714286 0.142857 0.142857 0.428571 0.428571 0.000000 0.000000 0.142857 0.000000 0.857143 0.142857 0.142857 0.142857 0.571429 0.000000 0.000000 1.000000 0.000000 0.714286 0.142857 0.142857 0.000000 0.714286 0.000000 0.000000 0.285714 0.142857 0.000000 0.857143 0.000000 0.000000 0.714286 0.000000 0.285714 0.142857 0.428571 0.142857 0.285714 0.000000 0.142857 0.714286 0.142857 0.000000 0.000000 0.142857 0.857143 0.000000 0.857143 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CT][GC]T[CAG][GA][ACG]CAG[CG]TTGA[AT]G[CT][CT]GTC -------------------------------------------------------------------------------- Time 2.52 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21350 1.16e-07 63_[+1(1.30e-05)]_1_[+3(2.85e-08)]_\ 353_[+2(1.08e-05)]_38 48558 2.39e-08 58_[+3(1.45e-07)]_143_\ [+1(3.57e-06)]_31_[+1(8.62e-05)]_104_[+2(1.37e-06)]_107 48799 1.03e-03 172_[+1(1.57e-05)]_14_\ [+2(2.94e-05)]_290 51519 2.44e-09 [+1(2.42e-06)]_46_[+2(5.81e-07)]_\ 265_[+3(4.23e-08)]_144 16210 1.17e-10 90_[+1(1.38e-06)]_137_\ [+3(8.16e-11)]_44_[+2(2.03e-05)]_184 48536 1.82e-08 157_[+2(2.57e-05)]_293_\ [+3(3.92e-08)]_[+1(5.23e-07)]_5 46148 2.47e-04 107_[+1(6.55e-05)]_133_\ [+1(1.20e-05)]_121_[+2(9.03e-07)]_103 47453 2.38e-03 88_[+2(9.76e-06)]_273_\ [+1(1.79e-05)]_115 44307 2.82e-09 168_[+1(6.43e-06)]_109_\ [+2(1.76e-05)]_93_[+3(6.18e-10)]_85 45055 1.25e-07 11_[+1(3.57e-06)]_76_[+3(1.13e-07)]_\ 31_[+2(1.08e-05)]_337 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************