******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/214/214.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47371 1.0000 500 13931 1.0000 500 47588 1.0000 500 29212 1.0000 500 22325 1.0000 500 22332 1.0000 500 54958 1.0000 500 40163 1.0000 500 45153 1.0000 500 44072 1.0000 500 50640 1.0000 500 47665 1.0000 500 47249 1.0000 500 37758 1.0000 500 48543 1.0000 500 49915 1.0000 500 49881 1.0000 500 49422 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/214/214.seqs.fa -oc motifs/214 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.242 G 0.216 T 0.282 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.242 G 0.216 T 0.282 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 117 E-value = 2.2e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::8::::::37:72332:22: pos.-specific C ::::::a:5:::3837:::5: probability G 8a2:7a::552a::3:585:a matrix T 2::a3::a:22:::::3233: bits 2.2 * * * * 2.0 * ** * * 1.8 * * *** * * 1.5 ** * *** * * * Relative 1.3 **** *** * * * * Entropy 1.1 ********* *** * * * (28.2 bits) 0.9 ********* *** * * * 0.7 ************** **** * 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GGATGGCTCGAGACACGGGCG consensus T GA C CAT TT sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 22332 384 3.26e-11 TGGATCATCA GGATGGCTCAAGACCCTGTCG AACTTGAACG 22325 384 3.26e-11 TGGATCATCA GGATGGCTCAAGACCCTGTCG AACTTGAACG 37758 241 3.87e-11 TTCTTGTGTT GGATTGCTGGAGCCGCGGGTG TGAGCTGTGC 50640 241 3.87e-11 TTCTTGTGTT GGATTGCTGGAGCCGCGGGTG TGAGCTGTGC 49915 409 1.14e-08 TCTCTATCGT GGGTGGCTCTGGACAAGTGAG TACACAGCAC 47588 450 1.33e-08 TGTTGATTGC TGATGGCTGGTGAAAAAGACG CGGACCCCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22332 3.3e-11 383_[+1]_96 22325 3.3e-11 383_[+1]_96 37758 3.9e-11 240_[+1]_239 50640 3.9e-11 240_[+1]_239 49915 1.1e-08 408_[+1]_71 47588 1.3e-08 449_[+1]_30 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 22332 ( 384) GGATGGCTCAAGACCCTGTCG 1 22325 ( 384) GGATGGCTCAAGACCCTGTCG 1 37758 ( 241) GGATTGCTGGAGCCGCGGGTG 1 50640 ( 241) GGATTGCTGGAGCCGCGGGTG 1 49915 ( 409) GGGTGGCTCTGGACAAGTGAG 1 47588 ( 450) TGATGGCTGGTGAAAAAGACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 10.9386 E= 2.2e-002 -923 -923 195 -76 -923 -923 221 -923 167 -923 -37 -923 -923 -923 -923 183 -923 -923 163 24 -923 -923 221 -923 -923 205 -923 -923 -923 -923 -923 183 -923 105 121 -923 35 -923 121 -76 135 -923 -37 -76 -923 -923 221 -923 135 46 -923 -923 -65 178 -923 -923 35 46 63 -923 35 146 -923 -923 -65 -923 121 24 -923 -923 195 -76 -65 -923 121 24 -65 105 -923 24 -923 -923 221 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 2.2e-002 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.333333 0.000000 0.500000 0.166667 0.666667 0.000000 0.166667 0.166667 0.000000 0.000000 1.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.333333 0.333333 0.333333 0.000000 0.333333 0.666667 0.000000 0.000000 0.166667 0.000000 0.500000 0.333333 0.000000 0.000000 0.833333 0.166667 0.166667 0.000000 0.500000 0.333333 0.166667 0.500000 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GGAT[GT]GCT[CG][GA]AG[AC]C[ACG][CA][GT]G[GT][CT]G -------------------------------------------------------------------------------- Time 2.80 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 7 llr = 106 E-value = 3.3e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :41:::1::9:a991: pos.-specific C :3::::39::::1::1 probability G a1:36a31a19::::: matrix T :1974:3:::1::199 bits 2.2 * * * 2.0 * * * * 1.8 * * * * 1.5 * * ** ** Relative 1.3 * * * ********* Entropy 1.1 * **** ********* (21.8 bits) 0.9 * **** ********* 0.7 * **** ********* 0.4 * **** ********* 0.2 ****** ********* 0.0 ---------------- Multilevel GATTGGCCGAGAAATT consensus C GT G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 37758 147 6.45e-10 CGGATGCTCT GATTGGTCGAGAAATT GACATTCTTC 50640 147 6.45e-10 CGGATGCTCT GATTGGTCGAGAAATT GACATTCTTC 22332 189 4.82e-09 AGTAGAAGAA GCTGGGCCGAGAAATT CGTTTGGATT 22325 189 5.36e-08 AGTAGAAGAA GCTGGGCCGAGAATTT CGTTTGGATT 47249 172 2.70e-07 TACGGGAGCC GGTTTGGGGAGAAAAT ACGTGACTTC 29212 169 4.01e-07 TTAATTGCCT GTTTTGGCGATACATT GCACCGTCAC 44072 155 6.65e-07 CTGTGAATGT GAATTGACGGGAAATC CTTCTTGATA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37758 6.4e-10 146_[+2]_338 50640 6.4e-10 146_[+2]_338 22332 4.8e-09 188_[+2]_296 22325 5.4e-08 188_[+2]_296 47249 2.7e-07 171_[+2]_313 29212 4e-07 168_[+2]_316 44072 6.6e-07 154_[+2]_330 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=7 37758 ( 147) GATTGGTCGAGAAATT 1 50640 ( 147) GATTGGTCGAGAAATT 1 22332 ( 189) GCTGGGCCGAGAAATT 1 22325 ( 189) GCTGGGCCGAGAATTT 1 47249 ( 172) GGTTTGGGGAGAAAAT 1 29212 ( 169) GTTTTGGCGATACATT 1 44072 ( 155) GAATTGACGGGAAATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8730 bayes= 10.8894 E= 3.3e-001 -945 -945 221 -945 72 24 -59 -98 -87 -945 -945 160 -945 -945 40 134 -945 -945 140 60 -945 -945 221 -945 -87 24 40 2 -945 183 -59 -945 -945 -945 221 -945 171 -945 -59 -945 -945 -945 199 -98 194 -945 -945 -945 171 -76 -945 -945 171 -945 -945 -98 -87 -945 -945 160 -945 -76 -945 160 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 3.3e-001 0.000000 0.000000 1.000000 0.000000 0.428571 0.285714 0.142857 0.142857 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 0.571429 0.428571 0.000000 0.000000 1.000000 0.000000 0.142857 0.285714 0.285714 0.285714 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 0.000000 0.857143 0.142857 1.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 0.142857 0.142857 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.857143 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[AC]T[TG][GT]G[CGT]CGAGAAATT -------------------------------------------------------------------------------- Time 5.66 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 8 llr = 133 E-value = 3.5e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::13::1::31:::::91a pos.-specific C a9::534:::461::319:5: probability G ::9:13:3:3:1:1:33:1:: matrix T :11a3368986:89a561:4: bits 2.2 2.0 * * 1.8 * * * * 1.5 **** * * * Relative 1.3 **** * ** ** * Entropy 1.1 **** *** ** ** * (23.9 bits) 0.9 **** ***** *** ** * 0.7 **** ********* *** * 0.4 **** *************** 0.2 ***** *************** 0.0 --------------------- Multilevel CCGTCATTTTTCTTTTTCACA consensus TCCG GCA CG T sequence G G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 22332 468 5.12e-11 AATTGTGGCT CCGTCATTTTTCTTTCTCATA CTGGGAGCGC 22325 468 5.12e-11 AATTGTGACT CCGTCATTTTTCTTTCTCATA CTGGGAGCGC 37758 394 3.28e-10 ATACCAAACA CCGTCCTTTTCATTTTGCACA TAGTCATTTT 50640 394 3.28e-10 ATACCAAACA CCGTCCTTTTCATTTTGCACA TAGTCATTTT 48543 90 4.51e-08 CGTTTCATTT CCGTTGCGTGTCTTTTTCGAA CAAGCCGGTT 49881 152 1.58e-07 ATTCTCTTTT CCGTGTCTTTCCATTTCTATA GTAATGGAAG 49915 460 1.90e-07 GATTGGTTGT CTGTATCTATTCCTTGTCACA GTCAATACAG 47249 92 2.18e-07 GTAAAAATCC CCTTTGTGTGTGTGTGTCACA GGATTTTCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22332 5.1e-11 467_[+3]_12 22325 5.1e-11 467_[+3]_12 37758 3.3e-10 393_[+3]_86 50640 3.3e-10 393_[+3]_86 48543 4.5e-08 89_[+3]_390 49881 1.6e-07 151_[+3]_328 49915 1.9e-07 459_[+3]_20 47249 2.2e-07 91_[+3]_388 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=8 22332 ( 468) CCGTCATTTTTCTTTCTCATA 1 22325 ( 468) CCGTCATTTTTCTTTCTCATA 1 37758 ( 394) CCGTCCTTTTCATTTTGCACA 1 50640 ( 394) CCGTCCTTTTCATTTTGCACA 1 48543 ( 90) CCGTTGCGTGTCTTTTTCGAA 1 49881 ( 152) CCGTGTCTTTCCATTTCTATA 1 49915 ( 460) CTGTATCTATTCCTTGTCACA 1 47249 ( 92) CCTTTGTGTGTGTGTGTCACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 10.0755 E= 3.5e-001 -965 205 -965 -965 -965 185 -965 -117 -965 -965 202 -117 -965 -965 -965 183 -106 105 -79 -17 -6 5 21 -17 -965 63 -965 115 -965 -965 21 141 -106 -965 -965 163 -965 -965 21 141 -965 63 -965 115 -6 137 -79 -965 -106 -95 -965 141 -965 -965 -79 163 -965 -965 -965 183 -965 5 21 83 -965 -95 21 115 -965 185 -965 -117 174 -965 -79 -965 -106 105 -965 41 194 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 3.5e-001 0.000000 1.000000 0.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 1.000000 0.125000 0.500000 0.125000 0.250000 0.250000 0.250000 0.250000 0.250000 0.000000 0.375000 0.000000 0.625000 0.000000 0.000000 0.250000 0.750000 0.125000 0.000000 0.000000 0.875000 0.000000 0.000000 0.250000 0.750000 0.000000 0.375000 0.000000 0.625000 0.250000 0.625000 0.125000 0.000000 0.125000 0.125000 0.000000 0.750000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.250000 0.500000 0.000000 0.125000 0.250000 0.625000 0.000000 0.875000 0.000000 0.125000 0.875000 0.000000 0.125000 0.000000 0.125000 0.500000 0.000000 0.375000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CCGT[CT][ACGT][TC][TG]T[TG][TC][CA]TTT[TCG][TG]CA[CT]A -------------------------------------------------------------------------------- Time 8.53 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47371 7.21e-01 500 13931 3.52e-01 500 47588 9.69e-05 449_[+1(1.33e-08)]_30 29212 3.75e-03 168_[+2(4.01e-07)]_316 22325 1.11e-17 188_[+2(5.36e-08)]_179_\ [+1(3.26e-11)]_63_[+3(5.12e-11)]_12 22332 1.10e-18 188_[+2(4.82e-09)]_179_\ [+1(3.26e-11)]_63_[+3(5.12e-11)]_12 54958 6.01e-01 500 40163 3.28e-01 500 45153 4.57e-01 500 44072 9.25e-04 154_[+2(6.65e-07)]_307_\ [+3(8.86e-05)]_2 50640 1.12e-18 146_[+2(6.45e-10)]_78_\ [+1(3.87e-11)]_132_[+3(3.28e-10)]_86 47665 5.15e-01 500 47249 2.24e-06 91_[+3(2.18e-07)]_59_[+2(2.70e-07)]_\ 313 37758 1.12e-18 146_[+2(6.45e-10)]_78_\ [+1(3.87e-11)]_132_[+3(3.28e-10)]_86 48543 2.31e-04 89_[+3(4.51e-08)]_390 49915 9.49e-08 408_[+1(1.14e-08)]_30_\ [+3(1.90e-07)]_20 49881 8.51e-04 151_[+3(1.58e-07)]_328 49422 4.07e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************