******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/33/33.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42843 1.0000 500 24783 1.0000 500 36846 1.0000 500 54863 1.0000 500 43614 1.0000 500 49053 1.0000 500 31447 1.0000 500 43964 1.0000 500 43563 1.0000 500 34919 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/33/33.seqs.fa -oc motifs/33 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.273 C 0.237 G 0.224 T 0.266 Background letter frequencies (from dataset with add-one prior applied): A 0.273 C 0.237 G 0.224 T 0.266 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 99 E-value = 1.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 3::a91394711 pos.-specific C 2:a::91161:9 probability G 4:::1:6::28: matrix T :a::::::::1: bits 2.2 * 1.9 *** 1.7 *** 1.5 ***** * Relative 1.3 ***** * * Entropy 1.1 ***** ** ** (15.8 bits) 0.9 ***** ** ** 0.6 *********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GTCAACGACAGC consensus A A AG sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43964 39 2.96e-07 ACTCCGTATG GTCAACAACAGC CGATGAAATC 31447 114 3.39e-07 TCCATCTCTC GTCAACGACGGC TATCGCAAGA 24783 259 5.50e-07 ACACCATGGT CTCAACGAAAGC GCCCGGATGA 34919 173 8.95e-07 ACCTCGCAGT GTCAACAACGGC TCCAAAATTA 36846 352 5.58e-06 CGTTGGTGGA ATCAAAGAAAGC AACAATCATC 43563 488 6.96e-06 TTTGCAATTA GTCAACAAAAGA C 42843 224 1.22e-05 AATGGTTGGC ATCAGCCACAGC TACTCTGTAG 49053 400 1.64e-05 ATGTATCCGC CTCAACGACCAC AAGTACGTAC 43614 275 2.35e-05 AGCGTGAAGT ATCAACGCAATC ATAAGTAGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43964 3e-07 38_[+1]_450 31447 3.4e-07 113_[+1]_375 24783 5.5e-07 258_[+1]_230 34919 8.9e-07 172_[+1]_316 36846 5.6e-06 351_[+1]_137 43563 7e-06 487_[+1]_1 42843 1.2e-05 223_[+1]_265 49053 1.6e-05 399_[+1]_89 43614 2.4e-05 274_[+1]_214 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 43964 ( 39) GTCAACAACAGC 1 31447 ( 114) GTCAACGACGGC 1 24783 ( 259) CTCAACGAAAGC 1 34919 ( 173) GTCAACAACGGC 1 36846 ( 352) ATCAAAGAAAGC 1 43563 ( 488) GTCAACAAAAGA 1 42843 ( 224) ATCAGCCACAGC 1 49053 ( 400) CTCAACGACCAC 1 43614 ( 275) ATCAACGCAATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 9.21757 E= 1.4e+001 29 -9 99 -982 -982 -982 -982 191 -982 207 -982 -982 187 -982 -982 -982 170 -982 -101 -982 -129 190 -982 -982 29 -109 131 -982 170 -109 -982 -982 70 123 -982 -982 129 -109 -1 -982 -129 -982 179 -126 -129 190 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 1.4e+001 0.333333 0.222222 0.444444 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.888889 0.000000 0.111111 0.000000 0.111111 0.888889 0.000000 0.000000 0.333333 0.111111 0.555556 0.000000 0.888889 0.111111 0.000000 0.000000 0.444444 0.555556 0.000000 0.000000 0.666667 0.111111 0.222222 0.000000 0.111111 0.000000 0.777778 0.111111 0.111111 0.888889 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GAC]TCAAC[GA]A[CA][AG]GC -------------------------------------------------------------------------------- Time 0.90 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 7 llr = 89 E-value = 3.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 1::1:::4:::a66 pos.-specific C :1:1::4:6:::11 probability G :9:3a:441aa:33 matrix T 9:a4:a113::::: bits 2.2 * ** 1.9 * ** *** 1.7 * ** *** 1.5 ** ** *** Relative 1.3 *** ** *** Entropy 1.1 *** ** *** (18.3 bits) 0.9 *** ** *** 0.6 *** ********** 0.4 *** ********** 0.2 ************** 0.0 -------------- Multilevel TGTTGTCACGGAAA consensus G GGT GG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 43964 408 2.57e-08 TGGCATGGTC TGTGGTGACGGAAA ATGTGTCTCT 49053 160 3.40e-07 ACTTCACGAG TGTTGTTGCGGAAG GGGAACTATG 24783 103 4.78e-07 CTCTGAAAAA TCTTGTCACGGAAA ACATATTACA 43614 154 5.61e-07 AGGAGTCAGA TGTGGTGGTGGAGG ATGAGGAAGA 43563 118 6.91e-07 CGCTGCAGTG TGTAGTCTCGGAAA ATGAGCCAAA 36846 22 2.51e-06 TCACAAGCCG TGTCGTGAGGGACA CGAACACCGA 54863 34 4.17e-06 TGTTTGTTTA AGTTGTCGTGGAGC TGCTCATGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43964 2.6e-08 407_[+2]_79 49053 3.4e-07 159_[+2]_327 24783 4.8e-07 102_[+2]_384 43614 5.6e-07 153_[+2]_333 43563 6.9e-07 117_[+2]_369 36846 2.5e-06 21_[+2]_465 54863 4.2e-06 33_[+2]_453 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=7 43964 ( 408) TGTGGTGACGGAAA 1 49053 ( 160) TGTTGTTGCGGAAG 1 24783 ( 103) TCTTGTCACGGAAA 1 43614 ( 154) TGTGGTGGTGGAGG 1 43563 ( 118) TGTAGTCTCGGAAA 1 36846 ( 22) TGTCGTGAGGGACA 1 54863 ( 34) AGTTGTCGTGGAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 4870 bayes= 9.28392 E= 3.2e+001 -93 -945 -945 169 -945 -73 193 -945 -945 -945 -945 191 -93 -73 35 69 -945 -945 216 -945 -945 -945 -945 191 -945 85 93 -89 65 -945 93 -89 -945 127 -65 10 -945 -945 216 -945 -945 -945 216 -945 187 -945 -945 -945 107 -73 35 -945 107 -73 35 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 7 E= 3.2e+001 0.142857 0.000000 0.000000 0.857143 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 0.000000 1.000000 0.142857 0.142857 0.285714 0.428571 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.428571 0.428571 0.142857 0.428571 0.000000 0.428571 0.142857 0.000000 0.571429 0.142857 0.285714 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.571429 0.142857 0.285714 0.000000 0.571429 0.142857 0.285714 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TGT[TG]GT[CG][AG][CT]GGA[AG][AG] -------------------------------------------------------------------------------- Time 1.85 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 10 llr = 98 E-value = 3.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :94213:3:::a pos.-specific C 3:3::49::::: probability G 611:92::928: matrix T 1:28:117182: bits 2.2 1.9 * 1.7 * * * 1.5 * * * * * Relative 1.3 * * * **** Entropy 1.1 * ** ****** (14.2 bits) 0.9 ** ** ****** 0.6 ** ** ****** 0.4 ** ** ****** 0.2 ************ 0.0 ------------ Multilevel GAATGCCTGTGA consensus C CA A A GT sequence T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 49053 209 1.01e-07 GCAAGAATGT GACTGCCTGTGA GGGCGTCTCG 42843 161 2.34e-06 ACGTTGTGAG CATTGACTGTGA GTTAGGTTGT 31447 239 4.09e-06 CAAACAGAAT CACTGACAGTGA ACTCCTTACC 36846 293 4.09e-06 CATCCGACAC GACTGCCAGGGA TATCCCGGTT 43563 197 9.06e-06 AAAACAATAG GAAAGCCTGTTA CCCTATTGCC 24783 293 1.88e-05 ATTGAGAATC TAGTGGCTGTGA GATCAGACAC 54863 90 3.35e-05 AAAATTAAAT CAATGATAGTGA TCCATCCCTA 34919 287 3.58e-05 GAGAAGGTAA GAAAAGCTGTGA TAATGCAAAT 43964 18 3.84e-05 GCACCTTCAT GATTGCCTTTTA CTCCGTATGG 43614 235 4.35e-05 GTGGTGGAAC GGATGTCTGGGA GGTTTAGCTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49053 1e-07 208_[+3]_280 42843 2.3e-06 160_[+3]_328 31447 4.1e-06 238_[+3]_250 36846 4.1e-06 292_[+3]_196 43563 9.1e-06 196_[+3]_292 24783 1.9e-05 292_[+3]_196 54863 3.3e-05 89_[+3]_399 34919 3.6e-05 286_[+3]_202 43964 3.8e-05 17_[+3]_471 43614 4.4e-05 234_[+3]_254 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=10 49053 ( 209) GACTGCCTGTGA 1 42843 ( 161) CATTGACTGTGA 1 31447 ( 239) CACTGACAGTGA 1 36846 ( 293) GACTGCCAGGGA 1 43563 ( 197) GAAAGCCTGTTA 1 24783 ( 293) TAGTGGCTGTGA 1 54863 ( 90) CAATGATAGTGA 1 34919 ( 287) GAAAAGCTGTGA 1 43964 ( 18) GATTGCCTTTTA 1 43614 ( 235) GGATGTCTGGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 8.93074 E= 3.2e+002 -997 34 142 -141 172 -997 -116 -997 55 34 -116 -41 -45 -997 -997 159 -145 -997 200 -997 14 75 -16 -141 -997 192 -997 -141 14 -997 -997 140 -997 -997 200 -141 -997 -997 -16 159 -997 -997 183 -41 187 -997 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 3.2e+002 0.000000 0.300000 0.600000 0.100000 0.900000 0.000000 0.100000 0.000000 0.400000 0.300000 0.100000 0.200000 0.200000 0.000000 0.000000 0.800000 0.100000 0.000000 0.900000 0.000000 0.300000 0.400000 0.200000 0.100000 0.000000 0.900000 0.000000 0.100000 0.300000 0.000000 0.000000 0.700000 0.000000 0.000000 0.900000 0.100000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.800000 0.200000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC]A[ACT][TA]G[CAG]C[TA]G[TG][GT]A -------------------------------------------------------------------------------- Time 2.85 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42843 3.05e-04 160_[+3(2.34e-06)]_51_\ [+1(1.22e-05)]_265 24783 1.42e-07 102_[+2(4.78e-07)]_142_\ [+1(5.50e-07)]_22_[+3(1.88e-05)]_196 36846 1.31e-06 21_[+2(2.51e-06)]_257_\ [+3(4.09e-06)]_47_[+1(5.58e-06)]_137 54863 1.33e-03 33_[+2(4.17e-06)]_42_[+3(3.35e-05)]_\ 399 43614 1.01e-05 153_[+2(5.61e-07)]_53_\ [+2(9.46e-05)]_[+3(4.35e-05)]_28_[+1(2.35e-05)]_214 49053 1.96e-08 159_[+2(3.40e-07)]_35_\ [+3(1.01e-07)]_179_[+1(1.64e-05)]_89 31447 2.21e-05 113_[+1(3.39e-07)]_113_\ [+3(4.09e-06)]_250 43964 1.06e-08 17_[+3(3.84e-05)]_9_[+1(2.96e-07)]_\ 357_[+2(2.57e-08)]_79 43563 1.02e-06 117_[+2(6.91e-07)]_65_\ [+3(9.06e-06)]_279_[+1(6.96e-06)]_1 34919 9.35e-05 172_[+1(8.95e-07)]_102_\ [+3(3.58e-05)]_202 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************