******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/77/77.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 46983 1.0000 500 47273 1.0000 500 47670 1.0000 500 22145 1.0000 500 38445 1.0000 500 52343 1.0000 500 52412 1.0000 500 22609 1.0000 500 49418 1.0000 500 49791 1.0000 500 50162 1.0000 500 50236 1.0000 500 23920 1.0000 500 50413 1.0000 500 50417 1.0000 500 25752 1.0000 500 50446 1.0000 500 47860 1.0000 500 38433 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/77/77.seqs.fa -oc motifs/77 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.251 C 0.266 G 0.227 T 0.255 Background letter frequencies (from dataset with add-one prior applied): A 0.251 C 0.266 G 0.227 T 0.255 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 18 llr = 192 E-value = 5.4e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 65324a182619:646 pos.-specific C :1745:9:724:a133 probability G 3::31::22::::1:1 matrix T 14:1:::::251:23: bits 2.1 1.9 * * 1.7 ** ** 1.5 ** ** Relative 1.3 *** ** Entropy 1.1 * *** ** (15.4 bits) 0.9 * *** ** * 0.6 *** ********* * 0.4 *** ************ 0.2 **************** 0.0 ---------------- Multilevel AACCCACACATACAAA consensus GTAGA CC TTC sequence A C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 25752 196 5.81e-09 CGTACACAGG AACGCACACATACATA CATACATACA 38445 468 1.26e-07 CCGATAGATT GACACACACATACACA TGTCCGAACA 50236 421 2.51e-07 ACGCCTACAC ATAACACACACACACA CGGAAAGCAC 50162 298 2.96e-07 CGCCAAATCA AAAGAACACCTACAAA CACCATGCTA 50417 388 1.34e-06 TCCGACTCTT ATCCAACACACACCTC TGGTGTGGGT 50413 396 1.34e-06 TCCGACTCTT ATCCAACACACACCTC TGGTGTGGGT 22145 119 1.34e-06 CGTGCGGCCA AACTCACACTCACATA TACATAACAT 47670 475 1.69e-06 CGTGCAGCCA TACTCACACATACATA ACATTTGACC 52343 443 3.56e-06 ATCAGCGGAT AAAGAACGCATACTCA ATCATGTCTG 47860 263 4.75e-06 ACATTGTCGA GTCGAACGCAAACAAA GAGTGCCTTT 46983 374 4.75e-06 GAGTCACACG AACCAACAACAACAAA TCGTTGTCCT 49418 14 7.40e-06 GCCAGCCTTC GCCACACACACACATC AGTGCAGTTT 49791 424 1.38e-05 AAACGGATTC GAACGACACTTACAAC ATACACACTC 47273 224 1.60e-05 CCTACCCCTT TTACCACAGCTACAAC AAAGCGTGGC 52412 203 2.70e-05 CGGTGGTTCA GTCACACAATTACGAA CCGGATGAAT 23920 305 3.61e-05 TTCGTGAACC AAACAAAAAACACTCA GTCCGTCAGA 50446 147 4.25e-05 GTCGTCATGA ATCCAACGGACACTCG TGCCTGTTTT 22609 404 4.97e-05 TATTTGCACC ATCGCACAGCTTCTAC TACTACTCTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25752 5.8e-09 195_[+1]_289 38445 1.3e-07 467_[+1]_17 50236 2.5e-07 420_[+1]_64 50162 3e-07 297_[+1]_187 50417 1.3e-06 387_[+1]_97 50413 1.3e-06 395_[+1]_89 22145 1.3e-06 118_[+1]_366 47670 1.7e-06 474_[+1]_10 52343 3.6e-06 442_[+1]_42 47860 4.7e-06 262_[+1]_222 46983 4.7e-06 373_[+1]_111 49418 7.4e-06 13_[+1]_471 49791 1.4e-05 423_[+1]_61 47273 1.6e-05 223_[+1]_261 52412 2.7e-05 202_[+1]_282 23920 3.6e-05 304_[+1]_180 50446 4.3e-05 146_[+1]_338 22609 5e-05 403_[+1]_81 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=18 25752 ( 196) AACGCACACATACATA 1 38445 ( 468) GACACACACATACACA 1 50236 ( 421) ATAACACACACACACA 1 50162 ( 298) AAAGAACACCTACAAA 1 50417 ( 388) ATCCAACACACACCTC 1 50413 ( 396) ATCCAACACACACCTC 1 22145 ( 119) AACTCACACTCACATA 1 47670 ( 475) TACTCACACATACATA 1 52343 ( 443) AAAGAACGCATACTCA 1 47860 ( 263) GTCGAACGCAAACAAA 1 46983 ( 374) AACCAACAACAACAAA 1 49418 ( 14) GCCACACACACACATC 1 49791 ( 424) GAACGACACTTACAAC 1 47273 ( 224) TTACCACAGCTACAAC 1 52412 ( 203) GTCACACAATTACGAA 1 23920 ( 305) AAACAAAAAACACTCA 1 50446 ( 147) ATCCAACGGACACTCG 1 22609 ( 404) ATCGCACAGCTTCTAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9215 bayes= 9.13157 E= 5.4e-004 128 -1081 29 -120 99 -226 -1081 80 41 133 -1081 -1081 -18 55 29 -120 82 91 -203 -1081 199 -1081 -1081 -1081 -218 183 -1081 -1081 173 -1081 -45 -1081 -59 133 -45 -1081 128 -26 -1081 -62 -118 55 -1081 97 191 -1081 -1081 -220 -1081 191 -1081 -1081 128 -126 -203 -20 63 6 -1081 38 128 33 -203 -1081 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 18 E= 5.4e-004 0.611111 0.000000 0.277778 0.111111 0.500000 0.055556 0.000000 0.444444 0.333333 0.666667 0.000000 0.000000 0.222222 0.388889 0.277778 0.111111 0.444444 0.500000 0.055556 0.000000 1.000000 0.000000 0.000000 0.000000 0.055556 0.944444 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.166667 0.666667 0.166667 0.000000 0.611111 0.222222 0.000000 0.166667 0.111111 0.388889 0.000000 0.500000 0.944444 0.000000 0.000000 0.055556 0.000000 1.000000 0.000000 0.000000 0.611111 0.111111 0.055556 0.222222 0.388889 0.277778 0.000000 0.333333 0.611111 0.333333 0.055556 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AG][AT][CA][CGA][CA]ACAC[AC][TC]AC[AT][ATC][AC] -------------------------------------------------------------------------------- Time 3.28 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 9 llr = 123 E-value = 2.0e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 2::61::7::391: pos.-specific C 7a:46a:21a:1:: probability G 1:1:1:::9:::9: matrix T ::9:2:a1::7::a bits 2.1 1.9 * ** * * 1.7 * ** ** ** 1.5 ** ** ** *** Relative 1.3 ** ** ** *** Entropy 1.1 *** ** ****** (19.7 bits) 0.9 *** ** ****** 0.6 **** ********* 0.4 **** ********* 0.2 ************** 0.0 -------------- Multilevel CCTACCTAGCTAGT consensus A CT C A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 49791 235 4.59e-09 GAGTTAGGTA CCTACCTAGCTAGT CACGCAATCT 47273 264 3.21e-08 TCGCAAGCTC CCTCTCTAGCTAGT AACCCTGACC 50417 228 3.70e-08 AGACGCAACT CCTACCTCGCTAGT TGTTAGAACG 50413 236 3.70e-08 AGACGCAACT CCTACCTCGCTAGT TGTTAAAACG 38445 255 2.09e-07 TGTCTCTCTC GCTATCTAGCTAGT AGTACTAGTG 50236 63 4.59e-07 CATGAACCAG ACTACCTACCTAGT TTAGGCTGCA 22145 20 8.01e-07 ATGCACTCAA ACGCCCTAGCAAGT GGTGCAATAC 52412 374 1.22e-06 CATGCTGTAA CCTCGCTAGCAAAT GGCTATCTCG 22609 438 3.48e-06 TGGAATCTTT CCTCACTTGCACGT CCACTGGTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49791 4.6e-09 234_[+2]_252 47273 3.2e-08 263_[+2]_223 50417 3.7e-08 227_[+2]_259 50413 3.7e-08 235_[+2]_251 38445 2.1e-07 254_[+2]_232 50236 4.6e-07 62_[+2]_424 22145 8e-07 19_[+2]_467 52412 1.2e-06 373_[+2]_113 22609 3.5e-06 437_[+2]_49 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=9 49791 ( 235) CCTACCTAGCTAGT 1 47273 ( 264) CCTCTCTAGCTAGT 1 50417 ( 228) CCTACCTCGCTAGT 1 50413 ( 236) CCTACCTCGCTAGT 1 38445 ( 255) GCTATCTAGCTAGT 1 50236 ( 63) ACTACCTACCTAGT 1 22145 ( 20) ACGCCCTAGCAAGT 1 52412 ( 374) CCTCGCTAGCAAAT 1 22609 ( 438) CCTCACTTGCACGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 9253 bayes= 10.1388 E= 2.0e-002 -18 133 -103 -982 -982 191 -982 -982 -982 -982 -103 180 114 74 -982 -982 -118 106 -103 -20 -982 191 -982 -982 -982 -982 -982 197 141 -26 -982 -120 -982 -126 197 -982 -982 191 -982 -982 41 -982 -982 138 182 -126 -982 -982 -118 -982 197 -982 -982 -982 -982 197 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 9 E= 2.0e-002 0.222222 0.666667 0.111111 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.555556 0.444444 0.000000 0.000000 0.111111 0.555556 0.111111 0.222222 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.666667 0.222222 0.000000 0.111111 0.000000 0.111111 0.888889 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.000000 0.666667 0.888889 0.111111 0.000000 0.000000 0.111111 0.000000 0.888889 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CA]CT[AC][CT]CT[AC]GC[TA]AGT -------------------------------------------------------------------------------- Time 6.11 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 9 llr = 124 E-value = 3.8e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 61:::::::3::::8: pos.-specific C :6::2:::1214:42: probability G 4:a482a19191a1:6 matrix T :3:6:8:9:3:4:4:4 bits 2.1 * * * 1.9 * * * 1.7 * * * * * 1.5 * *** * * Relative 1.3 * ***** * * * Entropy 1.1 * ******* * * ** (19.9 bits) 0.9 * ******* * * ** 0.6 ********* ****** 0.4 ********* ****** 0.2 ********* ****** 0.0 ---------------- Multilevel ACGTGTGTGAGCGCAG consensus GT GCG T T TCT sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 50417 22 6.43e-09 GAACAAAACA ACGTGTGTGTGCGTAT TGTCGGGGGC 50413 41 6.43e-09 GAACAAAACA ACGTGTGTGTGCGTAT TGTCGGGGCG 25752 57 2.10e-08 GTGTATATGT GTGTGTGTGTGTGTAT GCACGCGCTC 38445 415 2.10e-08 GGGCTCCGTT GCGTGTGTGCGTGCAT CTCTCAAACG 50446 356 4.73e-07 CGAAAGCAAG ACGTGGGTGGGTGCCG TTCAGTGAAG 50236 325 6.60e-07 CCACGAACAC GCGGCTGTCAGCGCAG CATTGCAATT 47273 59 8.07e-07 AAGTGGTACA AAGGCGGTGAGCGTAG TGCTACTACT 23920 15 1.02e-06 TTTAGGTCCG GTGGGTGGGCGGGCAG AAAGAGCGCT 38433 407 1.68e-06 AAAGGCCCAA ATGGGTGTGACTGGCG AATGCTTGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50417 6.4e-09 21_[+3]_463 50413 6.4e-09 40_[+3]_444 25752 2.1e-08 56_[+3]_428 38445 2.1e-08 414_[+3]_70 50446 4.7e-07 355_[+3]_129 50236 6.6e-07 324_[+3]_160 47273 8.1e-07 58_[+3]_426 23920 1e-06 14_[+3]_470 38433 1.7e-06 406_[+3]_78 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=9 50417 ( 22) ACGTGTGTGTGCGTAT 1 50413 ( 41) ACGTGTGTGTGCGTAT 1 25752 ( 57) GTGTGTGTGTGTGTAT 1 38445 ( 415) GCGTGTGTGCGTGCAT 1 50446 ( 356) ACGTGGGTGGGTGCCG 1 50236 ( 325) GCGGCTGTCAGCGCAG 1 47273 ( 59) AAGGCGGTGAGCGTAG 1 23920 ( 15) GTGGGTGGGCGGGCAG 1 38433 ( 407) ATGGGTGTGACTGGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9215 bayes= 10.8471 E= 3.8e-001 114 -982 97 -982 -118 106 -982 38 -982 -982 214 -982 -982 -982 97 112 -982 -26 177 -982 -982 -982 -3 161 -982 -982 214 -982 -982 -982 -103 180 -982 -126 197 -982 41 -26 -103 38 -982 -126 197 -982 -982 74 -103 80 -982 -982 214 -982 -982 74 -103 80 163 -26 -982 -982 -982 -982 129 80 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 3.8e-001 0.555556 0.000000 0.444444 0.000000 0.111111 0.555556 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.444444 0.555556 0.000000 0.222222 0.777778 0.000000 0.000000 0.000000 0.222222 0.777778 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.000000 0.111111 0.888889 0.000000 0.333333 0.222222 0.111111 0.333333 0.000000 0.111111 0.888889 0.000000 0.000000 0.444444 0.111111 0.444444 0.000000 0.000000 1.000000 0.000000 0.000000 0.444444 0.111111 0.444444 0.777778 0.222222 0.000000 0.000000 0.000000 0.000000 0.555556 0.444444 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AG][CT]G[TG][GC][TG]GTG[ATC]G[CT]G[CT][AC][GT] -------------------------------------------------------------------------------- Time 9.24 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46983 3.87e-02 304_[+1(2.09e-05)]_53_\ [+1(4.75e-06)]_111 47273 1.45e-08 38_[+2(9.22e-05)]_6_[+3(8.07e-07)]_\ 149_[+1(1.60e-05)]_24_[+2(3.21e-08)]_223 47670 3.98e-03 474_[+1(1.69e-06)]_10 22145 2.54e-05 19_[+2(8.01e-07)]_49_[+1(3.04e-05)]_\ 20_[+1(1.34e-06)]_366 38445 3.11e-11 254_[+2(2.09e-07)]_146_\ [+3(2.10e-08)]_37_[+1(1.26e-07)]_17 52343 1.06e-02 442_[+1(3.56e-06)]_42 52412 3.08e-04 202_[+1(2.70e-05)]_155_\ [+2(1.22e-06)]_113 22609 2.45e-03 403_[+1(4.97e-05)]_18_\ [+2(3.48e-06)]_49 49418 7.84e-02 13_[+1(7.40e-06)]_471 49791 1.77e-06 234_[+2(4.59e-09)]_175_\ [+1(1.38e-05)]_61 50162 6.04e-03 297_[+1(2.96e-07)]_187 50236 3.05e-09 62_[+2(4.59e-07)]_248_\ [+3(6.60e-07)]_80_[+1(2.51e-07)]_64 23920 4.22e-04 14_[+3(1.02e-06)]_274_\ [+1(3.61e-05)]_180 50413 1.86e-11 40_[+3(6.43e-09)]_179_\ [+2(3.70e-08)]_146_[+1(1.34e-06)]_16_[+3(6.50e-05)]_57 50417 1.86e-11 21_[+3(6.43e-09)]_190_\ [+2(3.70e-08)]_146_[+1(1.34e-06)]_16_[+3(6.50e-05)]_65 25752 7.28e-09 56_[+3(2.10e-08)]_123_\ [+1(5.81e-09)]_289 50446 1.70e-04 146_[+1(4.25e-05)]_193_\ [+3(4.73e-07)]_129 47860 3.47e-02 262_[+1(4.75e-06)]_222 38433 1.55e-02 406_[+3(1.68e-06)]_78 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************