******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/103/103.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42457 1.0000 500 43058 1.0000 500 47192 1.0000 500 21519 1.0000 500 47962 1.0000 500 49544 1.0000 500 43972 1.0000 500 44399 1.0000 500 26173 1.0000 500 54323 1.0000 500 12028 1.0000 500 42702 1.0000 500 48657 1.0000 500 45777 1.0000 500 50451 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/103/103.seqs.fa -oc motifs/103 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.275 C 0.242 G 0.223 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.275 C 0.242 G 0.223 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 15 llr = 148 E-value = 1.6e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 31418::::367 pos.-specific C 3:69192:::3: probability G 14::118:a5:3 matrix T 45:::::a:21: bits 2.2 * 1.9 ** 1.7 * * ** 1.5 * **** Relative 1.3 * **** Entropy 1.1 ******* * (14.2 bits) 0.9 ******* * 0.6 *********** 0.4 *********** 0.2 ************ 0.0 ------------ Multilevel TTCCACGTGGAA consensus AGA C ACG sequence C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 42457 338 3.17e-07 GACGTGCGTT TTCCACGTGGCA AGGTGAGCAG 12028 121 1.29e-06 ACCGATCGGA TGACACGTGGCA TTCAAATCCT 50451 173 1.65e-06 TGATTACGCA CTCCACGTGAAA CTCCCGTGCC 21519 150 1.65e-06 GTCGCCCCAG ATCCACGTGGAG AAACTCACAA 54323 320 2.03e-06 AGACGATTAC TTCCACGTGACA CGGCCATGAC 47962 69 3.46e-06 TGGACCCCAA TGCCGCGTGGAA GTCAACGTTC 45777 338 4.05e-06 AGTCAACCTT CTCCACCTGGAA TAAAAAAATT 48657 21 1.11e-05 GTGCGATACA AGACGCGTGGAA CCTTTGGAGA 42702 341 1.29e-05 TCCATGGGAA CGACACCTGGCA GTTGACTGTG 44399 245 1.40e-05 GCCACAAGCT AGCCACGTGTCG ACCAGCACCG 43058 365 1.69e-05 CCGCAGAATC AAACACGTGGAA GGAAATGCGA 47192 151 4.92e-05 CTTACTTTAC CTCCACGTGTTG ATGGCCTCGC 26173 118 5.87e-05 CTGTTCGGAT TGACCCGTGAAG GTCGTAATCG 49544 202 7.64e-05 GACAATGTAG TTCCAGCTGTAA GTCACAATCG 43972 39 1.12e-04 AGATTTGATA GTAAACGTGAAA ATGGTTGGGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42457 3.2e-07 337_[+1]_151 12028 1.3e-06 120_[+1]_368 50451 1.6e-06 172_[+1]_316 21519 1.6e-06 149_[+1]_339 54323 2e-06 319_[+1]_169 47962 3.5e-06 68_[+1]_420 45777 4e-06 337_[+1]_151 48657 1.1e-05 20_[+1]_468 42702 1.3e-05 340_[+1]_148 44399 1.4e-05 244_[+1]_244 43058 1.7e-05 364_[+1]_124 47192 4.9e-05 150_[+1]_338 26173 5.9e-05 117_[+1]_371 49544 7.6e-05 201_[+1]_287 43972 0.00011 38_[+1]_450 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=15 42457 ( 338) TTCCACGTGGCA 1 12028 ( 121) TGACACGTGGCA 1 50451 ( 173) CTCCACGTGAAA 1 21519 ( 150) ATCCACGTGGAG 1 54323 ( 320) TTCCACGTGACA 1 47962 ( 69) TGCCGCGTGGAA 1 45777 ( 338) CTCCACCTGGAA 1 48657 ( 21) AGACGCGTGGAA 1 42702 ( 341) CGACACCTGGCA 1 44399 ( 245) AGCCACGTGTCG 1 43058 ( 365) AAACACGTGGAA 1 47192 ( 151) CTCCACGTGTTG 1 26173 ( 118) TGACCCGTGAAG 1 49544 ( 202) TTCCAGCTGTAA 1 43972 ( 39) GTAAACGTGAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 9.60607 E= 1.6e-003 -4 14 -174 62 -204 -1055 84 104 54 131 -1055 -1055 -204 194 -1055 -1055 154 -186 -74 -1055 -1055 194 -174 -1055 -1055 -28 184 -1055 -1055 -1055 -1055 194 -1055 -1055 216 -1055 -4 -1055 126 -38 113 46 -1055 -196 142 -1055 26 -1055 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 15 E= 1.6e-003 0.266667 0.266667 0.066667 0.400000 0.066667 0.000000 0.400000 0.533333 0.400000 0.600000 0.000000 0.000000 0.066667 0.933333 0.000000 0.000000 0.800000 0.066667 0.133333 0.000000 0.000000 0.933333 0.066667 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.266667 0.000000 0.533333 0.200000 0.600000 0.333333 0.000000 0.066667 0.733333 0.000000 0.266667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TAC][TG][CA]CAC[GC]TG[GAT][AC][AG] -------------------------------------------------------------------------------- Time 2.32 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 11 llr = 124 E-value = 6.1e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 51::9:5:::a6 pos.-specific C 51:61a:::6:: probability G :2:2:::a:1:4 matrix T :6a2::5:a3:: bits 2.2 * 1.9 * * ** * 1.7 * * ** * 1.5 * ** ** * Relative 1.3 * ** ** * Entropy 1.1 * ** ** ** (16.3 bits) 0.9 * * ******** 0.6 * ********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel ATTCACAGTCAA consensus C T T G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 42702 57 1.64e-07 TGGATTGCCA CTTCACAGTCAA ATGATGTCGA 47962 342 5.79e-07 ATCGCACGCA ATTCACTGTCAG TGACCTCGCA 43058 241 1.63e-06 TTCGTTCGTG CTTGACTGTCAA TGAAGCGACC 48657 69 2.46e-06 TGGGGAATGT CGTCACTGTCAG TCACATACAT 54323 421 3.16e-06 ATCCTCTCCT CTTCACAGTGAA ACAAACGATT 47192 111 3.16e-06 CGTCCGGACG ACTCACAGTCAA GCAAACCACA 43972 377 4.92e-06 TCAAGGTACA ATTCCCTGTCAA GTAGAACAAC 45777 75 5.47e-06 GAGAAAACAT ATTTACAGTTAA GAGATAGTTG 50451 267 6.04e-06 ATACCGTCGA CTTTACTGTTAA TACCCGTTCC 12028 66 6.04e-06 AAAAAAAAGA AATCACAGTCAG CCAACCGCGA 44399 70 1.43e-05 GTCAACTTTC AGTGACAGTTAG CGAGTATACG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42702 1.6e-07 56_[+2]_432 47962 5.8e-07 341_[+2]_147 43058 1.6e-06 240_[+2]_248 48657 2.5e-06 68_[+2]_420 54323 3.2e-06 420_[+2]_68 47192 3.2e-06 110_[+2]_378 43972 4.9e-06 376_[+2]_112 45777 5.5e-06 74_[+2]_414 50451 6e-06 266_[+2]_222 12028 6e-06 65_[+2]_423 44399 1.4e-05 69_[+2]_419 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=11 42702 ( 57) CTTCACAGTCAA 1 47962 ( 342) ATTCACTGTCAG 1 43058 ( 241) CTTGACTGTCAA 1 48657 ( 69) CGTCACTGTCAG 1 54323 ( 421) CTTCACAGTGAA 1 47192 ( 111) ACTCACAGTCAA 1 43972 ( 377) ATTCCCTGTCAA 1 45777 ( 75) ATTTACAGTTAA 1 50451 ( 267) CTTTACTGTTAA 1 12028 ( 66) AATCACAGTCAG 1 44399 ( 70) AGTGACAGTTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 9.73455 E= 6.1e-002 99 91 -1010 -1010 -159 -141 -29 129 -1010 -1010 -1010 194 -1010 139 -29 -52 173 -141 -1010 -1010 -1010 204 -1010 -1010 99 -1010 -1010 81 -1010 -1010 216 -1010 -1010 -1010 -1010 194 -1010 139 -129 7 186 -1010 -1010 -1010 121 -1010 70 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 6.1e-002 0.545455 0.454545 0.000000 0.000000 0.090909 0.090909 0.181818 0.636364 0.000000 0.000000 0.000000 1.000000 0.000000 0.636364 0.181818 0.181818 0.909091 0.090909 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.545455 0.000000 0.000000 0.454545 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.636364 0.090909 0.272727 1.000000 0.000000 0.000000 0.000000 0.636364 0.000000 0.363636 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AC]TTCAC[AT]GT[CT]A[AG] -------------------------------------------------------------------------------- Time 4.80 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 107 E-value = 4.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :3::3::7:5:::::::3:8: pos.-specific C 82:2:3:2:22:::::552:5 probability G ::a822a:82::::73::8:: matrix T 25::55:2228aaa3752:25 bits 2.2 * * 1.9 * * *** 1.7 * * *** 1.5 ** * * *** * Relative 1.3 * ** * * **** ** Entropy 1.1 * ** * * ******* *** (25.7 bits) 0.9 * ** * * ******* *** 0.6 * ** **** ******* *** 0.4 ********* *********** 0.2 ********************* 0.0 --------------------- Multilevel CTGGTTGAGATTTTGTCCGAC consensus A AC TGTA T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 49544 277 1.31e-10 TTACGTTATG TTGGTCGAGATTTTGTCCGAT GACGAACAGG 44399 118 4.30e-10 TTGTGGTAGC CTGGTTGAGATTTTGTTTGTC TGTCCAGCAA 54323 365 3.34e-09 GCGATATAGG CAGCGCGAGATTTTTTTCGAC TATGGTAGCA 50451 380 9.12e-09 GTAACGTCTA CCGGAGGTGTTTTTGTCAGAC AGATGAAACG 43058 133 9.78e-09 AAAGTGCTAA CTGGATGATGCTTTGGTCGAT GAAATGATTC 47192 410 2.17e-08 GTCTTCTGCC CAGGTTGCGCTTTTTGCACAT TGACAAGGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49544 1.3e-10 276_[+3]_203 44399 4.3e-10 117_[+3]_362 54323 3.3e-09 364_[+3]_115 50451 9.1e-09 379_[+3]_100 43058 9.8e-09 132_[+3]_347 47192 2.2e-08 409_[+3]_70 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 49544 ( 277) TTGGTCGAGATTTTGTCCGAT 1 44399 ( 118) CTGGTTGAGATTTTGTTTGTC 1 54323 ( 365) CAGCGCGAGATTTTTTTCGAC 1 50451 ( 380) CCGGAGGTGTTTTTGTCAGAC 1 43058 ( 133) CTGGATGATGCTTTGGTCGAT 1 47192 ( 410) CAGGTTGCGCTTTTTGCACAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 10.6754 E= 4.6e+001 -923 178 -923 -64 28 -54 -923 94 -923 -923 216 -923 -923 -54 190 -923 28 -923 -42 94 -923 46 -42 94 -923 -923 216 -923 128 -54 -923 -64 -923 -923 190 -64 86 -54 -42 -64 -923 -54 -923 168 -923 -923 -923 194 -923 -923 -923 194 -923 -923 -923 194 -923 -923 158 36 -923 -923 58 136 -923 104 -923 94 28 104 -923 -64 -923 -54 190 -923 160 -923 -923 -64 -923 104 -923 94 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 4.6e+001 0.000000 0.833333 0.000000 0.166667 0.333333 0.166667 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.333333 0.000000 0.166667 0.500000 0.000000 0.333333 0.166667 0.500000 0.000000 0.000000 1.000000 0.000000 0.666667 0.166667 0.000000 0.166667 0.000000 0.000000 0.833333 0.166667 0.500000 0.166667 0.166667 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.333333 0.666667 0.000000 0.500000 0.000000 0.500000 0.333333 0.500000 0.000000 0.166667 0.000000 0.166667 0.833333 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.500000 0.000000 0.500000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[TA]GG[TA][TC]GAGATTTT[GT][TG][CT][CA]GA[CT] -------------------------------------------------------------------------------- Time 7.04 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42457 1.13e-03 337_[+1(3.17e-07)]_151 43058 9.83e-09 132_[+3(9.78e-09)]_87_\ [+2(1.63e-06)]_112_[+1(1.69e-05)]_124 47192 9.84e-08 110_[+2(3.16e-06)]_28_\ [+1(4.92e-05)]_247_[+3(2.17e-08)]_70 21519 8.20e-03 149_[+1(1.65e-06)]_339 47962 2.63e-05 68_[+1(3.46e-06)]_261_\ [+2(5.79e-07)]_147 49544 1.72e-07 201_[+1(7.64e-05)]_63_\ [+3(1.31e-10)]_203 43972 5.94e-03 376_[+2(4.92e-06)]_112 44399 3.43e-09 69_[+2(1.43e-05)]_36_[+3(4.30e-10)]_\ 106_[+1(1.40e-05)]_18_[+2(7.16e-05)]_214 26173 2.52e-01 117_[+1(5.87e-05)]_371 54323 9.43e-10 319_[+1(2.03e-06)]_33_\ [+3(3.34e-09)]_35_[+2(3.16e-06)]_68 12028 1.87e-04 65_[+2(6.04e-06)]_43_[+1(1.29e-06)]_\ 368 42702 4.12e-05 56_[+2(1.64e-07)]_272_\ [+1(1.29e-05)]_148 48657 4.10e-04 20_[+1(1.11e-05)]_36_[+2(2.46e-06)]_\ 420 45777 1.87e-04 74_[+2(5.47e-06)]_251_\ [+1(4.05e-06)]_151 50451 3.60e-09 172_[+1(1.65e-06)]_82_\ [+2(6.04e-06)]_101_[+3(9.12e-09)]_100 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************