******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/457/457.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11427 1.0000 500 2063 1.0000 500 23249 1.0000 500 24091 1.0000 500 24214 1.0000 500 24241 1.0000 500 263903 1.0000 500 268980 1.0000 500 27834 1.0000 500 3587 1.0000 500 37549 1.0000 500 5931 1.0000 500 6658 1.0000 500 9141 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/457/457.seqs.fa -oc motifs/457 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.254 C 0.246 G 0.243 T 0.258 Background letter frequencies (from dataset with add-one prior applied): A 0.254 C 0.246 G 0.243 T 0.258 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 14 llr = 164 E-value = 8.1e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 4a6:9a1661581642 pos.-specific C 6::9::81:4129:28 probability G ::111:13:54::41: matrix T 1:3:::::4:1:::3: bits 2.0 * * 1.8 * * 1.6 * * 1.4 * *** * Relative 1.2 * *** ** * Entropy 1.0 * **** * *** * (16.9 bits) 0.8 ** **** * *** * 0.6 ********** *** * 0.4 ************** * 0.2 ************** * 0.0 ---------------- Multilevel CAACAACAAGAACAAC consensus A T GTCGC GTA sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 37549 467 3.90e-09 CACTCAAACA CAACAACAACGACAAC AACAACAACA 27834 233 3.90e-09 ATATCACCCT CATCAACAAGAACAAC TGCTCGTCCG 268980 463 1.74e-08 CTCCCGACGA CAACAACAACAACAGC GACGGATCAC 9141 8 6.73e-08 CCTCAGA CATCAACGAGGACATC ATGCATCACT 11427 476 2.66e-07 GAAGACGGTG CATCAACGTCAACGTC ACAACCGCC 3587 407 1.33e-06 TACGATGACA CAGCAACATGAACGCA CAGTCACTCA 24241 274 1.45e-06 TCGTCGGACA CAACAACGTGTCCAAC CCACTAACTA 5931 251 1.91e-06 ATCATTTGTA AATCGACATGGACGTC AAACACTTTT 263903 253 4.97e-06 CCAATAGACC AAACAAGATAGACGAC GGAGCGTAGT 23249 477 7.02e-06 TCCACTGAAC AAACAACCACACAACC TCCCGACC 6658 362 1.09e-05 GGAAGAGAGG CAAGGAAATGGACACC GCCATCGTTG 2063 447 1.15e-05 CACAATAGTC AAGCAACGAGCACAGA CACACTGATA 24214 289 1.29e-05 CACGGCAGAA AAAGAACAACACAGTC ACGACGGCTT 24091 57 2.57e-05 GACGACACTG TAACAAACAAAACAAA TCGTAAATCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37549 3.9e-09 466_[+1]_18 27834 3.9e-09 232_[+1]_252 268980 1.7e-08 462_[+1]_22 9141 6.7e-08 7_[+1]_477 11427 2.7e-07 475_[+1]_9 3587 1.3e-06 406_[+1]_78 24241 1.5e-06 273_[+1]_211 5931 1.9e-06 250_[+1]_234 263903 5e-06 252_[+1]_232 23249 7e-06 476_[+1]_8 6658 1.1e-05 361_[+1]_123 2063 1.2e-05 446_[+1]_38 24214 1.3e-05 288_[+1]_196 24091 2.6e-05 56_[+1]_428 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=14 37549 ( 467) CAACAACAACGACAAC 1 27834 ( 233) CATCAACAAGAACAAC 1 268980 ( 463) CAACAACAACAACAGC 1 9141 ( 8) CATCAACGAGGACATC 1 11427 ( 476) CATCAACGTCAACGTC 1 3587 ( 407) CAGCAACATGAACGCA 1 24241 ( 274) CAACAACGTGTCCAAC 1 5931 ( 251) AATCGACATGGACGTC 1 263903 ( 253) AAACAAGATAGACGAC 1 23249 ( 477) AAACAACCACACAACC 1 6658 ( 362) CAAGGAAATGGACACC 1 2063 ( 447) AAGCAACGAGCACAGA 1 24214 ( 289) AAAGAACAACACAGTC 1 24091 ( 57) TAACAAACAAAACAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 9.52561 E= 8.1e-006 49 122 -1045 -185 198 -1045 -1045 -1045 117 -1045 -77 15 -1045 180 -77 -1045 176 -1045 -77 -1045 198 -1045 -1045 -1045 -83 168 -176 -1045 117 -78 23 -1045 117 -1045 -1045 73 -83 54 104 -1045 98 -178 55 -185 163 -20 -1045 -1045 -83 180 -1045 -1045 134 -1045 55 -1045 49 -20 -77 15 -24 168 -1045 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 14 E= 8.1e-006 0.357143 0.571429 0.000000 0.071429 1.000000 0.000000 0.000000 0.000000 0.571429 0.000000 0.142857 0.285714 0.000000 0.857143 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.785714 0.071429 0.000000 0.571429 0.142857 0.285714 0.000000 0.571429 0.000000 0.000000 0.428571 0.142857 0.357143 0.500000 0.000000 0.500000 0.071429 0.357143 0.071429 0.785714 0.214286 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.642857 0.000000 0.357143 0.000000 0.357143 0.214286 0.142857 0.285714 0.214286 0.785714 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CA]A[AT]CAAC[AG][AT][GC][AG][AC]C[AG][ATC][CA] -------------------------------------------------------------------------------- Time 1.62 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 14 llr = 138 E-value = 1.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3:::71:4211: pos.-specific C 2:19:1a::6:: probability G 32113::61:8a matrix T 289::8::731: bits 2.0 * * 1.8 * * 1.6 * * * 1.4 * * * Relative 1.2 **** * * Entropy 1.0 ******* ** (14.2 bits) 0.8 *********** 0.6 *********** 0.4 *********** 0.2 *********** 0.0 ------------ Multilevel ATTCATCGTCGG consensus GG G AAT sequence C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 24241 12 1.19e-07 CCGAGGACGG ATTCATCGTCGG AGGTGGTGGC 23249 52 2.41e-07 TGAACGTCGT CTTCATCGTCGG AGAGTTCAAT 37549 252 1.69e-06 TTGCTCTCAT CGTCATCGTCGG CTTGGTTGGT 5931 189 2.43e-06 CGTCATCCCC TTTCATCATTGG GCCTCCTTTA 24214 389 4.28e-06 ACAGCCTTCG GTTCATCATCAG TATCCAATCC 27834 266 6.75e-06 CCGGGGAAAG ATTCATCAATGG CCATGTCACT 3587 224 1.06e-05 TGAGATTCGT GTTCGTCGTAGG TCAGCCGTGG 24091 222 1.41e-05 CGGCCAGATT TTTCATCATCTG TCGCAGCTCG 263903 128 2.23e-05 CGGAGGCGGG GTTCGTCGGTGG CTGGAGAGGA 2063 284 2.23e-05 ATTATGCAGT GTGCGTCATCGG TTCACCACCT 6658 8 3.05e-05 TGCCACG AGTCGTCGATGG TTGTTGCTGC 9141 336 3.72e-05 TCTGCCTTCT TTTCACCGTCAG CTCTAGCACA 268980 8 3.72e-05 GTCTGTC CGTCAACGACGG GTGCTGACAG 11427 299 1.67e-04 TTCATTAGGC ATCGAACGTCGG CTTCCACACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24241 1.2e-07 11_[+2]_477 23249 2.4e-07 51_[+2]_437 37549 1.7e-06 251_[+2]_237 5931 2.4e-06 188_[+2]_300 24214 4.3e-06 388_[+2]_100 27834 6.8e-06 265_[+2]_223 3587 1.1e-05 223_[+2]_265 24091 1.4e-05 221_[+2]_267 263903 2.2e-05 127_[+2]_361 2063 2.2e-05 283_[+2]_205 6658 3.1e-05 7_[+2]_481 9141 3.7e-05 335_[+2]_153 268980 3.7e-05 7_[+2]_481 11427 0.00017 298_[+2]_190 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=14 24241 ( 12) ATTCATCGTCGG 1 23249 ( 52) CTTCATCGTCGG 1 37549 ( 252) CGTCATCGTCGG 1 5931 ( 189) TTTCATCATTGG 1 24214 ( 389) GTTCATCATCAG 1 27834 ( 266) ATTCATCAATGG 1 3587 ( 224) GTTCGTCGTAGG 1 24091 ( 222) TTTCATCATCTG 1 263903 ( 128) GTTCGTCGGTGG 1 2063 ( 284) GTGCGTCATCGG 1 6658 ( 8) AGTCGTCGATGG 1 9141 ( 336) TTTCACCGTCAG 1 268980 ( 8) CGTCAACGACGG 1 11427 ( 299) ATCGAACGTCGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 8.93074 E= 1.0e-001 17 -20 23 -27 -1045 -1045 -18 161 -1045 -178 -176 173 -1045 192 -176 -1045 149 -1045 23 -1045 -83 -178 -1045 161 -1045 202 -1045 -1045 49 -1045 140 -1045 -24 -1045 -176 147 -183 139 -1045 15 -83 -1045 169 -185 -1045 -1045 204 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 1.0e-001 0.285714 0.214286 0.285714 0.214286 0.000000 0.000000 0.214286 0.785714 0.000000 0.071429 0.071429 0.857143 0.000000 0.928571 0.071429 0.000000 0.714286 0.000000 0.285714 0.000000 0.142857 0.071429 0.000000 0.785714 0.000000 1.000000 0.000000 0.000000 0.357143 0.000000 0.642857 0.000000 0.214286 0.000000 0.071429 0.714286 0.071429 0.642857 0.000000 0.285714 0.142857 0.000000 0.785714 0.071429 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AGCT][TG]TC[AG]TC[GA][TA][CT]GG -------------------------------------------------------------------------------- Time 3.09 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 18 sites = 8 llr = 121 E-value = 5.3e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :5::333::::11:41:4 pos.-specific C :::1::::1:::::1::: probability G a:a381515:93:a3:a1 matrix T :5:6:6394a169:39:5 bits 2.0 * * * * * 1.8 * * * * * 1.6 * * * * * 1.4 * * * ** ** ** Relative 1.2 * * * * ** ** ** Entropy 1.0 *** * * ** ** ** (21.8 bits) 0.8 *** * * ** ** ** 0.6 ****** ******* *** 0.4 ************** *** 0.2 ************** *** 0.0 ------------------ Multilevel GAGTGTGTGTGTTGATGT consensus T GAAA T G G A sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 263903 172 4.10e-09 CGATGGGACG GTGTGTGTGTGATGATGA TGATGTGTTA 23249 122 2.25e-08 TTTTTGTGGC GTGGATATGTGTTGGTGT AGGTAAGGCG 2063 177 2.56e-08 CTCCTCCAGC GAGTGTGTGTGGTGAAGA AAGCACCAGC 6658 34 3.30e-08 TGCTGCACTC GAGTGAGTCTGTTGCTGT GTTTGCTGCG 5931 307 3.76e-08 GTCATGCTTT GTGTGTTGTTGTTGTTGT TGTTGTTCTG 3587 86 9.51e-08 AATTTCTTCA GAGGAATTTTGTTGTTGT GAATCAGTTG 24241 92 1.70e-07 GCGGTTTTGT GAGCGTGTGTGGAGGTGA CAGTTTCGGT 37549 73 3.43e-07 CCGGTGGGAG GTGTGGATTTTTTGATGG TGCATCTGCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 263903 4.1e-09 171_[+3]_311 23249 2.2e-08 121_[+3]_361 2063 2.6e-08 176_[+3]_306 6658 3.3e-08 33_[+3]_449 5931 3.8e-08 306_[+3]_176 3587 9.5e-08 85_[+3]_397 24241 1.7e-07 91_[+3]_391 37549 3.4e-07 72_[+3]_410 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=18 seqs=8 263903 ( 172) GTGTGTGTGTGATGATGA 1 23249 ( 122) GTGGATATGTGTTGGTGT 1 2063 ( 177) GAGTGTGTGTGGTGAAGA 1 6658 ( 34) GAGTGAGTCTGTTGCTGT 1 5931 ( 307) GTGTGTTGTTGTTGTTGT 1 3587 ( 86) GAGGAATTTTGTTGTTGT 1 24241 ( 92) GAGCGTGTGTGGAGGTGA 1 37549 ( 73) GTGTGGATTTTTTGATGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 6762 bayes= 9.72153 E= 5.3e-001 -965 -965 204 -965 98 -965 -965 96 -965 -965 204 -965 -965 -97 4 128 -2 -965 162 -965 -2 -965 -96 128 -2 -965 104 -4 -965 -965 -96 176 -965 -97 104 54 -965 -965 -965 196 -965 -965 185 -104 -102 -965 4 128 -102 -965 -965 176 -965 -965 204 -965 56 -97 4 -4 -102 -965 -965 176 -965 -965 204 -965 56 -965 -96 96 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 8 E= 5.3e-001 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.125000 0.250000 0.625000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.125000 0.625000 0.250000 0.000000 0.500000 0.250000 0.000000 0.000000 0.125000 0.875000 0.000000 0.125000 0.500000 0.375000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 0.125000 0.000000 0.250000 0.625000 0.125000 0.000000 0.000000 0.875000 0.000000 0.000000 1.000000 0.000000 0.375000 0.125000 0.250000 0.250000 0.125000 0.000000 0.000000 0.875000 0.000000 0.000000 1.000000 0.000000 0.375000 0.000000 0.125000 0.500000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[AT]G[TG][GA][TA][GAT]T[GT]TG[TG]TG[AGT]TG[TA] -------------------------------------------------------------------------------- Time 4.65 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11427 4.32e-05 108_[+3(7.06e-05)]_349_\ [+1(2.66e-07)]_9 2063 1.81e-07 176_[+3(2.56e-08)]_89_\ [+2(2.23e-05)]_151_[+1(1.15e-05)]_38 23249 1.60e-09 51_[+2(2.41e-07)]_58_[+3(2.25e-08)]_\ 55_[+3(7.61e-05)]_264_[+1(7.02e-06)]_8 24091 2.02e-03 56_[+1(2.57e-05)]_149_\ [+2(1.41e-05)]_267 24214 9.61e-04 288_[+1(1.29e-05)]_84_\ [+2(4.28e-06)]_100 24241 1.27e-09 11_[+2(1.19e-07)]_68_[+3(1.70e-07)]_\ 164_[+1(1.45e-06)]_211 263903 1.58e-08 127_[+2(2.23e-05)]_32_\ [+3(4.10e-09)]_63_[+1(4.97e-06)]_42_[+1(7.44e-05)]_174 268980 2.10e-05 7_[+2(3.72e-05)]_404_[+1(7.08e-05)]_\ 23_[+1(1.74e-08)]_22 27834 6.32e-07 232_[+1(3.90e-09)]_17_\ [+2(6.75e-06)]_223 3587 4.30e-08 85_[+3(9.51e-08)]_22_[+2(7.90e-05)]_\ 86_[+2(1.06e-05)]_171_[+1(1.33e-06)]_78 37549 1.16e-10 72_[+3(3.43e-07)]_161_\ [+2(1.69e-06)]_86_[+2(3.05e-05)]_105_[+1(3.90e-09)]_18 5931 6.59e-09 188_[+2(2.43e-06)]_50_\ [+1(1.91e-06)]_40_[+3(3.76e-08)]_12_[+2(5.18e-05)]_152 6658 2.88e-07 7_[+2(3.05e-05)]_14_[+3(3.30e-08)]_\ 310_[+1(1.09e-05)]_123 9141 1.37e-05 7_[+1(6.73e-08)]_312_[+2(3.72e-05)]_\ 153 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************