******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/423/423.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42685 1.0000 500 38008 1.0000 500 49270 1.0000 500 41478 1.0000 500 41604 1.0000 500 48827 1.0000 500 39407 1.0000 500 38076 1.0000 500 45944 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/423/423.seqs.fa -oc motifs/423 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.291 C 0.221 G 0.204 T 0.284 Background letter frequencies (from dataset with add-one prior applied): A 0.291 C 0.221 G 0.204 T 0.284 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 7 llr = 118 E-value = 1.1e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1:4971a::::674311::3: pos.-specific C :16::1:11611:1:3::44: probability G 99:137::::93117::::3: matrix T :::::::994::13:69a6:a bits 2.3 2.1 1.8 * * * 1.6 ** * * * * Relative 1.4 ** * *** * * * * Entropy 1.1 ** ******** * ** * (24.4 bits) 0.9 *********** * *** * 0.7 ************* * *** * 0.5 ************* ******* 0.2 ************* ******* 0.0 --------------------- Multilevel GGCAAGATTCGAAAGTTTTCT consensus A G T G TAC CA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 48827 398 2.09e-11 CAGTCAACCT GGCAAGATTCGAAAGCTTTAT CCTTCGTTTG 45944 394 2.57e-11 CAGTCAACCT GGCAAGATTTGAAAGTTTTAT CCTTCGTTCG 42685 378 5.99e-09 AGTCGGTCCA GGAAAGATTCGATTACTTTGT ACACGGAAAA 49270 129 2.32e-08 AGTTGGTTTC GGCAGCATTCGGAAATATTGT GTTATTGCGT 38076 8 3.07e-08 GTACTTT GCAAGAATTTGGATGTTTCCT GACTGTTAGT 38008 86 4.53e-08 ATCAATCGAC GGCAAGACTCCAGGGATTCCT TGGGTGGGGG 41478 402 1.16e-07 GCACACCATA AGAGAGATCTGCACGTTTCCT CTCGCGCACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48827 2.1e-11 397_[+1]_82 45944 2.6e-11 393_[+1]_86 42685 6e-09 377_[+1]_102 49270 2.3e-08 128_[+1]_351 38076 3.1e-08 7_[+1]_472 38008 4.5e-08 85_[+1]_394 41478 1.2e-07 401_[+1]_78 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=7 48827 ( 398) GGCAAGATTCGAAAGCTTTAT 1 45944 ( 394) GGCAAGATTTGAAAGTTTTAT 1 42685 ( 378) GGAAAGATTCGATTACTTTGT 1 49270 ( 129) GGCAGCATTCGGAAATATTGT 1 38076 ( 8) GCAAGAATTTGGATGTTTCCT 1 38008 ( 86) GGCAAGACTCCAGGGATTCCT 1 41478 ( 402) AGAGAGATCTGCACGTTTCCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4320 bayes= 9.11073 E= 1.1e-001 -102 -945 207 -945 -945 -63 207 -945 56 137 -945 -945 156 -945 -51 -945 130 -945 48 -945 -102 -63 180 -945 178 -945 -945 -945 -945 -63 -945 159 -945 -63 -945 159 -945 137 -945 59 -945 -63 207 -945 97 -63 48 -945 130 -945 -51 -99 56 -63 -51 1 -2 -945 180 -945 -102 37 -945 101 -102 -945 -945 159 -945 -945 -945 181 -945 96 -945 101 -2 96 48 -945 -945 -945 -945 181 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 1.1e-001 0.142857 0.000000 0.857143 0.000000 0.000000 0.142857 0.857143 0.000000 0.428571 0.571429 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.714286 0.000000 0.285714 0.000000 0.142857 0.142857 0.714286 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 0.571429 0.000000 0.428571 0.000000 0.142857 0.857143 0.000000 0.571429 0.142857 0.285714 0.000000 0.714286 0.000000 0.142857 0.142857 0.428571 0.142857 0.142857 0.285714 0.285714 0.000000 0.714286 0.000000 0.142857 0.285714 0.000000 0.571429 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.428571 0.000000 0.571429 0.285714 0.428571 0.285714 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GG[CA]A[AG]GATT[CT]G[AG]A[AT][GA][TC]TT[TC][CAG]T -------------------------------------------------------------------------------- Time 0.83 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 4 llr = 88 E-value = 3.3e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::a:3a::::::58:8:: pos.-specific C ::8:a:::::3:::333a3a: probability G 3::3::a::::88a:3::::8 matrix T 8a38:::8:a833:8:::::3 bits 2.3 * * * * * 2.1 * * * * * 1.8 * *** ** * * * 1.6 * *** ** * * * Relative 1.4 ** *** ** *** * ** Entropy 1.1 ******* ******* ***** (31.8 bits) 0.9 *************** ***** 0.7 *************** ***** 0.5 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TTCTCAGTATTGGGTAACACG consensus G TG A CTT CCC C T sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45944 248 2.62e-13 GCAACAATAT TTCTCAGTATTGGGTAACACG ACTCTCTCCA 48827 249 2.62e-13 GCAACAATAT TTCTCAGTATTGGGTAACACG ACTCTCTCCA 38076 201 5.73e-10 GTCATCAAAT GTCGCAGAATTGGGTGCCCCT TTGCCCCACA 41604 62 6.32e-10 TTTGATAAGG TTTTCAGTATCTTGCCACACG AAAGACTATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45944 2.6e-13 247_[+2]_232 48827 2.6e-13 248_[+2]_231 38076 5.7e-10 200_[+2]_279 41604 6.3e-10 61_[+2]_418 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=4 45944 ( 248) TTCTCAGTATTGGGTAACACG 1 48827 ( 249) TTCTCAGTATTGGGTAACACG 1 38076 ( 201) GTCGCAGAATTGGGTGCCCCT 1 41604 ( 62) TTTTCAGTATCTTGCCACACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4320 bayes= 10.0755 E= 3.3e+000 -865 -865 29 140 -865 -865 -865 181 -865 176 -865 -18 -865 -865 29 140 -865 218 -865 -865 178 -865 -865 -865 -865 -865 229 -865 -22 -865 -865 140 178 -865 -865 -865 -865 -865 -865 181 -865 18 -865 140 -865 -865 187 -18 -865 -865 187 -18 -865 -865 229 -865 -865 18 -865 140 78 18 29 -865 137 18 -865 -865 -865 218 -865 -865 137 18 -865 -865 -865 218 -865 -865 -865 -865 187 -18 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 3.3e+000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.000000 0.750000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.500000 0.250000 0.250000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TG]T[CT][TG]CAG[TA]AT[TC][GT][GT]G[TC][ACG][AC]C[AC]C[GT] -------------------------------------------------------------------------------- Time 1.59 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 9 llr = 104 E-value = 1.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 7::891a1:3:::4: pos.-specific C :98::::3:2:6:17 probability G 1:12:9:272:::2: matrix T 211:1::332a4a23 bits 2.3 2.1 1.8 ** * * 1.6 * ** * * Relative 1.4 * *** * * Entropy 1.1 ****** * * * * (16.6 bits) 0.9 ****** * *** * 0.7 ******* * *** * 0.5 ******* * *** * 0.2 ********* *** * 0.0 --------------- Multilevel ACCAAGACGATCTAC consensus T G TTC T GT sequence G G T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 38076 259 3.40e-08 CTCCAGGTTA ACCAAGAGGGTTTAC TTATGTTGTT 45944 225 1.12e-07 GGTTTGCAGA ACCAAGACTATCTGC AACAATATTT 48827 226 1.12e-07 GGTTTGCAGA ACCAAGACTATCTGC AACAATATTT 41478 229 1.77e-06 GATATGAACT TCCGAGACGATTTAT TTTGATATGT 39407 113 2.16e-06 CACGTTCCAA ACGAAGATGGTTTAT TTCACTTTTC 49270 357 4.12e-06 TTATTGATCG ACCATGAAGCTCTTC CGTTTCACTA 41604 185 7.69e-06 GCCCATTCCG ACTAAGATTTTTTTC ATATGATTCA 38008 139 8.14e-06 CGTAAAAGTT TTCGAGAGGTTCTAC TGTTATTCCG 42685 12 1.87e-05 CCGGTGAGAT GCCAAAATGCTCTCT TCCGTGACAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38076 3.4e-08 258_[+3]_227 45944 1.1e-07 224_[+3]_261 48827 1.1e-07 225_[+3]_260 41478 1.8e-06 228_[+3]_257 39407 2.2e-06 112_[+3]_373 49270 4.1e-06 356_[+3]_129 41604 7.7e-06 184_[+3]_301 38008 8.1e-06 138_[+3]_347 42685 1.9e-05 11_[+3]_474 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=9 38076 ( 259) ACCAAGAGGGTTTAC 1 45944 ( 225) ACCAAGACTATCTGC 1 48827 ( 226) ACCAAGACTATCTGC 1 41478 ( 229) TCCGAGACGATTTAT 1 39407 ( 113) ACGAAGATGGTTTAT 1 49270 ( 357) ACCATGAAGCTCTTC 1 41604 ( 185) ACTAAGATTTTTTTC 1 38008 ( 139) TTCGAGAGGTTCTAC 1 42685 ( 12) GCCAAAATGCTCTCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4374 bayes= 9.05641 E= 1.3e+001 120 -982 -88 -35 -982 201 -982 -135 -982 181 -88 -135 142 -982 12 -982 161 -982 -982 -135 -138 -982 212 -982 178 -982 -982 -982 -138 59 12 23 -982 -982 171 23 20 1 12 -35 -982 -982 -982 181 -982 133 -982 64 -982 -982 -982 181 61 -99 12 -35 -982 159 -982 23 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 9 E= 1.3e+001 0.666667 0.000000 0.111111 0.222222 0.000000 0.888889 0.000000 0.111111 0.000000 0.777778 0.111111 0.111111 0.777778 0.000000 0.222222 0.000000 0.888889 0.000000 0.000000 0.111111 0.111111 0.000000 0.888889 0.000000 1.000000 0.000000 0.000000 0.000000 0.111111 0.333333 0.222222 0.333333 0.000000 0.000000 0.666667 0.333333 0.333333 0.222222 0.222222 0.222222 0.000000 0.000000 0.000000 1.000000 0.000000 0.555556 0.000000 0.444444 0.000000 0.000000 0.000000 1.000000 0.444444 0.111111 0.222222 0.222222 0.000000 0.666667 0.000000 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AT]CC[AG]AGA[CTG][GT][ACGT]T[CT]T[AGT][CT] -------------------------------------------------------------------------------- Time 2.44 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42685 2.93e-06 11_[+3(1.87e-05)]_351_\ [+1(5.99e-09)]_102 38008 3.40e-06 85_[+1(4.53e-08)]_32_[+3(8.14e-06)]_\ 347 49270 3.08e-06 128_[+1(2.32e-08)]_207_\ [+3(4.12e-06)]_129 41478 7.26e-06 228_[+3(1.77e-06)]_158_\ [+1(1.16e-07)]_78 41604 2.16e-07 61_[+2(6.32e-10)]_102_\ [+3(7.69e-06)]_301 48827 9.34e-20 225_[+3(1.12e-07)]_8_[+2(2.62e-13)]_\ 128_[+1(2.09e-11)]_82 39407 8.57e-03 112_[+3(2.16e-06)]_373 38076 4.90e-14 7_[+1(3.07e-08)]_172_[+2(5.73e-10)]_\ 37_[+3(3.40e-08)]_227 45944 1.14e-19 224_[+3(1.12e-07)]_8_[+2(2.62e-13)]_\ 125_[+1(2.57e-11)]_86 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************