******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/379/379.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 20617 1.0000 500 21409 1.0000 500 22568 1.0000 500 22688 1.0000 500 23296 1.0000 500 24655 1.0000 500 25558 1.0000 500 263381 1.0000 500 3502 1.0000 500 35798 1.0000 500 41505 1.0000 500 4467 1.0000 500 5788 1.0000 500 6045 1.0000 500 6400 1.0000 500 7527 1.0000 500 8180 1.0000 500 8762 1.0000 500 931 1.0000 500 9968 1.0000 500 bd1077 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/379/379.seqs.fa -oc motifs/379 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 21 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10500 N= 21 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.250 C 0.228 G 0.251 T 0.270 Background letter frequencies (from dataset with add-one prior applied): A 0.250 C 0.228 G 0.251 T 0.270 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 9 llr = 150 E-value = 1.3e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :8226:43:681711a41:a8 pos.-specific C 8:842937a2:8189::89:2 probability G 11:2111::22:2:::211:: matrix T 11:11:1::::1:1::3:::: bits 2.1 * 1.9 * * * 1.7 * * ** ** 1.5 * * ** ** Relative 1.3 * * * * ** *** Entropy 1.1 *** * ** ** *** **** (24.1 bits) 0.9 *** * ** ****** **** 0.6 *** * ********* **** 0.4 *** ** ************** 0.2 ********************* 0.0 --------------------- Multilevel CACCACACCAACACCAACCAA consensus AAC CA CG G T C sequence G G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 22568 478 5.96e-11 ACAATCAAAG CACCCCACCAACACCAAGCAA CA bd1077 287 2.76e-10 AGTTCATCAC CAAGACAACAACGCCAACCAA TTCGCCTCAA 8180 287 2.76e-10 AGTTCATCAC CAAGACAACAACGCCAACCAA TTCGCCTCAA 6045 342 1.04e-09 TCGGTACGCC TACTACACCCACACCATCCAA CGAACGAACG 20617 450 2.53e-08 TCAAGAGAGA CACAACGCCGACACAATCGAA CGACGCCAAT 9968 236 3.16e-08 CCCAGCCTTG CTCCACCACAGCACCAGACAC ATGACAGGGC 8762 398 3.92e-08 GATCGTATTA GACAGCCCCGATACCAGCCAA CCCCTCCCGT 5788 462 1.34e-07 CCCGCCCAGC CACCTGCCCCACCACAACCAC GACGATACCA 41505 166 1.51e-07 CTTCGCCCGT CGCCCCTCCAGAATCATCCAA CGCCTTCACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22568 6e-11 477_[+1]_2 bd1077 2.8e-10 286_[+1]_193 8180 2.8e-10 286_[+1]_193 6045 1e-09 341_[+1]_138 20617 2.5e-08 449_[+1]_30 9968 3.2e-08 235_[+1]_244 8762 3.9e-08 397_[+1]_82 5788 1.3e-07 461_[+1]_18 41505 1.5e-07 165_[+1]_314 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=9 22568 ( 478) CACCCCACCAACACCAAGCAA 1 bd1077 ( 287) CAAGACAACAACGCCAACCAA 1 8180 ( 287) CAAGACAACAACGCCAACCAA 1 6045 ( 342) TACTACACCCACACCATCCAA 1 20617 ( 450) CACAACGCCGACACAATCGAA 1 9968 ( 236) CTCCACCACAGCACCAGACAC 1 8762 ( 398) GACAGCCCCGATACCAGCCAA 1 5788 ( 462) CACCTGCCCCACCACAACCAC 1 41505 ( 166) CGCCCCTCCAGAATCATCCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 10080 bayes= 10.9766 E= 1.3e-004 -982 177 -118 -128 164 -982 -118 -128 -17 177 -982 -982 -17 96 -18 -128 115 -4 -118 -128 -982 196 -118 -982 83 55 -118 -128 41 155 -982 -982 -982 213 -982 -982 115 -4 -18 -982 164 -982 -18 -982 -117 177 -982 -128 141 -104 -18 -982 -117 177 -982 -128 -117 196 -982 -982 200 -982 -982 -982 83 -982 -18 30 -117 177 -118 -982 -982 196 -118 -982 200 -982 -982 -982 164 -4 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 1.3e-004 0.000000 0.777778 0.111111 0.111111 0.777778 0.000000 0.111111 0.111111 0.222222 0.777778 0.000000 0.000000 0.222222 0.444444 0.222222 0.111111 0.555556 0.222222 0.111111 0.111111 0.000000 0.888889 0.111111 0.000000 0.444444 0.333333 0.111111 0.111111 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.555556 0.222222 0.222222 0.000000 0.777778 0.000000 0.222222 0.000000 0.111111 0.777778 0.000000 0.111111 0.666667 0.111111 0.222222 0.000000 0.111111 0.777778 0.000000 0.111111 0.111111 0.888889 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.444444 0.000000 0.222222 0.333333 0.111111 0.777778 0.111111 0.000000 0.000000 0.888889 0.111111 0.000000 1.000000 0.000000 0.000000 0.000000 0.777778 0.222222 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CA[CA][CAG][AC]C[AC][CA]C[ACG][AG]C[AG]CCA[ATG]CCA[AC] -------------------------------------------------------------------------------- Time 4.29 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 16 llr = 164 E-value = 1.2e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :25:821913:3 pos.-specific C ::3:::::2:1: probability G a8:928916894 matrix T ::21::::1::3 bits 2.1 1.9 * 1.7 * ** 1.5 * * ** * Relative 1.3 ** ***** ** Entropy 1.1 ** ***** ** (14.8 bits) 0.9 ** ***** ** 0.6 ******** ** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GGAGAGGAGGGG consensus C A A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 8762 295 6.18e-08 GGGAACAAGA GGAGAGGAGGGG TGCGGTTTTG 24655 40 6.14e-07 ATGAGGAGGA GGTGAGGAGGGA TGTTCGCTCG 6400 324 1.02e-06 AGAGGGTGGA GGAGAGGAGAGA GGGAGGAGTC 22688 369 1.02e-06 AAACTCTTCG GGAGGGGAGGGG GCAGTCTTCC 23296 338 1.80e-06 TGGCATCGTT GGAGAGGAGGCG TTGGAGGGTG 263381 405 4.66e-06 TTGGCGTTTG GGAGAGGGGGGG TGCTGCGTTT 22568 309 5.17e-06 AACGAGTGGC GGCGAGGAGGCT GGTTATGAGC 5788 330 6.82e-06 TGGTGGATAG GGCGAGGATGGA TGCTGAGGTG 3502 21 6.82e-06 CGGTATATTG GGCTAGGAGGGT GGCAATGGCA bd1077 435 1.24e-05 CGAACGCAAC GGCGAAGACGGT CCATCAGTGC 8180 435 1.24e-05 CGAACGCAAC GGCGAAGACGGT CCATCAGTGC 35798 223 1.73e-05 CACATCATCT GAAGAGGACAGA GGTCTGGTGT 21409 164 2.74e-05 TGACCGCAGC GAAGGGGAAGGG ATGTCCATCT 7527 179 3.34e-05 GTGTGTTGTT GGTTGGGAGGGA GATCGATTGA 931 213 4.59e-05 GACAGGGGTA GAAGAGAAGAGG TAGTATTTAG 20617 368 4.59e-05 AAGTGACAGC GGTGAAGAAAGG CCGGCTCAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8762 6.2e-08 294_[+2]_194 24655 6.1e-07 39_[+2]_449 6400 1e-06 323_[+2]_165 22688 1e-06 368_[+2]_120 23296 1.8e-06 337_[+2]_151 263381 4.7e-06 404_[+2]_84 22568 5.2e-06 308_[+2]_180 5788 6.8e-06 329_[+2]_159 3502 6.8e-06 20_[+2]_468 bd1077 1.2e-05 434_[+2]_54 8180 1.2e-05 434_[+2]_54 35798 1.7e-05 222_[+2]_266 21409 2.7e-05 163_[+2]_325 7527 3.3e-05 178_[+2]_310 931 4.6e-05 212_[+2]_276 20617 4.6e-05 367_[+2]_121 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=16 8762 ( 295) GGAGAGGAGGGG 1 24655 ( 40) GGTGAGGAGGGA 1 6400 ( 324) GGAGAGGAGAGA 1 22688 ( 369) GGAGGGGAGGGG 1 23296 ( 338) GGAGAGGAGGCG 1 263381 ( 405) GGAGAGGGGGGG 1 22568 ( 309) GGCGAGGAGGCT 1 5788 ( 330) GGCGAGGATGGA 1 3502 ( 21) GGCTAGGAGGGT 1 bd1077 ( 435) GGCGAAGACGGT 1 8180 ( 435) GGCGAAGACGGT 1 35798 ( 223) GAAGAGGACAGA 1 21409 ( 164) GAAGGGGAAGGG 1 7527 ( 179) GGTTGGGAGGGA 1 931 ( 213) GAAGAGAAGAGG 1 20617 ( 368) GGTGAAGAAAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 10269 bayes= 10.0616 E= 1.2e-002 -1064 -1064 199 -1064 -42 -1064 169 -1064 100 45 -1064 -53 -1064 -1064 180 -111 170 -1064 -42 -1064 -42 -1064 169 -1064 -200 -1064 190 -1064 191 -1064 -201 -1064 -100 -28 131 -211 0 -1064 158 -1064 -1064 -87 180 -1064 32 -1064 80 -11 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 16 E= 1.2e-002 0.000000 0.000000 1.000000 0.000000 0.187500 0.000000 0.812500 0.000000 0.500000 0.312500 0.000000 0.187500 0.000000 0.000000 0.875000 0.125000 0.812500 0.000000 0.187500 0.000000 0.187500 0.000000 0.812500 0.000000 0.062500 0.000000 0.937500 0.000000 0.937500 0.000000 0.062500 0.000000 0.125000 0.187500 0.625000 0.062500 0.250000 0.000000 0.750000 0.000000 0.000000 0.125000 0.875000 0.000000 0.312500 0.000000 0.437500 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GG[AC]GAGGAG[GA]G[GAT] -------------------------------------------------------------------------------- Time 8.15 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 8 llr = 116 E-value = 2.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::::3::1:::4:1 pos.-specific C 8:a3668:14a::a3 probability G ::::3::::6::::6 matrix T 3a:8113a8::a6:: bits 2.1 * * * 1.9 ** * ** * 1.7 ** * ** * 1.5 ** * ** * Relative 1.3 *** ** ** * Entropy 1.1 **** ** ***** (20.9 bits) 0.9 ************** 0.6 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CTCTCCCTTGCTTCG consensus T CGAT C A C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 25558 366 5.96e-09 ATGGTCATCA CTCTCACTTGCTTCG TCTCTGGCGG 41505 146 2.08e-08 CTCCTCCCTC CTCTCCTTTCCTTCG CCCGTCGCCC bd1077 231 3.63e-08 TGTACATCGT TTCTCCCTTCCTACG GTGCACCATG 8180 231 3.63e-08 TGTACATCGT TTCTCCCTTCCTACG GTGCACCATG 3502 451 8.60e-08 TGAGTTTTGC CTCCGCCTTGCTTCC TCCAAAAAAT 7527 346 2.91e-07 CTGCAGCCTC CTCTCTCTAGCTACG GTTGGGATTT 20617 428 4.65e-07 CTTTGCCCTT CTCCGACTTGCTTCA AGAGAGACAC 263381 111 8.36e-07 AATCGGTATA CTCTTCTTCGCTTCC TGATTCATGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25558 6e-09 365_[+3]_120 41505 2.1e-08 145_[+3]_340 bd1077 3.6e-08 230_[+3]_255 8180 3.6e-08 230_[+3]_255 3502 8.6e-08 450_[+3]_35 7527 2.9e-07 345_[+3]_140 20617 4.7e-07 427_[+3]_58 263381 8.4e-07 110_[+3]_375 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=8 25558 ( 366) CTCTCACTTGCTTCG 1 41505 ( 146) CTCTCCTTTCCTTCG 1 bd1077 ( 231) TTCTCCCTTCCTACG 1 8180 ( 231) TTCTCCCTTCCTACG 1 3502 ( 451) CTCCGCCTTGCTTCC 1 7527 ( 346) CTCTCTCTAGCTACG 1 20617 ( 428) CTCCGACTTGCTTCA 1 263381 ( 111) CTCTTCTTCGCTTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 10206 bayes= 10.316 E= 2.6e-001 -965 172 -965 -11 -965 -965 -965 189 -965 213 -965 -965 -965 13 -965 147 -965 145 -1 -111 0 145 -965 -111 -965 172 -965 -11 -965 -965 -965 189 -100 -87 -965 147 -965 72 131 -965 -965 213 -965 -965 -965 -965 -965 189 58 -965 -965 121 -965 213 -965 -965 -100 13 131 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 8 E= 2.6e-001 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.625000 0.250000 0.125000 0.250000 0.625000 0.000000 0.125000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.125000 0.125000 0.000000 0.750000 0.000000 0.375000 0.625000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.375000 0.000000 0.000000 0.625000 0.000000 1.000000 0.000000 0.000000 0.125000 0.250000 0.625000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CT]TC[TC][CG][CA][CT]TT[GC]CT[TA]C[GC] -------------------------------------------------------------------------------- Time 12.15 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20617 1.83e-08 367_[+2(4.59e-05)]_48_\ [+3(4.65e-07)]_7_[+1(2.53e-08)]_30 21409 9.63e-02 163_[+2(2.74e-05)]_325 22568 5.92e-09 308_[+2(5.17e-06)]_157_\ [+1(5.96e-11)]_2 22688 1.87e-03 368_[+2(1.02e-06)]_120 23296 7.29e-03 337_[+2(1.80e-06)]_151 24655 3.79e-03 39_[+2(6.14e-07)]_449 25558 1.66e-04 365_[+3(5.96e-09)]_120 263381 6.89e-05 110_[+3(8.36e-07)]_279_\ [+2(4.66e-06)]_84 3502 9.11e-06 20_[+2(6.82e-06)]_210_\ [+2(2.63e-05)]_196_[+3(8.60e-08)]_35 35798 7.69e-03 147_[+3(9.89e-05)]_60_\ [+2(1.73e-05)]_266 41505 1.73e-07 145_[+3(2.08e-08)]_5_[+1(1.51e-07)]_\ 314 4467 2.53e-01 500 5788 6.37e-06 223_[+2(9.24e-05)]_94_\ [+2(6.82e-06)]_120_[+1(1.34e-07)]_18 6045 2.94e-05 341_[+1(1.04e-09)]_16_\ [+1(9.85e-05)]_101 6400 2.90e-03 323_[+2(1.02e-06)]_165 7527 3.03e-05 178_[+2(3.34e-05)]_155_\ [+3(2.91e-07)]_140 8180 7.67e-12 230_[+3(3.63e-08)]_41_\ [+1(2.76e-10)]_127_[+2(1.24e-05)]_54 8762 3.25e-08 294_[+2(6.18e-08)]_91_\ [+1(3.92e-08)]_82 931 2.25e-01 212_[+2(4.59e-05)]_276 9968 1.56e-04 235_[+1(3.16e-08)]_244 bd1077 7.67e-12 230_[+3(3.63e-08)]_41_\ [+1(2.76e-10)]_127_[+2(1.24e-05)]_54 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************