******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/77/77.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10232 1.0000 500 10424 1.0000 500 21425 1.0000 500 21768 1.0000 500 21977 1.0000 500 22848 1.0000 500 24956 1.0000 500 25315 1.0000 500 261973 1.0000 500 263784 1.0000 500 268404 1.0000 500 30919 1.0000 500 37123 1.0000 500 4589 1.0000 500 4946 1.0000 500 6062 1.0000 500 7043 1.0000 500 8348 1.0000 500 9301 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/77/77.seqs.fa -oc motifs/77 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.266 C 0.235 G 0.231 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.266 C 0.235 G 0.231 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 9 llr = 153 E-value = 1.4e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::1:11:::2:111:::27: pos.-specific C :3141:1:::1:::::2:::: probability G 1:9:29143::a167:1a4:9 matrix T 97:47:767a7:832a7:331 bits 2.1 * * 1.9 * * * * 1.7 * * * * * * * 1.5 * * * * * * * * Relative 1.3 * * * * * * * * Entropy 1.1 *** * *** * * * ** (24.5 bits) 0.8 *** ** *** ** ** * ** 0.6 ****** *********** ** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TTGCTGTTTTTGTGGTTGGAG consensus C TG GG A TT C TT sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 21977 41 5.79e-11 ATTCTTGTCA TTGCTGAGTTTGTGGTTGGAG AGAACGTGTC 4589 139 4.30e-10 GTTGCGGTGG TTGCGGTGGTTGTGGTTGATG GGATGTGCGT 22848 194 1.75e-09 TTCTCCTCCC TCCCTGTTGTTGTGGTTGAAG TATCCTTTCG 261973 195 7.50e-09 TTTAGCTGGA TCGTCGTTTTTGTTGTTGGAT TCTTGACCAT 8348 125 8.90e-09 CGGCGTTTGG TTGTGGTGTTCGTGTTCGTAG CGGTCGGTTG 21425 2 1.46e-08 C TTGATATTTTAGTTGTTGTAG ACCTTACGGT 263784 315 1.58e-08 GTTGCGTGTT GTGTTGTGTTTGAGTTTGGTG TTCGGTTTCG 37123 237 4.31e-08 CACGGTGTAT TCGTTGGTGTAGTAGTCGGTG ATGTTGGTGT 7043 40 7.97e-08 AAGATGCAGT TTGCTGCTTTTGGTATGGTAG TGGTGTGTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21977 5.8e-11 40_[+1]_439 4589 4.3e-10 138_[+1]_341 22848 1.7e-09 193_[+1]_286 261973 7.5e-09 194_[+1]_285 8348 8.9e-09 124_[+1]_355 21425 1.5e-08 1_[+1]_478 263784 1.6e-08 314_[+1]_165 37123 4.3e-08 236_[+1]_243 7043 8e-08 39_[+1]_440 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=9 21977 ( 41) TTGCTGAGTTTGTGGTTGGAG 1 4589 ( 139) TTGCGGTGGTTGTGGTTGATG 1 22848 ( 194) TCCCTGTTGTTGTGGTTGAAG 1 261973 ( 195) TCGTCGTTTTTGTTGTTGGAT 1 8348 ( 125) TTGTGGTGTTCGTGTTCGTAG 1 21425 ( 2) TTGATATTTTAGTTGTTGTAG 1 263784 ( 315) GTGTTGTGTTTGAGTTTGGTG 1 37123 ( 237) TCGTTGGTGTAGTAGTCGGTG 1 7043 ( 40) TTGCTGCTTTTGGTATGGTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9120 bayes= 10.1179 E= 1.4e-004 -982 -982 -105 173 -982 50 -982 131 -982 -108 194 -982 -126 92 -982 73 -982 -108 -5 131 -126 -982 194 -982 -126 -108 -105 131 -982 -982 94 105 -982 -982 53 131 -982 -982 -982 190 -26 -108 -982 131 -982 -982 211 -982 -126 -982 -105 153 -126 -982 127 31 -126 -982 153 -27 -982 -982 -982 190 -982 -8 -105 131 -982 -982 211 -982 -26 -982 94 31 133 -982 -982 31 -982 -982 194 -127 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 1.4e-004 0.000000 0.000000 0.111111 0.888889 0.000000 0.333333 0.000000 0.666667 0.000000 0.111111 0.888889 0.000000 0.111111 0.444444 0.000000 0.444444 0.000000 0.111111 0.222222 0.666667 0.111111 0.000000 0.888889 0.000000 0.111111 0.111111 0.111111 0.666667 0.000000 0.000000 0.444444 0.555556 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.222222 0.111111 0.000000 0.666667 0.000000 0.000000 1.000000 0.000000 0.111111 0.000000 0.111111 0.777778 0.111111 0.000000 0.555556 0.333333 0.111111 0.000000 0.666667 0.222222 0.000000 0.000000 0.000000 1.000000 0.000000 0.222222 0.111111 0.666667 0.000000 0.000000 1.000000 0.000000 0.222222 0.000000 0.444444 0.333333 0.666667 0.000000 0.000000 0.333333 0.000000 0.000000 0.888889 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[TC]G[CT][TG]GT[TG][TG]T[TA]GT[GT][GT]T[TC]G[GTA][AT]G -------------------------------------------------------------------------------- Time 3.45 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 13 llr = 172 E-value = 9.1e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 5953:592222:5154348:2 pos.-specific C 3:1:95:78:22393255:58 probability G 21451:11:7282::411:1: matrix T ::12:::::13:1:2:1:24: bits 2.1 1.9 1.7 * * 1.5 * * * * * * Relative 1.3 * * * * * * * Entropy 1.1 * *** * * * * * (19.1 bits) 0.8 * ****** * * * * 0.6 * ****** * * **** 0.4 ********** * ******** 0.2 ********** ********** 0.0 --------------------- Multilevel AAAGCCACCGTGACAACCACC consensus C GA A A AACC CGAATT sequence G T C TC G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 21425 105 3.89e-10 AGGTGTATGT GAGACCACCGTGACACCCACC ATCAGTCCAC 268404 359 1.47e-08 GAAGAAGCAA AAAACAAACACGACAACCACC AAAAAGACCC 261973 48 3.04e-08 TACCACCAGT AAAGCCACCGGGGCCGAGACC AAGCCGTCAT 9301 414 1.95e-07 AAGCTCGAAC CAAGCAACCGTCGCACAATCC AACGGTGCAC 4946 416 1.95e-07 AACACTCCGC GAGACAACCAGCCCTGCCATC ATTGTCATAC 30919 329 2.14e-07 AGATTACGAA CAGTGCACCGAGCCAAACATC ACGGCAGAGC 24956 30 5.90e-07 ATGGTACATC AACGCCAACGCGACAACATCA ACATGCCTCA 8348 356 6.90e-07 TGGCCGAGAA AAGGCAACAGCGAACGAAATC ATTGGATACA 21977 317 7.45e-07 AGTCTAATGT GAGACCAGAGAGACCACAATC TCCTACGAGA 10424 100 8.04e-07 AAGAGGTGTG CAAGCAACCAAGCCAAGAAGC CAAAGGGAAA 10232 221 9.33e-07 TCAATATGTT CAAGCAGCCGTCTCTCCCACC TCGACCTCCA 25315 200 1.00e-06 CCAACAGTCA AGATCCACCGTGACCGTCTTC ACCTGAACAC 6062 44 4.67e-06 CGAAGTCCTC AATTCCAACTGGCCTGCCACA CCCAGCTTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21425 3.9e-10 104_[+2]_375 268404 1.5e-08 358_[+2]_121 261973 3e-08 47_[+2]_432 9301 1.9e-07 413_[+2]_66 4946 1.9e-07 415_[+2]_64 30919 2.1e-07 328_[+2]_151 24956 5.9e-07 29_[+2]_450 8348 6.9e-07 355_[+2]_124 21977 7.5e-07 316_[+2]_163 10424 8e-07 99_[+2]_380 10232 9.3e-07 220_[+2]_259 25315 1e-06 199_[+2]_280 6062 4.7e-06 43_[+2]_436 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=13 21425 ( 105) GAGACCACCGTGACACCCACC 1 268404 ( 359) AAAACAAACACGACAACCACC 1 261973 ( 48) AAAGCCACCGGGGCCGAGACC 1 9301 ( 414) CAAGCAACCGTCGCACAATCC 1 4946 ( 416) GAGACAACCAGCCCTGCCATC 1 30919 ( 329) CAGTGCACCGAGCCAAACATC 1 24956 ( 30) AACGCCAACGCGACAACATCA 1 8348 ( 356) AAGGCAACAGCGAACGAAATC 1 21977 ( 317) GAGACCAGAGAGACCACAATC 1 10424 ( 100) CAAGCAACCAAGCCAAGAAGC 1 10232 ( 221) CAAGCAGCCGTCTCTCCCACC 1 25315 ( 200) AGATCCACCGTGACCGTCTTC 1 6062 ( 44) AATTCCAACTGGCCTGCCACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9120 bayes= 9.20752 E= 9.1e-001 80 39 0 -1035 179 -1035 -158 -1035 80 -161 74 -180 21 -1035 100 -22 -1035 197 -158 -1035 80 120 -1035 -1035 179 -1035 -158 -1035 -20 156 -158 -1035 -79 185 -1035 -1035 -20 -1035 158 -180 -20 -3 0 20 -1035 -3 174 -1035 80 39 -58 -180 -179 197 -1035 -1035 80 39 -1035 -22 53 -3 74 -1035 21 120 -158 -180 53 120 -158 -1035 153 -1035 -1035 -22 -1035 120 -158 52 -79 185 -1035 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 13 E= 9.1e-001 0.461538 0.307692 0.230769 0.000000 0.923077 0.000000 0.076923 0.000000 0.461538 0.076923 0.384615 0.076923 0.307692 0.000000 0.461538 0.230769 0.000000 0.923077 0.076923 0.000000 0.461538 0.538462 0.000000 0.000000 0.923077 0.000000 0.076923 0.000000 0.230769 0.692308 0.076923 0.000000 0.153846 0.846154 0.000000 0.000000 0.230769 0.000000 0.692308 0.076923 0.230769 0.230769 0.230769 0.307692 0.000000 0.230769 0.769231 0.000000 0.461538 0.307692 0.153846 0.076923 0.076923 0.923077 0.000000 0.000000 0.461538 0.307692 0.000000 0.230769 0.384615 0.230769 0.384615 0.000000 0.307692 0.538462 0.076923 0.076923 0.384615 0.538462 0.076923 0.000000 0.769231 0.000000 0.000000 0.230769 0.000000 0.538462 0.076923 0.384615 0.153846 0.846154 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [ACG]A[AG][GAT]C[CA]A[CA]C[GA][TACG][GC][AC]C[ACT][AGC][CA][CA][AT][CT]C -------------------------------------------------------------------------------- Time 6.71 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 8 llr = 131 E-value = 2.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :4::11:::11::11:::3:: pos.-specific C 8131:::::3:1:::11:::1 probability G 3::363a:4491:966185:9 matrix T :58636:a63:8a:33833a: bits 2.1 * 1.9 ** * * 1.7 ** * * 1.5 ** * ** ** Relative 1.3 * ** * ** * ** Entropy 1.1 * * *** * ** * ** (23.7 bits) 0.8 * * *** **** *** ** 0.6 * ******* ******** ** 0.4 ********* *********** 0.2 ********************* 0.0 --------------------- Multilevel CTTTGTGTTGGTTGGGTGGTG consensus GACGTG GC TT TA sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 30919 47 1.03e-11 TGGATCGTGC CATTGTGTTGGTTGTGTGGTG TACTGTGATG 10232 402 4.68e-10 TTGGTGATGA CTTTTTGTTCGTTGGTTGATG CCGTGTTGAA 263784 285 2.50e-09 CAGCAACCAT CATGGAGTTTGTTGGGTTGTG TTGCGTGTTG 7043 295 1.42e-08 ATTTGCTGTC CATTGGGTGAGCTGGCTGGTG GTTGGGCGAT 22848 341 3.98e-08 GGCTCTCTCT GTCGATGTTGGTTGAGTGTTG GAGGCCTCCC 6062 132 7.91e-08 GATTGTGGAT CCCTGTGTGTGTTGGTGGGTC AGTTTGCGTG 21768 241 7.91e-08 TTTCTCTATC GTTCTGGTGGGTTGTGTTTTG AAGGAGGCTA 24956 170 1.45e-07 CCAATCCTTT CTTTGTGTTCAGTAGGCGATG TACCCAAACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 30919 1e-11 46_[+3]_433 10232 4.7e-10 401_[+3]_78 263784 2.5e-09 284_[+3]_195 7043 1.4e-08 294_[+3]_185 22848 4e-08 340_[+3]_139 6062 7.9e-08 131_[+3]_348 21768 7.9e-08 240_[+3]_239 24956 1.4e-07 169_[+3]_310 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=8 30919 ( 47) CATTGTGTTGGTTGTGTGGTG 1 10232 ( 402) CTTTTTGTTCGTTGGTTGATG 1 263784 ( 285) CATGGAGTTTGTTGGGTTGTG 1 7043 ( 295) CATTGGGTGAGCTGGCTGGTG 1 22848 ( 341) GTCGATGTTGGTTGAGTGTTG 1 6062 ( 132) CCCTGTGTGTGTTGGTGGGTC 1 21768 ( 241) GTTCTGGTGGGTTGTGTTTTG 1 24956 ( 170) CTTTGTGTTCAGTAGGCGATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9120 bayes= 10.891 E= 2.2e+000 -965 167 11 -965 50 -91 -965 90 -965 9 -965 148 -965 -91 11 122 -109 -965 144 -10 -109 -965 11 122 -965 -965 211 -965 -965 -965 -965 190 -965 -965 70 122 -109 9 70 -10 -109 -965 192 -965 -965 -91 -88 148 -965 -965 -965 190 -109 -965 192 -965 -109 -965 144 -10 -965 -91 144 -10 -965 -91 -88 148 -965 -965 170 -10 -9 -965 111 -10 -965 -965 -965 190 -965 -91 192 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 2.2e+000 0.000000 0.750000 0.250000 0.000000 0.375000 0.125000 0.000000 0.500000 0.000000 0.250000 0.000000 0.750000 0.000000 0.125000 0.250000 0.625000 0.125000 0.000000 0.625000 0.250000 0.125000 0.000000 0.250000 0.625000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.375000 0.625000 0.125000 0.250000 0.375000 0.250000 0.125000 0.000000 0.875000 0.000000 0.000000 0.125000 0.125000 0.750000 0.000000 0.000000 0.000000 1.000000 0.125000 0.000000 0.875000 0.000000 0.125000 0.000000 0.625000 0.250000 0.000000 0.125000 0.625000 0.250000 0.000000 0.125000 0.125000 0.750000 0.000000 0.000000 0.750000 0.250000 0.250000 0.000000 0.500000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.125000 0.875000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG][TA][TC][TG][GT][TG]GT[TG][GCT]GTTG[GT][GT]T[GT][GAT]TG -------------------------------------------------------------------------------- Time 9.89 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10232 1.18e-09 220_[+2(9.33e-07)]_160_\ [+3(4.68e-10)]_78 10424 1.29e-02 99_[+2(8.04e-07)]_380 21425 4.73e-10 1_[+1(1.46e-08)]_82_[+2(3.89e-10)]_\ 375 21768 9.20e-05 240_[+3(7.91e-08)]_239 21977 1.21e-09 40_[+1(5.79e-11)]_255_\ [+2(7.45e-07)]_163 22848 3.83e-09 193_[+1(1.75e-09)]_21_\ [+1(4.58e-05)]_84_[+3(3.98e-08)]_139 24956 3.30e-06 29_[+2(5.90e-07)]_119_\ [+3(1.45e-07)]_310 25315 6.59e-03 199_[+2(1.00e-06)]_280 261973 3.52e-09 47_[+2(3.04e-08)]_26_[+2(1.73e-05)]_\ 13_[+2(5.10e-05)]_45_[+1(7.50e-09)]_285 263784 1.53e-09 284_[+3(2.50e-09)]_9_[+1(1.58e-08)]_\ 165 268404 4.71e-04 358_[+2(1.47e-08)]_63_\ [+2(7.93e-06)]_37 30919 6.47e-11 46_[+3(1.03e-11)]_261_\ [+2(2.14e-07)]_151 37123 9.73e-05 236_[+1(4.31e-08)]_94_\ [+1(5.56e-05)]_128 4589 4.60e-07 76_[+1(7.16e-05)]_41_[+1(4.30e-10)]_\ 168_[+3(3.06e-05)]_152 4946 1.94e-03 415_[+2(1.95e-07)]_12_\ [+2(6.65e-05)]_31 6062 2.26e-06 43_[+2(4.67e-06)]_67_[+3(7.91e-08)]_\ 348 7043 4.71e-08 39_[+1(7.97e-08)]_234_\ [+3(1.42e-08)]_185 8348 1.46e-07 124_[+1(8.90e-09)]_210_\ [+2(6.90e-07)]_124 9301 2.35e-03 413_[+2(1.95e-07)]_66 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************