******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/372/372.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10724 1.0000 500 17228 1.0000 500 18846 1.0000 500 20880 1.0000 500 22121 1.0000 500 23753 1.0000 500 25193 1.0000 500 25234 1.0000 500 261140 1.0000 500 261500 1.0000 500 3020 1.0000 500 36088 1.0000 500 37988 1.0000 500 4540 1.0000 500 6551 1.0000 500 728 1.0000 500 797 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/372/372.seqs.fa -oc motifs/372 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.275 C 0.233 G 0.243 T 0.249 Background letter frequencies (from dataset with add-one prior applied): A 0.275 C 0.233 G 0.243 T 0.249 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 10 llr = 154 E-value = 1.6e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 159:73:2182:1::86:5:1 pos.-specific C 821a35a8525a459:37:78 probability G 1::::1::3:::2312::111 matrix T :3:::1::1:3:32::1342: bits 2.1 * * * 1.9 * * * 1.7 * * * * 1.5 ** * * * Relative 1.3 ** ** * * ** * Entropy 1.1 * *** ** * * ** * * (22.2 bits) 0.8 * *** ** * * ** * ** 0.6 * *** ** *** ******** 0.4 ************ ******** 0.2 ********************* 0.0 --------------------- Multilevel CAACACCCCACCCCCAACACC consensus T CA AGCT TG GCTTT sequence C A GT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 3020 452 1.60e-12 ACCGCCACGG CAACACCCCATCCCCAACACC GAGTACCAAC 261500 449 8.03e-10 GCGAAAACAA CAACAACCGACCACCACCACC CTTAATATTG 6551 468 2.37e-08 CAAATCCATC CCACCACCCCCCCTCAACTTC TTCACCACAA 797 480 2.85e-08 CAACTGCCAC CTACCCCCCATCCCCGATACG 25234 141 4.42e-08 ACTACTCTAA CAACATCATACCCGCAATACC AACTAAACAA 4540 389 9.88e-08 TTGCCAACGC CACCAACACAACTGCAATACC CGTACTCTTC 37988 453 9.88e-08 GCAGCATCTA CAACACCCACCCTTGACCTCC TCCCCTTGAA 10724 475 1.89e-07 ATCTAATACA ATACACCCGATCGGCACCTCA CCACC 25193 397 2.16e-07 GACACATGAG GCACACCCCAACTCCATCTGC TGTAATCACA 22121 443 2.31e-07 CGGCTTTCTA CTACCGCCGACCGCCGACGTC GTCGTTTGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3020 1.6e-12 451_[+1]_28 261500 8e-10 448_[+1]_31 6551 2.4e-08 467_[+1]_12 797 2.9e-08 479_[+1] 25234 4.4e-08 140_[+1]_339 4540 9.9e-08 388_[+1]_91 37988 9.9e-08 452_[+1]_27 10724 1.9e-07 474_[+1]_5 25193 2.2e-07 396_[+1]_83 22121 2.3e-07 442_[+1]_37 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=10 3020 ( 452) CAACACCCCATCCCCAACACC 1 261500 ( 449) CAACAACCGACCACCACCACC 1 6551 ( 468) CCACCACCCCCCCTCAACTTC 1 797 ( 480) CTACCCCCCATCCCCGATACG 1 25234 ( 141) CAACATCATACCCGCAATACC 1 4540 ( 389) CACCAACACAACTGCAATACC 1 37988 ( 453) CAACACCCACCCTTGACCTCC 1 10724 ( 475) ATACACCCGATCGGCACCTCA 1 25193 ( 397) GCACACCCCAACTCCATCTGC 1 22121 ( 443) CTACCGCCGACCGCCGACGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8160 bayes= 9.92248 E= 1.6e-003 -146 178 -128 -997 86 -22 -997 27 171 -122 -997 -997 -997 210 -997 -997 135 37 -997 -997 13 110 -128 -132 -997 210 -997 -997 -46 178 -997 -997 -146 110 30 -132 154 -22 -997 -997 -46 110 -997 27 -997 210 -997 -997 -146 78 -28 27 -997 110 30 -32 -997 195 -128 -997 154 -997 -28 -997 113 37 -997 -132 -997 159 -997 27 86 -997 -128 68 -997 159 -128 -32 -146 178 -128 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 1.6e-003 0.100000 0.800000 0.100000 0.000000 0.500000 0.200000 0.000000 0.300000 0.900000 0.100000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.700000 0.300000 0.000000 0.000000 0.300000 0.500000 0.100000 0.100000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.100000 0.500000 0.300000 0.100000 0.800000 0.200000 0.000000 0.000000 0.200000 0.500000 0.000000 0.300000 0.000000 1.000000 0.000000 0.000000 0.100000 0.400000 0.200000 0.300000 0.000000 0.500000 0.300000 0.200000 0.000000 0.900000 0.100000 0.000000 0.800000 0.000000 0.200000 0.000000 0.600000 0.300000 0.000000 0.100000 0.000000 0.700000 0.000000 0.300000 0.500000 0.000000 0.100000 0.400000 0.000000 0.700000 0.100000 0.200000 0.100000 0.800000 0.100000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[ATC]AC[AC][CA]C[CA][CG][AC][CTA]C[CTG][CGT]C[AG][AC][CT][AT][CT]C -------------------------------------------------------------------------------- Time 3.00 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 14 llr = 155 E-value = 6.4e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :2:::22:1::6:19 pos.-specific C 1::2:11:1::::1: probability G 93:744431a91971 matrix T :5a163378:1211: bits 2.1 * * 1.9 * * 1.7 * ** 1.5 * * ** * Relative 1.3 * * ** * * Entropy 1.1 * *** * ** * * (16.0 bits) 0.8 * *** **** * * 0.6 * *** ******** 0.4 ***** ******** 0.2 ***** ******** 0.0 --------------- Multilevel GTTGTGGTTGGAGGA consensus G CGTTG T sequence A AA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 261140 372 8.44e-08 TGAGGATATT GGTGTGTTTGGTGGA AAGGATGGGG 25234 82 1.69e-07 TTGGTTGGGC GATGGAGTTGGAGGA GACGAAACAG 23753 377 1.92e-07 GGCGTTTGGC GTTGTGATTGGAGTA TAAGAGAGAC 25193 84 3.70e-07 CAGTGTAAGT GTTGTTATTGGATGA GGATGACCTC 3020 13 1.25e-06 GTCCTTCGCC GATGTGTTTGGAGAA CGATTGATTC 17228 259 1.55e-06 TGTCACTGAT GTTGGATTCGGAGGA GGTGTAACCA 20880 306 2.27e-06 GCGACCAGAC GGTCGTGTTGGAGGG TTGGTTCACC 37988 64 4.59e-06 CGTGTCGTTG GATGGAGGTGGATGA CGGTGCGAAT 797 293 4.98e-06 ATTGAAGGCT CTTCTTCTTGGTGGA CTCCAAAAAC 36088 37 6.35e-06 CTCTCGCGAA GGTGGGCGGGGAGGA AGAGCACCGC 18846 135 6.35e-06 AAAGTCCCCG CTTCTCAGTGGAGGA AGGGATTCAC 22121 158 1.32e-05 GAGCATCGTT GTTGTCGTTGTTGTA TTTGCAAAAG 6551 306 1.61e-05 ACCTCGGAGC GTTGTGGGAGGGGGG AGTAGTCAGA 728 130 2.50e-05 TGTTAGCTTG GGTTTTTTTGGGGCA AGGCAAAGGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 261140 8.4e-08 371_[+2]_114 25234 1.7e-07 81_[+2]_404 23753 1.9e-07 376_[+2]_109 25193 3.7e-07 83_[+2]_402 3020 1.3e-06 12_[+2]_473 17228 1.5e-06 258_[+2]_227 20880 2.3e-06 305_[+2]_180 37988 4.6e-06 63_[+2]_422 797 5e-06 292_[+2]_193 36088 6.4e-06 36_[+2]_449 18846 6.4e-06 134_[+2]_351 22121 1.3e-05 157_[+2]_328 6551 1.6e-05 305_[+2]_180 728 2.5e-05 129_[+2]_356 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=14 261140 ( 372) GGTGTGTTTGGTGGA 1 25234 ( 82) GATGGAGTTGGAGGA 1 23753 ( 377) GTTGTGATTGGAGTA 1 25193 ( 84) GTTGTTATTGGATGA 1 3020 ( 13) GATGTGTTTGGAGAA 1 17228 ( 259) GTTGGATTCGGAGGA 1 20880 ( 306) GGTCGTGTTGGAGGG 1 37988 ( 64) GATGGAGGTGGATGA 1 797 ( 293) CTTCTTCTTGGTGGA 1 36088 ( 37) GGTGGGCGGGGAGGA 1 18846 ( 135) CTTCTCAGTGGAGGA 1 22121 ( 158) GTTGTCGTTGTTGTA 1 6551 ( 306) GTTGTGGGAGGGGGG 1 728 ( 130) GGTTTTTTTGGGGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8262 bayes= 9.80903 E= 6.4e-002 -1045 -70 182 -1045 -36 -1045 23 100 -1045 -1045 -1045 200 -1045 -12 156 -180 -1045 -1045 56 137 -36 -70 56 20 -36 -70 56 20 -1045 -1045 23 152 -194 -170 -176 165 -1045 -1045 204 -1045 -1045 -1045 193 -180 122 -1045 -77 -22 -1045 -1045 182 -80 -194 -170 156 -80 164 -1045 -77 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 14 E= 6.4e-002 0.000000 0.142857 0.857143 0.000000 0.214286 0.000000 0.285714 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.214286 0.714286 0.071429 0.000000 0.000000 0.357143 0.642857 0.214286 0.142857 0.357143 0.285714 0.214286 0.142857 0.357143 0.285714 0.000000 0.000000 0.285714 0.714286 0.071429 0.071429 0.071429 0.785714 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.928571 0.071429 0.642857 0.000000 0.142857 0.214286 0.000000 0.000000 0.857143 0.142857 0.071429 0.071429 0.714286 0.142857 0.857143 0.000000 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[TGA]T[GC][TG][GTA][GTA][TG]TGG[AT]GGA -------------------------------------------------------------------------------- Time 5.72 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 7 llr = 98 E-value = 1.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 3:::::::31::::3 pos.-specific C 6969aa1a6646:97 probability G 1:::::::1::1::: matrix T :141::9::363a1: bits 2.1 ** * * 1.9 ** * * 1.7 ** * * 1.5 * ***** ** Relative 1.3 * ***** *** Entropy 1.1 ******* * *** (20.2 bits) 0.8 ******* * *** 0.6 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CCCCCCTCCCTCTCC consensus A T ATCT A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 25234 225 3.34e-08 CTCCTCCAAA ACTCCCTCACTCTCC ACGCCCAACC 37988 397 7.25e-08 ACACAAGCAG CCCCCCCCCCCTTCC GACGACGAAC 261140 144 8.31e-08 CCATCCCTCG CCCCCCTCATTCTCA TGGCGGACGT 728 429 1.13e-07 TGTTTTTGTG CCCTCCTCCCTCTCA GTTGCTACGA 10724 417 2.31e-07 CACACAGAGG CCTCCCTCCCTGTTC TACCGACGTG 23753 467 5.92e-07 TCACTTGCAT GCCCCCTCGACCTCC CGTTGGACAA 17228 131 6.41e-07 CATGTCGTAC ATTCCCTCCTCTTCC AAATAGCATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25234 3.3e-08 224_[+3]_261 37988 7.2e-08 396_[+3]_89 261140 8.3e-08 143_[+3]_342 728 1.1e-07 428_[+3]_57 10724 2.3e-07 416_[+3]_69 23753 5.9e-07 466_[+3]_19 17228 6.4e-07 130_[+3]_355 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=7 25234 ( 225) ACTCCCTCACTCTCC 1 37988 ( 397) CCCCCCCCCCCTTCC 1 261140 ( 144) CCCCCCTCATTCTCA 1 728 ( 429) CCCTCCTCCCTCTCA 1 10724 ( 417) CCTCCCTCCCTGTTC 1 23753 ( 467) GCCCCCTCGACCTCC 1 17228 ( 131) ATTCCCTCCTCTTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8262 bayes= 10.8098 E= 1.8e+001 6 129 -76 -945 -945 188 -945 -80 -945 129 -945 78 -945 188 -945 -80 -945 210 -945 -945 -945 210 -945 -945 -945 -70 -945 178 -945 210 -945 -945 6 129 -76 -945 -94 129 -945 20 -945 88 -945 119 -945 129 -76 20 -945 -945 -945 200 -945 188 -945 -80 6 162 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 7 E= 1.8e+001 0.285714 0.571429 0.142857 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.571429 0.000000 0.428571 0.000000 0.857143 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 0.285714 0.571429 0.142857 0.000000 0.142857 0.571429 0.000000 0.285714 0.000000 0.428571 0.000000 0.571429 0.000000 0.571429 0.142857 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 0.857143 0.000000 0.142857 0.285714 0.714286 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA]C[CT]CCCTC[CA][CT][TC][CT]TC[CA] -------------------------------------------------------------------------------- Time 8.62 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10724 9.31e-07 416_[+3(2.31e-07)]_43_\ [+1(1.89e-07)]_5 17228 2.19e-05 130_[+3(6.41e-07)]_113_\ [+2(1.55e-06)]_227 18846 6.10e-03 134_[+2(6.35e-06)]_351 20880 2.07e-02 305_[+2(2.27e-06)]_180 22121 7.11e-05 157_[+2(1.32e-05)]_270_\ [+1(2.31e-07)]_37 23753 2.48e-06 376_[+2(1.92e-07)]_75_\ [+3(5.92e-07)]_19 25193 1.71e-06 83_[+2(3.70e-07)]_298_\ [+1(2.16e-07)]_83 25234 1.47e-11 81_[+2(1.69e-07)]_44_[+1(4.42e-08)]_\ 63_[+3(3.34e-08)]_261 261140 1.56e-07 143_[+3(8.31e-08)]_213_\ [+2(8.44e-08)]_114 261500 8.32e-06 448_[+1(8.03e-10)]_31 3020 1.07e-10 12_[+2(1.25e-06)]_424_\ [+1(1.60e-12)]_28 36088 1.07e-02 36_[+2(6.35e-06)]_449 37988 1.39e-09 63_[+2(4.59e-06)]_318_\ [+3(7.25e-08)]_41_[+1(9.88e-08)]_27 4540 2.79e-04 308_[+1(3.57e-06)]_59_\ [+1(9.88e-08)]_91 6551 1.11e-07 305_[+2(1.61e-05)]_147_\ [+1(2.37e-08)]_12 728 2.86e-05 129_[+2(2.50e-05)]_284_\ [+3(1.13e-07)]_57 797 9.28e-07 292_[+2(4.98e-06)]_148_\ [+1(1.30e-05)]_3_[+1(2.85e-08)] -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************