Add support for automatically updating Unicode derived files

Enterprise / PostgreSQL - Peter Eisentraut [eisentraut.org] - 9 January 2020 09:08 UTC

We currently have several sets of files generated from data provided by Unicode. These all have ad hoc rules and instructions for updating when new Unicode versions appear, and it's not done consistently.

This patch centralizes and automates the process and makes it part of the release checklist. The Unicode and CLDR versions are specified in Makefile.global.in. There is a new make target "update-unicode" that downloads all the relevant files and runs the generation script.

There is also a new script for generating the table of combining characters for ucs_wcwidth(). That table is now in a separate include file rather than hardcoded into the middle of other code. This is based on the script that was used for generating d8594d123c155aeecd47fc2450f62f5100b2fbf0, but the script itself wasn't committed at that time.

f85a485f89 Add support for automatically updating Unicode derived files
GNUmakefile.in | 4 +
contrib/unaccent/.gitignore | 3 +
contrib/unaccent/Makefile | 19 ++
contrib/unaccent/generate_unaccent_rules.py | 10 +-
src/Makefile.global.in | 18 +-
src/backend/utils/mb/Unicode/Makefile | 3 -
src/backend/utils/mb/wchar.c | 68 +-------
src/common/unicode/.gitignore | 2 +-
src/common/unicode/Makefile | 14 +-
src/common/unicode/README | 17 +-
.../unicode/generate-unicode_combining_table.pl | 52 ++++++
src/include/common/unicode_combining_table.h | 194 +++++++++++++++++++++
src/tools/RELEASE_CHANGES | 3 +
13 files changed, 313 insertions(+), 94 deletions(-)

Upstream: git.postgresql.org


  • Share