Support for Unicode emoji sequences

This covers both sequences using Zero-Width-Joiner codepoints and
those without. Bug#39799, I hope.

* .gitignore: Add emoji-zwj.el
* admin/notes/unicode: Add emoji-zwj-sequences.txt and
emoji-sequences.txt references.  Describe how to test after updating
to a newer Unicode version.
* admin/unidata/Makefile.in (all): add emoji-zwj.el as a dependency.
(emoji-zwj.el): Add target plus rules for building.
(gen-clean): Add emoji-zwj.el.
* admin/unidata/README: Add emoji-zwj-sequences.txt and
emoji-sequences.txt references.
* admin/unidata/blocks.awk: Force emoji script to be used for certain
codepoints that are used by the Unicode sequences.
* admin/unidata/emoji-sequences.txt: New file.
* admin/unidata/emoji-zwj-sequences.txt: New file.
* admin/unidata/emoji-zwj.awk: New file.  Derives
composition-function-table rules from emoji-zwj-sequences.txt, plus
hardcodes some derived manually from emoji-sequences.txt.
* etc/NEWS: Announce change.
* lisp/international/characters.el: Load the generated emoji-zwj.el
* src/Makefile.in (emoji-zwj): New target.
(temacs): Add emoji-zwj as a dependency.
This commit is contained in:
Robert Pluim 2021-09-20 12:41:15 +02:00
parent 0b98ea5fbe
commit de289d58a4
11 changed files with 3078 additions and 9 deletions

View file

@ -17,13 +17,15 @@ Emacs uses the following files from the Unicode Character Database
. NormalizationTest.txt
. SpecialCasing.txt
. emoji-data.txt
. emoji-zwj-sequences.txt
. emoji-sequences.txt
. BidiCharacterTest.txt
First, the first 8 files need to be copied into admin/unidata/, and
First, the first 10 files need to be copied into admin/unidata/, and
the file https://www.unicode.org/copyright.html should be copied over
copyright.html in admin/unidata (that file and emoji-data.txt might
need trailing whitespace removed before they can be committed to the
Emacs repository).
copyright.html in admin/unidata (some of them might need trailing
whitespace removed before they can be committed to the Emacs
repository).
Then Emacs should be rebuilt for them to take effect. Rebuilding
Emacs updates several derived files elsewhere in the Emacs source
@ -112,6 +114,11 @@ FONT-NAME-REGEXP is checked using `string-match'."
(princ (format "Font not matching '%s' was used for the following characters:\n%s"
font-name-regexp (reverse res))))))))
Visit "emoji-zwj-sequences.txt" and "emoji-sequences.txt" with the
rebuilt Emacs, and check that the sample sequences are composed
properly. Note that your emoji font might not have glyphs for the
newest codepoints yet.
Finally, etc/NEWS should be updated to announce the support for the
new Unicode version.