mirror of
https://github.com/masscollaborationlabs/emacs.git
synced 2025-07-03 10:53:23 +00:00
Import Unicode 12.0 data files
* admin/unidata/copyright.html: * admin/unidata/UnicodeData.txt: * admin/unidata/SpecialCasing.txt: * admin/unidata/NormalizationTest.txt: * admin/unidata/Blocks.txt: * admin/unidata/BidiMirroring.txt: * admin/unidata/BidiBrackets.txt: New versions from Unicode 12.0. * admin/unidata/unidata-gen.el (unidata-gen-file): * admin/unidata/blocks.awk (name2alias): Adapt to changes in new data files. * admin/notes/unicode: Update and improve instructions for importing a new Unicode Standard. * lisp/international/characters.el (char-width-table): Update lists of characters according to Unicode 12.0. * lisp/international/fontset.el (script-representative-chars): Add characters from new scripts to 'script-representative-chars'. (otf-script-alist): Update according to data on the MS site. * lisp/international/mule-cmds.el (ucs-names): Update unused ranges of codepoints according to Unicode 12.0. * test/lisp/international/ucs-normalize-tests.el (ucs-normalize-tests--failing-lines-part1) (ucs-normalize-tests--failing-lines-part2): Update for the new NormalizationTest.txt file. * test/manual/BidiCharacterTest.txt: Update with the new version from Unicode 12.0.
This commit is contained in:
parent
4e082ce394
commit
fddb915d23
15 changed files with 791 additions and 211 deletions
|
@ -11,15 +11,20 @@ Emacs uses the following files from the Unicode Character Database
|
|||
|
||||
. UnicodeData.txt
|
||||
. Blocks.txt
|
||||
. BidiMirroring.txt
|
||||
. BidiBrackets.txt
|
||||
. BidiCharacterTest.txt
|
||||
. BidiMirroring.txt
|
||||
. IVD_Sequences.txt
|
||||
. NormalizationTest.txt
|
||||
. SpecialCasing.txt
|
||||
. BidiCharacterTest.txt
|
||||
|
||||
First, the first 7 files need to be copied into admin/unidata/, and
|
||||
then Emacs should be rebuilt for them to take effect. Rebuilding
|
||||
the file https://www.unicode.org/copyright.html should be copied over
|
||||
copyright.html in admin/unidata (that file might need trailing
|
||||
whitespace removed before it can be committed to the Emacs
|
||||
repository).
|
||||
|
||||
Then Emacs should be rebuilt for them to take effect. Rebuilding
|
||||
Emacs updates several derived files elsewhere in the Emacs source
|
||||
tree, mainly in lisp/international/.
|
||||
|
||||
|
@ -28,7 +33,10 @@ files, pay attention to any warning or error messages. In particular,
|
|||
admin/unidata/unidata-gen.el will complain if UnicodeData.txt defines
|
||||
new bidirectional attributes of characters, because unidata-gen.el,
|
||||
bidi.c and dispextern.h need to be updated in that case; failure to do
|
||||
so will cause aborts in redisplay.
|
||||
so will cause aborts in redisplay. unidata-gen.el will also complain
|
||||
if the format of the Unicode Copyright notice in copyright.html
|
||||
changed in significant ways; in that case, update the regular
|
||||
expression in unidata-gen-file used to extract the copyright string.
|
||||
|
||||
Next, review the changes in UnicodeData.txt vs the previous version
|
||||
used by Emacs. Any changes, be it introduction of new scripts or
|
||||
|
@ -40,7 +48,12 @@ and see if any changes in admin/unidata/blocks.awk are required.
|
|||
|
||||
The setting of char-width-table around line 1200 of characters.el
|
||||
should be checked against the latest version of the Unicode file
|
||||
EastAsianWidth.txt, and any discrepancies fixed.
|
||||
EastAsianWidth.txt, and any discrepancies fixed: double-width
|
||||
characters are those marked with W or F in that file. Zero-width
|
||||
characters are not taken from EastAsianWidth.txt, they are those whose
|
||||
Unicode General Category property is one of Mn, Me, or Cf, and also
|
||||
Hangul jungseong and jongseong characters (a.k.a. "Jamo medial vowels"
|
||||
and "Jamo final consonants").
|
||||
|
||||
Any new scripts added by UnicodeData.txt will also need updates to
|
||||
script-representative-chars defined in fontset.el, and also the list
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue