Improve accuracy of character categories

* lisp/international/characters.el: Assign 'digit' category to all
the characters whose Unicode 'general-category' is Nd.

* admin/unidata/blocks.awk: Add code to assign 'symbol' category
to all characters belonging to the 'symbol' script.

* etc/NEWS: Announce the above changes
This commit is contained in:
Eli Zaretskii 2024-09-13 14:31:28 +03:00
parent 04e8ad6489
commit 7376623a24
3 changed files with 31 additions and 2 deletions

View file

@ -278,6 +278,10 @@ END {
print " (or (memq (nth 2 elt) script-list)"
print " (setq script-list (cons (nth 2 elt) script-list))))"
print " (set-char-table-extra-slot char-script-table 0 (nreverse script-list)))"
print "\n"
print "(provide 'charscript)"
print "\n(map-char-table"
print " (lambda (ch script)"
print " (and (eq script 'symbol)"
print " (modify-category-entry ch ?5)))"
print " char-script-table)"
print "\n(provide 'charscript)"
}

View file

@ -339,6 +339,18 @@ That convention was: '(error &rest ARGS)'.
** The 'rx' category name 'chinese-two-byte' must now be spelled correctly.
An old alternative name (without the first 'e') has been removed.
---
** All the digit characters now have the 'digit' category.
All the characters whose Unicode general-category is Nd now have the
'digit' category, whose mnemonic is '6'. This includes both ASCII and
non-ASCII digit characters.
---
** All the symbol characters now have the 'symbol' category.
All the characters that belong to the 'symbol' script (according to
'char-script-table') now have the 'symbol' category, whose mnemonic is
'5'.
* Lisp Changes in Emacs 31.1

View file

@ -849,6 +849,19 @@ with L, LRE, or LRO Unicode bidi character type.")
;; Fixme: syntax for symbols &c
)
;; Symbols and digits
;;; Each character whose script is 'symbol' gets the symbol category,
;;; see charscript.el.
;;; Each character whose Unicode general-category is Nd gets the digit
;;; category:
(let ((table (unicode-property-table-internal 'general-category)))
(when table
(map-char-table (lambda (key val)
(if (eq val 'Nd)
(modify-category-entry key ?6)))
table)))
(let ((pairs
'("⁅⁆" ; U+2045 U+2046
"⁽⁾" ; U+207D U+207E