Add textsec support for confusable characters

* admin/notes/unicode: Note the confusables.txt file.
* admin/unidata/Makefile.in (${unidir}/uni-confusable.el):
Generate the confusable file.

* admin/unidata/README (https): Add confusables.txt.

* admin/unidata/confusables.txt: New file.

* admin/unidata/unidata-gen.el (unidata-gen-confusable): Parse the
confusables.txt file.

* lisp/international/textsec.el (textsec-ascii-confusable-p)
(textsec-unconfuse-string): New functions.
This commit is contained in:
Lars Ingebrigtsen 2022-01-18 09:57:43 +01:00
parent 65c9f57856
commit 19fefea1ca
7 changed files with 9702 additions and 2 deletions

View file

@ -19,6 +19,7 @@ Emacs uses the following files from the Unicode Character Database
. ScriptExtensions.txt
. Scripts.txt
. SpecialCasing.txt
. confusables.txt
. emoji-data.txt
. emoji-zwj-sequences.txt
. emoji-sequences.txt
@ -27,7 +28,7 @@ Emacs uses the following files from the Unicode Character Database
Emacs also uses the file emoji-test.txt which should be imported from
the Unicode's Public/emoji/ directory.
First, the first 13 files and emoji-test.txt need to be copied into
First, the first 14 files and emoji-test.txt need to be copied into
admin/unidata/, and the file https://www.unicode.org/copyright.html
should be copied over copyright.html in admin/unidata (some of them
might need trailing whitespace removed before they can be committed to