Recognise ß properly as a lower-case letter (bug#11309)

ß was incorrectly treated as a caseless character and thus not matched
by the regexp [[:lower:]] (or, in case-folding mode, [[:upper:]]).
The reason is that the upcase table maps it to itself, which can be
remedied by mapping it to ẞ (U+7838) instead.  Doing so does not
affect upcasing since the special-uppercase property maps it to SS.

* lisp/international/characters.el (tbl): Map ß to ẞ in the upcase
table.
* test/src/regex-emacs-tests.el (regexp-eszett): Uncomment previously
failing tests.  Add checks to make sure that case transformations
remain valid.
This commit is contained in:
Mattias Engdegård 2020-12-09 13:27:16 +01:00
parent 445ab5cce9
commit beebd2a85e
2 changed files with 19 additions and 5 deletions

View file

@ -759,7 +759,14 @@ with L, LRE, or LRO Unicode bidi character type.")
(funcall map-unicode-property 'uppercase
(lambda (lc uc) (aset up lc uc) (aset up uc uc)))
(funcall map-unicode-property 'lowercase
(lambda (uc lc) (aset down uc lc) (aset down lc lc))))))
(lambda (uc lc) (aset down uc lc) (aset down lc lc)))
;; Override the Unicode uppercase property for ß, since we are
;; using our case tables for determining the case of a
;; character (see uppercasep and lowercasep in buffer.h).
;; The special-uppercase property of ß ensures that it is
;; still upcased to SS per the usual convention.
(aset up ?ẞ))))
;; Clear out the extra slots so that they will be recomputed from the main
;; (downcase) table and upcase table. Since were side-stepping the usual

View file

@ -834,6 +834,13 @@ This evaluates the TESTS test cases from glibc."
(ert-deftest regexp-eszett ()
"Test matching of ß and ẞ."
;; Sanity checks.
(should (equal (upcase "ß") "SS"))
(should (equal (downcase "ß") "ß"))
(should (equal (capitalize "ß") "Ss")) ; undeutsch...
(should (equal (upcase "") ""))
(should (equal (downcase "") "ß"))
(should (equal (capitalize "") ""))
;; ß is a lower-case letter (Ll); ẞ is an upper-case letter (Lu).
(let ((case-fold-search nil))
(should (equal (string-match "ß" "ß") 0))
@ -842,8 +849,8 @@ This evaluates the TESTS test cases from glibc."
(should (equal (string-match "" "") 0))
(should (equal (string-match "[[:alpha:]]" "ß") 0))
;; bug#11309
;;(should (equal (string-match "[[:lower:]]" "ß") 0))
;;(should (equal (string-match "[[:upper:]]" "ß") nil))
(should (equal (string-match "[[:lower:]]" "ß") 0))
(should (equal (string-match "[[:upper:]]" "ß") nil))
(should (equal (string-match "[[:alpha:]]" "") 0))
(should (equal (string-match "[[:lower:]]" "") nil))
(should (equal (string-match "[[:upper:]]" "") 0)))
@ -854,8 +861,8 @@ This evaluates the TESTS test cases from glibc."
(should (equal (string-match "" "") 0))
(should (equal (string-match "[[:alpha:]]" "ß") 0))
;; bug#11309
;;(should (equal (string-match "[[:lower:]]" "ß") 0))
;;(should (equal (string-match "[[:upper:]]" "ß") 0))
(should (equal (string-match "[[:lower:]]" "ß") 0))
(should (equal (string-match "[[:upper:]]" "ß") 0))
(should (equal (string-match "[[:alpha:]]" "") 0))
(should (equal (string-match "[[:lower:]]" "") 0))
(should (equal (string-match "[[:upper:]]" "") 0))))