Document textsec

* doc/lispref/elisp.texi (Top): Add menu. * doc/lispref/text.texi (Text): Add menu. (Suspicious Text): New node. * lisp/international/textsec-check.el (textsec-check): Adjust doc string.
2022-01-20 08:38:16 +01:00 · 2022-01-20 08:38:16 +01:00 · 2a3edd1e0a
commit 2a3edd1e0a
parent 7cfc0bd6a9
4 changed files with 108 additions and 21 deletions
--- a/doc/lispref/elisp.texi
+++ b/doc/lispref/elisp.texi
@ -1228,6 +1228,7 @@ Text
 * Decompression::           Dealing with compressed data.
 * Base 64::                 Conversion to or from base 64 encoding.
 * Checksum/Hash::           Computing cryptographic hashes.
+* Suspicious Text::         Determining whether a string is suspicious.
 * GnuTLS Cryptography::     Cryptographic algorithms imported from GnuTLS.
 * Database::                Interacting with an SQL database.
 * Parsing HTML/XML::        Parsing HTML and XML.
--- a/doc/lispref/text.texi
+++ b/doc/lispref/text.texi
@ -59,6 +59,7 @@ the character after point.
 * Decompression::    Dealing with compressed data.
 * Base 64::          Conversion to or from base 64 encoding.
 * Checksum/Hash::    Computing cryptographic hashes.
+* Suspicious Text::  Determining whether a string is suspicious.
 * GnuTLS Cryptography:: Cryptographic algorithms imported from GnuTLS.
 * Database::         Interacting with an SQL database.
 * Parsing HTML/XML:: Parsing HTML and XML.
@ -4943,6 +4944,80 @@ It should be somewhat more efficient on larger buffers than
@c according to what we find useful.
@end defun

+@node Suspicious Text
+@section Suspicious Text
+
+Emacs can display data from many external sources, like mail and web
+pages.  Attackers may attempt to confuse the user reading this data by
+using obfuscated @acronym{URL}s or email addresses, and tricking the
+user into visiting a web page they didn't intend to visit, or sending
+an email to the wrong address.
+
+This usually involves using characters from scripts that visually look
+like @acronym{ASCII} characters (i.e., are homoglyphs), but there are
+also other techniques used, like using bidirectional overrides, or
+having an @acronym{HTML} link text that says one thing, while the
+underlying @acronym{URL} points somewhere else.
+
+To help identify these @dfn{suspicious strings}, Emacs provides a
+library to do a number of checks.  (See
+@url{https://www.unicode.org/reports/tr39/} for the rationale behind
+the checks that are available.)  Packages that present data that might
+be suspicious should use this library.
+
+@vindex textsec-check
+@defun textsec-check object type
+This function is the high-level interface function that packages
+should use.  It respects the @code{textsec-check} user option, which
+allows the user to disable the checks.
+
+This function checks @var{object} to see if it looks suspicious when
+interpreted as a thing of @var{type}.  The available types are:
+
+@table @code
+@item domain
+Check whether a domain (e.g., @samp{www.gnu.org} looks suspicious.
+
+@item url
+Check whether an @acronym{URL} (e.g., @samp{http://gnu.org/foo/bar})
+looks suspicious.
+
+@item link
+Check whether an @acronym{HTML} link (e.g., @samp{<a
+href='http://gnu.org'>fsf.org</a>} looks suspicious.  In this case,
+@var{object} should be a @code{cons} cell where the @code{car} is the
+@acronym{URL} and the @code{cdr} is the link text.  The link is deemed
+suspicious if the link text contains a domain name, and that domain
+name points to something other than the @acronym{URL}.
+
+@item email-address
+Check whether an email address (e.g., @samp{foo@@example.org}) looks
+suspicious.
+
+@item local-address
+Check whether the local part of an email address (the bit before the
+@samp{@@} sign) looks suspicious.
+
+@item name
+Check whether a name (used in an email address header) looks suspicious.
+
+@item email-address-header
+Check whether a full RFC2822 email address header (e.g.,
+@samp{=?utf-8?Q?=C3=81?= <foo@@example.com>}) looks suspicious.
+@end table
+
+If @var{object} is suspicious, this function will return a string that
+explains why it is suspicious.  If @var{object} is not suspicious, it
+returns @code{nil}.
+@end defun
+
+If the text is suspicious, the application should mark the suspicious
+text with the @code{textsec-suspicious} face, and make the explanation
+returned by @code{textsec-check} available to the user.  The
+application might also prompt the user before taking any action on a
+suspicious string (like sending an email to a suspicious email
+address).
+
@node GnuTLS Cryptography
@section GnuTLS Cryptography
@cindex MD5 checksum
--- a/etc/NEWS
+++ b/etc/NEWS
@ -960,6 +960,7 @@ The input must be encoded text.

 * Lisp Changes in Emacs 29.1

+--
 ** New function 'bidi-string-strip-control-characters'.
 This utility function is meant for displaying strings when it's
 essential that there's no bidirectional context.
@ -1007,6 +1008,32 @@ This event is sent when a user peforms a pinch gesture on a touchpad,
 which is comprised of placing two fingers on the touchpad and moving
 them towards or away from each other.

+** Text security and suspiciousness
+
+++
+*** New library textsec.el.
+This library contains a number of checks for whether a string is
+"suspicious".  This usually means that the string contains characters
+that have glyphs that can be confused with other, more commonly used
+glyphs, or contain bidirectional (or other) formatting characters that
+may be used to confuse a user.
+
+++
+*** New user option 'textsec-check'.
+If non-nil (which is the default), Emacs packages that are vulnerable
+to attackers trying to confuse the users will use the textsec library
+to mark suspicious text.  For instance shr/eww will mark suspicious
+URLs and links, and Gnus will mark suspicious From addresses, and
+Message will query the user if the user is sending mail to a
+suspicious address.  If this variable is nil, these checks aren't
+performed.
+
+++
+*** New function 'textsec-check'.
+This is the main function Emacs applications should be using to check
+whether a string is suspicious.  It heeds the 'textsec-check' user
+option.
+
 ** Keymaps and key definitions

 +++
--- a/lisp/international/textsec-check.el
+++ b/lisp/international/textsec-check.el
@ -39,13 +39,13 @@ If nil, these checks are disabled."
  "Face used to highlight suspicious strings.")

 ;;;###autoload
-(defun textsec-check (string type)
-  "Test whether STRING is suspicious when considered as TYPE.
-If STRING is suspicious, a string explaining the possible problem
+(defun textsec-check (object type)
+  "Test whether OBJECT is suspicious when considered as TYPE.
+If OBJECT is suspicious, a string explaining the possible problem
 is returned.

 Available types include `url', `link', `domain', `local-address',
-`name', `email-address', and `email-address-headers'.
+`name', `email-address', and `email-address-header'.

 If the `textsec-check' user option is nil, these checks are
 disabled, and this function always returns nil."
@ -55,23 +55,7 @@ disabled, and this function always returns nil."
    (let ((func (intern (format "textsec-%s-suspicious-p" type))))
      (unless (fboundp func)
        (error "%s is not a valid function" func))
-      (funcall func string))))
-
-;;;###autoload
-(defun textsec-propertize (string type)
-  "Test whether STRING is suspicious when considered as TYPE.
-If STRING is suspicious, text properties will be added to the
-string to mark it as suspicious, and with tooltip texts that says
-what's suspicious about it.  Otherwise STRING is returned
-verbatim.
-
-See `texsec-check' for further information about TYPE."
-  (let ((warning (textsec-check string type)))
-    (if (not warning)
-        string
-      (propertize string
-                  'face 'textsec-suspicious
-                  'help-echo warning))))
+      (funcall func object))))

 (provide 'textsec-check)