Better doc fix for Bug#6283.
searching.texi (Regexp Special): Remove obsolete information about matching non-ASCII characters, and suggest using char classes (Bug#6283).
This commit is contained in:
parent
2c3a3c1d03
commit
ba3bf1d951
2 changed files with 13 additions and 18 deletions
|
@ -1,7 +1,8 @@
|
|||
2010-06-02 Chong Yidong <cyd@stupidchicken.com>
|
||||
|
||||
* searching.texi (Regexp Special): Replace "octal 377"
|
||||
with "#o377" (Bug#6283).
|
||||
* searching.texi (Regexp Special): Remove obsolete information
|
||||
about matching non-ASCII characters, and suggest using char
|
||||
classes (Bug#6283).
|
||||
|
||||
2010-05-30 Juanma Barranquero <lekktu@gmail.com>
|
||||
|
||||
|
|
|
@ -362,7 +362,7 @@ the two brackets are what this character alternative can match.
|
|||
|
||||
Thus, @samp{[ad]} matches either one @samp{a} or one @samp{d}, and
|
||||
@samp{[ad]*} matches any string composed of just @samp{a}s and @samp{d}s
|
||||
(including the empty string), from which it follows that @samp{c[ad]*r}
|
||||
(including the empty string). It follows that @samp{c[ad]*r}
|
||||
matches @samp{cr}, @samp{car}, @samp{cdr}, @samp{caddaar}, etc.
|
||||
|
||||
You can also include character ranges in a character alternative, by
|
||||
|
@ -400,21 +400,11 @@ is @samp{@var{c}..?\377}, the other is @samp{@var{c1}..@var{c2}}, where
|
|||
@var{c1} is the first character of the charset to which @var{c2}
|
||||
belongs.
|
||||
|
||||
You cannot always match all non-@acronym{ASCII} characters with the
|
||||
regular expression @code{"[\200-\377]"}. This works when searching a
|
||||
unibyte buffer or string (@pxref{Text Representations}), but not in a
|
||||
multibyte buffer or string, because many non-@acronym{ASCII}
|
||||
characters have codes above @code{#o377}. However, the regular
|
||||
expression @code{"[^\000-\177]"} does match all non-@acronym{ASCII}
|
||||
characters (see below regarding @samp{^}), in both multibyte and
|
||||
unibyte representations, because only the @acronym{ASCII} characters
|
||||
are excluded.
|
||||
|
||||
A character alternative can also specify named
|
||||
character classes (@pxref{Char Classes}). This is a POSIX feature whose
|
||||
syntax is @samp{[:@var{class}:]}. Using a character class is equivalent
|
||||
to mentioning each of the characters in that class; but the latter is
|
||||
not feasible in practice, since some classes include thousands of
|
||||
A character alternative can also specify named character classes
|
||||
(@pxref{Char Classes}). This is a POSIX feature whose syntax is
|
||||
@samp{[:@var{class}:]}. Using a character class is equivalent to
|
||||
mentioning each of the characters in that class; but the latter is not
|
||||
feasible in practice, since some classes include thousands of
|
||||
different characters.
|
||||
|
||||
@item @samp{[^ @dots{} ]}
|
||||
|
@ -432,6 +422,10 @@ A complemented character alternative can match a newline, unless newline is
|
|||
mentioned as one of the characters not to match. This is in contrast to
|
||||
the handling of regexps in programs such as @code{grep}.
|
||||
|
||||
You can specify named character classes, just like in character
|
||||
alternatives. For instance, @samp{[^[:ascii:]]} matches any
|
||||
non-@acronym{ASCII} character. @xref{Char Classes}.
|
||||
|
||||
@item @samp{^}
|
||||
@cindex beginning of line in regexp
|
||||
When matching a buffer, @samp{^} matches the empty string, but only at the
|
||||
|
|
Loading…
Add table
Reference in a new issue