Initial revision
This commit is contained in:
parent
19061fd414
commit
cc6d0d2c94
2 changed files with 1456 additions and 0 deletions
765
lispref/customize.texi
Normal file
765
lispref/customize.texi
Normal file
|
@ -0,0 +1,765 @@
|
|||
@c -*-texinfo-*-
|
||||
@c This is part of the GNU Emacs Lisp Reference Manual.
|
||||
@c Copyright (C) 1997, 1998 Free Software Foundation, Inc.
|
||||
@c See the file elisp.texi for copying conditions.
|
||||
@setfilename ../info/customize
|
||||
@node Customization, Loading, Macros, Top
|
||||
@chapter Writing Customization Definitions
|
||||
|
||||
This chapter describes how to declare customization groups, variables,
|
||||
and faces. We use the term @dfn{customization item} to include all
|
||||
three of those. This has few examples, but please look at the file
|
||||
@file{cus-edit.el}, which contains many declarations you can learn from.
|
||||
|
||||
@menu
|
||||
* Common Keywords::
|
||||
* Group Definitions::
|
||||
* Variable Definitions::
|
||||
* Face Definitions::
|
||||
* Customization Types::
|
||||
@end menu
|
||||
|
||||
@node Common Keywords
|
||||
@section Common Keywords for All Kinds of Items
|
||||
|
||||
All three kinds of customization declarations (for groups, variables,
|
||||
and faces) accept keyword arguments for specifying various information.
|
||||
This section describes some keywords that apply to all three.
|
||||
|
||||
All of these keywords, except @code{:tag}, can be used more than once in
|
||||
a given item. Each use of the keyword has an independent effect. The
|
||||
keyword @code{:tag} is an exception because any given item can only
|
||||
display one name item.
|
||||
|
||||
@table @code
|
||||
@item :group @var{group}
|
||||
Put this customization item in group @var{group}. When you use
|
||||
@code{:group} in a @code{defgroup}, it makes the new group a subgroup of
|
||||
@var{group}.
|
||||
|
||||
If you use this keyword more than once, you can put a single item into
|
||||
more than one group. Displaying any of those groups will show this
|
||||
item. Be careful not to overdo this!
|
||||
|
||||
@item :link @var{link-data}
|
||||
Include an external link after the documentation string for this item.
|
||||
This is a sentence containing an active field which references some
|
||||
other documentation.
|
||||
|
||||
There are three alternatives you can use for @var{link-data}:
|
||||
|
||||
@table @code
|
||||
@item (custom-manual @var{info-node})
|
||||
Link to an Info node; @var{info-node} is a string which specifies the
|
||||
node name, as in @code{"(emacs)Top"}. The link appears as
|
||||
@samp{[manual]} in the customization buffer.
|
||||
|
||||
@item (info-link @var{info-node})
|
||||
Like @code{custom-manual} except that the link appears
|
||||
in the customization buffer with the Info node name.
|
||||
|
||||
@item (url-link @var{url})
|
||||
Link to a web page; @var{url} is a string which specifies the URL. The
|
||||
link appears in the customization buffer as @var{url}.
|
||||
@end table
|
||||
|
||||
You can specify the text to use in the customization buffer by adding
|
||||
@code{:tag @var{name}} after the first element of the @var{link-data};
|
||||
for example, @code{(info-link :tag "foo" "(emacs)Top")} makes a link to
|
||||
the Emacs manual which appears in the buffer as @samp{foo}.
|
||||
|
||||
An item can have more than one external link; however, most items have
|
||||
none at all.
|
||||
|
||||
@item :load @var{file}
|
||||
Load file @var{file} (a string) before displaying this customization
|
||||
item. Loading is done with @code{load-library}, and only if the file is
|
||||
not already loaded.
|
||||
|
||||
@item :require @var{feature}
|
||||
Require feature @var{feature} (a symbol) when installing a value for
|
||||
this item (an option or a face) that was saved using the customization
|
||||
feature. This is done by calling @code{require}.
|
||||
|
||||
The most common reason to use @code{:require} is when a variable enables
|
||||
a feature such as a minor mode, and just setting the variable won't have
|
||||
any effect unless the code which implements the mode is loaded.
|
||||
|
||||
@item :tag @var{name}
|
||||
Use @var{name}, a string, instead of the item's name, to label the item
|
||||
in customization menus and buffers.
|
||||
@end table
|
||||
|
||||
@node Group Definitions
|
||||
@section Defining Custom Groups
|
||||
|
||||
Each Emacs Lisp package should have one main customization group which
|
||||
contains all the options, faces and other groups in the package. If the
|
||||
package has a small number of options and faces, use just one group and
|
||||
put everything in it. When there are more than twelve or so options and
|
||||
faces, then you should structure them into subgroups, and put the
|
||||
subgroups under the package's main customization group. It is ok to
|
||||
have some of the options and faces in the package's main group alongside
|
||||
the subgroups.
|
||||
|
||||
The package's main or only group should be a member of one or more of
|
||||
the standard customization groups. Type press @kbd{C-h p} to display a
|
||||
list of finder keywords; them choose some of them add your group to each
|
||||
of them, using the @code{:group} keyword.
|
||||
|
||||
The way to declare new customization groups is with @code{defgroup}.
|
||||
|
||||
@tindex defgroup
|
||||
@defmac defgroup group members doc [keyword value]...
|
||||
Declare @var{group} as a customization group containing @var{members}.
|
||||
Do not quote the symbol @var{group}. The argument @var{doc} specifies
|
||||
the documentation string for the group.
|
||||
|
||||
The arguments @var{members} can be an alist whose elements specify
|
||||
members of the group; however, normally @var{members} is @code{nil}, and
|
||||
you specify the group's members by using the @code{:group} keyword when
|
||||
defining those members.
|
||||
|
||||
@ignore
|
||||
@code{(@var{name} @var{widget})}. Here @var{name} is a symbol, and
|
||||
@var{widget} is a widget for editing that symbol. Useful widgets are
|
||||
@code{custom-variable} for editing variables, @code{custom-face} for
|
||||
editing faces, and @code{custom-group} for editing groups.
|
||||
@end ignore
|
||||
|
||||
In addition to the common keywords (@pxref{Common Keywords}), you can
|
||||
use this keyword in @code{defgroup}:
|
||||
|
||||
@table @code
|
||||
@item :prefix @var{prefix}
|
||||
If the name of an item in the group starts with @var{prefix}, then the
|
||||
tag for that item is constructed (by default) by omitting @var{prefix}.
|
||||
|
||||
One group can have any number of prefixes.
|
||||
@end table
|
||||
@end defmac
|
||||
|
||||
The @code{:prefix} feature is currently turned off, which means that
|
||||
@code{:prefix} currently has no effect. We did this because we found
|
||||
that discarding the specified prefixes often led to confusing names for
|
||||
options. This happened because the people who wrote the @code{defgroup}
|
||||
definitions for various groups added @code{:prefix} keywords whenever
|
||||
they make logical sense---that is, whenever they say that there was a
|
||||
common prefix for the option names in a library.
|
||||
|
||||
In order to obtain good results with @code{:prefix}, it is necessary to
|
||||
check the specific effects of discarding a particular prefix, given the
|
||||
specific items in a group and their names and documentation. If the
|
||||
resulting text is not clear, then @code{:prefix} should not be used in
|
||||
that case.
|
||||
|
||||
It should be possible to recheck all the customization groups, delete
|
||||
the @code{:prefix} specifications which give unclear results, and then
|
||||
turn this feature back on, if someone would like to do the work.
|
||||
|
||||
@node Variable Definitions
|
||||
@section Defining Customization Variables
|
||||
|
||||
Use @code{defcustom} to declare user editable variables.
|
||||
|
||||
@tindex defcustom
|
||||
@defmac defcustom option value doc [keyword value]...
|
||||
Declare @var{option} as a customizable user option variable that
|
||||
defaults to @var{value}. Do not quote @var{option}. @var{value} should
|
||||
be an expression to compute the value; it will be be evaluated on more
|
||||
than one occasion.
|
||||
|
||||
If @var{option} is void, @code{defcustom} initializes it to @var{value}.
|
||||
|
||||
The argument @var{doc} specifies the documentation string for the variable.
|
||||
|
||||
The following additional keywords are defined:
|
||||
|
||||
@table @code
|
||||
@item :type @var{type}
|
||||
Use @var{type} as the data type for this option. It specifies which
|
||||
values are legitimate, and how to display the value.
|
||||
@xref{Customization Types}, for more information.
|
||||
|
||||
@item :options @var{list}
|
||||
Specify @var{list} as the list of reasonable values for use in this
|
||||
option.
|
||||
|
||||
Currently this is meaningful only when type is @code{hook}. The
|
||||
elements of @var{list} are functions that you might likely want to use
|
||||
as elements of the hook value. The user is not actually restricted to
|
||||
using only these functions, but they are offered as convenient
|
||||
alternatives.
|
||||
|
||||
@item :version @var{version}
|
||||
This option specifies that the variable's default value was changed in
|
||||
Emacs version @var{version}. For example,
|
||||
|
||||
@example
|
||||
(defcustom foo-max 34
|
||||
"*Maximum number of foo's allowed."
|
||||
:type 'integer
|
||||
:group 'foo
|
||||
:version "20.3")
|
||||
@end example
|
||||
|
||||
@item :set @var{setfunction}
|
||||
Specify @var{setfunction} as the way to change the value of this option.
|
||||
The function @var{setfunction} should take two arguments, a symbol and
|
||||
the new value, and should do whatever is necessary to update the value
|
||||
properly for this option (which may not mean simply setting the option
|
||||
as a Lisp variable). The default for @var{setfunction} is
|
||||
@code{set-default}.
|
||||
|
||||
@item :get @var{getfunction}
|
||||
Specify @var{getfunction} as the way to extract the value of this
|
||||
option. The function @var{getfunction} should take one argument, a
|
||||
symbol, and should return the ``current value'' for that symbol (which
|
||||
need not be the symbol's Lisp value). The default is
|
||||
@code{default-value}.
|
||||
|
||||
@item :initialize @var{function}
|
||||
@var{function} should be a function used to initialize the variable when
|
||||
the @code{defcustom} is evaluated. It should take two arguments, the
|
||||
symbol and value. Here are some predefined functions meant for use in
|
||||
this way:
|
||||
|
||||
@table @code
|
||||
@item custom-initialize-set
|
||||
Use the variable's @code{:set} function to initialize the variable. Do
|
||||
not reinitialize it if it is already non-void. This is the default
|
||||
@code{:initialize} function.
|
||||
|
||||
@item custom-initialize-default
|
||||
Always use @code{set-default} to initialize the variable, even if some
|
||||
other @code{:set} function has been specified.
|
||||
|
||||
@item custom-initialize-reset
|
||||
Even if the variable is already non-void, reset it by calling the
|
||||
@code{:set} function using the current value (returned by the
|
||||
@code{:get} method).
|
||||
|
||||
@item custom-initialize-changed
|
||||
Like @code{custom-initialize-reset}, except use @code{set-default}
|
||||
(rather than the @code{:set} function) to initialize the variable if it
|
||||
is not bound and has not been set already.
|
||||
@end table
|
||||
|
||||
@item :require @var{feature}
|
||||
If the user saves a customized value for this item, them Emacs should do
|
||||
@code{(require @var{feature})} after installing the saved value.
|
||||
|
||||
The place to use this feature is for an option that turns on the
|
||||
operation of a certain feature. Assuming that the package is coded to
|
||||
check the value of the option, you still need to arrange for the package
|
||||
to be loaded. That is what @code{:require} is for.
|
||||
@end table
|
||||
@end defmac
|
||||
|
||||
@ignore
|
||||
Use @code{custom-add-option} to specify that a specific function is
|
||||
useful as an member of a hook.
|
||||
|
||||
@defun custom-add-option symbol option
|
||||
To the variable @var{symbol} add @var{option}.
|
||||
|
||||
If @var{symbol} is a hook variable, @var{option} should be a hook
|
||||
member. For other types variables, the effect is undefined."
|
||||
@end defun
|
||||
@end ignore
|
||||
|
||||
Internally, @code{defcustom} uses the symbol property
|
||||
@code{standard-value} to record the expression for the default value,
|
||||
and @code{saved-value} to record the value saved by the user with the
|
||||
customization buffer. The @code{saved-value} property is actually a
|
||||
list whose car is an expression which evaluates to the value.
|
||||
|
||||
@node Face Definitions
|
||||
@section Defining Faces
|
||||
|
||||
Faces are declared with @code{defface}.
|
||||
|
||||
@tindex defface
|
||||
@defmac defface face spec doc [keyword value]...
|
||||
Declare @var{face} as a customizable face that defaults according to
|
||||
@var{spec}. Do not quote the symbol @var{face}.
|
||||
|
||||
@var{doc} is the face documentation.
|
||||
|
||||
@var{spec} should be an alist whose elements have the form
|
||||
@code{(@var{display} @var{atts})} (see below). When @code{defface}
|
||||
executes, it defines the face according to @var{spec}, then uses any
|
||||
customizations saved in the @file{.emacs} file to override that
|
||||
specification.
|
||||
|
||||
In each element of @var{spec}, @var{atts} is a list of face attributes
|
||||
and their values. The possible attributes are defined in the variable
|
||||
@code{custom-face-attributes}.
|
||||
|
||||
The @var{display} part of an element of @var{spec} determines which
|
||||
frames the element applies to. If more than one element of @var{spec}
|
||||
matches a given frame, the first matching element is the only one used
|
||||
for that frame.
|
||||
|
||||
If @var{display} is @code{t} in a @var{spec} element, that element
|
||||
matches all frames. (This means that any subsequent elements of
|
||||
@var{spec} are never used.)
|
||||
|
||||
Alternatively, @var{display} can be an alist whose elements have the
|
||||
form @code{(@var{characteristic} @var{value}@dots{})}. Here
|
||||
@var{characteristic} specifies a way of classifying frames, and the
|
||||
@var{value}s are possible classifications which @var{display} should
|
||||
apply to. Here are the possible values of @var{characteristic}:
|
||||
|
||||
@table @code
|
||||
@item type
|
||||
The kind of window system the frame uses---either @code{x}, @code{pc}
|
||||
(for the MS-DOS console), @code{w32} (for MS Windows 9X/NT), or
|
||||
@code{tty}.
|
||||
|
||||
@item class
|
||||
What kinds of colors the frame supports---either @code{color},
|
||||
@code{grayscale}, or @code{mono}.
|
||||
|
||||
@item background
|
||||
The kind of background--- either @code{light} or @code{dark}.
|
||||
@end table
|
||||
|
||||
If an element of @var{display} specifies more than one
|
||||
@var{value} for a given @var{characteristic}, any of those values
|
||||
is acceptable. If an element of @var{display} has elements for
|
||||
more than one @var{characteristic}, then @var{each} characteristic
|
||||
of the frame must match one of the values specified for it.
|
||||
@end defmac
|
||||
|
||||
Internally, @code{defface} uses the symbol property
|
||||
@code{face-defface-spec} to record the face attributes specified in
|
||||
@code{defface}, @code{saved-face} for the attributes saved by the user
|
||||
with the customization buffer, and @code{face-documentation} for the
|
||||
documentation string.
|
||||
|
||||
@node Customization Types
|
||||
@section Customization Types
|
||||
|
||||
When you define a user option with @code{defcustom}, you must specify
|
||||
its @dfn{customization type}. That is a Lisp object which indictaes (1)
|
||||
which values are legitimate and (2) how to display the value in the
|
||||
customization buffer for editing.
|
||||
|
||||
You specify the customization type in @code{defcustom} with the
|
||||
@code{:type} keyword. The argument of @code{:type} is evaluated; since
|
||||
types that vary at run time are rarely useful, normally it is a quoted
|
||||
constant. For example:
|
||||
|
||||
@example
|
||||
(defcustom diff-command "diff"
|
||||
"*The command to use to run diff."
|
||||
:type 'string
|
||||
:group 'diff)
|
||||
@end example
|
||||
|
||||
In general, a customization type appears is a list whose first element
|
||||
is a symbol, one of the customization type names defined in the
|
||||
following sections. After this symbol come a number of arguments,
|
||||
depending on the symbol. Some of the type symbols do not use any
|
||||
arguments; those are called @dfn{simple types}.
|
||||
|
||||
In between the type symbol and its arguments, you can optionally
|
||||
write keyword-value pairs. @xref{Type Keywords}.
|
||||
|
||||
For a simple type, if you do not use any keyword-value pairs, you can
|
||||
omit the parentheses around the type symbol. The above example does
|
||||
this, using just @code{string} as the customization type.
|
||||
But @code{(string)} would mean the same thing.
|
||||
|
||||
@menu
|
||||
* Simple Types::
|
||||
* Composite Types::
|
||||
* Splicing into Lists::
|
||||
* Type Keywords::
|
||||
@end menu
|
||||
|
||||
@node Simple Types
|
||||
@subsection Simple Types
|
||||
|
||||
This section describes all the simple customization types.
|
||||
|
||||
@table @code
|
||||
@item sexp
|
||||
The value may be any Lisp object that can be printed and read back. You
|
||||
can use @code{sexp} as a fall-back for any option, if you don't want to
|
||||
take the time to work out a more specific type to use.
|
||||
|
||||
@item integer
|
||||
The value must be an integer, and is represented textually
|
||||
in the customization buffer.
|
||||
|
||||
@item number
|
||||
The value must be a number, and is represented textually in the
|
||||
customization buffer.
|
||||
|
||||
@item string
|
||||
The value must be a string, and the customization buffer shows just the
|
||||
contents, with no @samp{"} characters or quoting with @samp{\}.
|
||||
|
||||
@item regexp
|
||||
The value must be a string which is a valid regular expression.
|
||||
|
||||
@item character
|
||||
The value must be a character code. A character code is actually an
|
||||
integer, but this type shows the value by inserting the character in the
|
||||
buffer, rather than by showing the number.
|
||||
|
||||
@item file
|
||||
The value must be a file name, and you can do completion with
|
||||
@kbd{M-@key{TAB}}.
|
||||
|
||||
@item (file :must-match t)
|
||||
The value must be a file name for an existing file, and you can do
|
||||
completion with @kbd{M-@key{TAB}}.
|
||||
|
||||
@item directory
|
||||
The value must be a directory name, and you can do completion with
|
||||
@kbd{M-@key{TAB}}.
|
||||
|
||||
@item symbol
|
||||
The value must be a symbol. It appears in the customization buffer as
|
||||
the name of the symbol.
|
||||
|
||||
@item function
|
||||
The value must be either a lambda expression or a function name. When
|
||||
it is a function name, you can do completion with @kbd{M-@key{TAB}}.
|
||||
|
||||
@item variable
|
||||
The value must be a variable name, and you can do completion with
|
||||
@kbd{M-@key{TAB}}.
|
||||
|
||||
@item boolean
|
||||
The value is boolean---either @code{nil} or @code{t}.
|
||||
@end table
|
||||
|
||||
@node Composite Types
|
||||
@subsection Composite Types
|
||||
|
||||
When none of the simple types is appropriate, you can use composite
|
||||
types, which build from simple types. Here are several ways of doing
|
||||
that:
|
||||
|
||||
@table @code
|
||||
@item (restricted-sexp :match-alternatives @var{criteria})
|
||||
The value may be any Lisp object that satisfies one of @var{criteria}.
|
||||
@var{criteria} should be a list, and each elements should be
|
||||
one of these possibilities:
|
||||
|
||||
@itemize @bullet
|
||||
@item
|
||||
A predicate---that is, a function of one argument that returns non-@code{nil}
|
||||
if the argument fits a certain type. This means that objects of that type
|
||||
are acceptable.
|
||||
|
||||
@item
|
||||
A quoted constant---that is, @code{'@var{object}}. This means that
|
||||
@var{object} is an acceptable value.
|
||||
@end itemize
|
||||
|
||||
For example,
|
||||
|
||||
@example
|
||||
(restricted-sexp :match-alternatives (integerp 't 'nil))
|
||||
@end example
|
||||
|
||||
@noindent
|
||||
allows integers, @code{t} and @code{nil} as legitimate values.
|
||||
|
||||
The customization buffer shows all legitimate values using their read
|
||||
syntax, and the user edits them textually.
|
||||
|
||||
@item (cons @var{car-type} @var{cdr-type})
|
||||
The value must be a cons cell, its @sc{car} must fit @var{car-type}, and
|
||||
its @sc{cdr} must fit @var{cdr-type}. For example, @code{(const string
|
||||
symbol)} is a customization type which matches values such as
|
||||
@code{("foo" . foo)}.
|
||||
|
||||
In the customization buffeer, the @sc{car} and the @sc{cdr} are
|
||||
displayed and edited separately, each according to the type
|
||||
that you specify for it.
|
||||
|
||||
@item (list @var{element-types}@dots{})
|
||||
The value must be a list with exactly as many elements as the
|
||||
@var{element-types} you have specified; and each element must fit the
|
||||
corresponding @var{element-type}.
|
||||
|
||||
For example, @code{(list integer string function)} describes a list of
|
||||
three elements; the first element must be an integer, the second a
|
||||
string, and the third a function.
|
||||
|
||||
In the customization buffeer, the each element is displayed and edited
|
||||
separately, according to the type specified for it.
|
||||
|
||||
@item (vector @var{element-types}@dots{})
|
||||
Like @code{list} except that the value must be a vector instead of a
|
||||
list. The elements work the same as in @code{list}.
|
||||
|
||||
@item (choice @var{alternative-types}...)
|
||||
The value must fit at least one of @var{alternative-types}.
|
||||
For example, @code{(choice integer string)} allows either an
|
||||
integer or a string.
|
||||
|
||||
In the customization buffer, the user selects one of the alternatives
|
||||
using a menu, and can then edit the value in the usual way for that
|
||||
alternative.
|
||||
|
||||
Normally the strings in this menu are determined automatically from the
|
||||
choices; however, you can specify different strings for the menu by
|
||||
including the @code{:tag} keyword in the alternatives. For example, if
|
||||
an integer stands for a number of spaces, while a string is text to use
|
||||
verbatim, you might write the customization type this way,
|
||||
|
||||
@smallexample
|
||||
(choice (integer :tag "Number of spaces")
|
||||
(string :tag "Literal text"))
|
||||
@end smallexample
|
||||
|
||||
@noindent
|
||||
so that the menu offers @samp{Number of spaces} and @samp{Literal Text}.
|
||||
|
||||
@item (const @var{value})
|
||||
The value must be @var{value}---nothing else is allowed.
|
||||
|
||||
The main use of @code{const} is inside of @code{choice}. For example,
|
||||
@code{(choice integer (const nil))} allows either an integer or
|
||||
@code{nil}. @code{:tag} is often used with @code{const}.
|
||||
|
||||
@item (function-item @var{function})
|
||||
Like @code{const}, but used for values which are functions. This
|
||||
displays the documentation string of the function @var{function}
|
||||
as well as its name.
|
||||
|
||||
@item (variable-item @var{variable})
|
||||
Like @code{const}, but used for values which are variable names. This
|
||||
displays the documentation string of the variable @var{variable} as well
|
||||
as its name.
|
||||
|
||||
@item (set @var{elements}@dots{})
|
||||
The value must be a list and each element of the list must be one of the
|
||||
@var{elements} specified. This appears in the customization buffer as a
|
||||
checklist.
|
||||
|
||||
@item (repeat @var{element-type})
|
||||
The value must be a list and each element of the list must fit the type
|
||||
@var{element-type}. This appears in the customization buffer as a
|
||||
list of elements, with @samp{[INS]} and @samp{[DEL]} buttons for adding
|
||||
more elements or removing elements.
|
||||
@end table
|
||||
|
||||
@node Splicing into Lists
|
||||
@subsection Splicing into Lists
|
||||
|
||||
The @code{:inline} feature lets you splice a variable number of
|
||||
elements into the middle of a list or vector. You use it in a
|
||||
@code{set}, @code{choice} or @code{repeat} type which appears among the
|
||||
element-types of a @code{list} or @code{vector}.
|
||||
|
||||
Normally, each of the element-types in a @code{list} or @code{vector}
|
||||
describes one and only one element of the list or vector. Thus, if an
|
||||
element-type is a @code{repeat}, that specifies a list of unspecified
|
||||
length which appears as one element.
|
||||
|
||||
But when the element-type uses @code{:inline}, the value it matches is
|
||||
merged directly into the containing sequence. For example, if it
|
||||
matches a list with three elements, those become three elements of the
|
||||
overall sequence. This is analogous to using @samp{,@@} in the backquote
|
||||
construct.
|
||||
|
||||
For example, to specify a list whose first element must be @code{t}
|
||||
and whose remaining arguments should be zero or more of @code{foo} and
|
||||
@code{bar}, use this customization type:
|
||||
|
||||
@example
|
||||
(list (const t) (set :inline t foo bar))
|
||||
@end example
|
||||
|
||||
@noindent
|
||||
This matches values such as @code{(t)}, @code{(t foo)}, @code{(t bar)}
|
||||
and @code{(t foo bar)}.
|
||||
|
||||
When the element-type is a @code{choice}, you use @code{:inline} not
|
||||
in the @code{choice} itself, but in (some of) the alternatives of the
|
||||
@code{choice}. For example, to match a list which must start with a
|
||||
file name, followed either by the symbol @code{t} or two strings, use
|
||||
this customization type:
|
||||
|
||||
@example
|
||||
(list file
|
||||
(choice (const t)
|
||||
(list :inline t string string)))
|
||||
@end example
|
||||
|
||||
@noindent
|
||||
If the user chooses the first alternative in the choice, then the
|
||||
overall list has two elements and the second element is @code{t}. If
|
||||
the user chooses the second alternative, then the overall list has three
|
||||
elements and the second and third must be strings.
|
||||
|
||||
@node Type Keywords
|
||||
@subsection Type Keywords
|
||||
|
||||
You can specify keyword-argument pairs in a customization type after the
|
||||
type name symbol. Here are the keywords you can use, and their
|
||||
meanings:
|
||||
|
||||
@table @code
|
||||
@item :value @var{default}
|
||||
This is used for a type that appears as an alternative inside of
|
||||
@code{:choice}; it specifies the default value to use, at first, if and
|
||||
when the user selects this alternative with the menu in the
|
||||
customization buffer.
|
||||
|
||||
Of course, if the actual value of the option fits this alternative, it
|
||||
will appear showing the actual value, not @var{default}.
|
||||
|
||||
@item :format @var{format-string}
|
||||
This string will be inserted in the buffer to represent the value
|
||||
corresponding to the type. The following @samp{%} escapes are available
|
||||
for use in @var{format-string}:
|
||||
|
||||
@table @samp
|
||||
@ignore
|
||||
@item %[@var{button}%]
|
||||
Display the text @var{button} marked as a button. The @code{:action}
|
||||
attribute specifies what the button will do if the user invokes it;
|
||||
its value is a function which takes two arguments---the widget which
|
||||
the button appears in, and the event.
|
||||
|
||||
There is no way to specify two different buttons with different
|
||||
actions; but perhaps there is no need for one.
|
||||
@end ignore
|
||||
|
||||
@item %@{@var{sample}%@}
|
||||
Show @var{sample} in a special face specified by @code{:sample-face}.
|
||||
|
||||
@item %v
|
||||
Substitute the item's value. How the value is represented depends on
|
||||
the kind of item, and (for variables) on the customization type.
|
||||
|
||||
@item %d
|
||||
Substitute the item's documentation string.
|
||||
|
||||
@item %h
|
||||
Like @samp{%d}, but if the documentation string is more than one line,
|
||||
add an active field to control whether to show all of it or just the
|
||||
first line.
|
||||
|
||||
@item %t
|
||||
Substitute the tag here. You specify the tag with the @code{:tag}
|
||||
keyword.
|
||||
|
||||
@item %%
|
||||
Display a literal @samp{%}.
|
||||
@end table
|
||||
|
||||
@item :button-face @var{face}
|
||||
Use face @var{face} for text displayed with @samp{%[@dots{}%]}.
|
||||
|
||||
@item :button-prefix
|
||||
@itemx :button-suffix
|
||||
These specify the text to display before and after a button.
|
||||
Each can be:
|
||||
|
||||
@table @asis
|
||||
@item @code{nil}
|
||||
No text is inserted.
|
||||
|
||||
@item a string
|
||||
The string is inserted literally.
|
||||
|
||||
@item a symbol
|
||||
The symbol's value is used.
|
||||
@end table
|
||||
|
||||
@item :doc @var{doc}
|
||||
Use @var{doc} as the documentation string for this item.
|
||||
|
||||
@item :tag @var{tag}
|
||||
Use @var{tag} (a string) as the tag for this item.
|
||||
|
||||
@item :help-echo @var{motion-doc}
|
||||
When you move to this item with @code{widget-forward} or
|
||||
@code{widget-backward}, it will display the string @var{motion-doc}
|
||||
in the echo area.
|
||||
|
||||
@item :match @var{function}
|
||||
Specify how to decide whether a value matches the type. @var{function}
|
||||
should be a function that accepts two arguments, a widget and a value;
|
||||
it should return non-@code{nil} if the value is acceptable.
|
||||
|
||||
@ignore
|
||||
@item :indent @var{columns}
|
||||
Indent this item by @var{columns} columns. The indentation is used for
|
||||
@samp{%n}, and automatically for group names, for checklists and radio
|
||||
buttons, and for editable lists. It affects the whole of the
|
||||
item except for the first line.
|
||||
|
||||
@item :offset @var{columns}
|
||||
An integer indicating how many extra spaces to indent the subitems of
|
||||
this item. By default, subitems are indented the same as their parent.
|
||||
|
||||
@item :extra-offset
|
||||
An integer indicating how many extra spaces to add to this item's
|
||||
indentation, compared to its parent.
|
||||
|
||||
@item :notify
|
||||
A function called each time the item or a subitem is changed. The
|
||||
function is called with two or three arguments. The first argument is
|
||||
the item itself, the second argument is the item that was changed, and
|
||||
the third argument is the event leading to the change, if any.
|
||||
|
||||
@item :menu-tag
|
||||
Tag used in the menu when the widget is used as an option in a
|
||||
@code{menu-choice} widget.
|
||||
|
||||
@item :menu-tag-get
|
||||
Function used for finding the tag when the widget is used as an option
|
||||
in a @code{menu-choice} widget. By default, the tag used will be either the
|
||||
@code{:menu-tag} or @code{:tag} property if present, or the @code{princ}
|
||||
representation of the @code{:value} property if not.
|
||||
|
||||
@item :validate
|
||||
A function which takes a widget as an argument, and return nil if the
|
||||
widgets current value is valid for the widget. Otherwise, it should
|
||||
return the widget containing the invalid data, and set that widgets
|
||||
@code{:error} property to a string explaining the error.
|
||||
|
||||
You can use the function @code{widget-children-validate} for this job;
|
||||
it tests that all children of @var{widget} are valid.
|
||||
|
||||
@item :tab-order
|
||||
Specify the order in which widgets are traversed with
|
||||
@code{widget-forward} or @code{widget-backward}. This is only partially
|
||||
implemented.
|
||||
|
||||
@enumerate a
|
||||
@item
|
||||
Widgets with tabbing order @code{-1} are ignored.
|
||||
|
||||
@item
|
||||
(Unimplemented) When on a widget with tabbing order @var{n}, go to the
|
||||
next widget in the buffer with tabbing order @var{n+1} or @code{nil},
|
||||
whichever comes first.
|
||||
|
||||
@item
|
||||
When on a widget with no tabbing order specified, go to the next widget
|
||||
in the buffer with a positive tabbing order, or @code{nil}
|
||||
@end enumerate
|
||||
|
||||
@item :parent
|
||||
The parent of a nested widget (e.g. a @code{menu-choice} item or an
|
||||
element of a @code{editable-list} widget).
|
||||
|
||||
@item :sibling-args
|
||||
This keyword is only used for members of a @code{radio-button-choice} or
|
||||
@code{checklist}. The value should be a list of extra keyword
|
||||
arguments, which will be used when creating the @code{radio-button} or
|
||||
@code{checkbox} associated with this item.
|
||||
@end ignore
|
||||
@end table
|
691
lispref/nonascii.texi
Normal file
691
lispref/nonascii.texi
Normal file
|
@ -0,0 +1,691 @@
|
|||
@c -*-texinfo-*-
|
||||
@c This is part of the GNU Emacs Lisp Reference Manual.
|
||||
@c Copyright (C) 1998 Free Software Foundation, Inc.
|
||||
@c See the file elisp.texi for copying conditions.
|
||||
@setfilename ../info/characters
|
||||
@node Non-ASCII Characters, Searching and Matching, Text, Top
|
||||
@chapter Non-ASCII Characters
|
||||
@cindex multibyte characters
|
||||
@cindex non-ASCII characters
|
||||
|
||||
This chapter covers the special issues relating to non-@sc{ASCII}
|
||||
characters and how they are stored in strings and buffers.
|
||||
|
||||
@menu
|
||||
* Text Representations::
|
||||
* Converting Representations::
|
||||
* Selecting a Representation::
|
||||
* Character Codes::
|
||||
* Character Sets::
|
||||
* Scanning Charsets::
|
||||
* Chars and Bytes::
|
||||
* Coding Systems::
|
||||
* Default Coding Systems::
|
||||
* Specifying Coding Systems::
|
||||
* Explicit Encoding::
|
||||
@end menu
|
||||
|
||||
@node Text Representations
|
||||
@section Text Representations
|
||||
@cindex text representations
|
||||
|
||||
Emacs has two @dfn{text representations}---two ways to represent text
|
||||
in a string or buffer. These are called @dfn{unibyte} and
|
||||
@dfn{multibyte}. Each string, and each buffer, uses one of these two
|
||||
representations. For most purposes, you can ignore the issue of
|
||||
representations, because Emacs converts text between them as
|
||||
appropriate. Occasionally in Lisp programming you will need to pay
|
||||
attention to the difference.
|
||||
|
||||
@cindex unibyte text
|
||||
In unibyte representation, each character occupies one byte and
|
||||
therefore the possible character codes range from 0 to 255. Codes 0
|
||||
through 127 are @sc{ASCII} characters; the codes from 128 through 255
|
||||
are used for one non-@sc{ASCII} character set (you can choose which one
|
||||
by setting the variable @code{nonascii-insert-offset}).
|
||||
|
||||
@cindex leading code
|
||||
@cindex multibyte text
|
||||
In multibyte representation, a character may occupy more than one
|
||||
byte, and as a result, the full range of Emacs character codes can be
|
||||
stored. The first byte of a multibyte character is always in the range
|
||||
128 through 159 (octal 0200 through 0237). These values are called
|
||||
@dfn{leading codes}. The first byte determines which character set the
|
||||
character belongs to (@pxref{Character Sets}); in particular, it
|
||||
determines how many bytes long the sequence is. The second and
|
||||
subsequent bytes of a multibyte character are always in the range 160
|
||||
through 255 (octal 0240 through 0377).
|
||||
|
||||
In a buffer, the buffer-local value of the variable
|
||||
@code{enable-multibyte-characters} specifies the representation used.
|
||||
The representation for a string is determined based on the string
|
||||
contents when the string is constructed.
|
||||
|
||||
@tindex enable-multibyte-characters
|
||||
@defvar enable-multibyte-characters
|
||||
This variable specifies the current buffer's text representation.
|
||||
If it is non-@code{nil}, the buffer contains multibyte text; otherwise,
|
||||
it contains unibyte text.
|
||||
|
||||
@strong{Warning:} do not set this variable directly; instead, use the
|
||||
function @code{set-buffer-multibyte} to change a buffer's
|
||||
representation.
|
||||
@end defvar
|
||||
|
||||
@tindex default-enable-multibyte-characters
|
||||
@defvar default-enable-multibyte-characters
|
||||
This variable`s value is entirely equivalent to @code{(default-value
|
||||
'enable-multibyte-characters)}, and setting this variable changes that
|
||||
default value. Although setting the local binding of
|
||||
@code{enable-multibyte-characters} in a specific buffer is dangerous,
|
||||
changing the default value is safe, and it is a reasonable thing to do.
|
||||
|
||||
The @samp{--unibyte} command line option does its job by setting the
|
||||
default value to @code{nil} early in startup.
|
||||
@end defvar
|
||||
|
||||
@tindex multibyte-string-p
|
||||
@defun multibyte-string-p string
|
||||
Return @code{t} if @var{string} contains multibyte characters.
|
||||
@end defun
|
||||
|
||||
@node Converting Representations
|
||||
@section Converting Text Representations
|
||||
|
||||
Emacs can convert unibyte text to multibyte; it can also convert
|
||||
multibyte text to unibyte, though this conversion loses information. In
|
||||
general these conversions happen when inserting text into a buffer, or
|
||||
when putting text from several strings together in one string. You can
|
||||
also explicitly convert a string's contents to either representation.
|
||||
|
||||
Emacs chooses the representation for a string based on the text that
|
||||
it is constructed from. The general rule is to convert unibyte text to
|
||||
multibyte text when combining it with other multibyte text, because the
|
||||
multibyte representation is more general and can hold whatever
|
||||
characters the unibyte text has.
|
||||
|
||||
When inserting text into a buffer, Emacs converts the text to the
|
||||
buffer's representation, as specified by
|
||||
@code{enable-multibyte-characters} in that buffer. In particular, when
|
||||
you insert multibyte text into a unibyte buffer, Emacs converts the text
|
||||
to unibyte, even though this conversion cannot in general preserve all
|
||||
the characters that might be in the multibyte text. The other natural
|
||||
alternative, to convert the buffer contents to multibyte, is not
|
||||
acceptable because the buffer's representation is a choice made by the
|
||||
user that cannot simply be overrided.
|
||||
|
||||
Converting unibyte text to multibyte text leaves @sc{ASCII} characters
|
||||
unchanged. It converts the non-@sc{ASCII} codes 128 through 255 by
|
||||
adding the value @code{nonascii-insert-offset} to each character code.
|
||||
By setting this variable, you specify which character set the unibyte
|
||||
characters correspond to. For example, if @code{nonascii-insert-offset}
|
||||
is 2048, which is @code{(- (make-char 'latin-iso8859-1 0) 128)}, then
|
||||
the unibyte non-@sc{ASCII} characters correspond to Latin 1. If it is
|
||||
2688, which is @code{(- (make-char 'greek-iso8859-7 0) 128)}, then they
|
||||
correspond to Greek letters.
|
||||
|
||||
Converting multibyte text to unibyte is simpler: it performs
|
||||
logical-and of each character code with 255. If
|
||||
@code{nonascii-insert-offset} has a reasonable value, corresponding to
|
||||
the beginning of some character set, this conversion is the inverse of
|
||||
the other: converting unibyte text to multibyte and back to unibyte
|
||||
reproduces the original unibyte text.
|
||||
|
||||
@tindex nonascii-insert-offset
|
||||
@defvar nonascii-insert-offset
|
||||
This variable specifies the amount to add to a non-@sc{ASCII} character
|
||||
when converting unibyte text to multibyte. It also applies when
|
||||
@code{insert-char} or @code{self-insert-command} inserts a character in
|
||||
the unibyte non-@sc{ASCII} range, 128 through 255.
|
||||
|
||||
The right value to use to select character set @var{cs} is @code{(-
|
||||
(make-char @var{cs} 0) 128)}. If the value of
|
||||
@code{nonascii-insert-offset} is zero, then conversion actually uses the
|
||||
value for the Latin 1 character set, rather than zero.
|
||||
@end defvar
|
||||
|
||||
@tindex nonascii-translate-table
|
||||
@defvar nonascii-translate-table
|
||||
This variable provides a more general alternative to
|
||||
@code{nonascii-insert-offset}. You can use it to specify independently
|
||||
how to translate each code in the range of 128 through 255 into a
|
||||
multibyte character. The value should be a vector, or @code{nil}.
|
||||
@end defvar
|
||||
|
||||
@tindex string-make-unibyte
|
||||
@defun string-make-unibyte string
|
||||
This function converts the text of @var{string} to unibyte
|
||||
representation, if it isn't already, and return the result. If
|
||||
conversion does not change the contents, the value may be @var{string}
|
||||
itself.
|
||||
@end defun
|
||||
|
||||
@tindex string-make-multibyte
|
||||
@defun string-make-multibyte string
|
||||
This function converts the text of @var{string} to multibyte
|
||||
representation, if it isn't already, and return the result. If
|
||||
conversion does not change the contents, the value may be @var{string}
|
||||
itself.
|
||||
@end defun
|
||||
|
||||
@node Selecting a Representation
|
||||
@section Selecting a Representation
|
||||
|
||||
Sometimes it is useful to examine an existing buffer or string as
|
||||
multibyte when it was unibyte, or vice versa.
|
||||
|
||||
@tindex set-buffer-multibyte
|
||||
@defun set-buffer-multibyte multibyte
|
||||
Set the representation type of the current buffer. If @var{multibyte}
|
||||
is non-@code{nil}, the buffer becomes multibyte. If @var{multibyte}
|
||||
is @code{nil}, the buffer becomes unibyte.
|
||||
|
||||
This function leaves the buffer contents unchanged when viewed as a
|
||||
sequence of bytes. As a consequence, it can change the contents viewed
|
||||
as characters; a sequence of two bytes which is treated as one character
|
||||
in multibyte representation will count as two characters in unibyte
|
||||
representation.
|
||||
|
||||
This function sets @code{enable-multibyte-characters} to record which
|
||||
representation is in use. It also adjusts various data in the buffer
|
||||
(including its overlays, text properties and markers) so that they
|
||||
cover or fall between the same text as they did before.
|
||||
@end defun
|
||||
|
||||
@tindex string-as-unibyte
|
||||
@defun string-as-unibyte string
|
||||
This function returns a string with the same bytes as @var{string} but
|
||||
treating each byte as a character. This means that the value may have
|
||||
more characters than @var{string} has.
|
||||
|
||||
If @var{string} is unibyte already, then the value may be @var{string}
|
||||
itself.
|
||||
@end defun
|
||||
|
||||
@tindex string-as-multibyte
|
||||
@defun string-as-multibyte string
|
||||
This function returns a string with the same bytes as @var{string} but
|
||||
treating each multibyte sequence as one character. This means that the
|
||||
value may have fewer characters than @var{string} has.
|
||||
|
||||
If @var{string} is multibyte already, then the value may be @var{string}
|
||||
itself.
|
||||
@end defun
|
||||
|
||||
@node Character Codes
|
||||
@section Character Codes
|
||||
@cindex character codes
|
||||
|
||||
The unibyte and multibyte text representations use different character
|
||||
codes. The valid character codes for unibyte representation range from
|
||||
0 to 255---the values that can fit in one byte. The valid character
|
||||
codes for multibyte representation range from 0 to 524287, but not all
|
||||
values in that range are valid. In particular, the values 128 through
|
||||
255 are not valid in multibyte text. Only the @sc{ASCII} codes 0
|
||||
through 127 are used in both representations.
|
||||
|
||||
@defun char-valid-p charcode
|
||||
This returns @code{t} if @var{charcode} is valid for either one of the two
|
||||
text representations.
|
||||
|
||||
@example
|
||||
(char-valid-p 65)
|
||||
@result{} t
|
||||
(char-valid-p 256)
|
||||
@result{} nil
|
||||
(char-valid-p 2248)
|
||||
@result{} t
|
||||
@end example
|
||||
@end defun
|
||||
|
||||
@node Character Sets
|
||||
@section Character Sets
|
||||
@cindex character sets
|
||||
|
||||
Emacs classifies characters into various @dfn{character sets}, each of
|
||||
which has a name which is a symbol. Each character belongs to one and
|
||||
only one character set.
|
||||
|
||||
In general, there is one character set for each distinct script. For
|
||||
example, @code{latin-iso8859-1} is one character set,
|
||||
@code{greek-iso8859-7} is another, and @code{ascii} is another. An
|
||||
Emacs character set can hold at most 9025 characters; therefore. in some
|
||||
cases, a set of characters that would logically be grouped together are
|
||||
split into several character sets. For example, one set of Chinese
|
||||
characters is divided into eight Emacs character sets,
|
||||
@code{chinese-cns11643-1} through @code{chinese-cns11643-7}.
|
||||
|
||||
@tindex charsetp
|
||||
@defun charsetp object
|
||||
Return @code{t} if @var{object} is a character set name symbol,
|
||||
@code{nil} otherwise.
|
||||
@end defun
|
||||
|
||||
@tindex charset-list
|
||||
@defun charset-list
|
||||
This function returns a list of all defined character set names.
|
||||
@end defun
|
||||
|
||||
@tindex char-charset
|
||||
@defun char-charset character
|
||||
This function returns the the name of the character
|
||||
set that @var{character} belongs to.
|
||||
@end defun
|
||||
|
||||
@node Scanning Charsets
|
||||
@section Scanning for Character Sets
|
||||
|
||||
Sometimes it is useful to find out which character sets appear in a
|
||||
part of a buffer or a string. One use for this is in determining which
|
||||
coding systems (@pxref{Coding Systems}) are capable of representing all
|
||||
of the text in question.
|
||||
|
||||
@tindex find-charset-region
|
||||
@defun find-charset-region beg end &optional unification
|
||||
This function returns a list of the character sets
|
||||
that appear in the current buffer between positions @var{beg}
|
||||
and @var{end}.
|
||||
@end defun
|
||||
|
||||
@tindex find-charset-string
|
||||
@defun find-charset-string string &optional unification
|
||||
This function returns a list of the character sets
|
||||
that appear in the string @var{string}.
|
||||
@end defun
|
||||
|
||||
@node Chars and Bytes
|
||||
@section Characters and Bytes
|
||||
@cindex bytes and characters
|
||||
|
||||
In multibyte representation, each character occupies one or more
|
||||
bytes. The functions in this section convert between characters and the
|
||||
byte values used to represent them.
|
||||
|
||||
@tindex char-bytes
|
||||
@defun char-bytes character
|
||||
This function returns the number of bytes used to represent the
|
||||
character @var{character}. In most cases, this is the same as
|
||||
@code{(length (split-char @var{character}))}; the only exception is for
|
||||
ASCII characters, which use just one byte.
|
||||
|
||||
@example
|
||||
(char-bytes 2248)
|
||||
@result{} 2
|
||||
(char-bytes 65)
|
||||
@result{} 1
|
||||
@end example
|
||||
|
||||
This function's values are correct for both multibyte and unibyte
|
||||
representations, because the non-@sc{ASCII} character codes used in
|
||||
those two representations do not overlap.
|
||||
|
||||
@example
|
||||
(char-bytes 192)
|
||||
@result{} 1
|
||||
@end example
|
||||
@end defun
|
||||
|
||||
@tindex split-char
|
||||
@defun split-char character
|
||||
Return a list containing the name of the character set of
|
||||
@var{character}, followed by one or two byte-values which identify
|
||||
@var{character} within that character set.
|
||||
|
||||
@example
|
||||
(split-char 2248)
|
||||
@result{} (latin-iso8859-1 72)
|
||||
(split-char 65)
|
||||
@result{} (ascii 65)
|
||||
@end example
|
||||
|
||||
Unibyte non-@sc{ASCII} characters are considered as part of
|
||||
the @code{ascii} character set:
|
||||
|
||||
@example
|
||||
(split-char 192)
|
||||
@result{} (ascii 192)
|
||||
@end example
|
||||
@end defun
|
||||
|
||||
@tindex make-char
|
||||
@defun make-char charset &rest byte-values
|
||||
Thus function returns the character in character set @var{charset}
|
||||
identified by @var{byte-values}. This is roughly the opposite of
|
||||
split-char.
|
||||
|
||||
@example
|
||||
(make-char 'latin-iso8859-1 72)
|
||||
@result{} 2248
|
||||
@end example
|
||||
@end defun
|
||||
|
||||
@node Coding Systems
|
||||
@section Coding Systems
|
||||
|
||||
@cindex coding system
|
||||
When Emacs reads or writes a file, and when Emacs sends text to a
|
||||
subprocess or receives text from a subprocess, it normally performs
|
||||
character code conversion and end-of-line conversion as specified
|
||||
by a particular @dfn{coding system}.
|
||||
|
||||
@cindex character code conversion
|
||||
@dfn{Character code conversion} involves conversion between the encoding
|
||||
used inside Emacs and some other encoding. Emacs supports many
|
||||
different encodings, in that it can convert to and from them. For
|
||||
example, it can convert text to or from encodings such as Latin 1, Latin
|
||||
2, Latin 3, Latin 4, Latin 5, and several variants of ISO 2022. In some
|
||||
cases, Emacs supports several alternative encodings for the same
|
||||
characters; for example, there are three coding systems for the Cyrillic
|
||||
(Russian) alphabet: ISO, Alternativnyj, and KOI8.
|
||||
|
||||
@cindex end of line conversion
|
||||
@dfn{End of line conversion} handles three different conventions used
|
||||
on various systems for end of line. The Unix convention is to use the
|
||||
linefeed character (also called newline). The DOS convention is to use
|
||||
the two character sequence, carriage-return linefeed, at the end of a
|
||||
line. The Mac convention is to use just carriage-return.
|
||||
|
||||
Most coding systems specify a particular character code for
|
||||
conversion, but some of them leave this unspecified---to be chosen
|
||||
heuristically based on the data.
|
||||
|
||||
@cindex base coding system
|
||||
@cindex variant coding system
|
||||
@dfn{Base coding systems} such as @code{latin-1} leave the end-of-line
|
||||
conversion unspecified, to be chosen based on the data. @dfn{Variant
|
||||
coding systems} such as @code{latin-1-unix}, @code{latin-1-dos} and
|
||||
@code{latin-1-mac} specify the end-of-line conversion explicitly as
|
||||
well. Each base coding system has three corresponding variants whose
|
||||
names are formed by adding @samp{-unix}, @samp{-dos} and @samp{-mac}.
|
||||
|
||||
Here are Lisp facilities for working with coding systems;
|
||||
|
||||
@tindex coding-system-list
|
||||
@defun coding-system-list &optional base-only
|
||||
This function returns a list of all coding system names (symbols). If
|
||||
@var{base-only} is non-@code{nil}, the value includes only the
|
||||
base coding systems. Otherwise, it includes variant coding systems as well.
|
||||
@end defun
|
||||
|
||||
@tindex coding-system-p
|
||||
@defun coding-system-p object
|
||||
This function returns @code{t} if @var{object} is a coding system
|
||||
name.
|
||||
@end defun
|
||||
|
||||
@tindex check-coding-system
|
||||
@defun check-coding-system coding-system
|
||||
This function checks the validity of @var{coding-system}.
|
||||
If that is valid, it returns @var{coding-system}.
|
||||
Otherwise it signals an error with condition @code{coding-system-error}.
|
||||
@end defun
|
||||
|
||||
@tindex detect-coding-region
|
||||
@defun detect-coding-region start end highest
|
||||
This function chooses a plausible coding system for decoding the text
|
||||
from @var{start} to @var{end}. This text should be ``raw bytes''
|
||||
(@pxref{Specifying Coding Systems}).
|
||||
|
||||
Normally this function returns is a list of coding systems that could
|
||||
handle decoding the text that was scanned. They are listed in order of
|
||||
decreasing priority, based on the priority specified by the user with
|
||||
@code{prefer-coding-system}. But if @var{highest} is non-@code{nil},
|
||||
then the return value is just one coding system, the one that is highest
|
||||
in priority.
|
||||
@end defun
|
||||
|
||||
@tindex detect-coding-string string highest
|
||||
@defun detect-coding-string
|
||||
This function is like @code{detect-coding-region} except that it
|
||||
operates on the contents of @var{string} instead of bytes in the buffer.
|
||||
@end defun
|
||||
|
||||
@defun find-operation-coding-system operation &rest arguments
|
||||
This function returns the coding system to use (by default) for
|
||||
performing @var{operation} with @var{arguments}. The value has this
|
||||
form:
|
||||
|
||||
@example
|
||||
(@var{decoding-system} @var{encoding-system})
|
||||
@end example
|
||||
|
||||
The first element, @var{decoding-system}, is the coding system to use
|
||||
for decoding (in case @var{operation} does decoding), and
|
||||
@var{encoding-system} is the coding system for encoding (in case
|
||||
@var{operation} does encoding).
|
||||
|
||||
The argument @var{operation} should be an Emacs I/O primitive:
|
||||
@code{insert-file-contents}, @code{write-region}, @code{call-process},
|
||||
@code{call-process-region}, @code{start-process}, or
|
||||
@code{open-network-stream}.
|
||||
|
||||
The remaining arguments should be the same arguments that might be given
|
||||
to that I/O primitive. Depending on which primitive, one of those
|
||||
arguments is selected as the @dfn{target}. For example, if
|
||||
@var{operation} does file I/O, whichever argument specifies the file
|
||||
name is the target. For subprocess primitives, the process name is the
|
||||
target. For @code{open-network-stream}, the target is the service name
|
||||
or port number.
|
||||
|
||||
This function looks up the target in @code{file-coding-system-alist},
|
||||
@code{process-coding-system-alist}, or
|
||||
@code{network-coding-system-alist}, depending on @var{operation}.
|
||||
@xref{Default Coding Systems}.
|
||||
@end defun
|
||||
|
||||
@node Default Coding Systems
|
||||
@section Default Coding Systems
|
||||
|
||||
These variable specify which coding system to use by default for
|
||||
certain files or when running certain subprograms. The idea of these
|
||||
variables is that you set them once and for all to the defaults you
|
||||
want, and then do not change them again. To specify a particular coding
|
||||
system for a particular operation, don't change these variables;
|
||||
instead, override them using @code{coding-system-for-read} and
|
||||
@code{coding-system-for-write} (@pxref{Specifying Coding Systems}).
|
||||
|
||||
@tindex file-coding-system-alist
|
||||
@defvar file-coding-system-alist
|
||||
This variable is an alist that specifies the coding systems to use for
|
||||
reading and writing particular files. Each element has the form
|
||||
@code{(@var{pattern} . @var{coding})}, where @var{pattern} is a regular
|
||||
expression that matches certain file names. The element applies to file
|
||||
names that match @var{pattern}.
|
||||
|
||||
The @sc{cdr} of the element, @var{val}, should be either a coding
|
||||
system, a cons cell containing two coding systems, or a function symbol.
|
||||
If @var{val} is a coding system, that coding system is used for both
|
||||
reading the file and writing it. If @var{val} is a cons cell containing
|
||||
two coding systems, its @sc{car} specifies the coding system for
|
||||
decoding, and its @sc{cdr} specifies the coding system for encoding.
|
||||
|
||||
If @var{val} is a function symbol, the function must return a coding
|
||||
system or a cons cell containing two coding systems. This value is used
|
||||
as described above.
|
||||
@end defvar
|
||||
|
||||
@tindex process-coding-system-alist
|
||||
@defvar process-coding-system-alist
|
||||
This variable is an alist specifying which coding systems to use for a
|
||||
subprocess, depending on which program is running in the subprocess. It
|
||||
works like @code{file-coding-system-alist}, except that @var{pattern} is
|
||||
matched against the program name used to start the subprocess. The coding
|
||||
system or systems specified in this alist are used to initialize the
|
||||
coding systems used for I/O to the subprocess, but you can specify
|
||||
other coding systems later using @code{set-process-coding-system}.
|
||||
@end defvar
|
||||
|
||||
@tindex network-coding-system-alist
|
||||
@defvar network-coding-system-alist
|
||||
This variable is an alist that specifies the coding system to use for
|
||||
network streams. It works much like @code{file-coding-system-alist},
|
||||
with the difference that the @var{pattern} in an elemetn may be either a
|
||||
port number or a regular expression. If it is a regular expression, it
|
||||
is matched against the network service name used to open the network
|
||||
stream.
|
||||
@end defvar
|
||||
|
||||
@tindex default-process-coding-system
|
||||
@defvar default-process-coding-system
|
||||
This variable specifies the coding systems to use for subprocess (and
|
||||
network stream) input and output, when nothing else specifies what to
|
||||
do.
|
||||
|
||||
The value should be a cons cell of the form @code{(@var{output-coding}
|
||||
. @var{input-coding})}. Here @var{output-coding} applies to output to
|
||||
the subprocess, and @var{input-coding} applies to input from it.
|
||||
@end defvar
|
||||
|
||||
@node Specifying Coding Systems
|
||||
@section Specifying a Coding System for One Operation
|
||||
|
||||
You can specify the coding system for a specific operation by binding
|
||||
the variables @code{coding-system-for-read} and/or
|
||||
@code{coding-system-for-write}.
|
||||
|
||||
@tindex coding-system-for-read
|
||||
@defvar coding-system-for-read
|
||||
If this variable is non-@code{nil}, it specifies the coding system to
|
||||
use for reading a file, or for input from a synchronous subprocess.
|
||||
|
||||
It also applies to any asynchronous subprocess or network stream, but in
|
||||
a different way: the value of @code{coding-system-for-read} when you
|
||||
start the subprocess or open the network stream specifies the input
|
||||
decoding method for that subprocess or network stream. It remains in
|
||||
use for that subprocess or network stream unless and until overridden.
|
||||
|
||||
The right way to use this variable is to bind it with @code{let} for a
|
||||
specific I/O operation. Its global value is normally @code{nil}, and
|
||||
you should not globally set it to any other value. Here is an example
|
||||
of the right way to use the variable:
|
||||
|
||||
@example
|
||||
;; @r{Read the file with no character code conversion.}
|
||||
;; @r{Assume CRLF represents end-of-line.}
|
||||
(let ((coding-system-for-write 'emacs-mule-dos))
|
||||
(insert-file-contents filename))
|
||||
@end example
|
||||
|
||||
When its value is non-@code{nil}, @code{coding-system-for-read} takes
|
||||
precedence all other methods of specifying a coding system to use for
|
||||
input, including @code{file-coding-system-alist},
|
||||
@code{process-coding-system-alist} and
|
||||
@code{network-coding-system-alist}.
|
||||
@end defvar
|
||||
|
||||
@tindex coding-system-for-write
|
||||
@defvar coding-system-for-write
|
||||
This works much like @code{coding-system-for-read}, except that it
|
||||
applies to output rather than input. It affects writing to files,
|
||||
subprocesses, and net connections.
|
||||
|
||||
When a single operation does both input and output, as do
|
||||
@code{call-process-region} and @code{start-process}, both
|
||||
@code{coding-system-for-read} and @code{coding-system-for-write}
|
||||
affect it.
|
||||
@end defvar
|
||||
|
||||
@tindex last-coding-system-used
|
||||
@defvar last-coding-system-used
|
||||
All operations that use a coding system set this variable
|
||||
to the coding system name that was used.
|
||||
@end defvar
|
||||
|
||||
@tindex inhibit-eol-conversion
|
||||
@defvar inhibit-eol-conversion
|
||||
When this variable is non-@code{nil}, no end-of-line conversion is done,
|
||||
no matter which coding system is specified. This applies to all the
|
||||
Emacs I/O and subprocess primitives, and to the explicit encoding and
|
||||
decoding functions (@pxref{Explicit Encoding}).
|
||||
@end defvar
|
||||
|
||||
@tindex keyboard-coding-system
|
||||
@defun keyboard-coding-system
|
||||
This function returns the coding system that is in use for decoding
|
||||
keyboard input---or @code{nil} if no coding system is to be used.
|
||||
@end defun
|
||||
|
||||
@tindex set-keyboard-coding-system
|
||||
@defun set-keyboard-coding-system coding-system
|
||||
This function specifies @var{coding-system} as the coding system to
|
||||
use for decoding keyboard input. If @var{coding-system} is @code{nil},
|
||||
that means do not decode keyboard input.
|
||||
@end defun
|
||||
|
||||
@tindex terminal-coding-system
|
||||
@defun terminal-coding-system
|
||||
This function returns the coding system that is in use for encoding
|
||||
terminal output---or @code{nil} for no encoding.
|
||||
@end defun
|
||||
|
||||
@tindex set-terminal-coding-system
|
||||
@defun set-terminal-coding-system coding-system
|
||||
This function specifies @var{coding-system} as the coding system to use
|
||||
for encoding terminal output. If @var{coding-system} is @code{nil},
|
||||
that means do not encode terminal output.
|
||||
@end defun
|
||||
|
||||
See also the functions @code{process-coding-system} and
|
||||
@code{set-process-coding-system}. @xref{Process Information}.
|
||||
|
||||
See also @code{read-coding-system} in @ref{High-Level Completion}.
|
||||
|
||||
@node Explicit Encoding
|
||||
@section Explicit Encoding and Decoding
|
||||
@cindex encoding text
|
||||
@cindex decoding text
|
||||
|
||||
All the operations that transfer text in and out of Emacs have the
|
||||
ability to use a coding system to encode or decode the text.
|
||||
You can also explicitly encode and decode text using the functions
|
||||
in this section.
|
||||
|
||||
@cindex raw bytes
|
||||
The result of encoding, and the input to decoding, are not ordinary
|
||||
text. They are ``raw bytes''---bytes that represent text in the same
|
||||
way that an external file would. When a buffer contains raw bytes, it
|
||||
is most natural to mark that buffer as using unibyte representation,
|
||||
using @code{set-buffer-multibyte} (@pxref{Selecting a Representation}),
|
||||
but this is not required.
|
||||
|
||||
The usual way to get raw bytes in a buffer, for explicit decoding, is
|
||||
to read them with from a file with @code{insert-file-contents-literally}
|
||||
(@pxref{Reading from Files}) or specify a non-@code{nil} @var{rawfile}
|
||||
arguments when visiting a file with @code{find-file-noselect}.
|
||||
|
||||
The usual way to use the raw bytes that result from explicitly
|
||||
encoding text is to copy them to a file or process---for example, to
|
||||
write it with @code{write-region} (@pxref{Writing to Files}), and
|
||||
suppress encoding for that @code{write-region} call by binding
|
||||
@code{coding-system-for-write} to @code{no-conversion}.
|
||||
|
||||
@tindex encode-coding-region
|
||||
@defun encode-coding-region start end coding-system
|
||||
This function encodes the text from @var{start} to @var{end} according
|
||||
to coding system @var{coding-system}. The encoded text replaces
|
||||
the original text in the buffer. The result of encoding is
|
||||
``raw bytes.''
|
||||
@end defun
|
||||
|
||||
@tindex encode-coding-string
|
||||
@defun encode-coding-string string coding-system
|
||||
This function encodes the text in @var{string} according to coding
|
||||
system @var{coding-system}. It returns a new string containing the
|
||||
encoded text. The result of encoding is ``raw bytes.''
|
||||
@end defun
|
||||
|
||||
@tindex decode-coding-region
|
||||
@defun decode-coding-region start end coding-system
|
||||
This function decodes the text from @var{start} to @var{end} according
|
||||
to coding system @var{coding-system}. The decoded text replaces the
|
||||
original text in the buffer. To make explicit decoding useful, the text
|
||||
before decoding ought to be ``raw bytes.''
|
||||
@end defun
|
||||
|
||||
@tindex decode-coding-string
|
||||
@defun decode-coding-string string coding-system
|
||||
This function decodes the text in @var{string} according to coding
|
||||
system @var{coding-system}. It returns a new string containing the
|
||||
decoded text. To make explicit decoding useful, the contents of
|
||||
@var{string} ought to be ``raw bytes.''
|
||||
@end defun
|
Loading…
Add table
Reference in a new issue