CHARACTER-PROPOSALThis is KMP's plain-text transcription of the issues which comprise the Character Proposal. This isn't what was voted on, but it may be easier to use than the one that was, since it's not full of TeX codes. ================================================================================
Proposal 2.0.1: [Passed 03/89]
The terminology introduced in this proposal will be included in the language specification at the discretion of the editor.
Proposal 2.1.1: [Alternative A, Passed as Modified 03/89]
Remove all discussion of attributes from the language specification. Add the following discussion: ``Earlier versions of Common LISP incorporated FONT and BITS as attributes of character objects. These and other supported attributes are considered implementation-defined attributes and if supported by an implementation effect the action of selected functions.'' All types, constants and functions dealing with the BITS and FONT attributes are either removed or modified as follows:
* Modify CHAR-=: If two characters differ in any implementation-defined attributes, then they are not CHAR-=.
* Modify CHAR-<: If two characters have identical implementation-defined attributes, then their ordering by CHAR< is consistent with the numerical ordering by the predicate < on their code. (Similarly for CHAR>, CHAR>= and CHAR<=.)
* Modify CHAR-EQUAL: The effect, if any, on CHAR-EQUAL of each implementation-defined attribute has to be specified as part of the definition of that attribute (and similarly for CHAR-NOT-EQUAL, CHAR-LESSP, CHAR-GREATERP, CHAR-NOT-GREATERP, CHAR-NOT-LESSP).
* Modify CHAR-UPCASE and CHAR-DOWNCASE: The effect of CHAR-UPCASE and CHAR-DOWNCASE is to preserve implementation-defined attributes.
* Modify READ: It is implementation dependent which attributes are removed from symbol names. It is implementation dependent which attributes are removed from characters within double quotes.
* Modify INTERN: It is implementation dependent which implementation-defined attributes are removed.
* Modify DIGIT-CHAR: remove the optional FONT argument.
* Modify CODE-CHAR: remove the optional FONT and BIT arguments.
* Remove CHAR-FONT-LIMIT
* Remove CHAR-BITS-LIMIT
* Remove INT-CHAR
* Remove CHAR-INT <<This removal is later rescinded by 2.1.2. See below. -kmp 2-Aug-89>>
* Remove CHAR-BITS
* Remove CHAR-FONT
* Remove MAKE-CHAR
* Remove CHAR-CONTROL-BIT
* Remove CHAR-META-BIT
* Remove CHAR-SUPER-BIT
* Remove CHAR-HYPER-BIT
* Remove CHAR-BIT
* Remove SET-CHAR-BIT
* Remove STRING-CHAR and STRING-CHAR-P
* Modify readtable: If implementation-defined attributes are supported, an implementation need not (but may) allow for such characters to have syntax descriptions in the readtable. Otherwise, all characters are representable in the readtable.
Proposal 2.1.2: [Alternative B, Passed as Modified 03/89]
This is identical to all of Alternative A (above) except that the function CHAR-INT is retained. CHAR-INT returns a non-negative integer encoding the character object. The manner in which the integer is computed is implementation dependent. In contrast to SXHASH, the result is not guaranteed independent of the particular "incarnation" or "core image".
Proposal 2.2.1: [Passed 03/89]
The discussion of standard characters is replaced by the following:
Common LISP requires all implementations to support and document a STANDARD character subrepertoire. The Common LISP standard character subrepertoire consists of a newline, #\Newline; the graphic space character #\Space; and the following additional ninety-four graphic characters or their equivalents:
Note: #\Space and #\Newline are omitted. Graphic labels and descriptions are from ISO 6937/2. The first letter of the graphic Id categorizes the character as follows: L - Latin, N - Numeric, S - Special.
Id Glyph Name or description Id Glyph Name or description
LA01 a small a ND01 1 digit 1 LA02 A capital A ND02 2 digit 2 LB01 b small b ND03 3 digit 3 LB02 B capital B ND04 4 digit 4 LC01 c small c ND05 5 digit 5 LC02 C capital C ND06 6 digit 6 LD01 d small d ND07 7 digit 7 LD02 D capital D ND08 8 digit 8 LE01 e small e ND09 9 digit 9 LE02 E capital E ND10 0 digit 0 LF01 f small f SC03 $ dollar sign LF02 F capital F SP02 ! exclamation mark LG01 g small g SP04 " quotation mark LG02 G capital G SP05 ' apostrophe LH01 h small h SP06 ( left parenthesis LH02 H capital H SP07 ) right parenthesis LI01 i small i SP08 , comma LI02 I capital I SP09 _ low line LJ01 j small j SP10 - hyphen or minus sign LJ02 J capital J SP11 . full stop, period LK01 k small k SP12 / solidus LK02 K capital K SP13 : colon LL01 l small l SP14 ; semicolon LL02 L capital L SP15 ? question mark LM01 m small m SA01 + plus sign LM02 M capital M SA03 < less-than sign LN01 n small n SA04 = equals sign LN02 N capital N SA05 > greater-than sign LO01 o small o SM01 # number sign LO02 O capital O SM02 % percent sign LP01 p small p SM03 & ampersand LP02 P capital P SM04 * asterisk LQ01 q small q SM05 @ commercial at LQ02 Q capital Q SM06 [ left square bracket LR01 r small r SM07 \ reverse solidus LR02 R capital R SM08 ] right square bracket LS01 s small s SM11 { left curly bracket LS02 S capital S SM13 | vertical bar LT01 t small t SM14 } right curly bracket LT02 T capital T SD13 ` grave accent LU01 u small u SD15 ^ circumflex accent LU02 U capital U SD19 ~ tilde LV01 v small v LV02 V capital V LW01 w small w LW02 W capital W LX01 x small x LX02 X capital X LY01 y small y LY02 Y capital Y LZ01 z small z LZ02 Z capital Z
Proposal 2.3.1: [Passed as Modified 03/89]
The following type definitions are added:
Define BASE-CHARACTER as (UPGRADED-ARRAY-ELEMENT-TYPE 'STANDARD-CHAR) and EXTENDED-CHARACTER as type (AND CHARACTER (NOT BASE-CHARACTER)). Characters of type BASE-CHARACTER are referred to as ``base characters''. Characters of type EXTENDED-CHARACTER are referred to as ``extended characters.''
Proposal 2.3.2: [Passed 03/89]
The STRING type is defined as a union type. More precisely, a string is a specialized vector whose elements are of type CHARACTER or a subtype of CHARACTER. STRING used as a type specifier for object creation means (VECTOR CHARACTER).
Proposal 2.3.3: [Passed as Modified 03/89]
The following string subtypes are distinguished with standardized names. * BASE-STRING is equivalent to (VECTOR BASE-CHARACTER). Strings of type BASE-STRING are referred to as ``base strings.'' * BASE-STRING is valid as a type specifier that abbreviates.
Proposal 2.3.4: [Passed as Modified 03/89]
Define SIMPLE-STRING as a union type. A simple string is a specialized simple one dimensional array whose elements are of type CHARACTER or a subtype of CHARACTER. SIMPLE-STRING used as a type specifier for object creation means (SIMPLE-ARRAY CHARACTER size).
Proposal 2.3.5: [Passed as Modified 03/89]
The following simple string subtypes are distinguished with standardized names: * SIMPLE-BASE-STRING is equivalent to (SIMPLE-ARRAY BASE-CHARACTER (*)). SIMPLE-BASE-STRING is a subtype of BASE-STRING. * SIMPLE-BASE-STRING is valid as a type specifier that abbreviates.
Proposal 2.3.6: [Passed 03/89]
Extend the MAKE-STRING function to allow an ELEMENT-TYPE keyword argument: * MAKE-STRING size &KEY :initial-element :element-type [Function] This returns a simple string of length SIZE, each of whose characters has been initialized to the :INITIAL-ELEMENT argument. If an :INITIAL-ELEMENT argument is not specified, then the string will be initialized in an implementation-dependent way. The :ELEMENT-TYPE argument names the type of the elements of the string; a string is constructed of the most specialized type that can accommodate elements of the given type. If :ELEMENT-TYPE is omitted, the type CHARACTER is the default.
Proposal 2.4.1: [Passed 03/89]
Common LISP character codes are composed from a character script and a character label. The convention by which a character label and character script compose a character code is implementation dependent.
Proposal 2.4.2: [Passed as Modified 06/89]
An implementation must document the scripts it supports. For each script supported the documentation must include at least the following:
* Character Labels, Glyphs, and Descriptions. Character labels must be uniquely named using only Latin capital letters A-Z, hyphen and digits 0-9.
* Effect of CHAR-UPCASE and CHAR-DOWNCASE.
* Reader canonicalization and format directives. Note: Any mechanisms by which the READ function treats distinct characters as equivalent.
* Effect of character predicates. In particular, - CHAR-EQUAL and other case-insensitive character predicates. - ALPHA-CHAR-P - LOWER-CASE-P - UPPER-CASE-P - BOTH-CASE-P - GRAPHIC-CHAR-P - ALPHANUMERICP
* Interaction with File I/O. In particular, the coded character sets (e.g., ISO8859/1-1987) and external encoding schemes supported are documented.
Proposal 2.4.3: [Passed as Modified 06/89]
Every character repertoire name is a type specifier and a subtype of type CHARACTER.
Proposal 2.5.2: [Passed as Modified 06/89]
Add an additional keyword argument to OPEN and a new function to query external file format:
* :EXTERNAL-FORMAT keyword argument on OPEN which specifies an implementation recognized scheme for representing characters in files.
The default value is :DEFAULT and is implementation defined but must support the base characters.
If the argument is not recognized by the implementation, an error is signalled. This argument is provided for input, output, and bidirectional streams. It is an error to write a character which cannot be represented using the given file format. (This excludes the #\Newline character. Implementations must provide appropriate line division behavior for all character streams.)
* STREAM-EXTERNAL-FORMAT stream [Function]
STREAM-EXTERNAL-FORMAT returns the implementation recognized format of the specified file.
Proposal 2.5.4: [Alternative A, Passed 06/89]
The default for the :ELEMENT-TYPE argument of OPEN is CHARACTER.
Proposal 2.5.6: [Passed as Modified 06/89]
Modify the following functions:
* WITH-OUTPUT-TO-STRING. A new keyword argument :ELEMENT-TYPE is added which defaults to CHARACTER. If a string argument is provided, the :ELEMENT-TYPE argument is ignored. A string argument of NIL means no initial string argument is provided. If no string argument is provided, produces a stream that accepts all characters of the indicated type and returns a string of the indicated element type.
* MAKE-STRING-OUTPUT-STREAM. A new keyword argument :ELEMENT-TYPE is added which defaults to CHARACTER. MAKE-STRING-OUTPUT-STREAM returns an output stream that accepts all characters of the indicated type and returns (via GET-OUTPUT-STREAM-STRING) a string of the indicated type.
Proposal 2.5.7: [Passed as Modified 06/89]
Add the following function: * FILE-STRING-LENGTH file-stream object [Function]
FILE-STRING-LENGTH returns a non-negative integer which represents the difference between what (FILE-POSITION file-stream) would be after writing the OBJECT and its current value, or NIL if this cannot be determined. OBJECT must be a string or character.
This return value depends on the current state of the stream, that is, two calls to FILE-STRING-LENGTH with the same stream and object may return different values.
Misc effects on CLtL...
Proposal 2.6.1: [Passed 03/89]
Chapter 2 Data Types (Page 12)
Replace: provides for a rich character set, including ways to represent characters of various type styles. with: provides support for international language characters as well as characters used in specialized arenas, eg. mathematics.
Proposal 2.6.2: [Passed as Modified 03/89]
Chapter 2 Symbols (Page 25)
Clarify: A symbol may have any character in its print name.
Proposal 2.6.3: [Passed 03/89]
Chapter 10 Symbols (Page 163)
Replace: It is ordinarily not permitted to alter a symbol's print name. with: It is an error to alter a symbol's print name.
Proposal 2.6.4: [Passed 03/89]
Chapter 10 The Print Name (Page 168)
Replace: It is an extremely bad idea to modify a string being used as the print name of a symbol. with: It is an error to modify a string being used as the print name of a symbol.
Proposal 2.6.5: [Passed 03/89]
Chapter 14 Simple Sequence Functions (Page 249, make-sequence)
Append: If type STRING is specified, the result is equivalent to MAKE-STRING.