Cleanup Issue PATHNAME-COMPONENT-CASE

Status
passed, as amended here, Jun 89 X3J13
Category
CHANGE
References
Pathnames (pp410-413), MAKE-PATHNAME (p416), PATHNAME-HOST (p417), PATHNAME-DEVICE (p417), PATHNAME-DIRECTORY (p417), PATHNAME-NAME (p417), PATHNAME-TYPE (p417)
Related issues
PATHNAME-WILD

Problem Description

Issues of alphabetic case in pathnames are a major source of problems. In some file systems, the customary case is lowercase, in some uppercase, in some mixed. In some file systems, case matters, in others it does not.

There are two kinds of pathname case portability problems: moving programs from one Common Lisp to another, and moving pathname component values from one file system to another. To solve the first problem, all Common Lisp implementations that support a particular file system must use compatible representations for pathname component values. To solve the second problem, there must be a common representation for the least common denominator pathname component values that exist on all interesting file systems.

This desire for a common representation directly conflicts with the desire among programmers who only use one file system to work with the local conventions and not think about issues of porting to other file systems. The common representation cannot be the same as every local convention, since they vary.

In the current anarchy of pathname component case conventions:

(NAMESTRING (MAKE-PATHNAME :NAME "FOO" :TYPE "LISP")) will produce foo.lisp in some Unix Common Lisp implementations and will produce FOO.LISP in other Unix Common Lisp implementations.

(NAMESTRING (MAKE-PATHNAME :NAME "foo" :TYPE "lisp")) will produce FOO.LISP in some Tops-20 Common Lisp implementations and will produce "^Vf^Vo^Vo.^Vl^Vi^Vs^Vp"in other Tops-20 Common Lisp implementations.

Problems like this make it difficult to use MAKE-PATHNAME for much of anything without corrective (non-portable) code.

Other problems occur in merging because doing (NAMESTRING (MERGE-PATHNAMES (MAKE-PATHNAME :HOST "MY-TOPS-20" :NAME "FOO") (PARSE-NAMESTRING "MY-UNIX:x.lisp"))) should probably return "MY-TOPS-20:FOO.LISP" but in fact might return "MY-TOPS-20:FOO.^Vl^Vi^Vs^Vp" in some implementations.

Problems like this make it difficult to use any merging primitives for much of anything without corrective (non-portable) code.

Proposal (KEYWORD-ARGUMENT)

Add a keyword argument :CASE to MAKE-PATHNAME, PATHNAME-HOST, PATHNAME-DEVICE, PATHNAME-DIRECTORY, PATHNAME-NAME, and PATHNAME-TYPE. The possible values for the argument are :COMMON and :LOCAL.

:LOCAL means strings input to MAKE-PATHNAME or output by PATHNAME-xxx follow the local file system's conventions for alphabetic case. Strings given to MAKE-PATHNAME will be used exactly as written if the file system supports both cases. If the file system only supports one case, the strings will be translated to that case.

:COMMON means strings input to MAKE-PATHNAME or output by PATHNAME-xxx follow this common convention: - all uppercase means to use a file system's customary case. - all lowercase means to use the opposite of the customary case. - mixed case represents itself. The second and third bullets exist so that translation from local to common and back to local is information-preserving.

The default is :LOCAL.

Namestrings always use local file system case conventions.

MERGE-PATHNAMES and TRANSLATE-WILD-PATHNAME map customary case in the input pathnames into customary case in the output pathname.

Implications of the proposal:

Unix is case-sensitive and prefers lowercase, so it translates between common and local by inverting the case of non-mixed-case strings.

Tops-20 is case-sensitive and prefers uppercase, so it uses identical representations for common and local.

VAX/VMS is upper-case-only (that is, the file system translates all file name arguments to upper case), so it translates common to local by upcasing, and translates local to common with no change.

Macintosh is case-insensitive and prefers lowercase, so it translates between common and local by inverting the case of non-mixed-case strings, and ignores case in EQUAL of two pathnames.

Test Cases/Examples

  (PATHNAME-NAME (PARSE-NAMESTRING "MY-UNIX:/me/foo.lisp")
                 :CASE :COMMON)                                 => "FOO"
  (PATHNAME-NAME (PARSE-NAMESTRING "MY-TOPS-20:<ME>FOO.LISP")
                 :CASE :COMMON)                                 => "FOO"
  (PATHNAME-NAME (PARSE-NAMESTRING "MY-UNIX:/me/foo.lisp")
                 :CASE :LOCAL)                                  => "foo"
  (PATHNAME-NAME (PARSE-NAMESTRING "MY-TOPS-20:<ME>FOO.LISP")
                 :CASE :LOCAL)                                  => "FOO"
  (PATHNAME-NAME (PARSE-NAMESTRING "MY-UNIX:/me/TeX.lisp")
                 :CASE :COMMON)                                 => "TeX"
  (PATHNAME-NAME (PARSE-NAMESTRING "MY-UNIX:/me/TeX.lisp")
                 :CASE :LOCAL)                                  => "TeX"
  (NAMESTRING (MAKE-PATHNAME :HOST "MY-UNIX" :NAME "FOO"
                             :CASE :COMMON)                     => "MY-UNIX:foo"

Rationale

This does not solve the whole pathname problem, but it does improve the situation for a clearly defined set of very common problems. Together with the other pathname proposals, the behavior of pathnames should be sufficiently consistent across Common Lisp implementations and across file systems to allow portability of pathname-manipulating programs.

The current situation where different implementations talk about the *same* file system in different ways will be corrected by this and some of the other pathname proposals.

Upper case is chosen as the common case for no better reason than consistency with Lisp symbols.

The :CASE keyword argument provides access to both common and local conventions without introducing any new functions. The default convention is the common one, assuming that most programs are fully portable and therefore :COMMON will be more frequently used.

Current Practice

There are no known implementations of exactly what is proposed. Symbolics Genera uses common case normally, and provides a way to access the local case (called "raw") that in practice is rarely used. Symbolics Genera's own file system is case-insensitive and uses lower case as the customary case, but transparent network access is available to file systems using all known case conventions.

Several Common Lisp implementations behave as if :CASE :LOCAL was specified (but accept no :CASE argument).

Cost to Implementors

The :CASE feature is easily added, but some implementations may have to change the default behavior when :CASE is not specified. No implementation need change its internal representation, nor the way pathnames print, just the interface functions listed above.

Cost to Users

Technically, this change is upward compatible.

In fact, since the existing CLtL spec is so poor, nearly everyone relies heavily on implementation-specific behavior since there is little other choice. As such, any change is almost certain to break lots of programs, in usually superficial but nevertheless important ways. However, if we really make the pathname facility more portable, the user community may be willing to bear the consequences of these changes.

Cost of Non-Adoption

We would be contributing to the perpetuation of the existing fiasco of a pathname system.

Performance Impact

None.

Benefits

One step closer to a usable pathname system.

Aesthetics

Anything that simplifies the user model of pathnames is an improvement.

Discussion

Some people would rather use lowercase as the common case. The decision is essentially arbitrary. Everywhere else in Common Lisp where case matters, uppercase was chosen.

It has been proposed that the Common Lisp specification should include specifications of the exact behavior of pathnames for several popular operating systems, so that multiple implementations for those operating systems would be compatible with each other. This proposal does that for alphabetic case.

Some people want the default for :CASE to be :LOCAL instead of :COMMON. See Rationale.

There should probably be a remark somewhere that says that portable programs shouldn't expect to be able to create and/or access distinct files whose pathname components differ only in case.

Edit History