Cleanup Issue PATHNAME-CANONICAL-TYPE

Status
For Internal Discussion
Category
ADDITION
References
MAKE-PATHNAME (p416)
Related issues
PATHNAME-COMPONENT-CASE

Problem Description

The pathame-type of ``Lisp'' and ``Compiled Lisp'' files vary widely from implementation to implementation.

"LSP" is common on Vax VMS. "lisp" is generally used for the Symbolics file system. "l" and "lisp" are common on Unix. Some Lisp implementations use customized extensions such as "cl" or even "jcl" (eg, for "Joe's CL").

It would be useful to probe the existence of either a source or a binary file, but that cannot currently be done portably. Furthermore, it would be useful to create certain standard kinds of files in a system-independent fashion.

A common desire, for example, is to do (DEFUN FILE-NEEDS-TO-BE-COMPILED (FILE) (LET ((SOURCE (PROBE-FILE (MERGE-PATHNAMES FILE (MAKE-PATHNAME :TYPE ???)))) (BINARY (PROBE-FILE (MERGE-PATHNAMES FILE (MAKE-PATHNAME :TYPE ???))))) ... (FILE-WRITE-DATE SOURCE) ... (FILE-WRITE-DATE BINARY) ...)) The problem is that there's nothing portable to put in the ??? positions.

Indeed, depending on the host (ie, file system) of the pathname, the type might need to differ even in the same Lisp implementation. For example, Symbolics Genera stores its source files in names like "foo.l" on Unix, "FOO.LSP" on VMS, etc.

Proposal (NEW-CONCEPT)

In addition to the normal strings and keywords currently allowed as fillers of the TYPE field of a pathname, allow other keywords which designate ``canonical types''.

A canonical type is translated to a real type by MAKE-PATHNAME so that the (PATHNAME-TYPE (MAKE-PATHNAME :TYPE canonical-type)) is a string.

  Introduce a new function PATHNAME-CANONICAL-TYPE which returns the canonical
  type of an argument pathname, or the type if there is no canonical type.
  For example,
    (PATHNAME-CANONICAL-TYPE (MAKE-PATHNAME :TYPE :LISP)) => :LISP
  [This information may be explicitly represented as an additional slot, or
  computed on demand using a lookup table, as the implementor prefers.]

Define the following standard types: :LISP ``Lisp'' (source) file :BIN ``Compiled Lisp'' (object) file Permit implementations to extend the set of canonical type names.

Test Cases

  (PATHNAME-TYPE (MAKE-PATHNAME :TYPE :LISP))
   => "LSP" 	    ;Typically, on VMS
   => "l" or "lisp" ;Typically, on Unix
   => "L" or "LISP" ;Typically, on Unix 
		    ; (assuming PATHNAME-COMPONENT-CASE:CANONICALIZE adopted)
   ..etc.

  (PATHNAME-TYPE (MAKE-PATHNAME :TYPE :BIN))
   => "FAS" 	    ;eg, VAXLISP
   => "BIN"	    ;eg, Symbolics file system
   ...etc.

  (PATHNAME-CANONICAL-TYPE (MAKE-PATHNAME :TYPE :LISP)) => :LISP

  (PATHNAME-CANONICAL-TYPE (MAKE-PATHNAME :TYPE "LSP"))
   => :LISP	    ;eg, VAXLISP
   => "LSP"	    ;eg, Unix

Rationale

This is a useful subset of the functionality already available in at least one implementation.

Current Practice

Symbolics Genera implements this proposal.

Cost to Implementors

The cost of implementing these proposed features is very slightly.

MAKE-PATHNAME would have to change to coerce its :TYPE argument in implementations where it does not do so already. PATHNAME-CANONICAL-TYPE can be implemented as a fairly straightforward lookup.

Cost to Users

None. This change is upward compatible.

Cost of Non-Adoption

It would continue to be hard to portably name files when their types differed from file system to file system.

Benefits

The cost of non-adoption would be avoided.

Aesthetics

Some programs would be able to abstract away from the particulars of the host file system entirely. Some people believe this would be a definite improvement in aesthetics.

Discussion

Note that different Lisp implementations which share the same file system, need not and perhaps should not agree on the same type string for the canonical type :BIN. That is, if I store source files on VAX VMS and compile them both for use under Symbolics Genera and VAXLISP, then it is both appropriate and useful that VAXLISP :BIN files be named "something.FAS" and Genera :BIN files be named "something.BIN" since then they wouldn't clobber each other.

Pitman supports PATHNAME-CANONICAL-TYPE:NEW-CONCEPT.

-------

Summary of discussion on CL-Cleanup:

GZ suggested :COMPILED-LISP was suggested as a better name than :BIN. Masinter thought :SOURCE-LISP might be better than :LISP. Either of these would be gratuitously incompatible with Symbolics Genera, which already implements canonical types, but otherwise not technically unreasonable and probably something we should discuss.

Sandra Loosemore offered the following revealing piece of code from her work and asked why we couldn't just do this.

(defvar *binary-file-type* #+Symbolics (make-pathname :type "bin") #+(and dec common vax (not ultrix)) (make-pathname :type "FAS") #+(and dec common vax ultrix) (make-pathname :type "fas") #+pcls (make-pathname :type "b") #+KCL (make-pathname :type "o") #+Xerox (make-pathname :type "dfasl") #+(and Lucid MC68000) (make-pathname :type "lbin") #+(and Lucid VAX VMS) (make-pathname :type "vbin") #+excl (make-pathname :type "fasl") #+system::cmu (make-pathname :type "sfasl") #+PRIME (make-pathname :type "pbin") #+HP (make-pathname :type "b") #+TI (make-pathname :type "xfasl") "The default file type for compiled files.")

The reason is that some implementations (e.g., Symbolics) deal with more than one file system type -- and properly the information varies with the file system type, not with the implementations. [Since most implementations have only one associated file system type, this may not be obvious, but it's quite obvious on a Symbolics machine that you vary the extension name based on the host file system requirements.]

Moon suggested a compromise where *compile-file-output-type* (his name for Sandra's *binary-file-type*) existed but could be either a canonical type or a physical type.

Masinter worries about the PATHNAME-CANONICAL-TYPE part of the proposal being forced to be heuristic in some cases. [Will any alternative be any less heuristic? -kmp]

 Moon wanted the following example to be guaranteed to work:
   (PATHNAME-CANONICAL-TYPE (PATHNAME "foo.lisp")) => :LISP
 where of course the string is implementation-dependent.  That is,
 PATHNAME-CANONICAL-TYPE must produce a canonical type even when the
 pathname was not constructed from a canonical type, but instead came
 from user typein, the TRUENAME function, the DIRECTORY function,
 or some similar source, when the pathname's type is one that a
 canonical type maps into.

Moon also thought it would be nice to have a facility for users (in addition to implementations) to extend the set of canonical type names, since users may well have their own types of files. However, he admitted that the difficulty is that in any system that supports multiple file systems, it has to be complex enough to allow specification of separate mappings for each file system, which in turn requires a way to name file system types. [At this point, we probably don't have time left in our schedule to produce such a facility. -kmp]

Edit History