Why directories?

May 8th, 2007

I had this crazy idea the other day that I can’t get out of my head. It’s sort of Copernican revolutionist and perhaps plain wrong so I don’t blame you if you disagree.

I was thinking it might be for the best if we could scrap directories in the filesystem. A regular file would then be linked to children, along with it’s stream. In other words, merge directories and files into one type. I suppose Dennis Ritchie thought about these things and implemented directories as a separate type from regular files because they ARE a separate type. Regular files are streams of data. Directories are a list of files, both regular and not. I should thank my lucky stars that the two are polymorphic enough to be listed along side each other when I type ls.

From a user perspective though it is sort of bothersome. I remember an old version of Word asking me if I wanted to link to a picture or include the picture data in file. Now, if Word files were implemented as directories we wouldn’t need to let the application worry about such things (assuming FAT had symlinks). I could cd mywordfile.doc and inspect the contents to see for myself whether or not it contained images or links to images. Database implementations almost always use a directory per database; why not every other program? Usually I want to create a rich (as in treelike) document, not a stream, but the application saves my data as a stream. Shouldn’t the OS handle this?

Positives of blurring regular files and directories

  • One less first class data-type. Our filesystems now have at least three; regular files, directories, and sym-links. Hardlinks are not really first class. What I am proposing is to scrap the concept of directories and hardlink files in a hierarchical manner. I would never want to lose symlinks.
  • No more non-standard formats for embedded images, etc.
  • No more tar. One less step to do before or after transporting data.
  • Less multipart-mixed sections in email because we could attach one file that contained the others.
  • Less XML on disk, more on the wire. Perhaps if it were less of a chore to transmit a directory people would use them more often. We would see configuration files, hierarchical in nature like MS registry files, that use the OS instead of XML.

Negatives and Unknowns

  • We can already accomplish this by using a convention like index.html, where the directory is made to contain data of it’s own as if it where a regular file.
  • The transport problem is largely solved by tar, and I do mean largely. tar—help is 244 lines. Maybe email clients should open tar files instead of expecting us to include all the components of a document via multipart-mixed.
  • Possibly lots of small files eating up blocks.
  • Large number of open file handles.
  • How would file permissions work? Same as they do now?
  • Dennis Ritchie was probably right with our current implementation. The best way to avoid a leaky abstraction is to avoid abstraction. Directories and regular files are different enough that they warrant exposing different types to the user.

Sorry, comments are closed for this article.