Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Readme.txt vs. README.txt (2015) (softwareengineering.stackexchange.com)
139 points by joshcsimmons on April 11, 2024 | hide | past | favorite | 88 comments


The degree to which this is so pervasive and standard--so much so that Github renders your README by default--is kind of an artifact of the open source movement. Which also brought a lot of similar standardized files, as the need came up (LICENSE, COPYING, INSTALL, CONTRIBUTING, etc.).

During the sharing, hacking, free software period it was a lot less standard and I saw plenty of alternate casing for README and "non-standard" spellings like 00README, Readme_first, etc. When everyone started sharing their open source projects in a standard way (Github and its ilk, along with the rise of CTAN/CPAN-like registries for all kind software) the notion of the standard README got a lot more... standard.


Isn't the urban legend around this that it mimics the "EAT ME" and "DRINK ME" messages from Alice in Wonderland by Lewis Carroll?

The missing space is probably from UNIX preferences.


DOS era software sometimes had it in the form of "READ.ME".

Note the assumption that any real user is able to read text files after classifying them as such no matter what their “extension” is. Also seen in countless "ORDER.FRM", "EXAMPLE.001", "EXAMPLE.002", and so on, not to mention all the ".DOC" files which had nothing to do with Microsoft Word.


back then 8+3 was a limitation on filenames.

8 for the "name" and 3 for the extension. ah, the lagacy of bit-counting when hardware was THE limitation on software


README has 6 letters though, so README or README.TXT would have been perfectly OK.

READ.ME just looks cool, same thing as with internet domains.

Edit: I even remember seeing this somewhere as a child and admiring how clever it was (Commander Keen maybe?).

And the capitalization thing in this case is of course also a DOS thing (as opposed to the newer traditions pointed out by another comment.

Tangent: on DOS and Windows, filenames without a file extension suffix were just weird. I mostly expected them to contain binary data :D


> READ.ME just looks cool, same thing as with internet domains.

http://read.me/ I'm not interested, but it definitely looks cool


Out of necessity, ad-hoc metadata formats appeared quite soon.

Say, you're connecting to BBS with A LOT of files. Unless you already know that "SEX459W.ZIP" and "SEX454.ZIP" are Windows and DOS versions of (fictitious) “Super Extractor 4.5”, you'll spend a lot of time figuring it out. People who pay no attention to file categories can also get intrigued without a cause. Therefore, full names and descriptions were stored in sidecar files, and were processed to form complete file listings (to download and study offline). Sometimes operators personally reviewed the software, adding interesting opinions about users who had it on their computers, sometimes those were simple excerpts from release notes.

In some cases, the metadata was automatically appended to archives (as standard comments). Later, in the era of Rich Formats, WinZip even allowed arbitrary HTML in ZIP comments, and automatically loaded them into IE frame instead of regular text box when opening such archive. Obviously, that novelty didn't last long.


You just sparked a memory of FILE_ID.DIZ, a standard file included in zip archives during the BBS era.

Pretty sure the .DIZ stood for “Description In Zip”


Now we have almost the opposite in a lot of situations - software feels like it's holding hardware that's faster than ever back. See the comparison of Windows 2000 vs. Windows 11


Found this web site which feels old, and describes this legend.

http://catb.org/~esr/jargon/html/R/README-file.html

In fact, the revision history last shows an update in 2003.

http://catb.org/~esr/jargon/html/revision-history.html


> Found this web site which feels old, and describes this legend.

The web site is old, but its merely the current incarnation of something much older: “The Jargon File (hereafter referred to as ‘jargon-1’ or ‘the File’) was begun by Raphael Finkel at Stanford in 1975.”

http://catb.org/~esr/jargon/html/revision-history.html


I didn't know but this is going to be my canon now


Yes that's right. There used to be quite a cult around Alice in Wonderland amongst Computer Scientists. Caroll (or Dodgson as his real name was) was also a mathematician.


README files is the first thing I read in any source code repository. I’m noticing a trend of less useful readmes in projects. Often they are bare or provide outdated and incorrect information. This makes the introduction to the source code harder, and increases the barrier for new contributors.


Absolutely. Over the years countless projects have missed contributions from me because of a bad README.


the worst readmes are those that say "this is a fork of blablabla" Ok, but what do you do? what have you changed? Need any dependencies? How do I build?

at the very least just copy the old readme and add a note on top. And please please keep it updated


This made me feel very old. Too me it’s 100% logical and not a mystery - while I fully understand that it’s super-weird for people born after 1990


Ha, agreed. I was training a new help desk employee a couple years ago (22 years old) and dated the shit out of myself explaining why I don't like spaces or special characters in file names and directories and why we want to try to keep them as short as possible.


Back in the day that would mark you as a DOS pleb rather than a member of the glorious Macintosh master race.


I thoroughly enjoyed the early days of my career/learning as a DOS pleb!


So did I, but Macintosh people looked at it from an angle of "Why would you not put spaces in file names? They're meant to be human readable, and you will click on files more than you will type them in anyway." They were right, in the end. The future was less evenly distributed in the 80s.


dark days


I still instruct employees to never use spaces or weird characters in filename because they are a pain when I run into them in a CLI.

I don't foresee that advice ever changing lol


Sorry, 100% expected, sure, 100% normal, fine, 100% unremarkable, ok, but 100% logical?


I remember this for other informational text files going back to the 90s. The caps made it stand out in the list of files so it was easier to find first.

Another example is FILE_ID.DIZ. Just learned the extension is for description in zipfile.

https://en.wikipedia.org/wiki/FILE_ID.DIZ


They also make it often come first when the filenames are sorted since caps letters come first in the ascii table.


Initially in the 1980s it was to distinguish it visually from the other entries in the directory to make it more likely it is the first file a newbie will read.


Uppercase also comes first in ASCII, so a naive sort will show README before rEADME


On *nix systems README was always the first file in the directory because everyone tended to use lower case for all their other filenames.

I believe "Makefile" tended to be capitalized too, since you generally wanted to look in there to see how to build the thing.

I don't think anyone wrote that down, it's just how things worked out. If anything it probably was inspired by Alice in Wonderland (Eat Me/Drink Me).


In an alternate timeline, we could have named them MAKEME instead of Makefile.


plan9/9front uses mkfile.


This doesn't make any sense. All filenames were uppercase in DOS so having README in uppercase wouldn't distinguish it visually.


This particular question is not about the origin of README files (the earliest documented is from 1974 on TOPS-10) but about why README remains upper case on mixed-case systems (e.g. Unix). Neither has anything to do with MS-DOS. ASCIIbetical order is a good answer (certainly the reason I named things that way) although it doesn't entirely explain README over ReadMe.


I claim the README convention came from Unix.


This one seems pretty obvious to those of us around computers and BBS and FTP in the nineties. On a side note, I made the pinky-thumb phone gesture to my 10 year old daughter a few days ago, and she didn't have a clue... Nor does she understand how my music collection plays without the internet. She just assumes my hacker skills have reached such level of Zen nirvana actual internet access is just a trifling detail and not a requirement.


I actually didn't see the answer I was looking for in there: command line/file system environments used to be all caps and there WERE NO lowercase letters for a long time. I remember when MS-DOS (whatever version) first began using/allowed lowercase filenames and it was like stepping into the future


"Historically it wasn't possible to make lowercase `readme` files" doesn't explain why, after transitioning to systems which allowed separate case in file names, `README` files remained uppercase while most other text files now have lowercase names.


README is the only file that was both included with everything then, and is still now


I remember the MS-DOS variations.. README.DOC (before .DOC applied only to MS Word files), README.TXT, READ.ME, README.1ST, README.NOW...


1ST_README also


that filename wouldn't work in MSDOS - it breaks the 8.3 limit


Thanks, I mistyped. 1ST_READ.ME


It's a coincidence I just looked at that Q&A a couple nights ago. I had never heard of the "HACKING" file before but I like it. Is there a list somewhere of commonly used README-esque files used in open source projects?


INSTALL and LICENSE are the important ones. Maybe also BUGS (i.e., how to report bugs) and CONTRIBUTORS.


And NEWS or CHANGES or similar.


I think install instructions belong in README. It's always nice to see "pip install foo" in a readme letting me know it's properly packaged and not something I'll have to compile myself or something just to try it.


Always lovely to see a CHANGELOG.md


Even more lovely to see a ARCHITECTURE.md


In the linked answer? It had most of GNU's coding standard quoted.


The GNU coding standards only lists three: README, INSTALL, and COPYING.


I used to have HACKING but now I try to go for extremely obvious code with nothing people aren't already familiar with, plus GitHub Wiki.

Next step is cleaning stuff up in a way that would make automatic documentation generation useful.


I dunno about a list, but contributing, code-of-conduct, and license are all pretty common.


CONTRIBUTING


I worked for a French company in the 90s and the 8.3 dos name was LISEZMOI.1ER


Incidentally, the answer "So it appears first when sorted by ASCII (where capital letters appear before small letters)" is insufficient - not only would "Readme" be sorted pretty close to "README", but LICENSE, COPYING, INSTALL, BUGS, CONTRIBUTING, AUTHORS, NEWS would all end up even before it.


Does anyone know how to get en_US.UTF-8 sorting with diaeresis and stuff while still keeping capitalized stuff up front? I use LC_COLLATE=C to get this, but then diaeresis don't sort correctly.

I tried to hack my way around fixing this a few years ago, but after many hours gave up.


To do that you would have to define a custom locale in /usr/share/i18n/locales and define a custom LC_COLLATE section. See https://man7.org/linux/man-pages/man5/locale.5.html You can read /usr/share/i18n/locales/iso14651_t1_common as a (very) complex example.

But, full disclosure: I have never done it myself.


Yeah, that's what I tried, but was never able to get that to work. I don't know, it's been years and I had kind of given up on it.


README.TXT or README.1ST have been used for a long time distributing software, I'm assuming it's just to follow that trend. FILE_ID.DIZ files were all caps as well.


Randomly Explains Ambiguous Development Methodologies Endlessly


I still habitually use README.txt and/or README.md even though none of my environments require the upper case.

README.TXT vs README.MD:

https://stackoverflow.com/questions/8655937/what-is-the-diff...


One example of a situation when the answer "It is the way it is because this is the way it is" is correct.


Huh, I always assumed it was because upper case characters come before lower case ones, but in the temporal sense. Just a holdover from the days before lower case was an option, like FORTRAN was for a while.

The sorting trick is neat, though.


LC_COLLATE=C is still something I set on my linux installs.


My favorite configuration is just LC_CTYPE=en_US.UTF-8 and unsetting all other LC_* variables and unsetting LANG as well. That way you have full UTF-8 support, except ls(1) will sort the files as I want with uppercase coming before lowercase.


Agreed, for several reasons. Fun fact - any regex using the common idiom [a-z] will break on certain locales when certain characters like "y" are present.


all caps is (was?) typically considered SHOUTING so README gets your attention


> Avoid using more complicated markup languages like HTML in the README file, however, because it should be convenient to read on a text-only terminal

And that's what README.md files from these days hardly can follow on Github.


I kinda thought it was because it was easy to tab autocomplete.


This trend happened before autocomplete was common. If you logged into a commercial unix machine in the 90s it would not have autocomplete out of the box.

I assumed it was to sort at the top of 'ls' output, which is one of the answers on stack exchange.


Autocomplete was very common in the 90s — installing bash or tcsh was generally one of the very first things one did on a commercial UNIX that didn’t ship with a decent shell.


I'm having a hard time finding people writing about this, because so many people write about shells from a more recent perspective. According to this, tcsh introduced completion in 1983.

https://developer.ibm.com/tutorials/l-linux-shells/

Early on it used esc instead of tab.

I also found in the bash changelog from 1996 there are fixes to crashing bugs in file name completion.

However, I stand by what I said. It was common for people not to know about filename completion. It was common to ship without it by default. Most clued in people installed bash on those systems. But not everybody.


How is "README" easier to tab-autocomplete than "Readme"?


Weren't all file names all caps at one point?


Yes but that's not the reason. Back in the 90's shareware scene long before Github, an overwhelming number of files would be downloaded to your computer and then you'd be left to wonder where to start. Making a file all caps draws your attention to it and lets you know where the author intends you to start. Today, Github opens that file for you on the front page of every repo, so it doesn't matter so much anymore.


Yes. And even with lowercase, when all you have is ASCII, all caps becomes a reliable/x-platform way to add emphasis. The question is a fun watercooler topic but the accepted answer is among the various reasons the convention arose - it probably doesn't have a simple answer that's The Answer.


It was an act of desperation. People wouldn't read docs or sanely named information, so people would jokingly title something "readme" or "important" or "mefirst". It was kind of seen as pathetic originally, to have to be so bad at your job you needed to include a README file. Those were much more arrogant times. I still remember seeing repos that actually said it was a good idea to have a README file in them and being bemused.

Now I love them, but I've grown to love simple, "dumb" solutions.


I use README.org (emacs). Sadly, npmjs.com doesn’t consider it as a readme file.


And LICENSE has no file type!


People forget the simtel20 archives, with its 00_INDEX.TXT and AAAREADME.TXT


I also use 0README.txt so it show up first when sorting by name.


As an aside, Neal Stephenson wrote a book called REAMDE


Because really it should have been ReAdMeGoDDaMMiT so people rtfm :)


I worked with a rather conservative person who didn't like "README" as he thought that computers should never be aggressive or yell at their pilots. Smart guy, but had some strange tendencies.


I can see the first guy writing a README , so desperate because people would ask obvious questions ignoring a file called readme:)


This was my thought too, it's borne of frustration.


Shortly after, same guy invented "RTFM"...


Plausible and gave me a good chuckle.

Thanks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: