--------------------------------- | The New Plain Text Style Guide | --------------------------------- ====================================================================== DISCLAIMER: Nothing in this guide should be taken as a requirement. While some of this material reflects official standards as published by various agencies, even those are not compulsory. Much of this is simply what is already in use around the world, and will be quickly recognized by the average Internet user. ======================================================================= INTRODUCTION Most folks have seen or heard of something called a "style guide" or "manual of style." These things are usually published for select groups of folks writing for a specific audience. Literary types will often choose the MODERN LANGUAGE STYLE MANUAL, historians will probably be stuck with something from Sistrunk and White, the sciences have their guides, and so on. The Internet is no different. While the Internet is subject to regulation from various different types of governments, as well as the self-enforcing standards of Internet professional organizations, most of what crosses through The Ether may or may not adhere to any rules at all. Many of us serious "Netizens" like it that way. Still, it helps if most users tend to be on the same sheet of music. There is no attempt here to press some agenda as to what *should* be, but rather to describe in general what *is*. There is one other motive to consider: publication. That is, if you want others to read what you write, it helps to use a format accessable to as many computers as possible. Keep in mind that there are an awful lot of really old computers still in use, often as the only thing available in some places. What follows is your best shot at reaching them all. Hopefully the reader will consider this not too difficult to follow. You will find that many serious Netizens are involved in computers because they don't get along all that well with their fellow humans. Internet Land is their home. As computers have become increasingly accessible and usable to those who do not share their depth of technical interest, their domain is being flooded with "idiots" who just don't grasp the subtleties of Netiquette and e-language. By the way, "e-" anything usually means "electronic." Thus, an e-book is a book in electronic format -- a computer file -- rather than on printed paper. The average Internet user is often at a loss to even begin picking through the mass of human knowledge available in the Internet. Those who wish to get the most from their Internet use, without becoming hard-core "techies" or "geeks" or whatever, can probably benefit from a simple guide. Hopefully that's what this is. ----------------------------------------------------------------------- BASIC CONCEPTS The basic reason for all this is that storing information, especially documents, on a computer makes life easier for most people. We can carry the equivalent of the entire Library of Congress in a modern laptop computer with room to spare. We can write our documents far more easily, change them even more easily, and send them to all our friends at once. They are easier to keep safe from loss and damage, and it's easier to find the exact phrase or words we want when we review them. The computer does most of the work, we need only be creative enough to produce them and make sense of them. However, there are significant differences between reading words on paper and reading them from a glowing screen. We'll skip all the technical stuff and simply note that it is already known from years of study what works best. Let's make use of that information. This is not about using your computer to publish printed material, so I won't delve too much into that, but we need to highlight a few facts about something called "word processing." The average computer user knows that one of the nifty things a computer does for us is make it so easy to produce very nicely formatted pages through a printer connected to that computer. We know that word processors can do at least what a typewriter used to do, plus we get to wait until we are all finished before the ink goes on the paper. Like most typewriters, we can do things to enhance text by, for example, underlining the letters. A few typewriters could do boldface type, and a very few could do italics. Word processors can do all that and more, including making various different typefaces, different sizes, different colors, adding pictures, and so on. Of course, nowadays underlining is considered old-fashioned. All the various academic, computer and Internet standards use italics where the old typewriters used underlining. Part of the reason is that underlining is reserved for a special use: in webpages it marks text that is linked to an outside document, or something elsewhere within the document. The link offers related information to the part that is underlined. Keep in mind this is voluntary, and you'll see folks ignoring this standard. Back to word processing: In order to do all those things with our typed words, the word processing program adds codes to the file so that it will display and print these enhancements. These added codes are not displayed or printed themselves, but are used to change the letters that are displayed or printed. We refer to these codes as "formatting." We say that our file full of words is "formatted." We speak of the particular "format" used to store these files. Not everyone in the world will have access to the same word processor that you use. Each of the different word processors out there uses its own special codes. For that reason, we have a select few "formats" commonly used for e-documents. As long as everyone with whom you share computer files has the same software, it's no problem. What are the chances? Out of the billions of people who have access to the Internet, the vast majority are using computers that run some version of "Windows." But there are several million who use a MacIntosh. Quite a few folks use something like Linux, Unix, or a related system. In fact, most of the computers that make the Internet possible don't run Windows. It is considered very rude to assume everyone will be using the same software, and thus be able to use documents with the same format. Besides, those word processing formats are designed for printing, not for reading on a computer. These files are identified by an ending of 3 letters: something.doc, something.wpd, something.rtf, and so forth. Those approved for Internet use include plain text (files ending in .txt), HTML (.htm & .html), and the newer XML (.xml). Of the few formats considered proper for e-documents, plain text is the most efficient. That is, the size of the files for plain text is the smallest of all. Sometimes it's less that 10% the size of something formatted for printing. Smaller files pass through the Internet much faster. Even better, plain text is less likely to suffer damage while being sent over the Internet. For email, you should consider restricting yourself to plain text whenever possible. ----------------------------------------------------------------------- PLAIN TEXT ENHANCEMENTS Does that mean we have to give up *all* enhancements? Well, obviously not, since that last sentence has an example of enhancement. And you probably already noticed some similar enhancements above. Let's look at what's possible. The fancy technical term for plain text you'll often see is ASCII, which stands for "American Standard Code for Information Interchange." This was one of the earliest standards for storing information on computers. Way back when only governments, colleges and huge corporations could afford computers there were severe limits on what they could do. The number of codes one could type from a keyboard was limited to 128. Of those, 33 were used for the computer codes and commands, and the other 95 were used for various characters that were displayed on the screen. Well, we have 26 letters in the English alphabet, and each of those has an upper and lower case, so that takes up 52, leaving 43 for punctuation and popular symbols. That was the standard for quite a few years, and folks simply got used to it. Computer programs were designed to work with that system, and what we call the Internet was built around that. In order to get around these limitations, certain conventions were developed. The first concept is called "bracketing." That means we can enhance our plain text by making sure that whatever we do to fancy it up has a start point and an end point. Think of it as you would quotation marks: everything between the first and last mark is a part of the quotation. We bracket our quotations with double quotation marks in America. Folks in the UK tend to use single quotation marks more often. If you want to signify that a word or two should be in boldface type, put asterisks on each end: *bold*. For underlining or italics put the underscore mark at each end: _italics_. Recently, some software has begun to signify italics with the forward slash symbol: /italic/. You are likely to see any or all of these at various times when people want to emphasize one or more words. Obviously, if the number of words between the brackets is very long, the whole thing loses its punch. Use them sparingly. Don't, as some do, fill every space between words with the bracketing marks: _this_is_not_proper_. Using all capital letters traditionally signifies SHOUTING. People who type in all caps all the time are considered rude, especially in message formats (e-mail, instant messaging, etc.). Still, it shows the flexibility of the system, in making room for drama. Whole plays have been written in ASCII. All caps are also used as the simplest form of subheading in longer documents. It's also a good way to signify proper titles of things like books, movies, and so forth. I know it gets confusing, because most people use underlines in typing, or italics in printing for that, but the use is terribly inconsistent. You'll even see some articles wrongly using quotation marks for that. Italics are _way_ over used. It's best to use them for emphasis or foreign words. Brackets unique to Internet communications are the angle brackets, which look like arrow heads to some. They are properly called the "greater than" and "less than" symbols. Any time you see them in text, they should signify something related to the various Internet protocols. For example, you should put e-mail addresses inside them: . However, if you expect them to show up anywhere that involves a browser or e-mail client, you'll need to add the protocol prefix so that it's handled proplerly: . It's also the norm for webpage addresses when they appear in a text, sometimes called an "URL" (Uniform Resource Locator): . Notice the space between "URL:" and the address. ----------------------------------------------------------------------- SPACE USE ISSUES If all we did was type short notes to each other, we've covered most of what there is to know. The rest is not so simple. You'll find that a large number of people don't observe good formatting for much the same reason they don't use formal English in conversation. Complete slobs aren't likely to have read this guide in the first place, so I'll assume the reader wants to know, even if they don't use these rules. Text for screen display should be in "block format" -- that is, all the paragraphs line up with the left edge of the screen, and are separated by one blank line between them. Each paragraph should be less than 40 lines of text. That's because it's hard to read paragraphs on a computer screen when they go on and on below the current screen view. It's one thing to save old documents in ASCII format with paragraphs that run 100 lines or more, but avoid writing that way if you can. The subject of much debate still is line wrapping -- breaking off the line of text, and putting it on the next line. A good rule of thumb is that it doesn't matter much for short documents that might be as long as two screens of text display. For longer ones, it's easier to read if they break at some distinct point. That point is about 75 characters, or 75 columns if you prefer. Some e-mail software will break it off at odd points any way. What you send looks nothing like what they get on the other end, with lines of all different lengths. Most of the time 72 is safe, 70 for extra safety. Of course, another advantage is that you can add fancy touches to wrapped text, such as centering a line, drawing boxes around things, and so forth. You can also do indentations for what are known as "blockquotes." If you've ever written a paper for English composition class, you may know that any quote, which takes up more than 3 lines of text in a single paragraph, should be set off in an indented paragraph by itself. The Internet standard for indenting is in 3-space increments, so it would look like this: This is an indented paragraph. Some may use the term "hanging indent" to note that the indentation sort of hangs in place for the length of the paragraph. The Internet term is "blockquote" because it is a longer quote set off as a separate block of text. Good text editors will keep the indentation going for you. As you might guess, the same sort of rule applies to the concept of outlining, where sub-paragraphs are indented. The 3-space rule continues for each level below, each in increments of 3. Aside from the varied schemes for numbering of outlines, there is an e-standard for simple bullets. - The first level is a single hyphen. = The next level is an equal sign. + The final level will be a plus sign. Very often you will see asterisks for bullets, especially if there is only one level of bulleting. If any deeper levels than 3 are necessary, rely on some other scheme, and avoid indenting for any of it. Please note the placement of the bullets: format the text with the 3-space block indentation, but place the bullet in the middle of the spaces on the first line. That is: space, marker, space, text. With numbers it looks like this: 1. Numbered paragraphs. They should be formatted as a blockquote, with all the text together behind the 3-space margin. The number is at the left margin, followed by a period, or right parenthese in come cases, followed by a space before the text begins. Every paragraph, including bulleted or numbered ones, should be separated by one blank line between them. On longer documents, sub headings may have two blank lines above them, and chapters would have three. More than that is too much. Excess "white space" is a no-no. In fact, do not hit the space bar twice between sentences. One may see that still required for "old-school" typed documents, but is now obsolete. Nor should you indent the first line of any paragraph. There is a specific reason for that: an indented first line is the visual clue for footnotes. When reading paper documents, it not too hard to jump back and forth to the bottom of the page, or even to the back page for endnotes. That's pretty clumsy on a computer screen. Endnotes in e-text are for things you don't want the reader to see until last. For anything that genuinely qualifies as footnote material, it needs to be placed immediately below the paragraph where the reference appears. In the paragraph, since we can't have little numbers that are superscripted above the line, we put our number in square brackets, like so: [3]. Because of line wrapping, to keep it from getting separated from its associated text, attach it to the end of the last word, or the punctuation mark at the end of text being footnoted. Immediately below the paragraph[1], indent the first line of the footnote (3 spaces of course) and type the same referencing mark -- the square bracket around the number. Hit the space bar and type the footnote, with all following lines flush left. An example appears below this paragraph. [1] It must be noted that if there is a blockquote or bulleted list attached to the paragraph, then the footnote would follow all of that, plus any additional regularly placed text, until the paragraph is actually closed. Do not use asterisks for footnotes in e-text, as they are needed to mark bold text. ------------------------------------------------------------------------ PUNCTUATION Aside from the standard uses for the English language, there are a few conventions unique to e-text punctuation. The primary difference is in the placement of quotation marks. We are all quite used to the type-setter's trick of placing the period at the end of a sentence under the double quotation marks. For American typewriter products, the closest we could come was to put the period before the final quotation marks. That was not the standard prior to the days of fancy type-setting. Originally, punctuation went outside the quotes, unless it really was a part of the quote. It is too much of an ingrained habit for most people, so do what is comfortable for you. British usage treats quotation marks as bracketing, and you'll often see periods and commas outside the quotation marks. Many long-time computer users also do it that way. If there is a strong liklihood you'll need to reformat the ASCII file for printing, it's a lot of work to shift them back and forth. Too often we use quotation marks to set off something that we want our readers to type someplace, such as when giving instructions. If I am writing a tutorial on how to enter a password into a computer prompt, I might end up with this: At the prompt, type in "password." Is the period part of the password? If it is, I should have put another period at the end of the sentence. To avoid confusion, I'll do this: At the prompt, type in "password". We've already noted that the British also tend to use single quotation marks where Americans use double. Pick one and be consistent; readers will figure it out. For Americans, we use single quotes for one purpose only: quotations within quotations, called "nested quotes": The man said, "She told me, 'Leave that alone!'" Here I need to say a word to Windows users about "smart quotes." While the subject of this article is ASCII text, the world of Windows will often use an extended version of ASCII, and it is called ANSI or it may be called Unicode. That allows for additional symbols to be used that are missing from mere ASCII, including languages such as Chinese. This is becoming the new standard, but there are far too many computers not there yet. One of the most troublesome symbols coming from Windows computers are the specific opening and closing quotation marks, nicely shaped and different from each other. The writing software is supposed to be smart enough to know which to use and display for which end of the quote. They look really nice in print. They are seldom properly rendered on computers that don't run Windows. If the software you are using to compose in ASCII format automatically displays these nifty quotation marks, you will have to learn to work around that. Some editors will give you the option to stop it. Please learn to break the habit. The same goes for several other often-used symbols. There is a standard for each one in ASCII. The copyright symbol looks like this: (c) or (C). The trademark symbol is (TM) or (tm). For a registered trademark, (R). Another common problem is the use of ellipses and dashes. The main point is to avoid taking up unnecessary space and to avoid visual display problems with line wrapping. Sometimes you'll see an ellipse to indicate that the writer sort of tapers off without finishing the thought, or perhaps to indicate a dramatic pause in verbal dialogue. Normally an ellipse is 3 periods between words to indicate something cut out of quoted material. Usually what's cut out was not important for the discussion, but we assume you haven't left out anything that matters, so as to be actually mis-quoting. If such an ellipse falls at the end of a sentence, then use it like a single punctuation mark. If it breaks the middle of a sentence, let it stand on its own, with a space on either side: something ... like this. Only use four periods when you would in any other setting: to signify that more than one sentence was left out between quoted portions, or if the lapse occurs between two or more paragraphs. In blockquoting, simply start a new indented paragraph if the material is drawn from a separate paragraph in the source. Make your four-period ellipses rare. Dashes need to be distinguished from hyphens. We might want to break hyphenated words manually at the end of the line for line wrapping, and that's okay. If what stands between two words is a dash, use a double hyphen. As with the mid-sentence ellipse, it should always have a space before and after. Dashes are most useful in setting off parenthetical comments, such as -- surprise! -- this. They can also be used to signify an abrupt change in thought -- after which the sentence ends. (I know, it would have been properly a comma, but I couldn't think of any better way to demonstrate the idea.) If you have a famous quote set off as a blockquote, and you want to attribute the source immediately, then use the tilde: ~. In e-text, the tilde normally signifies that what follows should be considered a signature. If I make an editorial notation in some document written by someone else, I would enclose the comment in square brackets, and at the end, a tilde followed by my initials. [Editorial note. ~ JEH] Aside from footnotes, you should assume square brackets also indicate something not in the original text, but perhaps implied, guessed, or simply necessary to make sense of the text, and other editorial notations. For example, a question mark or exclamation point in square brackets is a subjective reaction the writer has to something in the material they are quoting. ------------------------------------------------------------------------ OTHER MARKINGS AND ASCII ART By now the reader has noticed the horizontal dividers made by repeating hyphens from one margin to another. That's simply a matter of personal style, but is generally understood by readers. When using such lines, the double line (equal signs) serves as a major separator; for example, to mark off the document header lines: the title, author's name and e-mail, and the date. When you write an article, those three are a basic minimum. If you intend to protect your rights to the material, some sort of copyright notice needs to appear at the end. For various levels of headings and sub-headings, you can underline the titles with a dashed line on the line below, or double line with equal signs. You can break up a story with centered lines of various characters: tilde ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ asterisk ******************************** plus sign ++++++++++++++++++++++++++++++++ Depending on your needs and your creativity, there are many ways to use marking for boxes, banners, and so forth. /----------------------\ \---| THIS IS MY BANNER! |---/ > \----------------------/ < /----\| |/----\ Hyphens, equal signs, and underscores make horizontal lines. The "pipe" is the simplest vertical line, usually the backslash key (\) shifted. The slashes can be used for fancy corners, and so on. Then, be prepared to enter lots of empty spaces to place things where you want them. There is a massive collection of something called ASCII Art available on the Internet. While it is possible to find software that can interpret two dimensional images into very vivid representations using ASCII characters, a great deal of this art is created by individuals pecking away until they come up with something clever. Simply conduct an Internet search with the words "ascii art" and see where it leads you. ------------------------------------------------------------------------ SOFTWARE FOR THE TASK In the modern world of the graphical computer display, there's just not a lot of market for software that serves as a "word processor" for plain text files. Most people who work with such files learned it when only the most primitive software was available. Very often one ends up using something designed more for writing computer code, which usually starts out as ASCII. Still, there are some good programs out there that are still free for the private user. For the majority, who run Windows, you can download one of a handful of free programs. One of my favorites was NoteTab Lite. It could be set up to break text lines wherever you liked, and could "reformat" any paragraph that had been edited so that the lines are back within your right margin limit, and short lines are merged in properly. On anything that runs a Unix like system, including Linux, the best so far is Nedit. Having never used Mac for much of anything, I can't make any recommendation. At any rate, some few word processors support plain text as a useful format, but complain about it. Most people are better off not using them for ASCII. Just about the most important feature is being able to set your right margin for line wrapping. Second to that is the ability to reformat a paragraph after editing, so that the lines are re-wrapped to the right length. A third issue is whether the program can automatically center a line of text for you, without all those typewriter calculations. A final useful feature is being able to shift whole words or sections of text between upper and lower case at one stroke. Much of the rest is up to the writer. ======================================================================== Feedback gratefully received. Ed Hurst Updated: 21 January 2003 COPYRIGHT NOTICE: People of honor need no copyright laws; they are only too happy to give credit where credit is due. Others will ignore copyright laws whenever they please. If you are of the latter, please note what Moses said about dishonorable behavior -- "be sure your sin will find you out" (Numbers 32:23)