WordStar for DOS (All Versions)The Why and Wherefore (of ASCII) Introduction The following is an article written by Dan Strychalski in response to a request for him to explain keyboard command history with particular relevance to WordStar style commands. His article describes the underlying basis of all keyboard command systems outside of IBM mainframes, and Microsoft operating systems. The system he describes is ASCII, which allows cross platform common commands as well as the better known character table - why Microsoft decided to lead its users into a dead end proprietary, and awkward, command scheme rather than embracing and expanding on the cross-platform keyboard command standard only they can know. Over to Dan... I have been asked, privately, to discuss keyboarding history and the advantages of the WordStar keyboard command set. I have decided to respond publicly. It seems like a good idea to do this while my post on ASCII and RTF is fresh in people's minds. I trust I need not go into the physical virtues of the keystrokes or the conceptual advantages of WordStar's division of functions. Most people here have seen Rob Sawyer's treatment, and he's a hard act to follow. Nor will I attempt to describe what I, if I had to come up with a ranking, would probably list as these keystrokes' second-greatest advantage: what I call the Exhilaration Factor. Outside of Unix circles, the only people I've ever known to speak of using the keyboard as a joy are WordStar users. And what percentage are we of all the people — other than those in Unix circles — who must use the keyboard day in and day out? Millions never had the chance to experience that joy. Not because they didn't have the chance to use WordStar, but because they didn't have the chance to hold down Ctrl and press a letter key. Those keystrokes could have been used for any commands at all, but in all big-name mass-market software other than WordStar they were used for virtually no commands at all until every big software company but one was dead or dying. We speak of "WordStar keystrokes," but the term is a misnomer. Ctrl-A through Ctrl-Z, as well as six combinations of Ctrl with symbol keys, belong to no organization or product. They are the ASCII/ECMA-6/ISO 8859/Unicode command keystrokes. They generate patterns of bits that correspond to the decimal numbers 0 to 31. Think about that: zero to thirty-one. Machines designed to operate on the basis of patterns of bits that correspond to numbers ought to make use of those patterns, don't you think? Those patterns were intended by the creators and approvers of ASCII — a 75% or greater majority of the industry, market, and special-interest players participating in the formal standards process over the past four decades — to be recognized as non-printing control/command/function codes in all information interchange between computer systems and between computers and peripherals. A keyboard is a peripheral. On IBM-type micros it is so tightly integrated into the system that it stretches the definition of the term "peripheral," but a peripheral it is — and in combination with the machine's BIOS, it is to a great extent an ASCII peripheral. ASCII is a work of art. That so few people appreciate this is mainly because of the so-called ASCII tables in manuals for MS/PC DOS programs. Most of these tables (including that in the WordStar 7.0d manual) are, it pains me to report, wrong: they show printing characters for codes that ASCII says should not print, and they show characters and codes outside the ASCII range. Microsoft itself, in recent years at least, has circumspectly labeled such tables "character charts." These non-ASCII tables and charts usually give the codes in decimal notation (that is, the familiar base-ten numbering system) and are arranged in two columns, making it impossible to perceive the structure of the standard. Let's look at that structure by examining an ASCII table of the kind found outside Microsoft's sphere of influence:
The number to the left of lowercase a (second row down, second column from the right) is read "six-one," or strictly speaking, "six-one hex," "hex" meaning hexadecimal, or base sixteen, notation. In running text it would be written 61h, 61H, 0x61, or in any of several other forms. What it means is six times sixteen (ninety-six) plus one — in other words, ninety-seven. Big deal, you say. Yes, big deal, I say. A formal, non-proprietary, consensus-based standard is always a very big deal. Thousands of people work on it. Hundreds of organizations vote on it. A lowercase a resides in a computer's memory (or shoots through a printer cable or communication line) as a group of eight electrical charges. Each charge can have either one voltage level, interpreted as the digit 1, or another, interpreted as the digit 0. We assign each charge a fixed place value from the binary notation system. This allows the eight charges to express a value from zero to two hundred and fifty-five. (To avoid confusion, I will use words for decimal notation as much as possible. Notice that I don't say "decimal values." A value is a value is a value; decimal, hex, and binary are simply different ways of writing the same values.) The charge with the greatest place value is often referred to as the top or highest (or "most significant") bit. Let's arrange the bits of our binary-coded ninety-seven, or 61h if you prefer, vertically from highest to lowest and see what we get:
*In everyday decimal notation; words would make this table too wide. Easy to understand, but not very interesting. To make it more interesting, let's write the bits horizontally in groups of four, with the most significant bit at the left, and compare them with the bit patterns for 16h (one-six hex, twenty-two decimal), 11h (one-one hex, seventeen decimal), and 66h (six-six hex, one hundred and two decimal):
Hey! Comparing hex to four-place binary notation, we see that a hex 6 always corresponds to 0110 binary, and a hex 1 always corresponds to 0001 binary. With a little practice, a glance at any hex digit tells us the exact state of four electrical or magnetic charges — four bits — in memory, on a disk or cable, or wherever. Two digits tell us the state of every bit in an eight-bit byte. And most people think computer engineers use hex just to show off and make things hard for the rest of us.... Interesting? I hope so. Important? No! What is important is knowing what you are looking at when you see hex notation, and why it is used. Like recognizing French when you see it, even if you can't read it. So don't memorize the following digits and bit patterns; just be aware of them:
(Yes, the use of letters in hex can be confusing ["A" means 1010, but 1010 doesn't encode the letter A — huh?!?]. It gets worse! It sorts itself out magically, however, if you use hex regularly. Really.) We see something both interesting and important when we scan across the table to the left. Let's look again at the binary form of the code for lowercase a — 61h, ninety-seven decimal, call it what you wish — and compare it with that for uppercase A — 41h, or sixty-five decimal:
Let's also look at the codes for lowercase z and uppercase Z:
Notice that only the third bit from the left changes when we go from lowercase to uppercase or vice versa. This greatly simplifies the design of sorting routines, keyboards, printers, and more. A similar mechanism is at work between the fourth and third columns of the ASCII table, where we find the Arabic numerals and most punctuation marks. The majority of computer keyboards used to be laid out to take advantage of this. The benefits to manufacturers and consumers of letting the shift keys work by changing a single bit hardly need elaboration. (EBCDIC, which IBM tried mightily to get adopted in place of ASCII, has similar benefits, but for punched cards only. The creators of ASCII had a sneaking suspicion that punched cards would not be the input method of the future.) These are not the only places in the table where a mechanism of this kind is at work. (If you already see where I'm heading, please don't ruin it for the others.) You may have noticed that four pairs of adjacent columns in the ASCII table contain similar elements. Behold how the second and third bits from the left allow devices and programs to distinguish among the four groups — control codes, punctuation/symbol/numeral codes, uppercase character codes, and lowercase character codes:
This basic scheme is also used in the 128 additional assignments made in ISO 8859-1 (so-called Latin 1). Processing power being cheap these days, you might think it no longer matters how many bits we need to change or check to perform an operation, but it does, and using a scheme like this has countless other advantages as well. (The artist in me also finds it stunningly elegant, but I doubt it was adopted for aesthetic reasons.) There's not much of a point to making all computers and peripherals use the same codes for printing characters if they all use different codes for things like tab, carriage return, new line, new page, and so forth. In addition, other non-printing codes are necessary for controlling transmission, "escaping" to a different mode, separating records and files, canceling, deleting, etc. EBCDIC and many other schemes that existed before ASCII put such functions at or near the low end, the beginning of the available "code space," and the creators of ASCII followed this model. They also thought it would be a good idea if these codes could be sent easily from the keyboard — and nobody, except perhaps IBM, thought it would be best for all concerned if this required every keyboard to have sixteen or seventeen extra keys. (IBM was powerful enough not to care if the standard increased product complexity and cost. I suspect they kind of liked the idea. Customers and smaller vendors had a somewhat different perspective. The customers and smaller vendors won, but only with help from the U.S. government. Guess what? Something similar is happening now. Whose side are you on?) Let's look again at a few lines of our original ASCII table, adding the groups' bit patterns and names, taking out extraneous material, scanning from right to left, and adding the relevant lines from a key of the kind that accompanies better ASCII tables everywhere:
bs Backspace; ht Horizontal tab; nl New line (formerly lf, line feed); vt Vertical tab; np New page (formerly ff, form feed); cr Carriage return Does it take a sharp eye to see that by zeroing two of the eight bits produced by any letter key you get a code from the first two columns of the table? We saw that the shift keys change one bit to zero, and that it makes a great deal of sense for this to be so. Seems it would make sense to have another modifier key that changes two bits to zero and gives us... hmmm, what was the first group labeled? "Control"? Hmmm. HMMM.... Sharp-eyed 'Starpeople, if they'd had such a table (instead of the Microsoft kind) lying around, might have had occasion to get curious about it. (These things happen, even to the most non-technical people, one example being the former free-lance translator and accidental computer user who sixteen years later is writing this article.) They might have noticed —
They might have looked at the table and thought, "Hmmm... lowercase h, uppercase H, `bs'.... What does it say `bs' is? Backspace? Hey! No wonder Ctrl-H is Backspace and ^Ph does an overprint! Hmmm... lowercase i, uppercase I, `ht'.... Lessee, what's `ht'? Horizontal tab? Yeah! Ctrl-I is Tab, and ^Pi inserts a real tab code! Hmmm... j, J, ^J; `nl' is `new line', and Ctrl-Pj starts a new line. Hmmm... l, L, ^L; betcha `np' is `new page'. Yup... and `cr' has to be `carriage return', considering the way Ctrl-M and Ctrl-Pm work. And so it is! Ve-e-e-ery interesting." Then they might have experimented and found that no matter what kind of computer they might be using, Ctrl-H works like Backspace, Ctrl-I works like Tab, and Ctrl-M works like Return/Enter on the operating system's command line, in BBSes' log-on screens, and in direct communication with modems and such, and that no matter what kind of printer they might be using, a ^L code sent straight from the keyboard or a file out through the printer cable makes the printer advance to a new page. Looking at that part of the table — h-H-^H/bs, i-I-^I/ht, j-J-^J/nl, l-L-^L/np, m-M-^M/cr — they might have thought, "It looks like these keystrokes I'm using are pretty basic to the way computers work. Ctrl plus a letter key generates a code with a value from one to twenty-six that is always interpreted as a command, and Ctrl-P followed by the same letter puts that same code into a file for interpretation during printing. Some of these codes do the same thing or pretty much the same thing all the time, and what they do seems to be based on this standard that, from what I've heard, is common to all computers and peripherals except IBM mainframe equipment. Most of the codes do different things in different contexts, though. I wonder why that is...." Looking at the table, they would have seen that ASCII gives most of the other control codes functions unrelated to keyboards and printing. "Ah-ha! This was designed not just for keyboards and printers but for all kinds of things," they might have thought. "I wouldn't be surprised if it says right in the standard that programs can use these codes, and the keystrokes that produce them, in any way that makes sense, as long as they don't print." This surmise would have been 100% correct. "Well, the way WordStar uses them sure makes sense to me! And it looks like they exist and can be used in all the same ways on every computer from an Apple II to a Cray! Wow! The keystrokes I'm using are part and parcel of the standard that makes everything work together. So there will always be programs like WordStar — though some people might be unscrupulous enough to write programs that ignore these keystrokes. Wouldn't that be a low-down thing to do?" Wouldn't it, now?
Hsinchu, Taiwan, R.O.C. Dan advises that "redistribution [of the article is] encouraged, attribution [is] appreciated."
|
|
FAQ: Q2026
See also: Q2003: WordStar 3 Keyboard Command SetQ2023: WordStar's Document and Non-Document Modes DemystifiedQ2027: What is the difference between ASCII and RTF?WordStar a Writer's Word Processor - by Robert J Sawyer
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||