WordStar for DOS (All Versions)What is the difference between ASCII and RTF? ASCII stands for American Standard Code for Information Interchange. This is a standard developed collaboratively by major vendors and purchasers of digital products in a process coordinated by the American National Standards Institute (ANSI), the European Computer Manufacturers' Association (ECMA), the International Organization for Standardization (ISO), and other standards bodies. Since its first official adoption in a form close to that we know today by ECMA in 1965, ASCII has been reviewed, refined, and reapproved by these bodies' memberships at regular intervals. It also serves as the basis for, and is included lock, stock, and barrel in, related standards such as ISO 8859-1 (also known as Latin 1, though that term is also applied to a non-ISO scheme) and Unicode. Computers handle everything in what can be described as numeric form. ASCII assigns numeric values to the indispensable elements of text common to all writing systems based on the Latin script: the space character, common punctuation marks, symbols common in commerce and programming, the ten Arabic numerals, and the 26 letters of the alphabet in lowercase and uppercase. These elements are assigned to the values 32 through 126. In addition, ASCII defines the values 0 through 31 and 127 as control codes with no assigned graphic characters; this is because such codes are necessary for sending, displaying, editing, and storing text in usable form. Among the control functions assigned to values in this range are Carriage Return, Backspace, Tab, and Delete. Why only 0 through 127? Well, computers don't really handle numbers at all. They operate on groups of electrical charges, where the charges can have only two voltage levels and each kind of group (the kind depending on the group's purpose) has a fixed size. We interpret the two possible voltage levels as the digits 1 and 0 and refer to them as binary digits, or bits; we decide on a size for the groups that will hold the values representing elements of text, and we refer to these as bytes. When ASCII was created, not all computers had bytes. Some used five bits for each text element (that is, they used five-bit bytes), some six, and some eight, and some handled text only by workarounds, if at all. However, the creators of the standard judged - correctly - that before long the eight-bit byte would become ubiquitous. Now, with one bit you can represent two values, 0 and 1; with two bits you can represent four values; with three bits you can represent eight values, and so on; the total doubles with each additional bit. With seven bits you can represent 128 values (0 through 127, the way it's usually done), and with eight bits you can represent 256. The people creating ASCII were defining their own future, and nobody wanted to use up all eight bits right at the outset. For one thing, some equipment needed to use one bit for error checking; for another, 128 values could cover all the characters and functions everyone considered absolutely indispensable; and finally, no model for using all 256 values had yet appeared. Formal, non-proprietary standards are nothing if not flexible, because they must be approved by competing vendors and extremely demanding, large-volume customers. The creators of ASCII used seven bits, specifying that vendors could use the eighth bit any way they wanted, in the full expectation that some vendor's scheme would become widespread enough to be adopted as the next standard. (They also specified that the control codes could be used in any way appropriate to the operational context and the communicating devices - and that, my friends, is what made possible the WordStar keyboard command set.) Eventually DEC's Multinational Character Set became the preferred eight-bit encoding method in Europe, and in 1988 it was adopted almost unchanged as ISO 8859-1. It has all the ASCII assignments in its lower half -- the values 0 through 127 -- and the characters that the European market wanted in its upper half -- the values 128 through 255 - the half purposely left unassigned by leaving the eighth bit free in ASCII. In short, ASCII is an encoding method that uses seven bits of an eight-bit byte to encode 95 printing characters and 33 control functions. So what's RTF? Well, it's an ASCII-based markup language. Raw RTF code contains only ASCII values (0 through 127), and therefore only the characters specifically given values in ASCII. RTF is similar in some ways to HTML. In HTML, we put special markers between left and right angle brackets, and HTML-speaking programs interpret these as commands to format the affected text as a title, a heading, a quote, a list item, or whatever. RTF's markers are enclosed in left and right curly brackets (also known as braces), and it is much, much more specific regarding what the interpreting program is to do. We use the term ASCII in a strict sense and a loose sense. Strictly speaking, an RTF file is an ASCII file, because it contains only bytes with the values 0 through 127 (and probably not all of those). Usually, however, when we say "ASCII file," we mean it in the loose sense of a file containing only ASCII values and meant to be treated entirely as directly human-readable text - a file that can be dumped directly to a terminal or printer with results that any literate person can read and find nothing strange about.
Dan advises that "redistribution [of the article is] encouraged, attribution [is] appreciated." See also: The Why and Wherefore (of ASCII) - a more detailed but very readable description of ASCII and the command set it includes.
|
|
FAQ: Q2027
See also: Q2026: Thw Why & Wherefore (of ASCII) - an explanation of ASCII and how it relates to the WordStar (and other) command sets
|
||||||||||