Monday, October 14, 2013

Why Microsoft Word Must Die

I hate Microsoft Word. I want Microsoft Word to die. I hate Microsoft Word with a burning, fiery passion. I hate Microsoft Word the way Winston Smith hated Big Brother. Our reasons are, alarmingly, not dissimilar ...

Microsoft Word is a tyrant of the imagination, a petty, unimaginative, inconsistent dictator that is ill-suited to any creative writer's use. Worse: it is a near-monopolist, dominating the word processing field. Its pervasive near-monopoly status has brainwashed software developers to such an extent that few can imagine a word processing tool that exists as anything other than as a shallow imitation of the Redmond Behemoth. But what exactly is wrong with it?

I've been using word processors and text editors for nearly 30 years. There was an era before Microsoft Word's dominance when a variety of radically different paradigms for text preparation and formatting competed in an open marketplace of ideas. One early and particularly effective combination was the idea of a text file, containing embedded commands or macros, that could be edited with a programmer's text editor (such as ed or teco or, later, vi or emacs) and subsequently fed to a variety of tools: offline spelling checkers, grammar checkers, and formatters like scribe, troff, and latex that produced a binary page image that could be downloaded to a printer.

These tools were fast, powerful, elegant, and extremely demanding of the user. As the first 8-bit personal computers appeared (largely consisting of the Apple II and the rival CP/M ecosystem), programmers tried to develop a hybrid tool called a word processor: a screen-oriented editor that hid the complex and hostile printer control commands from the author, replacing them with visible highlight characters on screen and revealing them only when the user told the program to "reveal codes". Programs like WordStar led the way, until WordPerfect took the market in the early 1980s by adding the ability to edit two or more files at the same time in a split screen view.

Then, in the late 1970s and early 1980s, research groups at MIT and Xerox's Palo Alto Research Center began to develop the tools that fleshed out the graphical user interface of workstations like the Xerox Star and, later, the Apple Lisa and Macintosh (and finally the Johnny-come-lately imitator, Microsoft Windows). An ongoing war broke out between two factions. One faction wanted to take the classic embedded-codes model, and update it to a graphical bitmapped display: you would select a section of text and mark it as "italic" or "bold" and the word processor would embed the control codes in the file and, when the time came to print the file, it would change the font glyphs being sent to the printer at that point in the sequence. But another group wanted to use a far more powerful model: hierarchical style sheets. In a style sheet system, units of text -- words, or paragraphs -- are tagged with a style name, which possesses a set of attributes which are applied to the text chunk when it's printed.

Microsoft was a personal computer software company in the early 1980s, mostly notable for their BASIC interpreter and MS-DOS operating system. Steve Jobs approached Bill Gates to write applications for the new Macintosh system in 1984, and Bill agreed. One of his first jobs was to organize the first true WYSIWYG word processor for a personal computer -- Microsoft Word for Macintosh. Arguments raged internally: should it use control codes, or hierarchical style sheets? In the end, the decree went out: Word should implement both formatting paradigms. Even though they're fundamentally incompatible and you can get into a horrible mess by applying simple character formatting to a style-driven document, or vice versa. Word was in fact broken by design, from the outset -- and it only got worse from there.

Over the late 1980s and early 1990s Microsoft grew into a behemoth with a near-monopoly position in the world of software. One of its tactics became known (and feared) throughout the industry: embrace and extend. If confronted with a successful new type of software, Microsoft would purchase one of the leading companies in the sector and then throw resources at integrating their product into Microsoft's own ecosystem, if necessary dumping it at below cost in order to drive rivals out of business. Microsoft Word grew by acquiring new subsystems: mail merge, spelling checkers, grammar checkers, outline processing. All of these were once successful cottage industries with a thriving community of rival product vendors striving to produce better products that would capture each others' market share. But one by one, Microsoft moved into each sector and built one of the competitors into Word, thereby killing the competition and stifling innovation. Microsoft killed the outline processor on Windows; stalled development of the grammar checking tool, stifled spelling checkers. There is an entire graveyard of once-hopeful new software ecosystems, and its name is Microsoft Word.

As the product grew, Microsoft deployed their embrace-and-extend tactic to force users to upgrade, locking them into Word, by changing the file format the program used on a regular basis. Early versions of Word interoperated well with rivals such as Word Perfect, importing and exporting other programs' file formats. But as Word's domination became established, Microsoft changed the file format repeatedly -- with Word 95, Word 97, in 2000, and again in 2003 and more recently. Each new version of Word defaulted to writing a new format of file which could not be parsed by older copies of the program. If you had to exchange documents with anyone else, you could try to get them to send and receive RTF â€" but for the most part casual business users never really got the hang of different file formats in the "Save As ..." dialog, and so if you needed to work with others you had to pay the Microsoft Danegeld on a regular basis, even if none of the new features were any use to you. The .doc file format was also obfuscated, deliberately or intentionally: rather than a parseable document containing formatting and macro metadata, it was effectively a dump of the in-memory data structures used by word, with pointers to the subroutines that provided formatting or macro support. And "fast save" made the picture worse, by appending a journal of changes to the application's in-memory state. To parse a .doc file you virtually have to write a mini-implementation of Microsoft Word. This isn't a data file format: it's a nightmare! In the 21st century they tried to improve the picture by replacing it with an XML schema ... but somehow managed to make things worse, by using XML tags that referred to callbacks in the Word codebase, rather than representing actual document semantics. It's hard to imagine a corporation as large and [usually] competently-managed as Microsoft making such a mistake by accident ...

This planned obsolescence is of no significance to most businesses, for the average life of a business document is less than 6 months. But some fields demand document retention. Law, medicine, and literature are all areas where the life expectancy of a file may be measured in decades, if not centuries. Microsoft's business practices are inimical to the interests of these users.

Nor is Microsoft Word easy to use. Its interface is convoluted, baroque, making the easy difficult and the difficult nearly impossible to achieve. It guarantees job security for the guru, not transparency for the zen adept who wishes to focus on the task in hand, not the tool with which the task is to be accomplished. It imposes its own concept of how a document should be structured upon the writer, a structure best suited to business letters and reports (the tasks for which it is used by the majority of its users). Its proofing tools and change tracking mechanisms are baroque, buggy, and inadequate for true collaborative document preparation; its outlining and tagging facilities are piteously primitive compared to those required by a novelist or thesis author: and the procrustean dictates of its grammar checker would merely be funny if the ploddingly sophomoric business writing style it mandates were not so widespread.

But this isn't why I want Microsoft Office to die.

The reason I want Word to die is that until it does, it is unavoidable. I do not write novels using Microsoft Word. I use a variety of other tools, from Scrivener (a program designed for managing the structure and editing of large compound documents, which works in a manner analogous to a programmer's integrated development environment if Word were a basic text editor) to classic text editors such as Vim. But somehow, the major publishers have been browbeaten into believing that Word is the sine qua non of document production systems. They have warped and corrupted their production workflow into using Microsoft Word .doc files as their raw substrate, even though this is a file format ill-suited for editorial or typesetting chores. And they expect me to integrate myself into a Word-centric workflow, even though it's an inappropriate, damaging, and laborious tool for the job. It is, quite simply, unavoidable. And worse, by its very prominence, we become blind to the possibility that our tools for document creation could be improved. It has held us back for nearly 25 years already; I hope we will find something better to take its place soon.

Share This!


No comments:

Post a Comment

Powered By Blogger · Designed By Top Digg Stories