O'Reilly Book Excerpts: Linux Desktop Hacks
Hacking the Linux Desktop, Part 2
Editor's note: If you didn't get enough Linux tweaks last week from O'Reilly's Linux Desktop Hacks, here are two more hacks from the book to satiate your hacking needs.
View Microsoft Word Documents in a Terminal
Avoid the load time of OpenOffice.org and view Microsoft Word documents in a terminal.
The simplest way to view a Microsoft Word
document in a terminal is to use the
catdoc turns a Word document to plain
text, which does little or nothing to preserve the format of the
original Word document. Obviously, it's nearly
impossible to view a Word document in a terminal exactly the way it
would look in Word. Heck, competing word processors have trouble
importing Word documents without upsetting the format, and they have
the advantage of being a graphical desktop application. But this hack
is still a vast improvement over the popular
catdoc program, because it preserves at least
some of the formatting of the original document by converting the
Word document to HTML.
You'll need both the wvWare set of file conversion utilities and the hybrid web browser/pager w3m, along with a little scripting magic to view Word documents in a terminal or console while retaining at least some of the original formatting.
wv, the All-Purpose Word Converter
There is a way to retain at least some of the original formatting while printing the document to the screen. For this, you need a set of utilities under the name of wvWare. You can find the home page for wvWare at http://wvware.sourceforge.net. Packages of wvWare are readily available for almost all Linux distributions, although the package name is usually just wv. For example, if you don't already have it installed on your system, you can install wv in Debian Linux with this command:
# apt-get install wv
Users of the yum package can get the RPM version of wv with this command:
# yum install wv
w3m, the All-Purpose Web Browser/Pager
That's not all you need for this hack. You also need a popular pager/browser called w3m. Packages of w3m should be available for most Linux distributions, and the package name is usually w3m. For example, you can install w3m in Debian Linux with this command:
# apt-get install w3m
Users of the yum package can get the RPM version of w3m with:
# yum install w3m
The w3m program is rather unique in that it is a web browser that works like a pager--that is, you can pipe text into w3m and use w3m to simply page back and forth through the text. Some versions of w3m even render graphics in a frame-buffer console without having an X Windows desktop running.
You can combine the two utilities to get the desired result of viewing a Word document in a terminal. Use wvWare to convert a Microsoft Word document to HTML format, and then pipe the output into the w3m pager to view it. Here's the full command you need to make it work (this command assumes wvHtml.xml is stored in the /usr/lib/wv directory, which might not be the case on your Linux system):
$ wvWare -x /usr/lib/wv/wvHtml.xml document.doc | w3m -T text/html
That's a lot of typing every time you want to view a Word document, so turn it into a script called viewdoc to make it easier to use in the future. Log in as root and use your favorite editor to create the following script:
#!/bin/bash wvWare -x /usr/lib/wv/wvHtml.xml $1 2>/dev/null | w3m -T text/html
Note the one subtle addition, 2>/dev/null. This simply redirects any error messages to the twilight zone so that they do not interfere with the presentation of the Word document. Store it as /usr/local/bin/viewdoc and make the script executable with this command:
# chmod +x /usr/local/bin/viewdoc
Now all you have to do to view a Word document in a text console or terminal is issue this command:
Not only does this technique preserve at least some of the formatting of a Word document, but also, hyperlinks are live and you can activate them to visit the URL from within the w3m viewer you're using to view the document. Figure 7-3 shows an example of a Word document viewed with w3m. Note both the bold headings and the live link to http://www.bootsplash.de/files.
Figure 7-3. A Word document viewed in HTML text format
Pages: 1, 2