Convert Word 2010 To HTML

If you run your own website, you may be interested in converting Word documents to HTML. HTML (Hypertext Markup Language) is the language that web pages are written in. This isn’t a HTML tutorial, so we won’t look at how to code in HTML, only how to convert Word to HTML.

Some people use Microsoft Word to write content for their website because it’s such a good word processor, and then convert it to a web page. Others simply want to convert existing documents to HTML, for publication on their website. Whatever the reason for wanting to convert Word documents to web pages, the methods to do so are the same.

Save As HTML

The first method is to save your document as HTML, by clicking the File tab > Save As, and then change the Save as type to Web Page (*.htm; *.html).

Save As HTML

Click to enlarge


You should change the title of the web page by clicking the Change Title button and then type in something meaningful:


Change Title

The web page title appears in your internet browser’s title bar, and helps users identify which browser session is which (this is especially useful if there are many browser sessions open).

When you click Save in the Save As window, a file with the .html extension is saved. When you open this file in an internet browser (like Firefox or Internet Explorer), you will see a web page that represents the Word document you saved.

In Windows Explorer, the Word generated web page looks slightly different:

Word generated web page

The first file was generated by Word, and displays the Word icon, whereas the second was hand coded manually using a HTML editor and displays an icon representing the default internet browser.

Although this method works, you invariably get a very large file size as the end result. For example, I saved a very simple Word document as HTML and Word created a .htm file that was 26KB. When hand coding the HTML manually, the equivalent file had a size of 1KB! That’s a big difference.

Word To HTML Converters

There are some Word to HTML converters available that will convert your Word document to HTML without all the unnecessary garbage that Word loads into your .htm file. One such converter is Word 2 Clean HTML. All you have to do is copy the contents of your Word document (place the cursor anywhere in the document, press ctrl-a, then press ctrl-c), and then paste them into the Word 2 Clean HTML box and then click the convert to clean html button.

The result box contains the converted HTML, with no unnecessary garbage. This needs copying to the source of your web page. The only problem with this method – and it is only a small problem – is that the end result isn’t a complete web page. You still need to create your own web page and paste the results into its source.

There are other “Word HTML Cleaners” that will take the badly coded Word HTML file and clean it up for you. This Word HTML Cleaner, for example, takes a web page and strips out all the Word nastiness. The difference between this solution and the previous one is that you have to get Word to produce the web page first and then get the tool to clean it up for you.