John Kaster

Behind the Screen

Tidying up HTML exported from MS Word

with 2 comments

One of my pet peeves is the extremely verbose and over-enthusiastic output tags generated by MS Word when you export a document with the plainest HTML Word supports. Fortunately, both Delphi for .NET and C#Builder include HTMLTidy as one of the HTML reformatting options included in the WYSIWYG HTML and ASP.NET designer. HTMLTidy has a great option that removes just about all of those annoying extra tags.

So, I just turn on this feature by selecting Tools | Options | HTML Tidy Options and setting “Source document is from Word 2000” to “Yes”, loading the exported HTML document in question, and then select Edit | HTML Tidy | Format Document when editing the HTML source. (You must be editing the source for the HTML Tidy menu to be enabled.)

I use this technique for almost every Word document I receive that needs to be converted into HTML, and I hope you find this time saves you from some tedium as well!


Written by John Kaster

June 4, 2004 at 10:15 pm

2 Responses

Subscribe to comments with RSS.

  1. Is there more information available for HTML Tidy in Delphi?
    I recently have written a Delphi binding for the current CVS version of HTML Tidy.

    Robert Marquardt

    January 15, 2007 at 5:26 am

  2. Hi Robert,
    We used TidyPas to integrate HTML tidy into the IDE which can be found here:

    We’ve been moving away from HTML Tidy because of it lack of support for ASP.NET.


    Steve Trefethen

    January 15, 2007 at 11:09 am

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: