Before Starting
Before you begin, it is important to know a few things:
Be Careful
This is just a warning, instantiating Word on your server consumes valuable memory and resources. However, if you restrict usage of this technique or decide to build a component or pass this work off to another server, the following code will help a lot.
The Problem
A friend of mine had a few hundred Word docs to convert to HTML for web use. I decided to help by creating a simple Visual Basic app that he pointed to the directory where the Word docs were located and it iterated through each one, creating the HTML doc. This worked great, but then he had users asking him to convert documents as they were created. He kept complaining about having to break up his day converting Word docs. I basically created this function to shut him up. Due to the workflow of the document creation, here is what happened.
1 - User Created Document
2 - User Uploaded Document to the Server via Intranet Screen
3 - User Clicked on Convert to HTML Next to the Document Icon
4 - Server Converted Word to HTML
5 - Server Saved HTML to Database
At this time, the newly created content was available to all web site users (in this case intranet and extranet users) via the normal web interface.
The Solution
The following code covers the fourth step...
The work is done with a function call after the Word document was uploaded to the server. All you need to have is the path to the Word document and the path you want the HTML to be saved to.
Function WordToHTML(strWordDoc, strHTMLDoc)
On Error Resume Next
'*******************************************************
'PURPOSE: takes word docs and converts to HTML
'INPUT: strWordDoc ' full path, name and extension of Word Doc
' strHTMLDoc 'path, name and ext. for HTML doc
'FUTURE:
' create function to pull out just the body contents
' of the doc and save to database for content
' abstraction
'AUTHOR: Scott Hand
'HISTORY: 3/1/99 created, tested and implemented
' 9/24/99 modified for new version of Word
'*******************************************************
Dim objWord
Set objWord = CreateObject("Word.Application")
objWord.Visible = FALSE
objWord.Documents.Open(strWordDoc) 'path, filename and extension
'check errors (this should be another routine)
If Err.Number <> 0 Then
WordToHTML = Err.Description
Else
Dim FileFormat
Dim LockComments
Dim Password
Dim AddToRecentFiles
Dim WritePassword
Dim ReadOnlyRecommended
Dim EmbedTrueTypeFonts
Dim SaveNativePictureFormat
Dim SaveFormsData
Dim SaveAsAOCELetter
FileFormat=106 'was 104
LockComments=True
Password=""
AddToRecentFiles=False
WritePassword=""
ReadOnlyRecommended=False
EmbedTrueTypeFonts=False
SaveNativePictureFormat=True
SaveFormsData=False
SaveAsAOCELetter=False
objWord.ActiveDocument.SaveAs strHTMLDoc, FileFormat, LockComments, Password, AddToRecentFiles, WritePassword, ReadOnlyRecommended, EmbedTrueTypeFonts,SaveNativePictureFormat, SaveFormsData, SaveAsAOCELetter
'check errors (this should be another routine)
If Err.Number <> 0 Then
Response.Write Err.Description & "<br>"
WordToHTML = Err.Description
Else
WordToHTML = "Success"
'Close/Kill
objWord.ActiveDocument.Close
objWord.Quit
Set objWord = Nothing
'check errors (this should be another routine)
If Err.Number <> 0 Then
WordToHTML = Err.Description
Else
WordToHTML = "Success"
End If 'closing/killing Word Object Error Handling
End If 'creating active doc and converting Error Handling
End If 'opening Word Object Error Handling
End Function
Real quick, if you think that ANY code you write may be possibly re-used for any reason, it is best to create a global include file and placing the code in a function or Subroutine. I call mine Suitcase.asp and put it in a Suitcase subdirectory off the root and refer to it by including it.
In this case, the function does the conversion from Word to HTML and then returns a value back noting the success or failure.
You call the function by the following code:Dim conversionProcess
conversionProcess = WordToHTML("c:\temps\myWord.doc", "c:\temp\myHTML.htm")
Response.Write "<br><br>" & conversionProcess & "<br><br>"
Instead of the Response.Write, you would want to check the value of conversionProcess and proceed... It will contain either "Success" or the description of an error that has occurred.
That's it... You can change these parameters to fit your particular needs. They are pretty easy to figure out.
****Error Handling****I placed all the error handling code in the function--it basically checks for an error and returns the error description to the caller. You REALLY want to create a routine to handle all errors. I have a checkErrors routine that checks the error object and performs a few actions if there is an error (write to database, inform user(pop-up window), e-mail support staff, etc...).REMEMBER
Don't open this up to all your users unless you want to possibly bring down your server. Make sure you analyze your workflow and decide who needs to actually perform this type of work and restrict access to them.
Something else you can do is to change the FileFormat value and see what else you can create. You can covert your Word doc programmatically to the format listed in your Save as type: drop-down box.
What Else
You can also do this with other Excel and PowerPoint…
If you like what you see, please link to NetEvolution.co.uk from your site!