Word 2003 XML Viewer Control v1.0 released

| 10 Comments | No TrackBacks

I was cleaning up my backyard and found this control I never finished. So I did. Here is Word 2003 XML Viewer Control v1.0 just in case somebody needs it. It's is ASP.NET 2.0 Web server control, which allows to display arbitrary Microsoft Word 2003 XML documents (aka WordML aka WordprocessingML) on the Web so people not having Microsoft Office 2003 installed can browse documents using only a browser.

The control renders Word 2003 XML documents by transforming content to HTML preserving styling and extracting images. Both Internet Explorer and Firefox are supported.

Word 2003 XML Viewer Control is Web version of the Microsoft Word 2003 XML Viewer tool and uses the same WordML to HTML transformation stylesheet thus providing the same rendering quality.

The control is free open-source, download it here, find documentation here.

I'm doing interesting trick with images in this control. The problem is that in WordML images are embedded into the document, so they need to be extracted when transforming to HTML. And I wanted to avoid writing images to file system. So the trick is to extract image when generating HTML (via XSLT), assign it guid, put it into session and generate <img> src attribute requesting image by guid. Then when browser renders HTML it requests images by guid and custom HTTP handler gets them from the session.

Having HTTP handler in ASP.NET control posed another problem - how do you register HTTP handler in Web.config automatically? AFAIK there is no out of box solution for the problem, but happily I found a solution that covers major use case. Here is piece of documentation:

When you are adding the first Word 2003 XML Viewer Control in your Web project, you should see the following confirmation dialog: "Do you want to automatically register the HttpHandler needed by this control in the web.config?". You must answer Yes to allow the control to register image handler in the Web.config. If don't answer Yes or if you add the control not in Design mode, you have to add the following definition to the Web.config in the <system.web> section:
<httpHandlers>
   <add path="image.ashx" verb="*" type="XMLLab.WordXMLViewer.ImageHandler, XMLLab.WordXMLViewer" />
</httpHandlers>

Yep. the hint is the Design mode. I'll post about this trick tomorrow.

The usage is simple - just drop control and assign "DocumentSource" property (Word 2003 XML file you want to show).

I deliberately named this control "Word 2003 XML Viewer Control" to avoid confusion. But I'll update it to support Word 2007 as soon as there is Word 2007 to HTML transformation problem solution.

Any comments are welcome. Enjoy.

Many people use converting PDF to Word as a way to change to a more easily editable document that only PDF conversion can easily accomplish--if you didn't convert PDF to Word then you'd have to manually transcribe the document, while PDF to Word software does this in a few clicks.

Related Blog Posts

No TrackBacks

TrackBack URL: http://www.tkachenko.com/cgi-bin/mt-tb.cgi/687

10 Comments

the url are broken
is the code still free source ?
where can i download it ?

Hello Oleg,

Your web control is perfect, I have used it many times and can say it is very stable.

I extend your control to support templates, in the WorldML I placed merge fields, and them maped them to the data:

">






This way I am using the control as template engine.

Nicolay

Dear Oleg Tkachenko,
Wanted to know if you have started working on XSL-Stylesheet for Word 2007 to html conversion..
I'm facing issues with tab-before and tab-after spaces in Word 2007. In Word 2003, this information was stored in the Word xml as wx:tabBefore and wx:tabAfter. This is no longer available. Also, with lists, the actual list label was stored as wx:t. This information is not available in Word 2007 xml and Word automatically shows the value at run time.. Wanted to know your thoughts on retrieving ( or generating) these values with an XSL-Stylesheet. I'm having a tough time generating values like "First, Second", "One, Two" etc..
Appreciate your help.

Best Regards,
Vasu

Hi Oleg,

I need a control or a way to embed the viewer in a winform application. Any idea on how to get that done? i specifically want that paper view of the document is shown, with headers and footers alike. Thanks in advance!

hi, i using a customised xslt to transform the word 2003 xml to html but the problem that i'm getting is that i do get the position & size of the pictures but the pictures are not getting displayed...how can i solve this problem??...the project is in vb.net. in the xslt i hve put this for extraction of pictures:

help will be greatly appreciated...thks

Any kind of documents? That's a bold goal.
Well, you need to be able to handle any kind of documents and convert them to HTML. For Word 2003 you can use this control, but for others - no idea.


Hai,
I am working on a web application where i want to create a document viewer control which can display any kind of documents. Can any one sujest me which componenet to use if any.

Thnks & Regs
Bharath Reddy VasiReddy

Dear Oleg Tkachenko,
Do you have a XSLT that does reverse of what this control do. I have xhtml text(Rich text from asp.net web page) and i want to show the rich text in word document generated by wordml. I am unable to find the documentation regarding this.

Thanks for your help in advance.

Franck, yes, you can use that picture for the book.

Contact me oleg@<this domain name> if you need original image.

Dear Oleg Tkachenko,
I am a french researcher in ecology and with two colleagues, from England and from Czech Republic, we are writing a scientific (non-commercial) book in population dynamics and conservation biology. We are looking for an illustration with gold fish, and we saw your nice photo http://tkachenko.com/cs/photos/fauna/picture99.aspx and we were wondering whether we would be allowed to use it. I post this request here because I have not found your email address, I hope you can read it.
Please do not hesitate to contact us for any information (you can find the content of the book here: http://www.entu.cas.cz/berec/alleebook.php)
Thank you very much in advance.
yours sincerely,
Dr Franck Courchamp


Leave a comment