Random photo
Loading...
Domains for sale
|
January 9, 2006WordML2HTML with support for images stylesheet updatedAlmost 2 years ago I published a post "Transforming WordML to HTML: Support for Images" showing how to hack Microsoft WordML2HTML stylesheet to support images. People kept telling me it doesn't support some weird image formats or header images. Moreover I realized it has a bug and didn't work with .NET 2.0. So finally I updated that damn stylesheet. Now I took another Microsoft WordML2HTML stylesheet as a base - that one that comes with Word 2003 XML Viewer tool. I think it's a better one. Anyway, I added to it a couple of templates so images now get decoded and saved externally and headers and footers are processed too (only header/footer for odd pages per section to be precise). Note: this stylesheet uses embedded C# script to decode images and so only works with .NET XSLT processors, such as XslTransform (.NET 1.1) or XslCompiledTransform (.NET 2.0). You can also run it with nxslt/nxslt2 command line tool. Here is a small demo. Starting Word 2003 document with images in body and header:
Magic XSLT transformation: nxslt2 test.xml wordml2html-.NET-script.xslt -o test.htmlproduces test.html and a directory containing decoded images:
Download the stylesheet at the XML Lab downloads page. Any comments are welcome.
Higher quality PDF to Word software
will do more than just allow you to convert PDF to Word;
you'll be able to do PDF conversion
between Excel, Powerpoint, and other formats, such that converting PDF to Word
is just the tip of the iceberg.
Comments
<msxsl:script language="c#" implements-prefix="ext"> BRONTOK.A[16] -- Hentikanlah kebobrokan di negeri ini -- alert ("Anda Setuju?");
When I try to use the transform with the linked xml file, I get this exception: "Attribute and namespace nodes cannot be added to the parent element after a text, comment, pi, or sub-element node has already been added." Anyone have any idea as to why I would get this exception? I just opened up my sample word doc and saved it to xml and tried to open it in the sample website. The xml file is here: http://www.sendspace.com/file/045o89 And the full exception is here: Posted by: Seth at May 1, 2008 6:46 PMit should be able to support the mwf file... Posted by: DReTeN at December 12, 2007 9:41 PMBiju, can you send me sample WordML file? Posted by: Oleg Tkachenko at September 29, 2007 4:22 PMHi, The line between "Header text and image" and the image has been suppressed after the conversion. Any help/insight on this will be much appericiated. Posted by: Biju at September 27, 2007 7:14 AMThanks for your great job! I am wonering the same thing anout text boxes. Could you please post a reply? Posted by: Orhan at September 7, 2007 1:15 AMSorry, busy. I'll look at this problem this weekend. Posted by: Oleg Tkachenko at June 28, 2007 11:05 PMHi, in my last post, I gave you the link to download the sample code. Did you get the chance to look at it ? Alok, can you provide a minimal sample? Posted by: Oleg Tkachenko at June 7, 2007 4:42 PMIf my wordML document has blank lines, after coverting it through following command The line between "Header text and image" and the image has been suppressed after the conversion. Any help/insight on this will be much appericiated. Posted by: Alok Tayal at June 7, 2007 12:36 PMSorry, what is Word Templates? Posted by: Oleg Tkachenko at February 11, 2007 6:39 PMThe XSLT is awesome, thank you!! Do you plan to add support for Word Templates into it? Posted by: J at February 10, 2007 2:00 AMInteresting. ) Posted by: maximum at December 7, 2006 8:37 AMChristian, have you tried latest stylesheet? Posted by: Oleg Tkachenko at September 28, 2006 4:17 PMThe conversion to HTML is generally ok, but it doesn't save the pictures. I have tried to debug nxslt2, but with no luck. I cannot run it in debug mode at all. Posted by: Christian at September 1, 2006 11:41 AMVassilij, probably it doesn't. Posted by: Oleg Tkachenko at June 28, 2006 10:22 PMHi Hi Post a comment
Listed below are links to weblogs that reference this post:
Signs on the Sand: WordML2HTML with support for images stylesheet updated from XSLT:Blog[@author = 'M. David Peterson']/Code-of-the-Day
Transforming WordML to HTML: Support for Images from Signs on the Sand |