Bugzilla – Bug 820
HtmlParser does not properly expose SVG namespace, startPrefixMapping not called
Last modified: 2013-09-23 22:35:22 CEST
When trying to use the HtmlParser 1.3 with the Saxon 9.3 and 9.2 versions I encounter problems with SVG elements not being put in the SVG namespace when the XSLT stylesheet simply does a deep copy (copy-of) of all child nodes of the document node. As the problem does not occur with other versions of Saxon like Saxon 6.5.5 or Saxon 9.1 I first raised the problem on the Saxon help mailing list, see http://sourceforge.net/mailarchive/forum.php?thread_name=4D67BFF0.3070507%40arcor.de&forum_name=saxon-help. Michael Kay says the problem is not a Saxon bug, but rather a bug in the HtmlParser which does not call http://www.saxproject.org/apidoc/org/xml/sax/ContentHandler.html#startPrefixMapping%28java.lang.String,%20java.lang.String%29 before calling startElement, as mandated by http://www.saxproject.org/namespaces.html. Thus I file this bug on HtmlParser. I will attach minimal files that demonstrate the problem. To reproduce simply run Saxon 9.3.0.4 from the command line with e.g. java -cp htmlparser-1.3.jar;saxon9he.jar net.sf.saxon.Transform -xsl:sheet.xsl -s:input.html -x:nu.validator.htmlparser.sax.InfosetCoercingHtmlParser
Created attachment 190 [details] example HTML input document with embedded SVG elements
Created attachment 191 [details] example stylesheet to run with Saxon against the HTML input document When I use the stylesheet which simply outputs a comment with some debugging information and a deep copy of the child nodes of the document node I would expect the SVG elements in the HTML input (i.e. the 'svg' and the 'circle' element) to be output in the SVG namespace, yet the result I get is as follows: <?xml version="1.0" encoding="UTF-8"?><!--Processed with XSLT version 2.0 by XSLT processor SAXON 9.3.0.4 from Saxonica.--><html xmlns="http://www.w3.org/1999/xhtml" lang="en"><head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> <title>HTML(5) parsing and XSLT transformation test</title> </head> <body> <h1 class="heading">HTML(5) parsing and XSLT transformation test</h1> <p> This is paragraph 1 with some inline SVG: <svg with="100" height="100"> <circle cx="50" cy="50" r="20" fill="green"/> </svg> </p><p>This is the next paragraph. </p> </body></html> So the SVG elements end up in the XHTML namespace which is not the desired result with HTML(5) parsing.
I hope I have given any information needed but I add a link http://markmail.org/thread/bpxhih7omu7pz2wa to the markmail view of the thread on the Saxon help list, as that is easier to use than the sourceforge mail archive.