NOTE: The current preferred location for bug reports is the GitHub issue tracker.
Bug 820 - HtmlParser does not properly expose SVG namespace, startPrefixMapping not called
HtmlParser does not properly expose SVG namespace, startPrefixMapping not called
Status: NEW
Product: Validator.nu
Classification: Unclassified
Component: HTML parser
HEAD
All All
: P2 normal
Assigned To: Nobody
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-02-25 18:57 CET by Martin Honnen
Modified: 2013-09-23 22:35 CEST (History)
1 user (show)

See Also:


Attachments
example HTML input document with embedded SVG elements (459 bytes, text/html)
2011-02-25 18:59 CET, Martin Honnen
Details
example stylesheet to run with Saxon against the HTML input document (530 bytes, application/xml)
2011-02-25 19:03 CET, Martin Honnen
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Honnen 2011-02-25 18:57:02 CET
When trying to use the HtmlParser 1.3 with the Saxon 9.3 and 9.2 versions I encounter problems with SVG elements not being put in the SVG namespace when the XSLT stylesheet simply does a deep copy (copy-of) of all child nodes of the document node.
As the problem does not occur with other versions of Saxon like Saxon 6.5.5 or Saxon 9.1 I first raised the problem on the Saxon help mailing list, see http://sourceforge.net/mailarchive/forum.php?thread_name=4D67BFF0.3070507%40arcor.de&forum_name=saxon-help.

Michael Kay says the problem is not a Saxon bug, but rather a bug in the HtmlParser which does not call http://www.saxproject.org/apidoc/org/xml/sax/ContentHandler.html#startPrefixMapping%28java.lang.String,%20java.lang.String%29 before calling startElement, as mandated by http://www.saxproject.org/namespaces.html.

Thus I file this bug on HtmlParser.

I will attach minimal files that demonstrate the problem.

To reproduce simply run Saxon 9.3.0.4 from the command line with e.g.

java -cp htmlparser-1.3.jar;saxon9he.jar net.sf.saxon.Transform -xsl:sheet.xsl -s:input.html -x:nu.validator.htmlparser.sax.InfosetCoercingHtmlParser
Comment 1 Martin Honnen 2011-02-25 18:59:07 CET
Created attachment 190 [details]
example HTML input document with embedded SVG elements
Comment 2 Martin Honnen 2011-02-25 19:03:43 CET
Created attachment 191 [details]
example stylesheet to run with Saxon against the HTML input document

When I use the stylesheet which simply outputs a comment with some debugging information and a deep copy of the child nodes of the document node I would expect the SVG elements in the HTML input (i.e. the 'svg' and the 'circle' element) to be output in the SVG namespace, yet the result I get is as follows:

<?xml version="1.0" encoding="UTF-8"?><!--Processed with XSLT version 2.0 by XSLT processor SAXON 9.3.0.4 from Saxonica.--><html xmlns="http://www.w3.org/1999/xhtml" lang="en"><head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<title>HTML(5) parsing and XSLT transformation test</title>
</head>
<body>
<h1 class="heading">HTML(5) parsing and XSLT transformation test</h1>
<p>
This is paragraph 1 with some inline SVG:
<svg with="100" height="100">
  <circle cx="50" cy="50" r="20" fill="green"/>
</svg>
</p><p>This is the next paragraph.
</p>


</body></html>

So the SVG elements end up in the XHTML namespace which is not the desired result with HTML(5) parsing.
Comment 3 Martin Honnen 2011-02-25 19:55:58 CET
I hope I have given any information needed but I add a link http://markmail.org/thread/bpxhih7omu7pz2wa to the markmail view of the thread on the Saxon help list, as that is easier to use than the sourceforge mail archive.