Bugzilla – Bug 65
Show document outline
Last modified: 2013-04-24 19:25:17 CEST
Show the outline of (X)HTML documents so that a human can verify that it makes sense. Pending Hixie's reformulation of the outline algorithm.
Got a patch from a contributor. Working on integrating it now.
I forgot to ask: If this is client-side JS, how does the client-side JS get a document tree to work with?
(In reply to comment #2) > I forgot to ask: If this is client-side JS, how does the client-side JS get a > document tree to work with? I did that by feeding the textarea value to DOMParser.parseFromString. But I've abandoned the client-side thing. It was just a kludge. I have it implemented completely on the server side now. You can test it now at http://qa-dev.w3.org:8888/ There were a couple of bugs in the contributed code that for a couple of cases caused it an outline that didn't conform to the spec (and that didn't match output from other implementations), but I have already debugged those and fixed them. So it now passes every test I've thrown at it. The code contributions I received (MIT-licensed) were for a class that builds an outline object, and another class that emits an HTML representation from that outline object. I integrated the outline-builder code by putting it into an additional reader that gets called on parsing of the document source, and I integrated the HTML emitter into the servlet code so that it gets called only if the output format is HTML. I did that without touching any of the MessageEmitter code, because that's really not necessary for the HTML case. I think it might be good to also make the outline available for JSON output and XML output, but I haven't written successful code to do that. The JSON and XML outline emitters would need to be hooked in to the MessageEmitter code, and I messed around a bit with try to create some and hook them in there, but I get failures from further away in the nu.validator.json.JsonHandler code and XML serializer code. I guess I'd need to get more familiar with that code before I could fix the problems. Anyway, I have ready for review a patch for the outline builder and HTML emitter, and can get it to you any time. Just didn't want to pile it on you.
Another thing I should mention: The way I implemented this, it requires some (minor) changes to the HTML parser code and also to the XML parser code (Aelfred2 SAXDriver). The reason Is, it stores and retrieves the outline using the setProperty and getProperty methods of the reader. So those need to be made to recognize the property name for the outline (for which I just have the code using "http://validator.nu/properties/document-outline"). That's the only way -- without also turning the outline builder into an output-format-neutral outline emitter (which would not make sense) -- that I could see to pass the outline from the step when it's built (by the additional reader wrapper) and the step when it needs to be rendered. So the backend handling is different from that for the sorta similar "Show source" case -- but that's kind of expected because the source emitter is not format-specific (that is, for HTML vs JSON vs XML output, the source output is all the same), while the outline emitter needs to be format specific (different between HTML and JSON output especially).
That’s weird. Why not have an outline builder as a ContentHandler outside the parsers?
(In reply to comment #5) > That’s weird. Why not have an outline builder as a ContentHandler outside the > parsers? Maybe just because I wasn't sure how to hook it into the existing VerifierServletTransaction code as a ContentHandler and have that actually read the document input. Is there existing example of another filter class that's hooked into the VerifierServletTransaction code as a ContentHandler?
See the way baseUriTracker is set up with CombineContentHandler for the pattern that doesn’t require parser changes and works with both parsers. if (baseUriTracker == null) { wiretap.setWiretapContentHander(recorder); } else { wiretap.setWiretapContentHander(new CombineContentHandler( recorder, baseUriTracker)); }
(In reply to comment #7) > See the way baseUriTracker is set up with CombineContentHandler for the pattern > that doesn’t require parser changes and works with both parsers. So I've looked now at that code but it seems like more than we need for the outline case. It has since occurred to me that the outline could be stored as a property of the request rather than the reader. So that's what I've switched it to -- request.setAttribute("http://validator.nu/properties/document-outline", (Deque<Section>) currentOutlinee.outline) to store it and then to get it back, outline = (Deque<Section>) request.getAttribute("http://validator.nu/properties/document-outline"). So no changes to the parsers needed.
https://bitbucket.org/validator/validator/commits/83a8589459852dc52a2d096f80cfaeeeae301234