NOTE: The current preferred location for bug reports is the GitHub issue tracker.
Bug 482 - consider including content-model, etc., informational stuff in GNU output (not just in HTML output)
consider including content-model, etc., informational stuff in GNU output (no...
Status: RESOLVED INTENTIONAL
Product: Validator.nu
Classification: Unclassified
Component: Web service formats
HEAD
All All
: P2 enhancement
Assigned To: Nobody
: 483 484 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-04-15 10:26 CEST by Michael[tm] Smith
Modified: 2013-07-12 05:26 CEST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael[tm] Smith 2009-04-15 10:26:55 CEST
About the content-model/contexts/attributes details that v.nu scrapes from the spec and includes in particular error messages -- it would be useful to have that information available in GNU output as well (though I recognize it'd be some significant work to implement).
Comment 1 Michael[tm] Smith 2009-04-15 11:59:52 CEST
*** Bug 483 has been marked as a duplicate of this bug. ***
Comment 2 Michael[tm] Smith 2009-04-15 12:00:11 CEST
*** Bug 484 has been marked as a duplicate of this bug. ***
Comment 3 Henri Sivonen 2009-04-16 16:03:37 CEST
Any suggestions on how to convert the HTML fragments into plain text and how to prefix the lines in the GNU format?
Comment 4 Michael[tm] Smith 2009-04-17 10:03:49 CEST
(In reply to comment #3)
> Any suggestions on how to convert the HTML fragments into plain text and how to
> prefix the lines in the GNU format?

As far as how to convert the HTML fragments, suppose the easiest and quickest thing to do would be to write a custom converter/serializer to handle just the subset of HTML element names that are used in the spec fragments -- which seems basically just to be <dl>, <dt>, <dd>, <a>, and <code>.

I think the GNU-format converter/serializer wouldn't need to do anything at all with <code>, nor with  <a>.  For <dt> content, which ends with a colon character to introduce the list items, I think because of the fact the colon is used as a field separator in GNU format, it would need to convert the colon to something else. Maybe just a space and dash?

I think the rest of it would come down to just normalizing the line breaks in the HTML fragments into single spaces, generating a comma character after contents of each <dd>, and then generating a period after the contents of the <dl> (closing tag). And then, preferably, emitting it as a separate "info" message instead of as part of the error (see the bug 485).

So for the following case:

http://dev.w3.org/html5/tests/validation/full/invalid/unknown-attribute/link.html

...the output would be:

"link.html":5.1-5.44: error: Attribute “bar†not allowed on element “link†at this point.
"link.html":5.1-5.44: info: Element-specific attributes for element link - Global attributes, href, rel, media, hreflang, type, sizes. Also, the title attribute has special semantics on this element.