Bugzilla – Bug 494
Empty class is marked as invalid
Last modified: 2009-05-01 04:36:49 CEST
Yet empty classes are valid (just nonsense). Sample:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
Created attachment 85 [details]
patch with proposed change
This patch sets the allowed value of the class attribute in the whattf XHTML10 schema to a list of zero or more xsd:NMTOKENS (it was set to one or more). That technically matches the HTML4 and XHTML1.0 specs (modulo the ambiguity that those specs don't clearly say whether the class value can actually be empty), but HTML4 and XHTML1.0 DTDs actually use CDATA as the allowed value for class. So if we wanted to match actual practice, the whattf XHTML10 schema should instead just set the allowed value to the text.datatype pattern (which is defined as "text").
Created attachment 86 [details]
just cd'ed up to my base validator dir and ran diff from there so that it would should the full path to the file (to make it clear where the change is)
The quantifier inside the comment should be +. Other than that, look better than what's in currently, but is there any spec basis even for nmtokens of any number as opposed to any string?
XHTML Modularization uses nmtokens, but it claims to be an XHTML 1.0 schema which uses CDATA, hence it should be CDATA.
(In reply to comment #3)
> The quantifier inside the comment should be +.
oops. I'll fix that.
> Other than that, look better
> than what's in currently, but is there any spec basis even for nmtokens of any
> number as opposed to any string?
As Geoffrey noted, the XHTML 1.0 schema uses CDATA
As far as the spec itself - http://www.w3.org/TR/html401/struct/global.html#adef-class - that specifies the datatype for @class as "cdata-list". The spec never defines what a "cdata-list" is; that instance of "cdata-list" is just a hyperlink to the definition of CDATA in the spec. The text following says, "Multiple class names must be separated by white space characters."
The non-normative attribute quick reference that's an appendix to the spec - http://www.w3.org/TR/html4/index/attributes.html- gives the datatype as CDATA and has the comment "space-separated list of classes".
I notice that the HTML4 DTD and spec never actually use NMTOKENS anyway. I guess that's because in DTDs, there's no way to state that an attribute value can be zero or more NMTOKENS. So if they had used NMTOKENS, it would have meant that the attribute value could not be empty (because NMTOKENS is defined as 1 more more NMTOKEN). So I'd speculate that when the old HTML WG put together XHTML 1.1, they probably figured they were correcting that deficiency or something.
Anyway, it does seems clearthat at this point that the XHTML1 schema should use the equivalent of CDATA. So I'll change the patch.
Created attachment 92 [details]
patch changes datatype of class to RelaxNG "text" primitive
Created attachment 93 [details]
validator r313 http://svn8.cvsdude.com/vvc/whattf/validator?view=revision&revision=313