Bugzilla – Bug 760
Wrong recognition of CDO-CDC pairs in <style> element when a CSS property name contains a HYPHEN.
Last modified: 2010-12-03 11:09:45 CET
Overview: Wrong recognition of CDO-CDC pairs in <style> element when a CSS property name contains a HYPHEN. Steps to Reproduce: 1) Go to http://html5.validator.nu/ 2) Select "Text Field" and enter the following example HTML5 file. Note that it contains a CSS property "font-family" with a hyphen in its name. <!DOCTYPE html> <html> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> <title>CDO-CDC Test</title> <style> <!-- body { font-family: serif; } --> </style> </head> <body> </body> </html> 3) Validate! Actual Results: Error: The text content of element style was not in the required format: Content contains the character sequence <!-- without a later occurrence of the character sequence -->. From line 12, column 7; to line 12, column 14 -->↩ </style>↩ </h Syntax of text content with CDO-CDC pair: Any text content that contains the character sequence "<!--" followed by a later occurrence of the character sequence "-->". Expected Results: The document validates according to the specified schema(s). Build Date & Platform: Current http://html5.validator.nu/ and also the validator.nu compiled locally using https://whattf.svn.cvsdude.com/syntax/trunk/ revision 564. Additional Builds and Platforms: HTML5 validator on http://validator.w3.org/ does not detect this error. Additional Information: The cause of this bug is in syntax/relaxng/datatype/java/src/org/whattf/datatype/CdoCdcPair.java . The hyphen in "font-family" changes the state from HAS_CDO to HAS_CDO_AND_HYPHEN ('-' == c). But the state is kept in HAS_CDO_AND_HYPHEN even after seeing many non-hyphen characters ("else" in case HAS_CDO_AND_HYPHEN). So when a real CDC is found, the state changes from HAS_CDO_AND_HYPHEN to HAS_CDO_AND_DOUBLE_HYPHEN ('-' == c), then back to HAS_CDO ("else" in case HAS_CDO_AND_DOUBLE_HYPHEN), then to HAS_CDO ("else" in case HAS_CDO), and then an error. Here is a patch that I think will fix it. The second hunk in "case HAS_CDO_AND_DOUBLE_HYPHEN:" is for the case of <!-- comment ---> (Note three hyphens in CDC). HTML5 spec does not forbid this, but it is bad as a CSS. If the second hunk is not appropriate, please pick up only the first hunk. ######################################################################### Index: relaxng/datatype/java/src/org/whattf/datatype/CdoCdcPair.java =================================================================== --- relaxng/datatype/java/src/org/whattf/datatype/CdoCdcPair.java (revision 564) +++ relaxng/datatype/java/src/org/whattf/datatype/CdoCdcPair.java (working copy) @@ -88,12 +88,15 @@ state = State.HAS_CDO_AND_DOUBLE_HYPHEN; continue; } else { + state = State.HAS_CDO; continue; } case HAS_CDO_AND_DOUBLE_HYPHEN: if ('>' == c) { state = State.DATA; continue; + } else if ('-' == c) { + continue; } else { state = State.HAS_CDO; continue; #########################################################################
Created attachment 183 [details] patch from neomjp
Thanks very much. I've applied both parts of the patch, and it is now deployed at http://html5.validator.nu/ and http://validator.nu/ Please test again at and let me know if you see any problems.
I tested again and confirmed that I get the expected result. Thank you for picking up this patch.
https://bitbucket.org/validator/syntax/changeset/13886b1a52ab https://bitbucket.org/validator/syntax/changeset/d9f048a305d8