NOTE: The current preferred location for bug reports is the GitHub issue tracker.
Bug 641 - U+EDC in URL triggers error "Bad value ໜ for attribute href on element a: COMPATIBILITY_CHARACTER in PATH."
U+EDC in URL triggers error "Bad value ໜ for attribute href on element a: COM...
Status: RESOLVED FIXED
Product: Validator.nu
Classification: Unclassified
Component: Datatype library
HEAD
All All
: P2 normal
Assigned To: Michael[tm] Smith
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-09-04 01:00 CEST by Aryeh Gregor
Modified: 2009-12-01 02:49 CET (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Aryeh Gregor 2009-09-04 01:00:48 CEST
Try validating the following document:

<!doctype html>
<title>Test</title>
<p><a href=ໜ>Bad URL?</a>

where ໜ is replaced the Unicode character U+EDC LAO HO NO.  The validator gives the following error:

Error: Bad value ໜ for attribute href on element a: COMPATIBILITY_CHARACTER in PATH.
From line 3, column 4; to line 3, column 13
title>↩<p><a href=ໜ>Bad UR
Syntax of IRI reference:
Any URL. For example: /hello, #canvas, or http://example.org/.

I've followed the bread crumbs from HTML 5 to Web Addresses to RFC 3987, and as far as I can tell, the entire range %xA0-D7FF is permitted in ucschar, thus iunreserved, thus ipchar, thus isegment*, etc., and so should be valid in IRIs, and thus HTML 5 URLs.  If I'm wrong, the error message here should at least be more helpful.

This is the second of the two apparent validator bugs I've found that block validation of www.wikipedia.org as HTML 5.  See also bug 640.
Comment 1 Henri Sivonen 2009-09-17 12:25:38 CEST
Compatibility character in IRIs is a SHOULD violation (RFC 3987, section 7.5., third paragraph). Validator.nu configures the IRI checking library to treat SHOULD violations as errors.
Comment 2 Henri Sivonen 2009-09-17 15:15:53 CEST
But chapter 7 of the RFC is informative even though it says "should"...
Comment 3 Michael[tm] Smith 2009-12-01 02:45:35 CET
I checked in a fix for this and you can check it at http://qa-dev.w3.org:8888/

This case now causes a warning to be reported, instead of an error.