Bugzilla – Bug 867
confusing (incorrect?) error message for page served with «Content-Type: text/html; charset="UTF-8"» HTTP header
Last modified: 2011-10-26 08:55:13 CEST
Running a page served with a «Content-Type: text/html; charset="UTF-8"» HTTP header through the validator causes the following error message to be omitted: [[ The encoding "utf-8" is not the preferred name of the character encoding in use. The preferred name is utf-8. (Charmod C024) ]] The cause is apparently that the htmlparser code is doing a case-insensitive comparison of the preferred value «utf-8» against the value of the charset parameter in the Content-Type header-- but instead of treating that charset value as a quoted string in the comparison, it's treating it as single literal token «"UTF-8"» during that comparison -- with the quotes as part of the token -- which, due to the quotes, doesn't case-insensitively match «utf-8». So the comparison fails.
Having quotes in the parameter is bogus on the HTTP layer. Thus, the value extracted from the HTTP header includes the quotes. Any ideas how to give a more obvious message about this?
Why is it bogus?
(In reply to comment #2) > Why is it bogus? Oops, right. Double quotes aren't bogus. I was confused by my vague recollection of quote oddities in MIME. But the oddity is that single quotes aren't quotes. Double quotes aren't bogus. Need to locate the code that extracts the parameter value from the HTTP header...