NOTE: The current preferred location for bug reports is the GitHub issue tracker.
Bug 200 - Bring things more in line with the requirements of actual web content
Bring things more in line with the requirements of actual web content
Status: NEW
Product: Validator.nu
Classification: Unclassified
Component: HTML parser
HEAD
All All
: P2 normal
Assigned To: Nobody
http://svn.whatwg.org/webapps/source?...
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-05-23 13:23 CEST by Henri Sivonen
Modified: 2009-11-23 17:16 CET (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Henri Sivonen 2008-05-23 13:23:51 CEST
Index: source
===================================================================
--- source	(revision 1673)
+++ source	(revision 1674)
@@ -28321,7 +28321,8 @@
   characters, U+000D CARRIAGE RETURN (CR) characters, or U+000D
   CARRIAGE RETURN (CR) U+000A LINE FEED (LF) pairs.</p>
 
-  <p class="note">This is a willful double violation of RFC2046.</p>
+  <p class="note">This is a willful double violation of RFC2046. <a
+  href="#refsRFC2046">[RFC2046]</a></p>
 
   <p>The first line of an application cache manifest must consist of
   the string "CACHE", a single U+0020 SPACE character, the string
@@ -31553,18 +31554,9 @@
 
   <ol>
 
-   <li><p>Skip characters in <var title="">s</var> up to and including
-   the first U+003B SEMICOLON (<code title="">;</code>)
-   character.</p></li>
-
-   <li><p>Skip any U+0009, U+000A, U+000B, U+000C, U+000D, or U+0020
-   characters (i.e. spaces) that immediately follow the
-   semicolon.</p></li>
-
-   <li><p>If the next seven characters are not a case-insensitive<!--
-   XXX ASCII--> match for 'charset', return nothing.</p></li>
-   <!-- XXX technically, we should skip to the next MIME parameter,
-   but the question is, do browsers do that? -->
+   <li><p>Find the first seven characters in <var title="">s</var>
+   that are a case-insensitive<!-- XXX ASCII--> match for the word
+   'charset'. If no such match is found, return nothing.</p>
 
    <li><p>Skip any U+0009, U+000A, U+000B, U+000C, U+000D, or U+0020
    characters that immediately follow the word 'charset' (there might
@@ -31590,11 +31582,14 @@
      <dd><p>Return the string between this character and the next
      earliest occurrence of this character.</dd>
 
+
      <dt>If it is an unmatched U+0022 QUOTATION MARK ('"')</dt>
      <dt>If it is an unmatched U+0027 APOSTROPHE ("'")</dt>
+     <dt>If there is no next character</dt>
 
      <dd><p>Return nothing.</dd>
 
+
      <dt>Otherwise</dt>
 
      <dd><p>Return the string from this character to the first U+0009,
@@ -31607,6 +31602,8 @@
 
   </ol>
 
+  <p class="note">The above algorithm is a willful violation of the
+  HTTP specification. <a href="#refsRFC2616">[RFC2616]</a></p>