NOTE: The current preferred location for bug reports is the GitHub issue tracker.
Bug 96 - Make the processing of '<' characters in attributes while doing the encoding-scan parse match the behaviour of the main parser. Also, various minor editorial fixes.
Make the processing of '<' characters in attributes while doing the encoding-...
Status: RESOLVED FIXED
Product: Validator.nu
Classification: Unclassified
Component: HTML parser
HEAD
All All
: P2 normal
Assigned To: Henri Sivonen
http://svn.whatwg.org/webapps/source?...
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-03-03 13:09 CET by Nobody
Modified: 2008-03-20 16:11 CET (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nobody 2008-03-03 13:09:59 CET
Index: source
===================================================================
--- source	(revision 1264)
+++ source	(revision 1265)
@@ -35597,14 +35597,14 @@
 
        </dd>
 
-       <dt>A sequence of bytes starting with: 0x3C, 0x4D or 0x6D, 0x45 or 0x65, 0x54 or 0x74, 0x41 or 0x61, and finally one of 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x20 (case-insensitive ASCII '&lt;meta' followed by a space)</dt>
+       <dt>A sequence of bytes starting with: 0x3C, 0x4D or 0x6D, 0x45 or 0x65, 0x54 or 0x74, 0x41 or 0x61, and finally one of 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x20, 0x2F (case-insensitive ASCII '&lt;meta' followed by a space or slash)</dt>
        <dd>
 
         <ol>
 
          <li><p>Advance the <var title="">position</var> pointer so
-         that it points at the next 0x09, 0x0A, 0x0B, 0x0C, 0x0D, or
-         0x20 byte (the one in sequence of characters matched
+         that it points at the next 0x09, 0x0A, 0x0B, 0x0C, 0x0D,
+         0x20, or 0x2F byte (the one in sequence of characters matched
          above).</p></li>
 
          <li><p><span title="concept-get-attributes-when-sniffing">Get
@@ -35672,12 +35672,7 @@
          <li><p>Advance the <var title="">position</var> pointer so
          that it points at the next 0x09 (ASCII TAB), 0x0A (ASCII LF),
          0x0B (ASCII VT), 0x0C (ASCII FF), 0x0D (ASCII CR), 0x20
-         (ASCII space), 0x3E (ASCII '>'), 0x3C (ASCII '&lt;')
-         byte.</p></li>
-
-         <li><p>If the pointer points to a 0x3C (ASCII '&lt;') byte, then
-         return to the first step in the overall "two step"
-         algorithm.</p></li>
+         (ASCII space), or 0x3E (ASCII '>') byte.</p></li>
 
          <li><p>Repeatedly <span
          title="concept-get-attributes-when-sniffing">get an
@@ -35726,13 +35721,8 @@
      <li><p>If the byte at <var title="">position</var> is one of 0x09
      (ASCII TAB), 0x0A (ASCII LF), 0x0B (ASCII VT), 0x0C (ASCII FF),
      0x0D (ASCII CR), 0x20 (ASCII space), or 0x2F (ASCII '/') then
-     advance <var title="">position</var> to the next byte and start
-     over.</p></li>
-
-     <li><p>If the byte at <var title="">position</var> is 0x3C (ASCII
-     '&lt;'), then move <var title="">position</var> back to the
-     previous byte, and stop looking for an attribute. There isn't
-     one.</p></li>
+     advance <var title="">position</var> to the next byte and redo
+     this substep.</p></li>
 
      <li><p>If the byte at <var title="">position</var> is 0x3E (ASCII
      '>'), then stop looking for an attribute. There isn't
@@ -35760,8 +35750,7 @@
 
        <dd>Jump to the step below labelled <em>spaces</em>.</dd>
 
-       <dt>If it is 0x2F (ASCII '/'), 0x3C (ASCII '&lt;'), or 0x3E
-       (ASCII '&gt;')</dt>
+       <dt>If it is 0x2F (ASCII '/') or 0x3E (ASCII '>')</dt>
 
        <dd>Stop looking for an attribute. The attribute's name is the
        value of <var title="">attribute name</var>, its value is the
@@ -35853,7 +35842,7 @@
 
        </dd>
 
-       <dt>If it is 0x3C (ASCII '&lt;'), or 0x3E (ASCII '&gt;')</dt>
+       <dt>If it is 0x3E (ASCII '>')</dt>
 
        <dd>Stop looking for an attribute. The attribute's name is the
        value of <var title="">attribute name</var>, its value is the
@@ -35884,8 +35873,8 @@
       <dl class="switch">
 
        <dt>If it is 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0B (ASCII
-       VT), 0x0C (ASCII FF), 0x0D (ASCII CR), 0x20 (ASCII space), 0x3C
-       (ASCII '&lt;'), or 0x3E (ASCII '&gt;')</dt>
+       VT), 0x0C (ASCII FF), 0x0D (ASCII CR), 0x20 (ASCII space), or
+       0x3E (ASCII '>')</dt>
 
        <dd>Stop looking for an attribute. The attribute's name is the
        value of <var title="">attribute name</var> and its value is the
@@ -36002,7 +35991,7 @@
   U+FFFD REPLACEMENT CHARACTERs. Any occurrences of such characters is
   a <span>parse error</span>.</p>
 
-  <p>Any occurances of any characters in the ranges U+0001 to U+0008,
+  <p>Any occurrences of any characters in the ranges U+0001 to U+0008,
   <!-- space characters allowed --> U+000E to U+001F, <!-- ASCII
   allowed --> U+007F <!--to U+0084, (U+0085 NEL not allowed),
   U+0086--> to U+009F, U+D800 to U+DFFF <!-- surrogates not allowed
@@ -41159,12 +41148,12 @@
 
   <p><dfn id="escapingString">Escaping a string</dfn> (for the
   purposes of the algorithm above) consists of replacing any
-  occurances of the "<code title="">&amp;</code>" character by the
-  string "<code title="">&amp;amp;</code>", any occurances of the
+  occurrences of the "<code title="">&amp;</code>" character by the
+  string "<code title="">&amp;amp;</code>", any occurrences of the
   "<code title="">&lt;</code>" character by the string "<code
-  title="">&amp;lt;</code>", any occurances of the "<code
+  title="">&amp;lt;</code>", any occurrences of the "<code
   title="">&gt;</code>" character by the string "<code
-  title="">&amp;gt;</code>", and any occurances of the "<code
+  title="">&amp;gt;</code>", and any occurrences of the "<code
   title="">&quot;</code>" character by the string "<code
   title="">&amp;quot;</code>".</p>