Bugzilla – Bug 216
Shun UTF-32. Make it slightly clearer what 'UTF-16' means.
Last modified: 2008-05-28 14:47:04 CEST
Index: source =================================================================== --- source (revision 1700) +++ source (revision 1701) @@ -31031,13 +31031,15 @@ <tbody> <tr> <td>FE FF - <td>UTF-16BE BOM <!-- followed by a character --> or UTF-32LE BOM + <td>UTF-16BE BOM <!-- followed by a character --><!-- nobody uses this: or UTF-32LE BOM --> <tr> <td>FF FE <td>UTF-16LE BOM <!-- followed by a character --> +<!-- nobody uses this <tr> <td>00 00 FE FF <td>UTF-32BE BOM +--> <!-- this one is redundant with the one above <tr> <td>FF FE 00 00 @@ -31055,8 +31057,6 @@ <p>...then the sniffed type of the resource is "text/plain".</p> - <p class="big-issue">Should we remove UTF-32 from the above?</p> - </li> <li><p>Otherwise, if any of the first <var title="">n</var> bytes @@ -39803,6 +39803,11 @@ <p>Support for UTF-32 is not recommended. This encoding is rarely used, and frequently misimplemented.</p> + <p class="note">This specification does not make any attempt to + support UTF-32 in its algorithms; support and use of UTF-32 can thus + lead to unexpected behavior in implementations of this + specification.</p> + <h5>Preprocessing the input stream</h5> @@ -39886,7 +39891,8 @@ <ol> - <li>If the new encoding is UTF-16, change it to UTF-8.</li> + <li>If the new encoding is a UTF-16 encoding, change it to + UTF-8.</li> <li>If the new encoding is identical or equivalent to the encoding that is already being used to interpret the input stream, then set