Bugzilla – Bug 622
Reduce amount of NamedCharacters generated code
Last modified: 2009-08-07 17:10:01 CEST
The C++ code generated by GenerateNamedCharactersCpp.java contains four similar segments of 2100+ lines (initializing NAME_*, VALUE_*, NAMES, and VALUES). We can reduce the repetition between these segments by consolidating all the named character reference data into a single file that can be #included four times. This would make the structure of nsHtml5NamedCharacters.cpp easier to understand, and reduce C++ code size by many thousands of lines.
Created attachment 110 [details] Generating and #including nsHtml5NamedCharactersInclude.h four times Each line of nsHtml5NamedCharactersInclude.h is a macro call that expands according to how the #including file has defined NAMED_CHARACTER_REFERENCE. This file can be #included four different times with four different interpretations of NAMED_CHARACTER_REFERENCE. According to my measurements, the generated-code savings are huge: ~/dev/debug/parser/html % hg diff . | diffstat nsHtml5NamedCharacters.cpp |17121 ---------------------------------------- nsHtml5NamedCharactersInclude.h | 2162 +++++ 2 files changed, 2184 insertions(+), 17099 deletions(-) The LINE_PATTERN-related changes were necessary because the HTML content of http://www.w3.org/TR/html5/named-character-references.html seems to have changed a bit (no more newlines between <tr>s, slightly different spacing).
Created attachment 112 [details] Updated to reflect changes from bug 623 This patch needs to be applied in translator-src with patch level -p1. It should apply cleanly on top of http://bugzilla.validator.nu/attachment.cgi?id=111&action=edit
Checked in. Thanks!