Bugzilla – Bug 622
Reduce amount of NamedCharacters generated code
Last modified: 2009-08-07 17:10:01 CEST
The C++ code generated by GenerateNamedCharactersCpp.java contains four similar segments of 2100+ lines (initializing NAME_*, VALUE_*, NAMES, and VALUES). We can reduce the repetition between these segments by consolidating all the named character reference data into a single file that can be #included four times. This would make the structure of nsHtml5NamedCharacters.cpp easier to understand, and reduce C++ code size by many thousands of lines.
Created attachment 110 [details]
Generating and #including nsHtml5NamedCharactersInclude.h four times
Each line of nsHtml5NamedCharactersInclude.h is a macro call that expands according to how the #including file has defined NAMED_CHARACTER_REFERENCE. This file can be #included four different times with four different interpretations of NAMED_CHARACTER_REFERENCE.
According to my measurements, the generated-code savings are huge:
~/dev/debug/parser/html % hg diff . | diffstat
nsHtml5NamedCharacters.cpp |17121 ----------------------------------------
nsHtml5NamedCharactersInclude.h | 2162 +++++
2 files changed, 2184 insertions(+), 17099 deletions(-)
The LINE_PATTERN-related changes were necessary because the HTML content of
seems to have changed a bit (no more newlines between <tr>s, slightly different spacing).
Created attachment 112 [details]
Updated to reflect changes from bug 623
This patch needs to be applied in translator-src with patch level -p1.
It should apply cleanly on top of http://bugzilla.validator.nu/attachment.cgi?id=111&action=edit
Checked in. Thanks!