Ticket #151 (new defect)
Opened 4 months ago
SAX2 parser doesn't define standard entities
| Reported by: | candlerb | Owned by: | ser |
|---|---|---|---|
| Priority: | normal | Milestone: | |
| Component: | SAX2 | Version: | 3.1.7 |
| Severity: | normal | Keywords: | |
| Cc: | Ruby version: | 1.8.6 | |
| Operating system: | Linux |
Description
$ ruby -vrrexml/rexml -e 'p REXML::VERSION,PLATFORM' ruby 1.8.6 (2008-03-03 patchlevel 114) [i686-linux] "3.1.7.2" "i686-linux"
See http://www.w3.org/TR/2006/REC-xml-20060816/#sec-predefined-ent
"(amp, lt, gt, apos, quot) ... All XML processors MUST recognize these entities whether they are declared or not."
However the REXML SAX2 parser initialises @entities = {}, and does not recognise the mandatory ones:
require 'rexml/parsers/sax2parser' source = <<EOS <foo> Testing & < > ' " </foo> EOS l = Object.new def l.method_missing(*args) p args end p = REXML::Parsers::SAX2Parser.new(source) p.listen(l) p.parse
Result:
[:start_document]
[:start_element, nil, "foo", "foo", {}]
[:progress, 5]
[:characters, "\n Testing & < > ' "\n"]
[:progress, 46]
[:end_element, nil, "foo", "foo"]
[:progress, 5]
[:characters, "\n"]
[:progress, 0]
[:end_document]
The application could initialise @entities itself, but there appears to be no accessor to do this, so you'd have to mess around with instance_variable_set or instance_eval.
Perhaps @entities should be initialised from DocType::DEFAULT_ENTITIES? This would of course break all applications which depend on the current behaviour. See also #150
