Ticket #151 (new defect)

Opened 4 months ago

SAX2 parser doesn't define standard entities

Reported by: candlerb Owned by: ser
Priority: normal Milestone:
Component: SAX2 Version: 3.1.7
Severity: normal Keywords:
Cc: Ruby version: 1.8.6
Operating system: Linux

Description

$ ruby -vrrexml/rexml -e 'p REXML::VERSION,PLATFORM'
ruby 1.8.6 (2008-03-03 patchlevel 114) [i686-linux]
"3.1.7.2"
"i686-linux"

See http://www.w3.org/TR/2006/REC-xml-20060816/#sec-predefined-ent

"(amp, lt, gt, apos, quot) ... All XML processors MUST recognize these entities whether they are declared or not."

However the REXML SAX2 parser initialises @entities = {}, and does not recognise the mandatory ones:

require 'rexml/parsers/sax2parser'

source = <<EOS
<foo>
  Testing &amp; &lt; &gt; &apos; &quot;
</foo>
EOS

l = Object.new
def l.method_missing(*args)
  p args
end

p = REXML::Parsers::SAX2Parser.new(source)
p.listen(l)
p.parse

Result:

[:start_document]
[:start_element, nil, "foo", "foo", {}]
[:progress, 5]
[:characters, "\n  Testing &amp; &lt; &gt; &apos; &quot;\n"]
[:progress, 46]
[:end_element, nil, "foo", "foo"]
[:progress, 5]
[:characters, "\n"]
[:progress, 0]
[:end_document]

The application could initialise @entities itself, but there appears to be no accessor to do this, so you'd have to mess around with instance_variable_set or instance_eval.

Perhaps @entities should be initialised from DocType::DEFAULT_ENTITIES? This would of course break all applications which depend on the current behaviour. See also #150

Note: See TracTickets for help on using tickets.