Ticket #78 (closed defect: fixed)

Opened 2 years ago

Last modified 2 years ago

String element values that look like numbers are treated as numbers, causing XPath queries to fail

Reported by: jos@… Owned by: ser
Priority: normal Milestone: 3.1.6
Component: DOM Version: 3.1.3
Severity: normal Keywords:
Cc: Ruby version: 1.8.4
Operating system: Unix

Description

The following code snippet demonstrates the problem:

require "rexml/document"
include REXML

doc = <<EOT
<root>
    <element>
        <tag>123</tag>
    </element>
    <element>
        <tag>123a</tag>
    </element>
</root>
EOT

xmlDoc = Document.new(doc)

["//element[tag='123']/tag", "//element[tag='123a']/tag"].each do |query|
  puts "BEGIN"
  XPath.each(xmlDoc, query) { |element|
    puts element.text
  }
  puts "END"
end

# According to http://b-cage.net/code/web/xpath-evaluator.html,
# should print:
# BEGIN
# 123
# END
# BEGIN
# 123a
# END
# Instead prints:
# BEGIN
# END
# BEGIN
# 123a
# END


Suggested patch:

--- functions.rb.orig   Wed Aug  2 12:24:24 2006
+++ functions.rb        Wed Aug  2 12:39:38 2006
@@ -330,8 +330,8 @@
       else
         str = string( object )
         #puts "STRING OF #{object.inspect} = #{str}"
-        if str =~ /^\d+/
-          object.to_s.to_f
+        if str =~ /^\d+$/
+          str.to_f
         else
           (0.0 / 0.0)
         end

It sure looks like str should be used instead of object, as object still has the surrounding tags in this case. The string call returns the value inside the tags, which is what is wanted here.

I am not sure if the $ is correct in light of numbers with decimal points in them; maybe the other cases handle those already. But without the $ the fix is incomplete.

Change History

Changed 2 years ago by josb

It looks like ticket 60 partially addressed this but the change submitted doesn't fix this particular case. Amended patch against the trunk:

--- functions.rb.orig   Wed Aug  2 13:14:33 2006
+++ functions.rb        Wed Aug  2 13:10:57 2006
@@ -330,7 +330,7 @@
       else
         str = string( object )
         #puts "STRING OF #{object.inspect} = #{str}"
-        if str =~ /^-?\.?\d/
+        if str =~ /^-?\.?\d+$/
           str.to_f
         else
           (0.0 / 0.0)

Changed 2 years ago by ser

  • status changed from new to assigned

Changed 2 years ago by ser

  • status changed from assigned to closed
  • resolution set to fixed

Fixed by changeset:1239.

NOTE: this also fixes what is technically another bug in REXML. REXML's XPath parser used to allow exponential notation in numbers. The XPath spec is specific about what a number is, and scientific notation is not included. Therefore, this has been changed.

Note: See TracTickets for help on using tickets.