Ticket #37 (new enhancement)
Encodings should be able to suggest a preference order for conversion.
| Reported by: | ser | Owned by: | ser |
|---|---|---|---|
| Priority: | normal | Milestone: | 3.1.8 |
| Component: | DOM | Version: | 3.1.2 |
| Severity: | normal | Keywords: | |
| Cc: | Ruby version: | 1.8.2 | |
| Operating system: | Linux |
Description
From Nobuyoshi Nakada:
Currently, REXML prefers iconv module to convert encodings. Iconv can be useful for general purpose, but, to be frank, is halfdone, I guess. In many cases, particular conversion engines would be much preferable if available.Also, nkf module, a bundled library, can deal with Japanese characters in utf-8 as well as in others already, so I'd like to give priority nkf to over uconv, which is not bundled.
And he provides a patch:
Index: ruby-ruby_1_8/lib/rexml/encodings/EUC-JP.rb
===================================================================
RCS file: /cvs/ruby/src/ruby/lib/rexml/encodings/EUC-JP.rb,v
retrieving revision 1.6.2.1
diff -U2 -p -r1.6.2.1 EUC-JP.rb
--- ruby-ruby_1_8/lib/rexml/encodings/EUC-JP.rb 19 May 2005 03:51:53 -0000 1.6.2.1
+++ ruby-ruby_1_8/lib/rexml/encodings/EUC-JP.rb 31 Oct 2005 04:31:52 -0000
@@ -1,12 +1,27 @@
-require 'uconv'
-
module REXML
module Encoding
- def decode_eucjp(str)
- Uconv::euctou8(str)
- end
+ begin
+ require 'uconv'
+
+ def decode_eucjp(str)
+ Uconv::euctou8(str)
+ end
+
+ def encode_eucjp content
+ Uconv::u8toeuc(content)
+ end
+ rescue LoadError
+ require 'nkf'
+
+ EUCTOU8 = '-Ewm0'
+ U8TOEUC = '-Wem0'
- def encode_eucjp content
- Uconv::u8toeuc(content)
+ def decode_eucjp(str)
+ NKF.nkf(EUCTOU8, str)
+ end
+
+ def encode_eucjp content
+ NKF.nkf(U8TOEUC, content)
+ end
end
Index: ruby-ruby_1_8/lib/rexml/encodings/SHIFT-JIS.rb
===================================================================
RCS file: /cvs/ruby/src/ruby/lib/rexml/encodings/SHIFT-JIS.rb,v
retrieving revision 1.2.2.3
diff -U2 -p -r1.2.2.3 SHIFT-JIS.rb
--- ruby-ruby_1_8/lib/rexml/encodings/SHIFT-JIS.rb 19 May 2005 10:08:11 -0000 1.2.2.3
+++ ruby-ruby_1_8/lib/rexml/encodings/SHIFT-JIS.rb 31 Oct 2005 04:31:52 -0000
@@ -1,12 +1,27 @@
-require 'uconv'
-
module REXML
module Encoding
- def decode_sjis content
- Uconv::sjistou8(content)
- end
+ begin
+ require 'uconv'
+
+ def decode_sjis content
+ Uconv::sjistou8(content)
+ end
+
+ def encode_sjis(str)
+ Uconv::u8tosjis(str)
+ end
+ rescue LoadError
+ require 'nkf'
+
+ SJISTOU8 = '-Swm0'
+ U8TOSJIS = '-Wsm0'
- def encode_sjis(str)
- Uconv::u8tosjis(str)
+ def decode_sjis(str)
+ NKF.nkf(SJISTOU8, str)
+ end
+
+ def encode_sjis content
+ NKF.nkf(U8TOSJIS, content)
+ end
end
Iconv is still preferable to the pure-Ruby encoding mechanisms, so this solution isn't acceptable. Each encoding needs to be able to try a series of encoding options based on library availability, and choose the best.
Change History
Note: See
TracTickets for help on using
tickets.
