Friday, February 5, 2010

Hash.from_xml and Multiple Elements with Same Name

I've seen reference to ActiveSupport's Hash.from_xml method to easily parse XML into hashes, but I assumed that it would overwrite keys when the same element name was defined multiple times while in the same context, as you would think it might, but it doesn't (which is a good thing).

Some examples:

Hash.from_xml "<a><b><tags><tag>apple</tag><tag>orange</tag></tags></b></a>"
produces the hash:
{"a"=>{"b"=>{"tags"=>{"tag"=>["apple", "orange"]}}}}
and
Hash.from_xml "<a><b><a>1</a><b>apple</b><a>2</a><b>banana</b></b></a>"
produces the hash:
{"a"=>{"b"=>{"a"=>["1", "2"], "b"=>["apple", "banana"]}}}

Please note that well-designed XML should usually not use unencapsulated key/value pairs in a XML like:

<keypairs><key>key1</key><value>value1</value><key>key2</key><value>value2</value></keypairs>
but I've had to consume XML like this before (very recently).

Note that XML attributes can be lost as noted in ticket #1598, but perhaps that will be fixed in a future version of ActiveSupport/Rails.

If you get "NoMethodError: undefined method `from_xml' for Hash:Class", remember that Rails includes ActiveSupport. If you're not using Rails, you need to:

require 'rubygems'                          
require 'active_support'
After playing with REXML and XmlSimple a little previously, this seems to be one of the easier ways to consume XML in Ruby/JRuby, if you want to work with most of the data in the supplied in the XML (and if you either don't need XML attributes or ticket #1598 gets fixed). However, it isn't always that great. Depending on the xml, it might produce hashes and arrays sometimes and not other times when multiples are not defined, ending with code that has respond_to? :each and respond_to? :keys , etc. So, it is not a solution for every case, but it sometimes provides a fairly simple solution.

No comments: