For Programmers: Free Programming Magazines  


Home > Archive > Ruby > August 2005 > SAX-based XPath









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author SAX-based XPath
Gary Shea

2005-08-30, 7:02 pm

I'm posting this in hope of getting some API suggestions.

I'm building a native stream-based Ruby XPath processor (or whatever it
would be called) in order to parse some gigabyte-scale XML files at
work. It will accept multiple XPath expressions and output events (SAX
for now) matching the union of the XPath expressions.

It currently only works with absolute, non-wildcarded, predicate-less
default-axis XPath expressions:

filter = XmlFilter::XPathFilter.new
filter.listener = XmlFilter::RecordingListener.new
filter.xpath = '/a/b/c'

parser = REXML::Parsers::SAX2Parser.new(File.open('some_file_path.xml'))
parser.listen = filter
parser.parse

This interface needs to be extended a little to work with multiple XPath
expressions, maybe:

filter.xpath = ['/a/b/c', '/d/e/f']

Any suggestions for a more Ruby-esque way to do it?

Gary




Robert Klemme

2005-08-31, 3:59 am

Gary Shea wrote:
> I'm posting this in hope of getting some API suggestions.
>
> I'm building a native stream-based Ruby XPath processor (or whatever
> it would be called) in order to parse some gigabyte-scale XML files at
> work. It will accept multiple XPath expressions and output events
> (SAX for now) matching the union of the XPath expressions.
>
> It currently only works with absolute, non-wildcarded, predicate-less
> default-axis XPath expressions:
>
> filter = XmlFilter::XPathFilter.new
> filter.listener = XmlFilter::RecordingListener.new
> filter.xpath = '/a/b/c'
>
> parser =
> REXML::Parsers::SAX2Parser.new(File.open('some_file_path.xml'))
> parser.listen = filter
> parser.parse
>
> This interface needs to be extended a little to work with multiple
> XPath expressions, maybe:
>
> filter.xpath = ['/a/b/c', '/d/e/f']
>
> Any suggestions for a more Ruby-esque way to do it?


It seems you could simplify the interface a bit (or add a method) along
the lines of REXML so you can do

File.open('some_file_path.xml') do |io|
XmlFilter::XPathFilter.parse(io, '/a/b/c', '/d/e/f') do |event, filter|
# process event
end
end

I'm unsure about the "filter" block parameter but it might be useful to
know the matching filter criterium. What do you think?

Kind regards

robert




Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com