| markn@ieee.org 2007-01-30, 6:56 pm |
| I get a fair number of emails asking about compressed databases.
Random access of compressed data is interesting, and there are a few
papers and sample programs on the web, but not a lot.
I saw this article and it looks interesting, anyone know more about
the approach this company is using?
http://www.techworld.com/features/i...&FeatureID=3137
Ninety percent data compression at VW - and it's not de-dupe
Structured data - there's the rub
Chris Mellor January 30, 07
It's possible to get a 90 percent reduction in the space taken up by
structured data - without using de-duplication technology.
[blah blah blah]
SAND's compression technology
SAND's compression rate is about 85-90 percent. It uses column-based
data compression technology that allows it to store relational data in
what is essentially a pre-indexed format, alleviating the requirement
for storing or building indexes at restore time. This by itself
significantly reduces the overall storage needed for a SAND database.
Column-based storage also significantly improves data compression:
each column of data, being made up of a single data type, can be
compressed much more efficiently than rows of data that are by
definition made up of many different data types. SAND can select the
best optimized compression strategy for each data type, and thus
further reduce the data footprint.
Column-based storage also allows SAND DNA to more rapidly process
archival queries: reporting tools can either directly query a SAND
repository using the subset of the ANSI SQL language current
supported, or the necessary data can be rapidly restored to an
operational data store and queried using the full complement of SQL
commands. This is in contrast to the majority of archiving systems
that only allow access to summary data unless a full database
restoration process has been undertaken.
|
| Mark Nelson - http://marknelson.us
|
|