Digital Reef aims for data classification scalability
By Beth Pariseau | Mar 10, 2009
Digital Reef Inc. has introduced its self-named data classification and unstructured data management software, which the company claims offers a more scalable approach than previous enterprise offerings.
For scalability, Digital Reef has broken its software into three layers. Each layer can run on clusters of hosts, according to Brian Giuffrida, Digital Reef's vice president of marketing and business development. The access tier can be tunneled through a firewall if necessary, the service tier acts as a job router for requests that come into the system and the analytics tier processes data. Within the analytics tier, the software also checks large jobs to see if they need to be load-balanced or restarted.
"It scales to the needs of a true enterprise data store," Giuffrida said.
Giuffrida claims Digital Reef's data classification and search algorithms are more advanced than previous generations of classification products. The product contains a "similarity engine" that can identify overall contextual similarity of files in a repository rather than simply matching keywords.
"So an oil-exploration company investigating a failure mechanism on a particular offshore oil rig could find solutions to the problem written on other rigs where the components are named differently," Giuffrida said.
The similarity engine also allows the identification of duplicate or near-duplicate files without depending on specific keywords.
Giuffrida said the software could be used for business intelligence and data mining by identifying patterns on its own, rather than relying on users to manually classify data or train artificial intelligence to do so.
"We want to show customers what they don't know they don't know about their file repositories," he said.
The software only moves data if users or policies require it to do so. It doesn't perform automated migration, although it can initiate data migration processes for litigation holds if necessary. Giuffrida said Digital Reef plans to take this engine a step further in future releases this year, with automated, tiered storage features and classification of multimedia files on the roadmap.
Arun Taneja, founder and consulting analyst at Hopkinton, Mass.-based Taneja Group, said the technology looks promising on paper, but he has seen this movie before.
"Scalability has absolutely, without question, been an issue for data classification products," he said. "The proof of the pudding is in the eating—scalability has been a hallmark for all of these products when they were first presented to me."
Taneja said scalability issues often come to light only after enterprise customers expose a new product like this to real-world environments.


0 comments
Facebook
LinkedIn
Digg
Email
Print






