Dems score with better data

15.11.2006

Netezza, which makes the technology used by the DNC, is part of a new generation of data warehousing companies that are using commodity hardware such as Seagate hard drives, Intel processors, and hardened Linux operating systems to create low-cost, fast data warehouse appliances, according to Donald Feinberg, of Gartner.

Like incumbent data warehouse players such as Teradata (part of NCR), Netezza uses distributed database intelligence, in which data filtering, processing, and analysis is done on the same device that stores the data.

"They have code running on the hard drive, so you can parallelize the queries and do them as fast as you can lift the data off the hard drive. Fundamentally it results in a two order of magnitude improvement in speed," said Rich Zimmerman, IISi's CTO.

Parallelizing queries to databases is nothing new. However, running parallel queries on inexpensive hardware and software, like Linux and PostgreSQL, and being able to match what high-end vendors like Teradata can offer is new, said Feinberg. Appliance-based products like Netezza's Performance Server are also easier to maintain, requiring less staff and keeping the cost to implement and run the data warehouse low, he said.

Motivating the DNC's data warehouse project was an effort to improve on the organization's 2004 voter targeting project, which was roundly criticized for providing state-level organizations with inaccurate data in a close race against a well organized Republican opposition.