Beware of tests that use simulated data to validate product performance, researchers warn

10.11.2010
WALTHAM, MASS. -- Testing against simulated data may not be worth much depending on how closely the simulation , researchers told attendees at an IEEE conference on homeland security.

Info-assurance products such as anomaly detectors check only a certain set of characteristics within the data, and if those characteristics aren’t accurately simulated, the results will lack validity, says John DeVale, a researcher at Johns Hopkins Applied Physics Lab who presented the research at the 2010 IEEE International Conference on Technologies for Homeland Security.

As a result, anomaly detectors tested against simulated data and found effective may prove ineffective when deployed in real networks monitoring real traffic, he says. They may actually have trouble operating in real-world environments or generate blizzards of false alarms, he says.

“Dealing with false alarms is manpower intensive,” DeVale says. “You can’t look at them all, so you want a detector to have a low false-alarm rate.”

The solution is to recognize what data subset the anomaly detector looks for and make sure that subset is accurately represented in the simulation, he says. “If assumptions are wrong, you get problems,” DeVale says. “It’s more difficult than people realize.”

If two different detectors seeking different data characteristics look at the same simulated data, the first might seem to work well because of a good matchup between what it seeks and what was represented in the data, he says. The second might not test well because it seeks characteristics not represented in the data.

The second detector might actually work better against real-world data, he says, but testers would be unable to tell from the experiments using simulated data. That could lead vendors, for example, to develop a product that doesn’t work well in the real world while abandoning work on the product that would have done well, DeVale says. “It might shut down a valid research area or you might get false confidence that the detector really works,” he says “The second detector could be better, but you’d never know it because the test was flawed.”

Designing tests that match salient characteristics of test data to the anomaly detection products being tested means more work. “It adds complexity to it, but that’s better than being blind to it,” DeVale says.

He presented research that used fairly simple and restricted data sets, one pertaining to altitude and speed of airplanes landing and one pertaining to commands sent to orbiting deep-space probes. The problem of simulation is much more complex if the data set is Internet traffic, he says. “Cyber data is harder to look at,” he says.

in Network World's Wide Area Network section.