Data Analysis & Simulation

How To Speed Up The Distribution Fitting Process?

Tuesday, December 30th, 2008

Since fitting probability distributions to large data sets can be a time-consuming task, we are currently researching the possibility of using multi-core processors to make EasyFit work faster. During the past several years, major processor manufacturers have been promoting the multi-core technology on the desktop processors market. Multiple cores in a single chip allow for better performance/price ratio on a range of tasks, however, existing software needs to be updated accordingly to take full advantage of this type of hardware.

We have modified the original distribution fitting algorithm to utilize all cores available on a system, and used it to fit distributions to a simulated set of 200,000 data points. In a series of tests on an Intel dual-core processor, the new algorithm executed almost twice as fast, yielding up to 90% performance increase, compared to the version currently used in EasyFit. These are very good results, and we will definitely be including this feature into the next release of EasyFit.

On a related note, last week we were contacted by a customer regarding our upcoming Simulation & Probabilistic Analysis SDK. They need to analyze large volumes of data, and from their description of the problem we estimated that the typical analysis would take up to 20 hours on a modern PC. With the new distribution fitting algorithm, it can take less than 12 hours on a dual-core CPU, or even less on quad-core processors popular in the server space. In a decision making environment where several hours can mean the difference between profit and loss, this is a very important improvement.

EasyFit: select the best fitting distribution and use it to make better decisions. learn more
EasyFit Screenshot - Click To Enlarge
Download Free Trial