Can EasyFit be used to analyze time series data? To answer this question we recently received from a customer, we will shed some light on the differences between the probabilistic analysis and time series analysis.
When dealing with time series data, you usually have as an input a set of (time, value) data pairs indicating the consecutive measurements taken at equally spaced time intervals. The goal of time series analysis is to identify the nature of the process represented by your data, and use it to forecast the future values of the time series being analyzed.
A widespread application of such an analysis is weather forecasting: for more than a century, hundreds of weather stations around the world record various important parameters such as the air temperature, wind speed, precipitation, snowfall etc. Based on these data, scientists build models reflecting seasonal weather changes (depending on the time of the year) as well as the global trends – for example, temperature change during the last 50 years. These models are used to provide weather forecast for government and commercial organizations. In a typical forecast, the predicted values are not assigned probability: “In May, the maximum daily air temperature is expected to be 22 degrees Celsius.”
In contrast to the predictions based on time series analysis, when performing probabilistic analysis, you get not just a single value as a forecast, but a probabilistic model that accounts for uncertainty. In this scenario, you would obtain a continuous range of values and assigned probabilities. Of course, for real world applications, it is more practical to deal with specific values, so the probabilistic models are used to obtain predictions at fixed probability levels. Considering the above example, a forecast might look like: “In May, the maximum daily air temperature will be 22 degrees Celsius with 95% probability.”
So can distribution fitting be useful when analyzing time series data? The answer depends on the goals of your analysis – i.e. what kind of information you want to derive from your data. If you want to understand the connection between the predicted values and the probability, you should fit distributions to your data (just keep in mind that in this case the “time” variable will be unused). On the other hand, if you need to identify seasonal patterns or global trends in your data, you should go with the “classical” time series analysis methods.