Optical transport networks are the backbone of the modern communication systems. During the planning phase, all the information of the network is not available. So, it is very difficult for the network designer to estimate the capital expenditure (CAPEX) needed for the network. Under such incomplete information scenarios statistical models are used for the estimations. There are several statistical models proposed by several researchers. However, there was no model for the link lengths… read the full case study
Have you ever been busy with your statistics home work, a course work or a Ph.D. thesis requiring the use of probability laws, and wondered if there was an easy way to do the calculations?
Well, you can spend a few evenings at the nearest café with a cup of coffee and a cake, trying to do the math yourself.
Or, you can buy a monthly subscription license for EasyFit and have your calculations done in minutes.
And you will still have money left to enjoy a tall cup of Caffè Latte.
If you have been recently evaluating EasyFit and find it useful for your short or medium term projects, but cannot justify the upfront cost of the Perpetual License, today we have very good news for you: at just $1 a day or less, you can use the fully functional version of the product without the need to purchase the Perpetual License.
You can subscribe for a minimum of one month, and the monthly price drops as you subscribe for a longer term, going from around $0.80 a day for a three-month license, down to less than 55 cents a day when purchasing the annual license (see detailed pricing).
Note that the subscription fee is non-recurring: once your license expires, your credit card will not be charged, but you can still continue using the product to open your existing EasyFit project files and view the analysis results obtained during the subscription term.
For subscription ordering details, click here.
On a yearly basis, a huge number of goods is traded internationally and declared at Customs. Accordingly, there are large incoming and outgoing message flows that need to be processed. At the Dutch Customs, this mostly takes place electronically. In nowadays’ global economy that is running 24 hours a day, 7 days a week, properly and continuously working systems have become fundamental. As such, the presence of constantly available automated systems used to handle declaration processes is highly important for Customs and businesses… read the full case study
Naturally, when dealing with some particular probability distribution that fits to many of your data sets well, one day you will want to learn more about that distribution. What is so specific to it that it works for your data? And, moreover, how can you interpret the distribution parameters?
The good news is: for many probability distributions, the meaning of their parameters is described in the scientific literature. The classic example is the Normal distribution having two parameters: σ (scale) and μ (location). The parameterization of this distribution is pretty easy to understand: as you change the location parameter, the probability density graph moves along the x-axis, while changing the scale parameter affects how wide or narrow the graph is:
However, for quite a few of distributions, modifying the parameters and observing how the graphs change will be of little help for your undetstanding of what those parameters indicate. One of such distributions is the Lognormal model defined as “a continuous probability distribution of a random variable whose logarithm is normally distributed.” What does that mean exactly? For better understanding, compare the CDFs of the Normal and Lognormal distributions:
Normal distribution CDF
Lognormal distribution CDF
As you can see, the Normal model is “included” into the Lognormal in such a way that in the Lognormal model, ln(x) has the Normal distribution with the same parameters (σ, μ) as the original Lognormal distribution. And this is the key point to understand: the parameters of the Lognormal model are not the “pure” scale and location (pretty intuitive in the Normal model), but rather the scale and location of the included Normal distribution.
The same logic applies to the Gamma and Log-Gamma pair of distributions. The classical Gamma has two parameters: α (shape), β (scale), and the Gamma CDF is as follows:
Gamma distribution CDF
The shape parameter indicates the form of the Gamma PDF graph, while the scale factor affects the spread of the curve. Similarly to the Gamma model, the Log-Gamma distribution has two parameters with the same names (α, β), but its CDF has the form:
Log-Gamma distribution CDF
Just like with the Normal & Lognormal analogy, in the Log-Gamma model, ln(x) has the Gamma distribution with the same parameters (α, β) which cannot be treated as the “pure” Log-Gamma shape and scale, but the shape and scale of the included Gamma model.
There are dozens of different probability distributions out there, and even if you use only a couple of them on a daily basis, sometimes it can be hard to remember the meaning of all the parameters. That is why we decided to include a little feature in EasyFit that helps you keep your memory fresh: when moving the mouse pointer over a distribution parameter edit box, EasyFit displays a pop-up hint indicating the meaning of that particular parameter:
Distribution parameter hint
Using this feature, you can better focus on your core analysis rather than the technical details like the ones outlined in this article.
Some time ago, we covered the use of probability distributions and related Excel worksheet functions available in EasyFitXL. When dealing with probability data in Excel, most of the time, you would use those functions to set up your calculations to be performed directly within your workbooks. This approach works well for applications where you need to perform typical probability analysis based on different input data: you modify the data, and Excel recalculates the entire worksheet and updates the associated results.
However, for more advanced applications, you might need to implement some complex logic requiring the use of IF statements, which will make your worksheets too complicated. Of course, you can still use the IF worksheet function, but in reality, you would want to keep your workbooks as simple as possible, which is a good idea if you want to easily get back to your analysis in a month. And that is where the built-in Visual Basic for Application programming language comes in handy: with little programming knowledge, using the VBA functions available in EasyFitXL as well as in the EasyFit SDK, you can create feature-rich probability analysis and Monte Carlo simulation applications implementing the logic of any degree of complexity.
Even though both EasyFitXL and the SDK include a variety of VBA functions, these software packages differ in the feature sets they offer. Initially, EasyFitXL was designed as an Excel add-in that brings the visual distribution fitting feature of EasyFit to Excel. Of course, we could not ignore the integration and data analysis automation capabilities of Excel, so we came up with the following ideology for EasyFitXL: visually fit distributions to data in Excel, and use the results in the most convenient way – either visually, in your worksheets, or in your VBA applications. That is why the VBA functions offered by EasyFitXL allow you to evaluate most common distribution functions (PDF, CDF etc.), calculate distribution statistics (mean, variance…), and generate random numbers from any probability distribution you choose as the model for your data.
On the other hand, the Simulation & Probabilistic Analysis SDK was designed from the ground up as the package targeting software developers and offering a complete range of functions covering the entire feature set of EasyFit. Apart from evaluating distribution functions, calculating statistics and generating random numbers, you can do distribution fitting, perform goodness of fit tests, and even create distribution graphs – all directly from your VBA applications.
Another huge difference is that technically, the SDK offers its functionality through a set of Objects, enabling you to use the object-oriented approach to software development, making your work with large projects more efficient. On the contrary, EasyFitXL employs the functional programming model, offering a separate VBA function for each kind of distribution function and each probability distribution, which is good for short and simple programs.
Overall, depending on your needs, you can use either EasyFitXL or the SDK to implement any kind of data analysis application, ranging from simple probability calculation programs to complex automated data analysis and Monte Carlo simulation systems.
Can EasyFit be used to analyze time series data? To answer this question we recently received from a customer, we will shed some light on the differences between the probabilistic analysis and time series analysis.
When dealing with time series data, you usually have as an input a set of (time, value) data pairs indicating the consecutive measurements taken at equally spaced time intervals. The goal of time series analysis is to identify the nature of the process represented by your data, and use it to forecast the future values of the time series being analyzed.
A widespread application of such an analysis is weather forecasting: for more than a century, hundreds of weather stations around the world record various important parameters such as the air temperature, wind speed, precipitation, snowfall etc. Based on these data, scientists build models reflecting seasonal weather changes (depending on the time of the year) as well as the global trends – for example, temperature change during the last 50 years. These models are used to provide weather forecast for government and commercial organizations. In a typical forecast, the predicted values are not assigned probability: “In May, the maximum daily air temperature is expected to be 22 degrees Celsius.”
In contrast to the predictions based on time series analysis, when performing probabilistic analysis, you get not just a single value as a forecast, but a probabilistic model that accounts for uncertainty. In this scenario, you would obtain a continuous range of values and assigned probabilities. Of course, for real world applications, it is more practical to deal with specific values, so the probabilistic models are used to obtain predictions at fixed probability levels. Considering the above example, a forecast might look like: “In May, the maximum daily air temperature will be 22 degrees Celsius with 95% probability.”
So can distribution fitting be useful when analyzing time series data? The answer depends on the goals of your analysis – i.e. what kind of information you want to derive from your data. If you want to understand the connection between the predicted values and the probability, you should fit distributions to your data (just keep in mind that in this case the “time” variable will be unused). On the other hand, if you need to identify seasonal patterns or global trends in your data, you should go with the “classical” time series analysis methods.
From time to time, we receive emails from our customers asking what parameter estimation methods are implemented in EasyFit to carry out distribution fitting. When designing EasyFit, we were striving for a good balance between the accuracy and speed of calculations. That is why we decided to use the Method of Moments (MOM) for those models that allow for easy use of this method. Some examples of such distributions include the Chi-Squared, Exponential, two-parameter Gamma, and Logistic models. However, for many other distributions, the Method of Moments does not yield closed form expressions for parameter estimates, and in such cases EasyFit uses the Maximum Likelihood Estimation (MLE) method. In addition, for some distributions used in specific industries, such as the Wakeby model, EasyFit employs the Method of L-Moments (LMOM). You can find a detailed list of supported distributions and estimation methods used on our website.
Recently we have released a new version of our SDK. In this update, we have added a new property that lets you obtain the current licensing status of the SDK – for instance, you can determine whether the SDK is currently running in trial mode (using the Evaluation License), and if so, how many days are left until the evaluation period expires.
Consider the following scenario: you are building an application with a modular structure that, apart from its core feature set, provides some additional functionality through a number of modules, or add-ins, which can be installed and enabled on an optional basis. Now, suppose one of these modules uses the simulation or distribution fitting features of the SDK, and you want to give your users an ability to evaluate it prior to making a purchase decision. The new version of the SDK lets you easily integrate this logic into your applications, allowing you to create more flexible solutions that better meet your customers’ needs.
Because risk and uncertainty are a part of literally all areas of our life, with the finance being one of the most important areas, scientifically based risk management methods are gaining more and more popularity among the finance industry professionals. Currency fluctuations affect all businesses dealing with multiple currencies, so having at least some degree of certainty about the future exchange rates can be a significant success factor for any international enterprise. A wide range of currency forecasting methods have been developed, however, not many of them can pretend to be reliable in the long run: most algorithms only work for a short period of time, and need to be tweaked as the market conditions change.
Brijen Hathi, a Research Fellow at the Planetary & Space Sciences Research Institue, performs his own research in the field and publishes the results in the Currency Forecasting Blog. The forecasting methodology employed by Mr. Hathi is in part based on the same techniques used in probabilistic risk analysis. Like with most modern forecasting methods, in this approach, he uses historical data to predict the future, but the big difference here is that he also assigns specific probabilities to the predictions. For example, for a US-based company doing business in the UK, it doesn’t really matter what the exact GBP/USD exchange rate is going to be during the next 30 days, as long as it stays within a specific interval with a high probability (95% or more). Recently Mr. Hathi has published an article highlighting the use of EasyFit to model pricing probability of the Pound Sterling versus the US Dollar from historical data. It is fascinating to see how EasyFit is being used in (what we believe) a truly scientific approach to data analysis, and we hope to see new developments in this area soon.