Optical transport networks are the backbone of the modern communication systems. During the planning phase, all the information of the network is not available. So, it is very difficult for the network designer to estimate the capital expenditure (CAPEX) needed for the network. Under such incomplete information scenarios statistical models are used for the estimations. There are several statistical models proposed by several researchers. However, there was no model for the link lengths… read the full case study
Archive for the ‘Uncategorized’ Category
On a yearly basis, a huge number of goods is traded internationally and declared at Customs. Accordingly, there are large incoming and outgoing message flows that need to be processed. At the Dutch Customs, this mostly takes place electronically. In nowadays’ global economy that is running 24 hours a day, 7 days a week, properly and continuously working systems have become fundamental. As such, the presence of constantly available automated systems used to handle declaration processes is highly important for Customs and businesses… read the full case study
Naturally, when dealing with some particular probability distribution that fits to many of your data sets well, one day you will want to learn more about that distribution. What is so specific to it that it works for your data? And, moreover, how can you interpret the distribution parameters?
The good news is: for many probability distributions, the meaning of their parameters is described in the scientific literature. The classic example is the Normal distribution having two parameters: σ (scale) and μ (location). The parameterization of this distribution is pretty easy to understand: as you change the location parameter, the probability density graph moves along the x-axis, while changing the scale parameter affects how wide or narrow the graph is:
However, for quite a few of distributions, modifying the parameters and observing how the graphs change will be of little help for your undetstanding of what those parameters indicate. One of such distributions is the Lognormal model defined as “a continuous probability distribution of a random variable whose logarithm is normally distributed.” What does that mean exactly? For better understanding, compare the CDFs of the Normal and Lognormal distributions:
Normal distribution CDF
Lognormal distribution CDF
As you can see, the Normal model is “included” into the Lognormal in such a way that in the Lognormal model, ln(x) has the Normal distribution with the same parameters (σ, μ) as the original Lognormal distribution. And this is the key point to understand: the parameters of the Lognormal model are not the “pure” scale and location (pretty intuitive in the Normal model), but rather the scale and location of the included Normal distribution.
The same logic applies to the Gamma and Log-Gamma pair of distributions. The classical Gamma has two parameters: α (shape), β (scale), and the Gamma CDF is as follows:
Gamma distribution CDF
The shape parameter indicates the form of the Gamma PDF graph, while the scale factor affects the spread of the curve. Similarly to the Gamma model, the Log-Gamma distribution has two parameters with the same names (α, β), but its CDF has the form:
Log-Gamma distribution CDF
Just like with the Normal & Lognormal analogy, in the Log-Gamma model, ln(x) has the Gamma distribution with the same parameters (α, β) which cannot be treated as the “pure” Log-Gamma shape and scale, but the shape and scale of the included Gamma model.
There are dozens of different probability distributions out there, and even if you use only a couple of them on a daily basis, sometimes it can be hard to remember the meaning of all the parameters. That is why we decided to include a little feature in EasyFit that helps you keep your memory fresh: when moving the mouse pointer over a distribution parameter edit box, EasyFit displays a pop-up hint indicating the meaning of that particular parameter:
Distribution parameter hint
Using this feature, you can better focus on your core analysis rather than the technical details like the ones outlined in this article.
From time to time, we receive emails from our customers asking what parameter estimation methods are implemented in EasyFit to carry out distribution fitting. When designing EasyFit, we were striving for a good balance between the accuracy and speed of calculations. That is why we decided to use the Method of Moments (MOM) for those models that allow for easy use of this method. Some examples of such distributions include the Chi-Squared, Exponential, two-parameter Gamma, and Logistic models. However, for many other distributions, the Method of Moments does not yield closed form expressions for parameter estimates, and in such cases EasyFit uses the Maximum Likelihood Estimation (MLE) method. In addition, for some distributions used in specific industries, such as the Wakeby model, EasyFit employs the Method of L-Moments (LMOM). You can find a detailed list of supported distributions and estimation methods used on our website.
Over the last five years, we have been adding new features to EasyFit mostly with business users in mind, but thanks to the nature of the product and the special academic pricing, it has become quite popular among the academic community: a quick search in Google reveals numerous research papers referring to EasyFit, just to name a few:
- “Co-evolution of Social and Affiliation Networks” (University of Maryland, USA) [link]
- “Power laws in top wealth distributions: evidence from Canada” (Brock University, Canada) [link]
- “Duration of Coherence Intervals in Electrical Brain Activity in Perceptual Organization” (RIKEN Brain Science Institute, Japan) [link]
- “Resource Management Schemes for Mobile Ad hoc Networks” (National University of Singapore, Singapore) [link]
- “Modelling the diffusion of innovation management theory using S-curves” (University of London, UK) [link]
It is pleasing to see EasyFit helping researchers in such diverse disciplines get their job done in a more efficient way.
Over the past five years, we have been selling our distribution fitting software through Plimus Inc. – the U.S. based company responsible for processing credit card orders and sending out the license keys to the customers who purchased our products. What we like about Plimus is that apart from making the ordering process secure and smooth for our users, this company is also very responsive to any queries, which is especially important when it comes to dealing with people’s money.
However, in some countries there are state regulations preventing customers (mostly government organizations and academic institutions) from ordering software online. To make our distribution fitting products available to users in those countries, we are in the process of creating a network of international distributors. Specifically, this year we have partnered up with a Chinese, a Mexican, and three Taiwanese software resellers who now offer our products through their local distribution channels.
Click here to watch the quick Flash demo showing how to fit probability distributions using EasyFit and apply the best fitting distribution to perform specific calculations – for instance, make estimates using the quantile function, and calculate probabilities.
The data set used in this demo consists of maximum daily wind gust speeds recorded at Station TPLM2 located in the Atlantic Ocean during 2005-2007. This station is owned and maintained by the National Data Buoy Center, and measures wind speed, air temperature, sea temperature, and other data used for weather forecasting.
NOAA defines a wind gust as “a sudden, brief increase in speed of the wind” which usually lasts for less than 20 seconds. The relatively rare but very high wind gusts cause the most damage, that is why they are of more interest than the average daily wind speeds. In essence, the wind gusts are extreme events – no wonder the Generalized Extreme Value distribution perfectly fits the data:
(the x axis units are m/s)