Nuts and bolts
What forecasting models are you using?
Addressing this question is tricky for two reasons: first, our forecasting technology is a core intellectual property (IP) asset that we are not willing to disclose in detail; second, our technology is complex and comes with many models. That being said, Lokad is leveraging a well-known theory named the statistical learning theory. This theory encompasses most modern forecasting methods such as Support Vector Regression, Bayesian Networks, mixture or boosting methods and meta-heuristics including neural networks or genetic algorithms… Then, we don’t dismiss plain old classics either: linear autoregression, moving average, (double, triple) exponential smoothing, Box-Jenkins, Holt-Winters, ARMA, ARIMA. Yet, those classics are typically very weak when it comes to leverage correlations between time-series.
How accurate are your forecasts?
Forecasting accuracy is extremely dependent of the very specific dataset being considered. We have encountered situations where 0.5% error was considered as poor (such as a nation-wide electricity consumption hourly forecasts 24h ahead), and other situations where 80% error was considered as excellent (such as one-of-a-kind promotional operation performed during a product launch). Accuracy heavily depends on the horizon - the further ahead the forecasts, the less accurate the forecasts -, but accuracy also heavily depends on the aggregation level - the more aggregated the forecasts, the more accurate the forecasts.
Forecasting competitions, do you have any academic validation of your technology?
There are plenty of data mining competitions that take place every year. At Lokad, we typically keep an eye on those events, and we routinely benchmark our forecasting technology against those competition datasets when the data is relevant to Lokad (we only process time-series, not images or customer profiles for example). Although, as of today, we have still not observed yet any public data mining competition that we believe to be deeply representative of the challenges we face on a daily basis. First, academic datasets tend to be small - less than a few hundreds of time-series - with long time-series - hundreds of data points per time-series. This is nearly the opposite from what we typically observe in retail: thousands if not millions of time-series, but very short series because products are short lived. That being said, Lokad typically performs well on those competitions, and very well if you take into account that, with Lokad, results are obtained out of the box, no expertise required to produce the results.
Do you evaluate the accuracy of your forecasts?
Yes, we do. Precise quantitative measurements of the forecast accuracy achieved with our forecasting technology represent about half of our core technology. Without getting too much into the details, let’s say it is a big challenge, not only to produce models that actually fit your data, but that happen to be really good on the data you don’t have yet, i.e. future data. See also Overfitting: when accuracy measure goes wrong. The typical daily task of Lokad R&D team consists of running over and over our forecasting engine over client datasets, measuring forecast errors and trying to reduce them. Then, a noticeable aspect of our technology is that you don’t only get forecasts, but also, for each forecasted value, you get the expected accuracy of this value, expressed as a MAPE error. Hence, you don’t have to wait to finally discover that a forecast was poorly reliable, Lokad gives you the info upfront so that you adjust your strategy accordingly.
How much historical data do you need?
There is no lower requirement for the amount of historical data. That being said, Lokad delivers a statistical technology, hence the more historical data, the more accurate the forecasts. In practice, 2 years of historical data is considered as good, and 3 years or more is considered as excellent. If you have less than 1 year of historical data, then Lokad will not be able to refine forecasts through seasonality which is an important pattern for many businesses. Also, to leverage seasonality, Lokad doesn’t need more than 1 year on every single time-series (i.e. product sales), we only need to have a couple of time-series with more than 1 year of history to establish the seasonality profiles that exist in your business. For startups and emerging businesses, Lokad can be used right from the beginning. Indeed, we deliver not only forecasts but expected forecast accuracy as well. Hence, the first forecasts typically get very high error levels and gradually improve over time. Lokad offers you a way to quantify the uncertainty as well.
General patterns
Macro trends (ex: financial crisis), how are they handled?
We believe they are two typical misunderstandings about macro-trends. First, macro-trends can only be leveraged to refine demand forecasts if those macro-trends themselves can be accurately forecasted. If banks had been able to forecast the financial crisis, there would not have been a crisis in the first place. Forecasting macro-trends is typically much harder than forecasting demand for your average product, so it’s frequently a rather intractable option. Second, a recession at -3% / year is considered as a big macro-trend, but in practice it means a -0.06% impact at the weekly level. In comparison, we routinely observe product sales varying of 20% from one week to the next. Lokad is best-suited for short-term forecasts, and looking a few weeks ahead, macro-trends are typically dwarfed by microeconomics factors such as promotions, cannibalization, advertising campaigns, … In conclusion, Lokad typically ignore most macro-trends, but in our experience, it’s the only reasonable option for 99% of the situations.
Seasonality, trend, how are they handled?
We auto-detect calendar-based patterns. You don’t need to tell Lokad that a product is seasonal, seasonality is a frequent pattern natively addressed by our forecasting technology. As a matter of fact, seasonality is a lot more complex than what most people expect. In our eyes, there is no one seasonality but many cyclical patterns that interact in multiple ways. There are the yearly seasonality, the day of the week effect, the pay-check effect at the monthly level, the quasi-yearly seasonality such as mother’s day celebrated the 2nd Sunday of May in the USA, … Moreover, when considering sales forecasts at the Point of Sale level, then cyclical patterns of the products get combined with cyclical patterns of the Point of Sale itself. Indeed, each Point of Sale has a more or less unique environment which generates its own demand patterns. Hence, seasonality isn’t just about providing some YES/NO flag, it’s a rather complex set of interdependent patterns. The good news is that Lokad manage this complexity for you.
Easter, Ramadan, Mother’s day and other quasi-seasonal events?
Some calendar patterns are, in Lokad’s jargon, quasi-seasonal: patterns repeat themselves from one year to the next, but they are not strictly annual in the sense of the Gregorian calendar (also known as the Western or Christian calendar). Easter, Ramadan, Chinese New Year, Mother’s Day are as many examples of quasi-seasonal patterns. Lokad auto-detects quasi-seasonal patterns, so you don’t need to dedicate any specific effort to handle those patterns. Then, in a manner very similar to the classical seasonality, Lokad primarily relies on multiple time-series analysis to detect time-series that have similar quasi-seasonal patterns in order to refine the pattern analysis.
Product Life Cycles and product launches, how are they handled?
Most consumer goods go through a life cycle. Products are launched, grow, wither and finally, are phased out of the market. Lokad can forecast sales at launch, considering that the launch date is given. Obviously, when a product is about to be launched, there is no sales data available for this very product to support the forecast. Yet, contrary to classical forecasting toolkits, Lokad is not just about classical time-series forecasting. In particular, products can be described through tags. A tag can represent about any property of the product: category, sub-category, family, brand, color, size, … In order to forecast the sales of a product being launched, Lokad analyzes historical launches of similar products, and similarities are evaluated based on tags provided for each product. We apply the same principle for other life cycle patterns.
Intermittent / low volume products, how are they handled?
If you have a product that is sold once a year, well, there is little that can be done as far statistical forecasting is concerned. In practice, it’s rather a marketing choice to have 1 unit in store or zero. However, between this is extreme slow mover case and your top sellers, there is a whole gray area of products that are sold infrequently but still frequently enough to require inventory optimization. Most classical forecasting toolkits behave poorly against intermittent sales. At Lokad, we have pushed a lot of efforts on this demand pattern because many businesses, such as eCommerce, heavily rely on long tail to reach profitability. Yet, slow mover’s, unless carefully managed, can generate even more inventory than top sellers. In order to deal with slow mover’s, we suggest to go for probabilistic forecasting.
Weather, how is it handled?
In certain businesses, such as grocery stores, weather is a very important demand factor. As of today, Lokad is not leveraging weather forecasts as input in our forecasting technology. Although, this item is part of our mid-term roadmap. Our goal is not only to support weather inputs, but to make the process vastly automated, so that it would basically require near-zero effort from our clients to benefit from the extra-accuracy.
Demand artifacts
Lost sales caused by stock-outs, how are they handled?
Sales do not equate demand. A stock-out is an artifact that distorts sales away from the original demand. Indeed, a stock-out causes sales to drop while demand remains steady. Contrary to classical forecasting toolkits, with Lokad, you don’t need to alter or tweak your historical data as an attempt to express sales that would have occurred if no stock-outs had taken place. Instead, events can be used to indicate when stock-outs took place. Stock-out information is used to more accurately estimate all patterns that would have been impacted (seasonality, trend, …) otherwise. If stock-outs are not flagged as such with events, Lokad filters those patterns as noise. Keeping track of stock-outs is nice to have, but not a requirement to get started with Lokad.
Exceptional sales, how are they handled?
Depending on your industry, your business might face exceptional sales. Since those sales are, well, exceptional in size, they are also usually rather straightforward to spot with a pure statistical approach. So, we suggest not to tweak your historical data to clean-up those exceptional sales. First, it’s probably a waste of time, second, exceptional sales themselves may carry valuable information that helps forecasting demand. Then, Lokad cannot forecast individual future exceptional sales - which may depend on the outcome of a negotiation for example. If there is a known exceptional sales ahead, we suggest to manually override the Lokad forecasts with the extra piece of information.
Aggregation, top-down or bottom-up?
Some companies forecast demand at the level of groups or families and then split those forecasts to reach individual products. This is a product top-down forecasting method. The same idea can be applied to forecast frequency: some companies forecast first at the weekly level, and then apply day-of-the-week coefficients. In this case, this is a frequency top-down forecasting method. The other way around, weekly forecasts can be produced by summing daily forecasts. At Lokad, we suggest to adjust your forecasts to match as closely as possible your operational needs: if supply chain needs weekly forecasts for each product, then request weekly forecasts for each product from Lokad. Requesting daily forecasts and then summing those forecasts will not improve your accuracy. Following the same idea, letting Lokad forecast sales at the product group level, and then manually splitting the forecasts for each SKU is a poor idea, because a significant forecast error is likely to be introduced through the split itself. Internally, Lokad relies on many aggregation/disaggregation algorithms, and we typically like to leverage the most fine-grained data available. For example, we do leverage daily sales data to deliver monthly forecasts. Indeed, a month may come with 4 or 5 week-ends which significantly impacts most retail businesses. As usual, you don’t have to worry about the aggregation level, Lokad handles your requirements.