From probabilistic forecasting to deep learning
Partly by chance, it turns out that deep learning happens to be heavily geared toward probabilistic forecasts by design. The motivation for this perspective was, however, entirely unrelated to supply chain concerns. Deep learning algorithms are favoring optimization built on top of a probabilistic / Bayesian perspective with metrics like cross entropy because these metrics provide huge gradient values that are especially suitable for the stochastic gradient descent, the “one” algorithm that makes deep learning possible.
In the specific case of supply chains, it happens that the foundations of deep learning are fully aligned with the actual business requirements!
Beyond the hype of artificial intelligence
Artificial intelligence - powered by deep learning in practice - has been the buzzword of the year in 2017. Claims are bold, enthralling and, well, fuzzy. From Lokad’s vantage point, we observe that the majority of these enterprise AI techs are not living up to their expectations. Very few companies can secure over half a billion USD in funding, like Instacart, to gather a world-class deep learning team in order to successfully tackle a supply chain challenge.
With this release, Lokad is making AI-grade forecasting technology accessible to any reasonably “digitalized” company. Obviously, the whole thing is still powered by historical supply chain data, so the data must be accessible to Lokad, but our technology requires zero deep learning expertise. Unlike virtually every single “enterprise” AI techs, Lokad does not rely on manual feature engineering. As far as our clients are concerned, the upgrade from our previous probabilistic forecasts to deep learning will be seamless. Lokad is the first software company to provide a turnkey AI-grade forecasting technology, accessible both to tiny 1-man ecommerces and yet scaling up to the largest supply chain networks that can include thousands of locations, and a million product references.
The age of GPU computing
(*) For ultra-small datasets, our 5th gen forecasting engine is actually slower, and takes a few more minutes - which is largely inconsequential in practice.
Product launches and promotions
Our 5th generation forecasting engine is bringing substantial improvements to hard forecasting situations, most notably product launches and promotions. From our perspective product launches, albeit very difficult, remain a tad easier than promotion forecasts. The difference in difficulty is driven by the quality of the historical data, which is invariably lower for promotions compared to product launches. Promotion data gets better over time once the proper quality assurance processes are in place.
In particular, we are seeing deep learning as a massive opportunity for fashion brands who are struggling with product launches that dominate their sales: launching a new product isn’t the exception, it’s the rule. Then, as color and size variants vastly inflate the number of SKUs, the situation is made even more complex.
Our forecasting FAQ
Which forecasting models are you using?
Our deep forecasting engine is using a single model built from deep learning principles. Unlike classic statistical models, it’s a model that features tens of millions of trainable parameters, which is about 1000 times more parameters than our previous, most complex, non-deep machine learning model. Deep learning dramatically outperforms older machine learning approaches (random forests, gradient boosted trees). Yet, it’s worth noting that these older machine learning approaches were already outperforming all the time-series classics (Box-Jenkins, ARIMA, Holt-Winters, exponential smoothing, etc).
Do you learn from your forecasting mistakes?
Yes. The statistical training process - which ultimately generates the deep learning model - leverages all the historical data that is available to Lokad. The historical data is leveraged through a process known as backtesting. Thus, the more historical data that is available to the model, the more opportunities the model has to learn from its own mistakes.
Does your forecasting engine handle seasonality, trends, days of week?
Yes, the forecasting engine handles all the common cyclicities, and even the quasi-cyclicities, whose importance is frequently underestimated. As for the code, the deep learning model intensively uses a multiple time-series approach to leverage the cyclicities observed in other products, in order to improve the forecasting accuracy of any one given product. Naturally, two products may share the same seasonality, but not the same day-of-week pattern. The model is capable of capturing this pattern. Also, one of the major upside of deep learning is the capacity to properly capture the variability of the seasonality itself. Indeed, a season may start earlier or later depending on external variables, such as the weather, and those variations are detected and reflected in our forecasts.
What data do you need?
As was the case with our previous generation of forecasting technology, In order to forecast demand, the forecasting engine needs to be provided - at least - with the daily historical demand, and providing a disaggregated order history is even better. As far as the length of the history is concerned - the longer it is, the better. While no seasonality can be detected with less than 2 years of history, we consider 3 years of history to be good, and 5 years excellent. In order to forecast the lead times, the engine typically requires the purchase orders to contain both the order dates and the delivery dates. Specifying your product or SKU attributes helps to considerably refine the forecasts too. In addition, providing your stock levels is also very helpful to us, for getting a first meaningful stock analysis over to you.
Can you forecast my Excel sheet?
As a rule of thumb, if all of your data fits into one Excel sheet, then we usually cannot do much for you, and to be honest, nobody can either. Spreadsheet data is likely to be aggregated per week or per month, and most of the historical information ends up being lost through such aggregation. In addition, in this case, your spreadsheet is also not going to contain much information about the categories and the hierarchies that apply to your products either. Our forecasting engine leverages all the data you have, and doing a test on a tiny sample is not going to give satisfying results.
What about stock-outs and promotions?
Both stock-outs and promotions represent bias in historical sales. Since the goal is to forecast the demand, and not the sales, this bias needs to be taken into account. One frequent - but incorrect - way of dealing with these events consists of rewriting the history, to fill in the gaps and truncate the peaks. However, we don’t like this approach, because it consists of feeding forecasts to the forecasting engine, which can result in major overfitting problems. Instead, our engine natively supports “flags” that indicate where the demand has been censored or inflated.
Do you forecast new products?
Yes, we do. However, in order to forecast new products, the engine requires the launch dates for the other “older” products, as well as their historical demand at the time of the launch. Also, specifying some of your product categories and/or a product hierarchy is advised. The engine does indeed forecast new products by auto-detecting the “older” products, which can be considered as comparable to the new ones. However, as no demand has yet been observed for the new items, forecasts fully rely on the attributes that are associated with them.
Do you use external data to refine the forecasts?
We can use competitive pricing data typically obtained through 3rd party companies that specialize in web scraping for example. Web traffic data can also be used, and possibly acquired, to enrich the historical data in order to boost further the statistical accuracy. In practice, the biggest bottleneck in using external data sources isn’t the Lokad forecasting engine - which is fairly capable - but setting-up and maintaining a high-quality data pipeline attached to those external data sources.