00:00:07 Potential use cases of differential programming in supply chains.
00:00:31 Applying differential programming to retail stores and customer perspectives.
00:02:56 Game-changing property of differentiable programming in modeling customer behavior.
00:06:05 Differentiable programming’s impact on warehouse operations and forecasting future demand.
00:07:11 Smoothing warehouse shipment curves with differentiable programming.
00:09:38 The importance of serving clients on time and its impact on the supply chain.
00:10:40 Differentiable programming’s role in modeling complex supply chain networks.
00:13:01 Quality control and imperfections in production systems.
00:14:17 Applying differentiable programming to model uncertainties in the pharma industry.
00:16:00 Differential programming and its advantages in sparse data situations.
00:17:41 Patent expiration as an example of applying differential programming in Pharma industry.
00:19:57 Embracing complexity and addressing key business drivers through differential programming.
00:21:59 Balancing simplicity and complexity in models based on business requirements.
00:22:42 Differentiable programming as an evolution of Lokad’s programmatic approach, benefits and opportunities for clients.

Summary

In this interview episode, host Kieran Chandler and Joannes Vermorel, Lokad’s founder, explore differential programming’s applications and impact on supply chain management. Traditional time series models struggle with modeling cannibalization and substitution, while differential programming offers a customer-centric approach, considering customer desires and needs for better-informed decision-making. This approach can lead to more accurate demand forecasting, inventory management, and warehousing optimization. Differentiable programming addresses complex multi-echelon challenges and accounts for production imperfections, making it suitable for various industries. Vermorel emphasizes that differentiable programming allows businesses to incorporate domain knowledge into machine learning models, resulting in more accurate, efficient, and tailored solutions for specific problems.

Extended Summary

In this episode of the interview series on differential programming, host Kieran Chandler and Joannes Vermorel, founder of Lokad, discuss the potential use cases and consequences of applying this technology to supply chains, specifically retail stores. Differential programming has the potential to improve supply chain management by addressing various issues that traditional time series approaches cannot effectively handle.

One of the key issues in supply chain management is the challenge of modeling cannibalization and substitution, which are particularly significant in industries such as luxury, fast fashion, and food retail. Traditional time series models struggle to account for these factors, often resorting to duct tape solutions that are unsatisfying and far from optimal.

Differential programming offers a fresh approach to these problems by focusing on the customer’s perspective, rather than solely on the product. It allows supply chain managers to consider factors such as customer desires, needs, and their likelihood to pick or not pick items from the current assortment and stock availability. This customer-centric approach provides a more accurate and nuanced understanding of the retail environment, leading to better-informed decision-making.

The game-changing aspect of differential programming lies in its ability to model customer affinities towards specific products in the catalog. This procedural process allows supply chain managers to consider various factors, such as how novelty drives customer purchases and how repeat clients are unlikely to buy the same product again. These insights can lead to more accurate demand forecasting and inventory management.

For example, in a bookstore setting, a customer who buys a book is highly unlikely to purchase the same title again on their next visit. Traditional time series models struggle to account for this behavior, while differential programming can directly model these individual purchasing decisions. This leads to a more accurate understanding of customer demand and the lifecycle of products.

Differential programming enables supply chain managers to model the behavior of clients who visit the store regularly, such as those driven by novelty. This approach can help predict how popular new products will be and when the demand for them will decrease. Unlike time series models, which rely on indirect ways of modeling these patterns, differential programming offers a more direct and accurate solution.

Vermorel explains that differentiable programming allows for more accurate modeling of customer behavior at the point of sale. Traditional statistical models had difficulty incorporating even basic insights about customer behavior, making it challenging for them to learn from scratch. Differentiable programming, on the other hand, offers a more direct way to understand what’s happening in the store and can be easily integrated into machine learning models.

When it comes to warehousing, differentiable programming can help optimize and smooth the flow of products. Warehouses often face input/output capacity issues, and it would be ideal if small adjustments could be made to shipping schedules to avoid colliding shipments and reduce the need for temporary staff. Traditional optimization techniques struggled with this problem because it involved both learning and optimization aspects. Differentiable programming, however, can handle the vast number of variables involved in this process, making it possible to optimize the shipment of millions of SKUs and address subtle interaction effects.

At the manufacturing level, differentiable programming can help address complex multi-echelon challenges. Traditional approaches tended to focus on specific nodes within the supply chain and aimed for high service levels for certain products. However, Vermorel argues that what truly matters is whether the finished goods are available to customers, rendering much of the intermediate steps in the supply chain irrelevant. Differentiable programming allows for more accurate modeling of the complex network of parts and assemblies within the supply chain, which ultimately helps serve clients better and on time.

Additionally, differentiable programming can help account for imperfections in the production system, such as quality control issues. In industries like pharmaceuticals, where living organisms are involved in producing advanced drugs, production batches might be lost due to biological processes. Differentiable programming can account for these losses and help optimize the overall production process.

Vermorel explains that the pharmaceutical industry deals with high levels of uncertainty due to the nature of their processes. For example, if a problem arises in a batch of cultures, the entire batch is likely to be lost, which is different from the automotive industry where only a small fraction of parts might fail quality control. Traditional machine learning models might struggle with this level of uncertainty, as they may not have enough relevant historical data to predict outcomes accurately.

Differentiable programming offers an alternative by allowing businesses to incorporate their domain knowledge directly into the machine learning model. Vermorel emphasizes that differentiable programming is not about throwing large amounts of data at an AI system but rather making the most out of sparse and valuable data. For instance, in the pharmaceutical industry, the impact of patent expiration on drug pricing is a well-known phenomenon. Differentiable programming enables the incorporation of this knowledge into the model, improving its accuracy and efficiency.

The versatility of differentiable programming makes it suitable for various industries, each with unique challenges. Vermorel uses the example of the automotive aftermarket, where the compatibility between vehicle parts and specific vehicle models is crucial. Ignoring this aspect in a simplistic model could lead to suboptimal results, while differentiable programming can help capture these essential business drivers.

Despite the complexity of differentiable programming, Vermorel argues that businesses should not shy away from embracing it. While simpler models might work, they often do so at the expense of accuracy and a thorough understanding of the business. Differentiable programming allows for a more tailored approach that can address specific problems and situations.

Differentiable programming represents an evolution of Lokad’s programmatic approach to supply chain optimization. It enables companies to incorporate their domain knowledge into their machine learning models, leading to leaner execution and improved performance in terms of accuracy. Differentiable programming offers businesses the opportunity to revisit existing problems and develop scalable solutions that better address their unique challenges.

Full Transcript

Kieran Chandler: Today, we’re going to conclude our short series by looking a little bit more at some of its potential use cases and the far-reaching consequences this can have when applied to a supply chain. So, Joannes, what are some of the problems which we can improve our approach upon by using differentiable programming? And let’s start maybe with retail stores, you know, the point of contact with the customers.

Joannes Vermorel: Right now, pretty much everything that is done in supply chain takes the time series perspective, where you have a product and you observe unit sales, demand being purchased, or service depending on what kind of shop you’re running. Obviously, a shop in aerospace isn’t the same thing as a shop for a fast fashion store, but the idea is that the angle is kind of the time series angle per product. The issue with this perspective, for example, is that things like cannibalization and substitution, which are very strong for luxury, fast fashion, or even food retail, are super hard to model. In many cases, they barely exist at all. Differentiable programming gives you an angle to directly tackle the problem from the customer’s perspective, saying, “Well, I have a population of clients who walk into my store and have desires and needs, and they are going to pick or not pick things that are exposed to them, considering the present assortment and the present stock availability in the store.” That’s very interesting because, through something like differentiable programming, we can operate at a level that is not the level of a time series on top of the product references listed in the store. We can take the customer perspective, and that’s quite game-changing. Our experience with the time series perspective is that, usually, the best you can do is just apply duct tape to your numerical models so that they are not too broken when facing cannibalization and substitution, but it’s not very satisfying. It’s duct tape at best.

Kieran Chandler: So let’s maybe recap on what we’ve discussed in previous episodes. What is that game-changing property that you mentioned there that makes this all possible?

Joannes Vermorel: With a differentiable programming approach, you can literally model the fact that a client has a specific affinity to any product in your catalog, and you can write a procedural process for that. For example, let’s say I have clients who come back to my store, and maybe those clients are driven by novelty. How do I model something as simple as, once people came into my store to buy a book, by definition, they are not going to buy the same book when they come back? They are only going to buy another title, not the same one. From a classical, time-series perspective, it’s nearly impossible to factor in something as basic as that, that a returning client who walks into your bookstore once per month is not going to repurchase the same product. So, if you see a surge of demand for a book, chances are that the demand is going to be extinguished by the fact that if all your routine clients buy this new, popular book, then by definition, once they come back, they will not be buying it again. Obviously, you can model that with a time series perspective using a life cycle effect, where you introduce a new product, it has a spike at the very beginning at launch, and then demand only decreases. But that’s a very indirect way of modeling the problem. A much more direct approach is to use different

Kieran Chandler: Your software allows you to accurately model what is happening in the store in a much more direct way than was previously possible. Can you explain how this changes the way statistical models are used in supply chain optimization?

Joannes Vermorel: With differentiable programming, it becomes easier to inject basic insights about customer behavior into statistical models. This means that the models don’t have to learn everything from scratch with zero business insights, which was a difficult task before.

Kieran Chandler: How does differentiable programming help in the warehousing aspect of the supply chain? Is it mainly about forecasting future demand?

Joannes Vermorel: Differentiable programming can also help with challenges at the warehouse level, such as smoothing the flow of shipments. Warehouses often face input-output capacity problems, and one solution is to intelligently organize shipments to avoid colliding shipments and reduce pressure on logistic platforms. By making small adjustments to the shipment schedule, operations can be smoother, easier, and cheaper to manage, reducing the need for temporary workforce and the operational complexity that comes with it.

Kieran Chandler: Was it difficult to achieve this level of optimization with existing techniques?

Joannes Vermorel: With existing techniques, it was difficult to combine learning and optimization. When you have thousands of products and hundreds of clients, you end up with millions of variables to optimize, and traditional optimization methods can’t handle this complexity. Differentiable programming allows for better optimization in these situations, even with many subtle interactions, such as the need to ship more products to a store if it’s closer to running out of stock.

Kieran Chandler: You have a feedback loop between what you decide, what you forecast, and with a more traditional perspective, we could do this sort of optimization, but it was a lot more tedious because we had to do stage analysis. Fundamentally, it was very hard to factor in all those feedback loops that exist in the system. Okay, and if we take the final step back in that supply chain and look at things now from a manufacturing level, how does differentiable programming help us with these multi-echelon challenges?

Joannes Vermorel: The point is even more acute when you go to the realm of multi-echelon optimization. Most of what is happening on every single node is, in a way, inconsequential in the sense that it’s an artifact. You do not care about the stock availability at random points in your complex network of parts and assemblies that end up as the finished goods. The only point of the graph that really matters is whether you’re serving your client on time, which is only a question that is relevant for the finished goods.

What about all the dependency graph that you have behind? The fact that what is happening at every intermediate step, this complex graph of dependencies where you have your bill of materials generating this graph, is fundamentally irrelevant. It’s an artifact that only matters from the perspective of whether you are, at the very end of the process, serving your clients.

By the way, that’s getting back to my criticism of DDMRP a couple of weeks ago. If you can adopt a binary scoring scheme on this graph and say that for certain nodes, you want to achieve a high service level, it doesn’t matter to have a high service level for a product if your clients do not care about it because they are not purchasing that product. The only thing they care about is whether the finished goods that you’re selling are available or not.

Differentiable programming helps you model much more accurately what is happening in this network. You have some steps that can have probabilistic times or not. You might have steps where you have a certain fraction of the flow that does not pass quality control. Obviously, if you have a perfect supply chain, you would have 100% quality control. So if you have 100 items you need to supply, you will have 100 finished goods that flow after the machining step. But sometimes, you have quality control, and your production system is imperfect, and you might lose some quantities.

For example, in pharma, when they have very advanced biological processes, you might lose a batch of production because it’s a culture of cells to produce the more advanced drugs. Despite decades of efforts, when you work with living organisms producing the chemical compounds that you want to extract and be part of your drug, it’s very hard to have a process that is completely 100% reliable. It’s not like machining in the automotive industry.

Kieran Chandler: So, is this where the idea of modeling those outcomes that are not completely deterministic comes in?

Joannes Vermorel: Yes, but also the fact that you can have very specific insights on the type of problems that you can have. For example, in the pharma industry, if you have a problem, you’re most likely going to lose the entire batch of cultures that you have in the plant. It’s not going to be like machining in the automotive industry where one part out of ten thousand doesn’t pass quality control. If you’re in pharma and you have cultures that generate certain types of chemical compounds, you might lose the entire batch if there’s a problem.

Kieran Chandler: That completely changes the type of uncertainty, and you can try to learn that from the data, but it’s difficult because you might not have 20 years of relevant data. You’re kind of making the problem harder than it should be because you would like to be able to express this sort of insight, the physical reality of your business, directly into the model. So the probabilistic approach is very good, but my point is, what about having an approach like differentiable programming where you can frame the problems that you’re trying to learn so that you directly steer your machine learning algorithms toward the very specific sort of uncertainty that you expect to find because you know a lot of things about your network? And it can be game-changing because suddenly, you need a lot less data to be super efficient.

Joannes Vermorel: Absolutely. The real virtue lies in this idea of programming. You want to, it’s not an AI that you could just throw data at and say learn; this is kind of the opposite of that. It’s saying that data is sparse, and I want to be very accurate, but I need to make the most of the data that I have. This is not like Google trying to analyze a billion web pages; we don’t have infinite data. Data is sparse; there is erratic data, and it’s very valuable because we don’t have that many data points. So we really need to make the most of it. For example, if we want to get back to pharma and make very insightful strategic forecasts, there is the whole thing about patent expiration.

Patent expiration is driving big pharma. You have a product, a patented drug, and then when the patent expires, there is the risk that competitors enter your market at a lower price and compete with you, forcing you to lower your prices as well, which can significantly reduce your margin. This patent expiration thing is completely obvious for anybody familiar with pharma, and it has been driving innovation and the activity of big pharmaceutical companies for decades. If you expect that a machine learning algorithm is going to rediscover on its own this mechanism of patent expiration, that’s a little bit insane. In contrast, differentiable programming is like a tool for supply chain scientists to say, well, I know I have this patent expiration thing. What I do not know is exactly what is the likelihood of competitors entering the arena and competing with us with price. And what I do not know exactly is how this thing is going to play out for us if suddenly we have to have the quantities that we sell just because other competitors enter and we maintain all the same fixed costs.

If I maintain the same production capacity, then I have many costs that are completely flat and do not depend on the quantity I produce, and thus, if I have competitors that enter the market, the effect on my margins can be completely nonlinear. So you’re correct; it’s completely about being able to model the key insights that are specific from one vertical to another vertical by programming them into the machine learning model.

Kieran Chandler: And the problem many people might have with differentiable programming is that it’s fairly complex in some parts. Are we sometimes using a sledgehammer to crack a nut, and are there more simplistic techniques that we could still use?

Joannes Vermorel: You can always use more simplistic techniques, but I think the key question that clients should ask themselves is, if you run a complex supply chain, can you really decide to ignore the complexity of the business you’re operating in? For example, if you are selling automotive parts on an e-commerce platform and you’re servicing car owners…

Kieran Chandler: Can you really ignore the problems that you have, like mechanical compatibilities between vehicles and parts? The fact is that people who come to buy car parts on your website, the real clients are not those people, they are their vehicles. So the vehicle is the ultimate client of those parts, and you have at the core of the demand a problem of mechanical compatibility. If you have many parts that are perfect substitutes because they are all mechanically compatible to some vehicle, it’s a super important aspect of your business. What I’m saying is that this is an example where you need to embrace this because it’s really the core of your business. A simplistic approach that just ignores the part-vehicle compatibility challenge, which is completely crucial when you’re thinking about automotive aftermarket, can work but at the expense of being incredibly crude business-wise.

Joannes Vermorel: I’m saying that in terms of sledgehammer, you should not be using fancy tech for the sake of fancy tech. What I’m saying is that if you are using something that just ignores the key business driver of your business, then whatever model you have is incredibly simplistic, and don’t expect that fancy numerical solution or whatever will actually solve the business problem that you have if your numerical recipe starts by ignoring this business angle altogether. My point is that you should be as simple as possible, but no simpler than what your business actually requires.

Kieran Chandler: If we start concluding today, previously at Lokad, we had a very much programmatic approach. What’s the big change that differentiable programming is giving us, and how can companies adapt to take advantage of it?

Joannes Vermorel: Differentiable programming is indeed more of the same of this programmatic approach that has been the motor of Lokad for a long time. It’s now something where this programmatic approach goes into the core of our machine learning technology. It was not just the core of our Big Data Platform with mechanisms to do big data processing, but just simple filtering, aggregation, and typical pre-processing, data cleaning, and so on. That was already completely programmatic, but the machine learning core was slightly rigid. With deep learning, we were already a lot more flexible than what we had with the previous generation before, but it’s a new stage. For our clients, I believe it’s the opportunity to revisit many problems and situations where we, in the past, had done a lot of duct tape. When you don’t have something that is flexible, you kind of duct tape the thing by having clever tricks, but they are not naturally as scalable as we want them to be. They might be a bit crude and approximate the business insight in suboptimal ways. Here, it’s an opportunity to revisit that and just do pretty much the same thing but in a way that is leaner in terms of execution and more performant in terms of accuracy when we count in euros or dollars of error, business-wise.

Kieran Chandler: Great, well thanks for your time today. That’s everything for our mini-series on differentiable programming. We’ll be back next week with another episode on a new topic, but until then, thanks for watching. Goodbye.