#!/usr/bin/env python # coding: utf-8 # # Block 1: Linear programming: the diet problem # ### Alfred Galichon (NYU & Sciences Po) # ## 'math+econ+code' masterclass on optimal transport and economic applications # © 2018-2021 by Alfred Galichon. Past and present support from NSF grant DMS-1716489, ERC grant CoG-866274, and contributions by Jules Baudet, Pauline Corblet, Gregory Dannay, and James Nesbit are acknowledged. # # #### With python code examples # ## Linear programming: duality # # ### Learning objectives # # * Linear programming duality # # * Economic interpretation of the dual # # * Numerical computation # # ### References # # * Galichon, *Optimal Transport Methods in Economics*. Appendix B. # # * Stigler (1945), The cost of subsistence. *Journal of Farm Economics*. # # * Dantzig (1990), The diet problem. *Interface*. # # * Complements: # # * Gale (1960), *The theory of linear economic models*. # # * Vohra (2011), *Mechanism Design: A Linear Programming Approach*. # # ### The diet problem # # During World War II, engineers in US Army were wondering how to feed their personnel at minimal cost, leading to what is now called the **optimal diet problem**. # # * Nutritionists have identified a number of vital nutrients (calories, protein, calcium, iron, etc.) that matter for a person's health, and have determined the minimum daily intake of each nutrient # # * For each basic food (pasta, butter, bread, etc), nutritionists have characterized the intake in each of the various nutrients # # * Each food has a unit cost, and the problem is to find the optimal diet = combination of foods that meet the minimal intake in each of the nutrients and achieves minimal cost # # The problem was taken on by G. Stigler, who published a paper about it in 1945, giving a first heuristic solution, exhibiting a diet that costs $\$39.93$ per year in $1939$ dollars. Later (in $1947$) it was one of the first # application of G.B. Dantzig's method (the simplex algorithm), which provided the exact solution ($\$39.67$). It then took $120$ man-day to perform this operation. However today the computer will perform it for us in a # fraction of second. # # However, don't try this diet at home! Dantzig did so and almost died from it... # ## Motivation # # ### A look at the Data # # Our dataset was directly taken from Stigler's article. It is a .csv file called `StiglerData1939.txt'. Have a look at the diet problem data: # In[5]: import scipy.sparse as sp import pandas as pd import numpy as np thepath = 'https://raw.githubusercontent.com/math-econ-code/mec_optim_2021-01/master/data_mec_optim/lp_stigler-diet/' filename = 'StiglerData1939.txt' thedata = pd.read_csv(thepath + filename, sep='\t') thedata = thedata.dropna(how = 'all') # Our dataset has the nutritional content of $77$ commodities, and in the final row, the daily minimum requirement of each of these nutrients. # In[6]: thedata.head() # ### The Diet problem # # Problem setup: # # * Assume there are nutrients $i\in\left\{ 1,...,I\right\} $ (calories, protein, calcium, iron, etc.) that matter for a person's health, in such way that the minimum daily intake of nutrient $i$ should be $d_{i}$. # # * Nutrients do not come as standalone elements, but are combined into various foods. One dollar worth of food $j\in\left\{ 1,...,J\right\}$ yields a quantity $N_{ij}$ of nutrient $i\in\left\{1,...,I\right\}$. # # The problem is to find the diet that achieves the minimal intake of each nutrient at a cheapest price. If $q\in\mathbb{R}^{J}$ is a vector such that $q_{j}\geq0$ is the quantity of food $i$ purchased, the quantity of nutrient $i$ ingested is $\sum_{j=1}^{J}N_{ij}q_{j}$, and the cost of the diet is $\sum_{j=1}^{J}q_{j}$. Letting $1_J$ be the column vector of ones of size $J$, the optimal diet is therefore given by # # \begin{align*} # \min_{q\geq 0} & ~1_J^{\top}q\\ # s.t.~ & Nq\geq d.\nonumber # \end{align*} # # Before we tackle this problem, let's look at the more general instance of linear programming problem in standard form. # ## A Crash Course on Linear Programming # # ### Linear programming in standard form # # Let $c\in\mathbb{R}^{n}$, $d\in\mathbb{R}^{m}$, $A$ be a $m\times n$ matrix, and consider the following problem # # \begin{align} # V_{P} = \max_{x\in\mathbb{R}_{+}^{n}} & \, c^{\top} x \\ # s.t.~Ax & \leq d # \end{align} # # This problem is a *linear programming problem*, as the objective function, namely $x\rightarrow c^{\top}x$ is linear, and as the constraint, namely $x\in\mathbb{R}_{+}^{n}$ and $Ax=d$ are also linear (or more accurately, affine). This problem is called the *primal program*, for reasons to be explained soon. The set of $x$'s that satisfy the constraint are called *feasible solutions*; the set of solutions of the primal problem are called *optimal solutions*. # # #### Remarks # # * The previous diet problem can be reformulate into this problem - why? # # * A problem does not necessarly have a feasible solution (e.g. if $A=0$ and $d\neq0$), in which case (by convention) $V_{P}=-\infty$. # # * The whole space may be solution (e.g. if $A=0$ and $d=0$), in which case $V_{P}=+\infty$. # # ### Duality # # There is a powerful tool called *duality* which provides much insight into the analysis of the primal problem. The idea is to rewrite the problem as # # \begin{align*} # V_{P}=\max_{x\in\mathbb{R}_{+}^{n}}\left\{ c^{\top}x+L_{P}\left( # d-Ax\right) \right\} # \end{align*} # # where $L_{P}\left(z\right)$ is a penalty function whose value is zero if the constraint is met, that is if $z=0$, and $-\infty$ if it is not, namely if $z\neq0$. The simplest choice of such penalty function is given by $L_{P}\left( z\right) =\min_{y\in\mathbb{R}^{m}}\left\{ z^{\top}y\right\}$. One has # # \begin{align*} # V_{P}=\max_{x\in\mathbb{R}_{+}^{n}}\min_{y\in\mathbb{R}^{m}}\left\{c^{\top}x+\left( d-Ax\right) ^{\top}y\right\} . # \end{align*} # # However, the minimax inequality $\max_{x}\min_{y}\leq\min_{y}\max_{x}$ always holds, thus # # \begin{align*} # V_{P} & \leq\min_{y\in\mathbb{R}^{m}}\max_{x\in\mathbb{R}_{+}^{n}}\left\{ # c^{\top}x+\left( d-Ax\right) ^{\top}y\right\} =\min_{y\in\mathbb{R}^{m} # }\max_{x\in\mathbb{R}_{+}^{n}}\left\{ x^{\top}\left( c-A^{\top}y\right) # +d^{\top}y\right\} \\ # & \leq\min_{y\in\mathbb{R}^{m}}\left\{ d^{\top}y+L_{D}\left( c-A^{\top # }y\right) \right\} =:V_{D} # \end{align*} # # where $L_{D}\left(z\right) = \max_{x\in\mathbb{R}_{+}^{n}}\left\{x^{\top}z\right\}$ is equal to $0$ if $z\in\mathbb{R}_{-}^{n}$, and to $+\infty$ if not. # Therefore, the value $V_{D}$ is expressed by the *dual program* # # \begin{align} # V_{D}=\min_{y\in\mathbb{R}^{m}} & \, d^{\top}y, \\ # s.t.~A^{\top}y & \geq c # \end{align} # # and the weak duality inequality $V_{P}\leq V_{D}$ holds. # # It turns out that as soon as either the primal or dual program has an optimal solution, then both # programs have an optimal solution and the values of the two programs coincide, # so the weak duality becomes an equality $V_{P}=V_{D}$ called strong duality. # Further, if $x^{\ast}\in\mathbb{R}_{+}^{n}$ is an optimal primal solution, and # $y^{\ast}\in\mathbb{R}^{m}$ is an optimal dual solution, then complementary # slackness holds, that is $x_{i}^{\ast}>0$ implies $\left( A^{\top}y^{\ast # }\right) _{i}=c_{i}$. # # # ### Duality theorem # We summarize these results into the following statement. # # --- # **Theorem.** In the setting described above: # # 1. The weak duality inequality holds: # # \begin{align} # V_{P}\leq V_{D}. # \end{align} # # 2. As soon as the primal or the dual program have an optimal solution, then both programs have an optimal solution, and strong duality holds: # # \begin{align} # V_{P}=V_{D}. # \end{align} # # 3. If $x^{\ast}\in\mathbb{R}_{+}^{n}$ is an optimal primal solution, and $y^{\ast}\in\mathbb{R}^{m}$ is an optimal dual solution, then complementary slackness holds: # # \begin{align} # x_{i}^{\ast}>0\text{ implies }\left( A^{\top}y^{\ast}\right) _{i}=c_{i}. # \end{align} # # --- # # # Back to the diet problem # # Recall the optimal diet problem # # \begin{align*} # \min_{q\geq0} & \, c^{\top}q\\ # s.t.~ & Nq\geq d. # \end{align*} # # which has minimax formulation $\min_{q\geq0}\max_{\pi\geq0}c^{\top}q+d^{\top}\pi-q^{\top}N^{\top}\pi$, so the dual is # # \begin{align*} # \max_{\pi\geq0} & \, d^{\top}\pi\\ # s.t.~ & N^{\top}\pi\leq c # \end{align*} # # Complementary slackness yields: # # * $q_{j}>0$ implies $\left( N^{\intercal}\pi\right) _{j}=c_{j}$; that # is, if natural food $j$ is actually purchased, then the prices of its # synthetic and natural versions coincide # # * $\pi_{i}>0$ implies $\left( Nq\right) _{i}=d_{i}$; that is, if # nutrient $i$ has a positive price, then the natural diet has the # "just right" amount. # ## Interpretation of duality # # Imagine that there is a new firm called Nutrient Shoppe, who sells raw nutrients. Let $\pi_{i}$ be the price of nutrient $i$. The cost of the diet is $d^{\top}\pi$. Consumer purchase raw nutrients and can generate # "synthetic foods". The cost of the synthetic version of food $j$ is $\sum_{i=1}^{m}N_{ij}\pi_{i}=\left(N^{\intercal}\pi\right)_{j}$. The constraint thus means that each "synthetic food" is more affordable than its natural counterpart. # # The duality means that it is possible to price the nutrients so that the # synthetic foods are cheaper than the natural ones, in such a way that the # price of the synthetic diet equals the price of the natural diet. # # ## Numerically solving the diet problem # # To solve the primal problem we need to construct the objects $N$ and $d$. $c$ is simply a vector of ones, the size of the number of commodities. $N$ is a matrix of amount of nutrients in each commodity. $d$ is the required daily allowance of each nutrient. # In[7]: commodity = (thedata['Commodity'].to_numpy())[:-1] intake = thedata.iloc[:-1, 4:].fillna(0).transpose().to_numpy() allowance = thedata.iloc[-1, 4:].fillna(0).transpose() allowance # ### Finding a solver # # There are many solvers available. Some are open-souce, some are commercially available, even though the latter often come for free for an academic use. `scipy` has a linear programming solver called `scipy.optimize.linprog`, which we don't recommend except for small problems. # # Actually, with a problem of the (tiny) size that we are discussing in this introduction, the choice of the solver does not matter: all of them will return a solution whithin a fraction of seconds. But later on, we shall deal with problem with hundreds of thousands of variables or constraints, and the choice of the right linear programming solver will become crucial. # We will need *large scale* solvers, which deal with sparse constraints (often the case as we shall see). # # See a benchmark of large scale solvers at: # # http://plato.asu.edu/ftp/lpsimp.html # # As we can see, there is a factor 100 between the average running time of the best performer (COPT, a commercial solver) and the worse performer of the list (GLPK, an open-source solver). [Gurobi](http://www.gurobi.com/), also a commercial solver with a free academic license, is among the best performers, and widely available in Python, R, C++ and many other languages. We shall use it as our linear programming solver of choice for the entirety of this course. # # To access Gurobi from a docker container, you need a token server license. The system administrator of your university can help you setup one. For the purpose of this course, the Gurobi team has graciously provided us with educational licenses so that you can start working immediately. Please keep in mind that these licenses can only be used for the purpose of this course. They should not be communicated to anyone else. They expire January 31, 2021. # # If you do not have access to a full Gurobi license, or if you are running this notebook from Colab, you may still install Gurobi using `pip` with limited functionality -- namely, up to 2,000 variables and 2,000 constraints. Remove the commenting dash # in the cell below and install using: # Assuming we've installed the Gurobi solver and solved the license issue, let's first load up the Gurobi library and get some help. # # In[22]: import gurobipy as grb # Define the model and call the solver by: # In[23]: m = grb.Model('optimalDietMatrix') x = m.addMVar(shape=commodity.shape) m.setObjective(x.sum(), grb.GRB.MINIMIZE) c = m.addConstr(sp.csr_matrix(intake) @ x >= np.array(allowance), name="c") m.optimize() # We are after the optimal solutions `x`, the dual solution `pi` and the value function `objval` # In[24]: x = m.getAttr('X') pi = m.getAttr('pi') # In[25]: print('***Optimal solution***') total = 0 thelist = [] for i,thecom in enumerate(commodity): if x[i] > 1e-8: total += x[i] * 365 thelist.append([thecom,x[i]*365]) thelist.append(['Total cost (optimal):', total]) pd.DataFrame(thelist) # As promised, we achieve the minimum cost bundle at $\$39.67$ per year in $1939$ dollars. We can compare this to Stigler's solutions which was: # # |Food| Annual Quantities| Annual Cost| # | ---------- | ------------------ | ------------ | # | Wheat Flour | 370 lb.| \$13.33 | # | Evaporated Milk | 57 cans | \$3.84 | # |Cabbage| 111 lb. |\$4.11| # |Spinach| 23 lb. |\$1.85| # |Dried Navy Beans| 285 lb. | \$16.80| # |Total Annual Cost| | \$39.93 | # **Exercise**. Recover the solution with `scipy`'s linear programming solver, namely `scipy.optimize.linprog`. # In[31]: # answer: from scipy.optimize import linprog as lp res = lp(np.ones(len(intake[0])), A_ub=-intake, b_ub=-allowance) x = res.get('x') result = res.get('fun') total = 0 diet = [] for i, com in enumerate(commodity): if x[i] >1e-8: total = total + x[i]*365 diet.append([com, x[i]]) print(pd.DataFrame(diet)) print("Total cost:" + str(total))