Notebook

Block 4: optimal assignments¶

Alfred Galichon (NYU & Sciences Po)¶

'math+econ+code' masterclass on optimal transport and economic applications¶

© 2018-2021 by Alfred Galichon. Past and present support from NSF grant DMS-1716489, ERC grant CoG-866274, and contributions by Jules Baudet, Pauline Corblet, Gregory Dannay, and James Nesbit are acknowledged.

With python code examples¶

Optimal Transport I: Discrete Transport¶

Learning Objectives¶

Optimal assignment problem
Pairwise stability, Walrasian equilibrium
Computation

References¶

Galichon, Optimal Transport Methods in Economics, Ch. 3
Roth, Sotomayor(1990). Two-Sided Matching. Cambridge.
Koopmans and Beckmann (1957). "Assignment problems and the location of economic activities." Econometrica.
Shapley and Shubik (1972). The assignment game I: The core." IJGT.
Becker (1993). A Treatise of the Family. Harvard.
Gretsky, Ostroy, and Zame (1992). "The nonatomic assignment model." Economic Theory.
Burkard, Dell'Amico, and Martello (2012). Assignment Problems. SIAM.
Dupuy and Galichon (2014). "Personality traits and the marriage market." JPE.

Motivation¶

Optimal Transport¶

Consider the problem of assigning a possibly infinite number of workers and firms.

Each worker should work for one firm, and each firm should hire one worker.
Workers and firms have heterogenous characteristics; let $x\in \mathcal{X}$ and $y\in\mathcal{Y}$ be the characteristics of workers and firms respectively.
Workers and firms are in equal mass, which is normalized to one. The distribution of worker's types is $P$, and the distribution of the firm's types is $Q$, where $P$ and $Q$ are probability measures on $\mathcal{X}$ and $\mathcal{Y}$.

It is assumed that if a worker $x$ matches with a firm $y$, the total output generated is $\Phi_{xy}$. The questions are then:

Optimality: what is the optimal assignment in the sense that it maximizes the overal output generated?
Equilibrium: what are the equilibrium assignment and the equilibrium wages?
Efficiency: do these two notions coincide?

The same tools have been used by Gary Becker to study the heterosexual marriage market, where $x$ is the man's characteristics, and $y$ is the woman's characteristics, and "wages" are replaced by "transfers".

Data¶

In this block, we shall take a first look at marriage data (while a worker-firm example will be seen in next block). Dupuy and Galichon (JPE, 2014) study a marriage dataset where, in addition to usual socio-demographic variables (such as education and age), measures of personality traits are reported.

The literature on quantitative psychology argues that one can capture relatively well an individual's personality along five dimensions, the "big 5" - consciousness, extraversion, agreableness, emotional stability, autonomy - assessed though a standardized questionaire.
In addition to this, we observed a (self-assessed) measure of health, risk-aversion, education, height and body mass index = weight in kg/ (height in m)$^2$. In total, the available characteristics $x_{i}$ of man $i$ and $y_{j}$ of woman $j$ are both $10$-dimensional vectors.
It is assumed that the surplus of interaction is given by $\Phi\left(x_{i},y_{j}\right) =x_{i}^{\intercal}Ay_{j}$, where $A$ is a given $10 \times 10$ matrix. (later in this course, we'll see how to estimate $A$ based on matched marital data).

Today, we solve a central planner's problem (a stylized version of the problem OKCupids would solve): given a population of men and a population of women, how do we mutually assign these in order to:

maximize matching surplus
attain a (hopefully) stable assignment.

In [3]:

import pandas as pd
import numpy as np
import scipy.sparse as spr
import os
# !python -m pip install -i https://pypi.gurobi.com gurobipy ## only if Gurobi not here
import gurobipy as grb

In [4]:

thepath = os.getcwd()

#thepath = os.path.join(os.getcwd(),'data_mec_optim/marriage_personality-traits/')
thepath = 'https://raw.githubusercontent.com/math-econ-code/mec_optim_2021-01/master/data_mec_optim/marriage_personality-traits/'

data_X = pd.read_csv(thepath + "Xvals.csv")
data_Y = pd.read_csv(thepath + "Yvals.csv")
affdf = pd.read_csv(thepath + "affinitymatrix.csv")

In [5]:

data_X.head()

Out[5]:

	educm	heightm	BMIm	healthm	consm	extram	agreem	emom	autom	riskym
0	2	186	28.905075	3	-0.752877	-0.360787	-0.711276	-0.291031	0.840217	0.479437
1	2	176	27.440599	3	0.345542	-0.805524	-0.251796	-0.305475	-0.064454	0.030303
2	3	187	23.163374	3	-0.759678	0.898007	-0.029462	-0.672859	-0.961691	-0.556598
3	1	184	29.241493	2	-0.455688	-1.053375	-0.041612	0.436133	0.121873	0.992084
4	1	174	23.781214	4	-1.440239	1.163730	0.293750	-0.538922	0.782285	-1.401034

The Discrete Monge-Kantorovich Theorem¶

Assume that the type spaces $\mathcal{X}$ and $\mathcal{Y}$ are finite, so $\mathcal{X=}\left\{ 1,...,N\right\} $, and $\mathcal{Y}=\left\{1,...,M\right\}$.

The mass of workers of type $x$ is $p_{x}$; the mass of jobs of type $y$ is $q_{y}$, with $\sum_{x}p_{x}=\sum_{y}q_{y}=1$.

Let $\pi_{xy}$ be the mass of workers of type $x$ assigned to jobs of type $y$. Every worker is busy and every job is filled, thus

\begin{align*} \pi\in\mathcal{M}\left(p,q\right) :=\left\{ \pi_{xy}\geq 0, \sum_{y\in\mathcal{Y}}\pi_{xy}=p_{x}\text{ and }\sum_{x\in\mathcal{X}}\pi _{xy}=q_{y} \right\}. \end{align*}

(Note that this formulation allows for mixing, i.e. it allows for $\pi_{xy}>0$ and $\pi_{xy^{\prime}}>0$ to hold simultaneously with $y\neq y^{\prime}$.)

Assume the economic output created when assigning worker $x$ to job $y$ is $\Phi_{xy}$. Hence, the optimal assignment is

\begin{align*} \max_{\pi\geq0} & \sum_{xy}\pi_{xy}\Phi_{xy} \\ s.t.~ & \sum_{y\in\mathcal{Y}}\pi_{xy}=p_{x} \left[u_{x}\right] \\ & \sum_{x\in\mathcal{X}}\pi_{xy}=q_{y}~\left[v_{y}\right] \end{align*}

Duality¶

Theorem

The value of the primal problem coincides with the value of the dual problem

\begin{align*} \min_{u,v} & \sum_{x\in\mathcal{X}}p_{x}u_{x}+\sum_{y\in\mathcal{Y}} q_{y}v_{y}\\ s.t. & u_{x}+v_{y}\geq\Phi_{xy}~\left[\pi_{xy}\geq0\right] \end{align*}
Both the primal and the dual problems have optimal solutions. If $\pi$ is a solution to the primal problem and $\left(u,v\right)$ is a solution to the dual problem, then by complementary slackness,

\begin{align*} \pi_{xy}>0\text{ implies }u_{x}+v_{y}=\Phi_{xy}. \end{align*}

Proof. The proof follows from the min-cost flow duality result, but let us rewrite it anyway.

The value of the primal problem can be written as $\max_{\pi\geq0}\min_{u,v}S\left( \pi,u,v\right)$, where

\begin{align*} S\left( \pi,u,v\right) :=\sum_{xy}\pi_{xy}\Phi_{xy}+\sum_{x\in\mathcal{X}}u_{x}(p_{x}-\sum_{y\in\mathcal{Y}}\pi_{xy})+\sum_{y\in\mathcal{Y}}v_{y}(q_{y}-\sum_{x\in\mathcal{X}}\pi_{xy}) \end{align*}

but by the minmax theorem, this value is equal to $\min_{u,v}\max_{\pi\geq 0}S\left( \pi,u,v\right)$, which is the value of the dual problem.
Follows by noting that, for a primal solution $\pi$ and a dual solution $\left( u,v\right) $, then $S\left( \pi,u,v\right) =\sum_{xy}\pi_{xy} \Phi_{xy}$. $\blacksquare$

Remarks. Note that this result is the min-cost flow duality theorem in the bipartite case, as seen in lecture $2$, after setting transportation cost through $xy\in\mathcal{X}\times\mathcal{Y}$ to $c_{xy}=-\Phi_{xy}$, and $n_{t}=-p_{t}1\left\{ t\in\mathcal{X}\right\} +q_{t}\mathbf{1}\left\{ t\in\mathcal{Y}\right\}$. We see various new interpretations of the result.

The following statements are equivalent:

$\pi$ is an optimal solution to the primal problem, and $\left(

u,v\right) $ is an optimal solution to the dual problem, and

(i) $\pi\in M\left( p,q\right) $
(ii) $u_{x}+v_{y}\geq\Phi_{xy}$
(iii) $\pi_{xy}>0$ implies $u_{x}+v_{y}\leq\Phi_{xy}$.

We saw the direct implication. But the converse is easy: take $\pi$ and $\left( u,v\right) $ satisfying (i)--(iii), Then one has

\begin{align*} dual\leq\sum_{x}p_{x}u_{x}+\sum_{y}q_{y}v_{y}=\sum_{xy}\pi_{xy}\left( u_{x}+v_{y}\right) \leq\sum_{xy}\pi_{xy}\Phi_{xy}\leq primal \end{align*}

but by the MK duality theorem, both ends coincide. Thus $\pi$ is optimal for the primal and $\left( u,v\right) $ for the dual.

Unassigned Agents¶

A important variant of the problem exists with $\sum_{x\in\mathcal{X}}p_{x}\neq\sum_{y\in\mathcal{Y}}q_{y}$ and the primal constraints become inequality constraints. The duality then becomes

\begin{align*} \begin{array} [c]{rrr} \max_{\pi\geq0}\sum\pi_{xy}\Phi_{xy} & = & \min_{u,v}\sum_{x\in\mathcal{X} }p_{x}u_{x}+\sum_{y\in\mathcal{Y}}q_{y}v_{y}\\ s.t.~\sum_{y\in\mathcal{Y}}\pi_{xy}\leq p_{x} & & u\geq0,~v\geq0 \\ \sum_{x\in\mathcal{X}}\pi_{xy}\leq q_{y} & & u_{x}+v_{y}\geq\Phi_{xy} \end{array} \end{align*}

Pairwise stability¶

In a marriage context, an important concept is stability:

An outcome is a vector $\left( \pi,u,v\right) $, where $u_{x}$ and $v_{y}$ are $x$'s and $y$'s payoffs, and $\pi$ is a matching that is

\begin{align*} \pi\in\mathcal{M}\left( p,q\right). \end{align*}

A pair $xy$ is blocking if $x$ and $y$ can find a way of sharing their joint surplus $\Phi_{xy}$ in such a way that $x$ gets more than $u_{x}$ and $y$ gets more than $v_{y}$. Hence there is no blocking pair if and only if for every $x$ and $y$, one has

\begin{align*} u_{x}+v_{y}\geq\Phi_{xy}. \end{align*}

If $x$ and $y$ are actually matched, their utilities $u_{x}$ and $v_{y}$ need to be feasible, i.e. the above inequality should be saturated. Hence

\begin{align*} \pi_{xy}>0\text{ implies }u_{x}+v_{y}=\Phi_{xy}. \end{align*}

Definition:

A matching that satisfies primal feasbilitity, no blocking, and complementary slackness is called a stable matching.

As it turns out, these conditions are precisely the conditions that express complementarity slackness in the Monge-Kantorovich problem. Therefore, outcome$\left( \pi,u,v\right) $ is stable if and only if $\pi$ is a solution to the primal problem, and $\left( u,v\right) $ is a solution to the dual problem.

Wage market interpretation 1¶

Back to the workers / firms interpretation and assume for now that workers are indifferent between any two firms that offer the same salary. We argue that $u\left( x\right) $ can be interpreted as the equilibrium wage of worker $x$, while $v\left( y\right) $ can be interpreted as the equilibrium profit of firm $y$. Indeed:

Proposition If $\left(u,v\right)$ is a solution to the dual of the Kantorovich problem, then

\begin{align*} u_{x} & =\sup_{y\in\mathcal{Y}}\left( \Phi_{xy}-v_{y}\right) \label{conjug1}\\ v_{y} & =\sup_{x\in\mathcal{X}}\left( \Phi_{xy}-u_{x}\right) . \label{conjug2} \end{align*}

Therefore, $u_{x}$ can be interpreted as equilibrium wage of worker $x$, and $v_{y}$ as equilibrium profit of firm $y$. In this interpretation, all workers get the same wage at equilibrium.

Wage market interpretation 2¶

Assume now that if a worker of type $x$ works for a firm of type $y$ for wage $w_{xy}$, then gets $\alpha_{xy}+w_{xy}$, where $\alpha_{xy}$ is the nonmonetary payoff associated with working with a firm of type $y$. The firm's profit is $\gamma_{xy}-w_{xy}$, where $\gamma_{xy}$ is the economic output.

If an employee of type $x$ matches with a firm of type $y$, they generate joint surplus $\Phi_{xy}$, given by%

\begin{align*} \Phi_{xy}=\underset{\text{employee's payoff}}{\underbrace{\alpha_{xy}+w_{xy}}}+\underset{\text{firm's payoff}}{\underbrace{\gamma_{xy}-w_{xy}}}=\alpha_{xy}+\gamma_{xy} \end{align*}

which is independent from $w$.

Workers choose firms which maximize their utility, i.e. solve

\begin{align*} u_{x}=\max_{y}\left\{ \alpha_{xy}+w_{xy}\right\} \end{align*}

and $u_{x}=\alpha_{xy}+w_{xy}$ if $x$ and $y$ are matched. Similarly, the indirect payoff vector of firms is

\begin{align*} v_{y}=\max_{x}\left\{ \gamma_{xy}-w_{xy}\right\} \end{align*}

and, again, $v_{y}=\gamma_{xy}-w_{xy}$ if $x$ and $y$ are matched.

As a result,

\begin{align*} u_{x}+v_{y}\geq\alpha_{xy}+\gamma_{xy}=\Phi_{xy} \end{align*}

and equality holds if $x$ and $y$ are matched. Thus, if $w_{xy}$ is an equilibrium wage, then the triple $\left( \pi,u,v\right)$ where $\pi$ is the corresponding matching, and $u_{x}$ and $v_{y}$ are defined by this and this defines a stable outcome.

Conversely, let $\left(\pi,u,v\right)$ be a stable outcome. Then let $\bar{w}_{xx}$ and \b{w}$_{xy}$ be defined by

\begin{align*} \bar{w}_{xy}=u_{x}-\alpha_{xy}\text{ and }w^l_{xy}=\gamma_{xy}-v_{y}. \end{align*}

One has $\bar{w}_{xy}\geq b^l_{xy}$. Any $w_{xy}$ such that $\bar{w}_{xy}\geq w_{xy}\geq b^l_{xy}$ is an equilibrium wage. Indeed, $\pi_{xy}>0$ implies $\bar{w}_{xy} \geq b^l_{xy}$, thus this and this hold. Given $u$ and $v$, $w_{xy}$ is uniquely defined on the equilibrium path (ie. when $x$ and $y$ are such that $\pi_{xy}>0$), but there are multiple choices of $w$ outside the equilibrium path.

Note that all workers of the same type get the same indirect utility, but not necessarly the same wage.

Application¶

We postulate that the form of the surplus function is \begin{align*} \Phi_{ij}=x_{i}^{\intercal} Ay_{j} \end{align*} where $x_{i}$ and $y_{j}$ are the 10-dimensional characteristics of man $i$ and woman $j$, and the form of $A$, a 10x10 matrix, is given (it is stored in the file affinitymatrix.csv). Again, we'll see later how to solve the econometrics problem of estimating $A$.

In [6]:

nbcar = 10
nobs = 10
affmat = affdf.iloc[0:nbcar,1:nbcar+1].to_numpy()

In [7]:

sdX = data_X.std().to_numpy()
sdY = data_Y.std().to_numpy()
mX = data_X.mean().to_numpy()
mY = data_Y.mean().to_numpy()

Xvals = ((data_X-mX)/sdX).to_numpy()[0:nobs, :]
Yvals = ((data_Y-mY)/sdY).to_numpy()[0:nobs, :]

In [8]:

nobs = Xvals.shape[0]
Phi = Xvals @ affmat @ Yvals.T

This problem of computation of the Optimal Assignment Problem, more specifically of $\left(\pi,u,v\right)$, is arguably the most studied problem in Computer Science, and dozens, if not hundreds of algorithms exist, whose running time is polynomial in $\max\left(n,m\right)$, typically a power less than three of the latter.

Famous algorithms include: the Hungarian algorithm (Kuhn-Munkres); Bertsekas' auction algorithm; Goldberg and Kennedy's pseudoflow algorithm. For more on these, see the book by Burkard, Dell'Amico, and Martello, and a introductory presentation in http://www.assignmentproblems.com/doc/LSAPIntroduction.pdf.

Here, we will show how to solve the problem with the help of a Linear Programming solver used as a black box; our challenge here will be to carefully set up the constraint matrix as a sparse matrix in order to let a large scale Linear Programming solvers such as Gurobi recognize and exploit the sparsity of the problem.

Let $\Pi$ and $\Phi$ be the matrices with typical elements $\left(\pi_{xy}\right) $ and $\left( \Phi_{xy}\right) $. We let $p$, $q$, $u$,$v$, and $1$ the column vectors with entries $\left( p_{x}\right)$, $\left( q_{y}\right) $, $\left( u_{x}\right) $, $\left( v_{y}\right)$, and $1$, respectively. The optimal assignment problem

\begin{align*} \max_{\pi\geq0} & \sum_{xy}\pi_{xy}\Phi_{xy} \\ s.t.~ & \sum_{y\in\mathcal{Y}}\pi_{xy}=p_{x}~\left[ u_{x}\right]\\ & \sum_{x\in\mathcal{X}}\pi_{xy}=q_{y}~\left[ v_{y}\right] \end{align*}

Can be rewritten writes using matrix algebra as

\begin{align*} & \max_{\Pi\geq0}Tr\left( \Pi^{\top}\Phi\right) \\ & \Pi1_{M}=p\\ & 1_{N}^{\top}\Pi=q^{\top}. \end{align*}

In [9]:

obj = Phi.flatten()

Recall that if $A$ is a $M\times p$ matrix and $B$ a $N\times q$ matrix, and if $vec(A)$ vectorizes in the row-major order (i.e. concatenates the rows of $A$) then the Kronecker product $A\otimes B$ of $A$ and $B$ is a $mn\times pq$ matrix such that

\begin{align*} vec\left( AXB^{\top}\right) =\left( A\otimes B\right) vec\left( X\right). \label{VecAndKronecker} \end{align*}

In python, $A\otimes B$ is implemented by sparse.kron(A,B) of the library sparse of scipy.

The first constraint $I_{N}\Pi1_{M}=p$, therefore vectorizes under the row-major order as

\begin{align*} \left( I_{N} \otimes1_{M}^{\top}\right) vec\left( \Pi\right) =vec\left( p\right), \end{align*}

and similarly, the second constraint $1_{N}^{\top}\Pi I_{M}=q^{\top}$, vectorizes as

\begin{align*} \left( 1_{N}^{\top} \otimes I_{M}\right) vec\left( \Pi\right) =vec\left( q\right) . \end{align*}

Note that the matrix $I_{N} \otimes1_{M}^{\top}$ is of size $N\times NM$, and the matrix $ 1_{N}^{\top} \otimes I_{M}$ is of size $M\times NM$; hence the full matrix involved in the left-hand side of the constraints is of size $\left( N+M\right) \times NM$. In spite of its large size, this matrix is sparse. In python, the identity matrix $I_{N}$ is coded as scipy.sparse.identity(N).

In [10]:

N = Phi.shape[0]
M = Phi.shape[1]

A1 = spr.kron(spr.identity(N), np.ones(M))
A2 = spr.kron(np.ones(N),spr.identity(M))
A = spr.vstack([A1, A2])

p = 1/nobs * np.ones(nobs)
q = 1/nobs * np.ones(nobs)
d = np.concatenate((p,q), axis = None)

Setting $z=vec\left( \Pi\right)$, the Linear Programming problem then becomes

\begin{align*} & \max_{z\geq0}vec\left( \Phi\right) ^{\top}z\label{LPvectorized}\\ s.t.~ & \left( 1_{M}^{\top}\otimes I_{N}\right) z=vec\left( p\right) \nonumber\\ & \left( I_{M}\otimes1_{N}^{\top}\right) z=vec\left( q^{\top}\right) \nonumber \end{align*}

which is ready to be passed on to a linear programming solver such as Gurobi.

A LP solver typically computes programs of the form

\begin{align*} & \max_{z\geq0}c^{\top}z\label{standardLP}\\ & s.t.~Az=d.\nonumber \end{align*}

In [11]:

m=grb.Model('Marriage')
x = m.addMVar(shape=len(obj), name="x")
m.setObjective(obj @ x, grb.GRB.MAXIMIZE)
# we minus 1 for the nodes because of python's 0 indexing
rhs = d
m.addConstr(A @ x == rhs, name="Constr")
m.optimize()

Academic license - for non-commercial use only - expires 2022-11-14
Using license file c:\gurobi911\gurobi.lic
Gurobi Optimizer version 9.1.1 build v9.1.1rc0 (win64)
Thread count: 6 physical cores, 12 logical processors, using up to 12 threads
Optimize a model with 20 rows, 100 columns and 200 nonzeros
Model fingerprint: 0x32e2b3fc
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [6e-03, 3e+00]
  Bounds range     [0e+00, 0e+00]
  RHS range        [1e-01, 1e-01]
Presolve time: 0.00s
Presolved: 20 rows, 100 columns, 200 nonzeros

Iteration    Objective       Primal Inf.    Dual Inf.      Time
       0    2.8601661e+31   8.600000e+31   2.860166e+01      0s
      22    1.0705334e+00   0.000000e+00   0.000000e+00      0s

Solved in 22 iterations and 0.01 seconds
Optimal objective  1.070533447e+00

In [12]:

if m.status == grb.GRB.Status.OPTIMAL:
    solution = np.array(m.getAttr('x')).reshape(N,M)
    pi = m.getAttr('pi')

Who does man $1$ match with?

In [13]:

print('Woman', np.argwhere(solution[0,:] != 0)[0][0] + 1)

Woman 7