Notebook for the course on Computational Social Choice, demonstrating how to use SAT-solving technology to reason about voting rules. Prepared by Ulle Endriss, ILLC, University of Amsterdam (November 2023).
You need to be able to run Jupyter Notebooks for Python 3 and you need to have the PySAT toolkit installed on your machine.
The recommended way of using this notebook is to first restart the kernel and then run the entire notebook cell by cell. With the exception of two cells near the end of the notebook, this should not generate any error messages. Then go to any of the cells labelled "Try it!" and see what happens when you make changes to those cells and run them again.
Beware that some of the code is only explained on the slides and not here. Note that I followed to a 'minimalist' coding philosophy: trying to write as little code as possible and trying to make that code as easy to understand as possible. So this is neither maximally efficient (but easily fast enough for our present needs) nor maximally robust (but it works fine when used as intended).
We first import some extra functionality from standard Python modules, so we can easily (a) compute the factorial of a given number, (b) work with the list of all permutations of a given list, and (c) randomly shuffle a given list.
from math import factorial
from itertools import permutations
from random import shuffle
We want to use SAT-solving technology to reason about voting rules. In this section, we provide the methods required to make this technology available to us. Specifically, we define methods to check whether a given formula is satisfiable, to enumerate all models of a satisfiable formula, and to compute a minimally unsatisfiable subset (MUS) of an unsatisfiable set of clauses. Our methods are simple wrappers around more sophisticated versions of the same methods provided by PySAT. Note that none of this is specific to our intended application to social choice theory.
A SAT solver is a program that can determine whether a given formula of propositional logic is satisfiable. Recall that a formula $\varphi$ is satisfiable if there exists an assignment of truth values to the propositional variables occurring in $\varphi$ for which $\varphi$ evaluates to true. Formulas must be provided in conjunctive normal form (CNF). Recall that a formula in CNF is a conjunction ('and') of clauses, with a clause being a disjunction ('or') of literals. A literal, in turn, is either a propositional variable or the negation of a propositional variable.
In our code, we use positive integers to represent positive literals (i.e., propositional variables), negative integers to represent negative literals (i.e., negations of propositional variables), lists of integers to represent clauses, and lists of such lists to represent formulas in CNF. Thus, for example, the list [[1,-3,2],[-2]]
represents the formula $(p_1 \lor \neg p_3 \lor p_2) \land (\neg p_2)$. This is known as the DIMACS format.
We first import the required functionality from PySAT. The first import makes available the solver Glucose 3.0 (any of the other solvers should work just as well). The second makes available a module for working with weighted formulas in CNF, which is required for the implementation of the MUS extractor provided by means of the third import.
from pysat.solvers import Glucose3
from pysat.formula import WCNF
from pysat.examples.musx import MUSX
The method solve()
implemented below offers easy access to the corresponding method of our chosen SAT solver. If you apply this method to a formula $\varphi$ in CNF, it will return the string UNSATISFIABLE
in case $\varphi$ is unsatisfiable, and otherwise a model that satisfies $\varphi$. Any such a model is presented in the form of a list of positive and negative integers, indicating which propositional variables must be set to true and which must be set to false.
def solve(cnf):
solver = Glucose3()
for clause in cnf: solver.add_clause(clause)
if solver.solve():
return solver.get_model()
else:
return('UNSATISFIABLE')
Try it! You can change the CNF formula (i.e., the list of lists of integers) in the example below to test different formulas for their satisfiability.
solve([[1,2], [1,-2], [-1,2], [-1,-2]])
The method enumModels()
can be used to enumerate all satisfying models of a given formula in CNF. Keep in mind that this is a demanding operation and that the number of models can be huge. Note that the object returned by enumModels()
is an iterator (not a list). A simple way of inspecting it is to use the list()
method on the object returned by enumModels()
.
def enumModels(cnf):
solver = Glucose3()
for clause in cnf: solver.add_clause(clause)
return solver.enum_models()
Try it!
models = enumModels([[-1,-2,3], [2], [-3,1]])
print(list(models))
The method getMUS()
can be used to compute an MUS for a given formula in CNF (which should be unsatisfiable). Note that a given formula might have more than one MUS, and getMUS()
does not necessarily return the smallest MUS (because computing a cardinality-minimal MUS is a much more demanding operation).
def getMUS(cnf):
wcnf = WCNF()
for clause in cnf: wcnf.append(clause, 1)
mus = MUSX(wcnf,verbosity=0).compute()
return list(cnf[i-1] for i in mus)
Try it! The list we obtain represents an unsatisfiable set of clauses that becomes satisfiable as soon as we remove even a single clause from it.
getMUS([[-1,2], [1], [-3,4], [-2], [3,-4]])
In this section, we set up everything we need to represent (a) voters, (b) alternatives, (c) preferences of a single voter over the alternatives, and (d) profiles bundling the preferences of all voters. Observe that $n$ voters and $m$ alternatives, there are $m!$ possible preferences and $m!^n$ possible profiles.
We first fix n
, the number of voters, and m
, the number of alternatives. For much of what we are going to do, we shall assume n = 2
and m = 3
. Keep in mind that the number of objects we are dealing with grows exponentially in n
and superexponentially in m
, so things are unlikely to work well (or at all) for even moderately larger numbers.
n = 2
m = 3
The set of voters is simply the set of integers from 0
to n-1
.
def allVoters():
return range(n)
Similarly, the set of alternatives is simply the set of integers from 0
to m-1
.
def allAlternatives():
return range(m)
It will be convenient to also represent preferences and profiles as integers (just think of some enumeration of all preferences and assign integers to them, starting with 0
, and similarly for profiles). The method below can be used to return the set of all such integers representing profiles (we happen to not require a corresponding method for preferences).
def allProfiles():
return range(factorial(m) ** n)
Try it!
ps = allProfiles()
print(list(ps))
The next three methods allow us to retrieve lists of voters, alternatives, and profiles, respectively, that meet a certain condition. Such a condition can be any function mapping objects of the appropriate types to a boolean value.
def voters(condition):
return [i for i in allVoters() if condition(i)]
def alternatives(condition):
return [x for x in allAlternatives() if condition(x)]
def profiles(condition):
return [r for r in allProfiles() if condition(r)]
Try it! For example, using lambda expressions in Python to define suitable (boolean) conditions, we can retrieve the list of all alternatives that are different from alternative 1
.
alternatives(lambda x : x!=1)
Method preference()
can be used to retrieve the number representing the preference order reported by a given voter i
in a given profile r
. The way I have chosen to implement this is to think of the numbers representing profiles as being written down in an m!
-ary number system in which each digit represents the preference order of one voter (with voters being listed in inverse order, for technical convenience). For example, for n=2
and m=3
, profiles are 2-digit numbers with digits ranging from 0
to 5
. In this case, profile 50
(representing the number $5 \cdot 3!^1 + 0\cdot 3!^0 = 30$ in the decimal system) is the profile in which the second voter (corresponding to the first digit) has the last possible preference order (say, 2>1>0
) and the first voter (corresponding to the second digit) has the first possible preference order (say, 0>1>2
).
def preference(i, r):
base = factorial(m)
return ( r % (base ** (i+1)) ) // (base ** i)
Method iVariants()
can be used to check whether profiles r1
and r2
are i
-variants, i.e., to check whether in the two profiles all voters, except possibly voter i
, report the same preferences.
def iVariants(i, r1, r2):
return all(preference(j,r1) == preference(j,r2)
for j in voters(lambda j : j!=i))
Method preflist()
can be used to retrieve a list-representation of the preference order of a given voter i
in a given profile r
. This is implemented by generating the list of all permutations of the alternatives and then picking the $k$th element of that big list, where $k$ is obtained using preference(i,r)
.
def preflist(i, r):
preflists = list(permutations(allAlternatives()))
return preflists[preference(i,r)]
We now have everything in place to implement the top-level methods we will actually need to reason about preferences and profiles. First among them is prefers()
, which we can use to check whether voter i
prefers alternative x
to alternative y
in profile r
.
def prefers(i, x, y, r):
mylist = preflist(i, r)
return mylist.index(x) < mylist.index(y)
Try it! Let's check, for example, whether the second voter (1
) prefers the first alternative (0
) to the third alternative (2
) in profile number 30
.
prefers(1,0,2,30)
Method top()
can be used to check whether voter i
has alternative x
at the top of her preference order in profile r
.
def top(i, x, r):
mylist = preflist(i, r)
return mylist.index(x) == 0
Try it!
top(1,2,30)
The next two methods are simple pretty-print methods for preferences and profiles, respectively.
def strPref(i, r):
return ''.join(str(x) for x in preflist(i,r))
def strProf(r):
return '(' + ','.join(strPref(i,r) for i in allVoters()) + ')'
Try it!
strProf(30)
Method posLiteral()
can be used to return the positive literal corresponding to the statement that in profile r
it is the case that alternative x
is a winner (but not necessarily the only one). Method negLiteral()
returns the corresponding negative literal.
def posLiteral(r, x):
return r * m + x + 1
def negLiteral(r, x):
return (-1) * posLiteral(r, x)
Method strLiteral()
provide a simple pretty-print method for literals, and strClause()
extends this to clauses (i.e., to lists of literals).
def strLiteral(lit):
r = (abs(lit) - 1) // m
x = (abs(lit) - 1) % m
return ('not ' if lit<0 else '') + strProf(r) + '->' + str(x)
def strClause(clause):
return ' or '.join(strLiteral(lit) for lit in clause)
Try it!
strLiteral(-108)
Method cnfAtLeastOne()
returns the CNF encoding the statement that a voting rule must always return at least one winner.
def cnfAtLeastOne():
cnf = []
for r in allProfiles():
cnf.append([posLiteral(r,x) for x in allAlternatives()])
return cnf
Try it!
print(cnfAtLeastOne())
Method cnfResolute()
returns the CNF encoding the requirement that our voting rule must be resolute.
def cnfResolute():
cnf = []
for r in allProfiles():
for x in allAlternatives():
for y in alternatives(lambda y : x < y):
cnf.append([negLiteral(r,x), negLiteral(r,y)])
return cnf
The following axioms encode various axioms we might want our voting rule to satisfy. Refer to the slide deck for explanation.
def cnfStrategyProof():
cnf = []
for i in allVoters():
for r1 in allProfiles():
for r2 in profiles(lambda r2 : iVariants(i,r1,r2)):
for x in allAlternatives():
for y in alternatives(lambda y : prefers(i,x,y,r1)):
cnf.append([negLiteral(r1,y), negLiteral(r2,x)])
return cnf
def cnfSurjective():
cnf = []
for x in allAlternatives():
cnf.append([posLiteral(r,x) for r in allProfiles()])
return cnf
def cnfNonDictatorial():
cnf = []
for i in allVoters():
clause = []
for r in allProfiles():
for x in alternatives(lambda x : top(i,x,r)):
clause.append(negLiteral(r,x))
cnf.append(clause)
return cnf
def most(bools):
return sum(bools) > len(list(bools)) / 2
def cnfMajoritarian():
cnf = []
for x in allAlternatives():
for r in profiles(lambda r : most(list(top(i,x,r) for i in allVoters()))):
cnf.append([posLiteral(r,x)])
return cnf
The dictionary axiom
associates names of axioms (i.e., strings) with methods generating the formulas encoding those axioms. This will be useful later on when we want to determine which axiom was responsible for a given clause to have been generated. (For convenience and by a slight abuse of terminology, we are including the requirements of always returning at least one winner and to be resolute in our list of axioms here.)
axiom = {
'AtLeastOne' : cnfAtLeastOne(),
'Resolute' : cnfResolute(),
'StrategyProof' : cnfStrategyProof(),
'Surjective' : cnfSurjective(),
'NonDictatorial' : cnfNonDictatorial(),
'Majoritarian' : cnfMajoritarian()
}
For a given clause or CNF, respectively, the next two methods allow us to display a simple explanation for how this formula has been generated.
def explainClause(clause):
reason = next((k for k in axiom if clause in axiom[k]), 'None')
print(reason + ': ' + strClause(clause))
def explainCNF(cnf):
for clause in cnf: explainClause(clause)
Try it!
explainClause([-10,-17])
Try it! We are now ready to generate the CNF corresponding to the Gibbard-Satterthwaite Theorem.
cnf = ( cnfAtLeastOne() + cnfResolute() + cnfStrategyProof() + cnfSurjective() + cnfNonDictatorial() )
len(cnf)
Try it! Let's ask the SAT solver to check whether the CNF we just generated is satisfiable. If it is not, we have found (an instance of) an impossibility theorem.
solve(cnf)
Try it! If our CNF indeed is unsatisfiable, then we can try to compute an MUS for it.
mus = getMUS(cnf)
len(mus)
Try it! The CNF below corresponds to the theorem by Geist and Peters on the incompatibility of strategyproofness and majoritarianism (which you can think of as a weakened version of the Gibbard-Satterthwaite Theorem). Recall that here the impossibility only kicks in once we set n = 3
.
cnf = ( cnfAtLeastOne() + cnfResolute() + cnfStrategyProof() + cnfMajoritarian() )
solve(cnf)
Try it! For n=2
and m=3
, the next cell will produce an error message, given that for these parameters the CNF we generated earlier in fact is satisfiable (so does not have an MUS). The intended use is for n=3
and m=3
.
mus = getMUS(cnf)
len(mus)
Try it! Sometimes we can find a smaller MUS by first shuffling our CNF.
shuffle(cnf)
mus = getMUS(cnf)
len(mus)
Try it! Let's print the last MUS we generated.
print(mus)
Try it! Now let's retrieve explanations for how the clauses in our MUS have been generated in the first place. Provided the MUS is reasonably short, it should be easy to transform this into a human-readable proof.
explainCNF(mus)