#!/usr/bin/env python # coding: utf-8 #
#
#
#
#
Crypto Trading and

Arbitrage Identification Strategies
#
#
March 29th @ Slash Hackathon 2019
#
# #
Questions? Contact: roman@celo.org
#
#
#
#
#
#
#

Table of Contents

#
# # Representations of a Market # Assume that there is a market with $N$ currencies. Also assume for simplicity that every possible pair can be traded (not a necessary assumption for the depicted representations). The exchange rates can be displayed by a source-to-target exchange rate matrix. Let us generate and analyze an exemplary such matrix and translate it into a graph representation. # In[1]: # set things up and create button # In[2]: # set some parameters, generate exchange rate matrix and the corresponding graph N = 6 # number of currencies max_spread_pct = 0.05 # maximum bid-ask spread in pct of bid, 0.05 for 5% # ## Exchange Rate Matrix Representation # The exchange rate matrix takes the current best bids and best asks in the market and generates a matrix of multipliers that shows how many units of the target currency one would receive for a unit of the source currency. # In[3]: # generate and display random exchange rate matrix # ## Directed Graph Representation # The above exchange rate matrix can be translated into a directed graph with nodes representing the currencies and the edge weights set to the exchange rate matrix multipliers. # In[4]: # display corresponding directed graph # An arbitrage is a cylce for which the **cumulative product** of the edge weights in an exchange rate graph is bigger than one. # ## Log-Transformed Representations # There are many well known graph algorithms that deal with cycle lengths - the **cumulative sum** of the edge weights. Fortunately, one can easily translate a cumulative product optimization problem into a convex cumulative sum optimzation problem by taking logs. To see this, note that for $a>0 1: break v = p[int(v)] return list(reversed(cycle)) # In[7]: # Check if arbitrage cycle exists using Bellman-Ford. If so, output an arbitrage cycle # ## Finding an "Optimal" Set of Cycles # Identifying the cycle with an arbitrary number of vertices with the **largest** arbitrage multiplier is NP-hard in general. Finding this cycle with many tradable pairs thus quickly becomes computationally infeasible and one has to resort to approximate/heuristic algorithms. # # Observe that finding the cycle with the largest multiplier (i.e. the cycle with the most negative cycle length in the graph with negative-log-transformed weights) of all cycles that visit each vertex exactly once is equivalent to the (asymmetric) [Travelling Salesman Problem](https://en.wikipedia.org/wiki/Travelling_salesman_problem) - a well known NP-hard problem. The Travelling Salesman Problem (TSP) with $n+1$ cities can be formulated as an integer linear program for a distance matrix $C$ as follows # \begin{align} # \min &\sum_{i=0}^n \sum_{j\ne i,j=0}^nc_{ij}x_{ij} && \\ # & 0 \le x_{ij} \le 1 && i,j=0, \cdots, n \\ # & u_{i} \in \mathbf{Z} && i=0, \cdots, n \\ # & \sum_{i=0,i\ne j}^n x_{ij} = 1 && j=0, \cdots, n \\ # & \sum_{j=0,j\ne i}^n x_{ij} = 1 && i=0, \cdots, n \\ # &u_i-u_j +nx_{ij} \le n-1 && 1 \le i \ne j \le n # \end{align} # # # # The TSP is only **NP-hard because we are asking for a single roundtrip solution** that covers all cities / currencies (the last two sets of constraints in the above formulation). But **subtours are fine in the arbitrage finding task** and even skipping some assets completely (a self-cylce) is not a problem! # If we remove the "no-subtours" constaints of the TSP and also account for the diagonal of $X$ in the objective function, we arrive at the linear programming relaxation of the [assignment problem](https://en.wikipedia.org/wiki/Assignment_problem) # \begin{align} # \min &\sum_{i=0}^n \sum_{j=0}^nc_{ij}x_{ij} && \\ # & 0 \le x_{ij} \le 1 && i,j=0, \cdots, n \\ # & \sum_{i=0}^n x_{ij} = 1 && j=0, \cdots, n \\ # & \sum_{j=0}^n x_{ij} = 1 && i=0, \cdots, n \\ # \end{align} # This linear program always has an integer solution (due to the totally unimodular linear constraint matrix) and thus solves the assignment problem. You can use a very specialized solver for the assignment problem (this notebook uses [this one](https://developers.google.com/optimization/assignment/simple_assignment) - extremely fast and exact solutions!) or just solve the linear program above - [CVXPY](https://www.cvxpy.org/) makes this very easy. # In[8]: # Alternative: Solve Assignment Problem with CVXPY X = Variable((len(C_ln),len(C_ln))) iota = np.ones((len(C_ln),1)) constraints = [X <= 1, X >= 0, X*iota == 1, np.transpose(iota)*X == 1] # Form objective. obj = Minimize(sum(multiply(X, C_ln))) # Form and solve problem. prob = Problem(obj, constraints) prob.solve() # Returns the optimal value. print ("status:", prob.status) print ("optimal value", prob.value) X = pd.DataFrame(X.value) display(X.round(3)) # In[9]: # Find best set of arbitrage cycles by solving the assignment problem # ## Beyond Arbitrage Cycles - Maximize wrt. Quantities (EXPERIMENTAL!) # Assume that we start out with one quantity in each currency. Let us find the set of trades that maximize the quantity in currency $k$ without changing the other quantities. For two matrices $A$ and $B$ of the same dimensions, let $A\circ B$ denote the element-wise product (also denoted [Hadamard product](https://en.wikipedia.org/wiki/Hadamard_product_(matrices))). # # Let $C$ denote the exchange rate matrix (not log-transformed) and $Q$ a matrix of the same size as $C$ where the entry $Q_{i,j}$ denotes the number of $i$ coins that flow from currency $i$ to currency $j$. Let $\mathbb{1}$ denote a vector of ones of size $N\times 1$, then vector of outflows from the respective currency inventories is # # \begin{align} # \text{outflows} = Q \mathbb{1}. # \end{align} # The vector of inflows (also in currency $i$ units) is # # \begin{align} # \text{inflows} = (Q \circ C)' \mathbb{1} # \end{align} # and hence the delta inventory is # # \begin{align} # \Delta\! I = (Q \circ C)' \mathbb{1} - Q \mathbb{1}. # \end{align} # # Let $I_{init}$ denote the initial inventory the maximizre # # \begin{align} # \max_{Q} \Delta\! I_k && \\ # & \Delta\! I_{\neq k} =0 && \text{| No change in inventory in other assets}\\ # & \Delta\! I_k \leq 1000 && \text{| Problem would be unbounded in case of arb-cycle}\\ \\ # \end{align} # # The nice thing about this setup is that it allows to easily incorperate new constraints like the blocksize of the observed quotes. The drawback is that this is numerically extremely challenging! # In[10]: # Maximize inventory # # Live Arbitrage Identification # In this part, we are looking for live arbitrage opportunities on some centralized exchanges. The main tool we use for this is [CCXT](https://github.com/ccxt/ccxt) - an open source JavaScript / Python / PHP cryptocurrency trading library with support for more than 130 crypto exchanges. # # # The Binance fee structure can be found [here](https://www.binance.com/en/fee/schedule). # In[41]: # set some parameters # In[54]: # set things up and create button # In[55]: # create real time exchange rate matrix # In[56]: # Check if arbitrage cycle exists using Bellman-Ford. If so, output an arbitrage cycle # In[57]: # Find best set of arbitrage cycles by solving the assignment problem # # Practical Challenges of Arbitrage Trading # Aspects to consider: # * Speed, speed, speed! # * Timing issues # * Costs of inventory (opportunity cost, market risk, custody risk) # * Available liqudity # * Precision of quantities # * Error handling # * ...