2 Reconstruction algorithm
2.1 Order
Consider a definition given in (Bouchaud and Bonart 2018):
An order is a commitment, declared at a given submission time, to buy or sell a given volume of an asset at no worse than a given price. An order \(x\) is thus described by four attributes:
- its sign (or direction) \(\epsilon = \pm 1\), (\(\epsilon_x = +1\) for buy orders; \(\epsilon_x = −1\) for sell orders),
- its price \(p_x\) ,
- its volume \(v_x > 0\), and
- its submission time \(t_x\).
We introduce the succinct notation \(x := (\epsilon_x , p_x ,v_x, t_x )\).
(Bouchaud and Bonart 2018) then continues with a description of what they call The Trade-Matching Algorithm:
Whenever a trader submits a buy (respectively, sell) order \(x\), an LOB’s trade-matching algorithm checks whether it is possible for \(x\) to match to existing sell (respectively,buy)orders \(y\) such that \(p_y \leq p_x\) (respectively, \(p_y \geq p_x\)). If so, the matching occurs immediately and the relevant traders perform a trade for the agreed amount at the agreed price. Any portion of \(x\) that does not match instead becomes active at the price \(p_x\), and it remains active until either it matches to an incoming sell (respectively, buy) order or it is cancelled. Cancellation usually occurs when the owner of an order no longer wishes to offer a trade at the stated price.
Above definition and description are incomplete:
According to the definition, an order \(x\) seems to be an immutable entity. But according to the Trade-Matching algorithm, a portion of \(x\) may become ‘active’ and it is not clear how the quality of being ‘active’ is different from the mere existence of submitted orders,
At what price the trade is performed - \(p_x\) or \(p_y\) - is not explicitly stated,
It is not stated what will happen if \(v_x \leq v_y\), etc.
Surprisingly, the concept of order and operations of a continous double auction appear to be rather complicated things if one attempts to define them rigorously!
Hereafter is assumed that a single asset is traded and a single currency is used.
Definition 2.1 (Order) Suppose that:
- \(i \in \mathbb{N}\) is called an unique order identification,
\(\mathcal{T^i}\) is a set of intervals \(\mathcal{T^i} = \{T_k^i\}_{k=0}^{K^i}\) called order intervals where each \(T_k^i\) is either a proper interval with endpoints \(s_k^i \in \mathbb{R}_{>0}\), \(e_k^i \in \mathbb{R}_{>0} \cup \{+\infty\}\), \(s_k^i < e_k^i\) or a degenerate interval \(\{s_k^i\}, s_k^i \in \mathbb{R}_{>0}\) and \(\forall m \neq n : T_m^i \cap T_n^i = \emptyset\),
\(\chi_{T_k^i} : \mathbb{R} \mapsto \{0,1\}\) is an indicator function of intervals in \(\mathcal{T^i}\): \(\chi_{T_k^i}(t) = \begin{cases} 1, & t \in T_k^i \\ 0, & \text{otherwise} \end{cases}\),
\(\mathcal{P}^i = \{p_k^i\}_{k=0}^{K^i}\) is a set of constants \(p_k^i \in \mathbb{R}_{>0}\) called order prices,
\(\mathcal{V}^i = \{v_k^i\}_{k=0}^{K^i}\) is a set of constants called order volumes, such that either \(v_k^i \in \mathbb{R}_{<0}\) or \(v_k^i \in \mathbb{R}_{>0}\),
Sets \(\mathcal{P}^i\) and \(\mathcal{V}^i\) satisfy the following condition \(\forall k \in [0, \ldots, N_i) : (p_k^i, v_k^i) \neq (p_{k+1}^i, v_{k+1}^i)\),
Whenever it is clear from the context, we may also use the word order for:
- An unique order identification \(i\) number
- An order as defined in (Bouchaud and Bonart 2018) (see above)
- The value of tuple \((i, p^i(t_0), v^i(t_0))\) at the particular time \(t_0\)
Now we can define
2.2 Order Book
Definition 2.11 (Ideal Order Book) Suppose we are given an order book \(\mathcal{O}\). Then \(\mathcal{O}\) is called Ideal Order Book (IOB) if the following conditions are satisfied:
- Instantaneous Match Execution (IME) \[ \lambda \Big( \{t : b_{\mathcal{O}}(t) \geq a_{\mathcal{O}}(t)\}\Big) = 0 \tag{2.1} \] where \(\lambda(X)\) is the Lebesgue measure of set \(X\).
- Volume Conservation (VC) \[ \forall t \in \{t : b_{\mathcal{O}}(t) \geq a_{\mathcal{O}}(t)\} : \sum_{\mathbf{o}^b(t) \in \mathcal{M}_{\text{bid}}(t)} f^b(t) = \sum_{\mathbf{o}^s(t) \in \mathcal{M}_{\text{ask}}(t)} |f^s(t)| \tag{2.2} \]
To make these definitions intuitevely clear, let’s consider how IOB \(\mathcal{O}\) would evolve under some order placement policy \(\mathcal{P}\) in a typical scenario of placing several buy and sell orders. Note that our policy \(\mathcal{P}\) given IOB retuns IOB. It is not the case in general and another order placement policy could retrun order book which is not IOB.
A buy order is submitted at time \(t_1\) with the price \(p_1\) and volume \(v_1\) and is transformed by \(\mathcal{P}\) into \(\mathbf{o}^1(t)\):
- \(\mathbf{o}^1(t)\) is added to \(\mathcal{O}\): \[ \mathbf{o}^1(t) = \left[ \begin{eqnarray} \mathcal{T}^1 & = & \{ [t_1, +\infty) \} \\ \mathcal{P}^1 & = & \{ p_1 \} \\ \mathcal{V}^1 & = & \{ v_1 \} \end{eqnarray} \right. \]
A sell order with price \(p_2 < p_1\) and volume \(|v_2| < |v_1|\) is submitted and transformed into \(\mathbf{o}^2(t)\), matched with \(\mathbf{o}^1(t)\) so the latter is partially filled at time \(t_2\). All these actions are performed by \(\mathcal{P}\). Note that the right endpoint \(t_2\) below is included:
- \(\mathbf{o}^1(t)\) is changed: \[ \mathbf{o}^1(t) = \left[ \begin{eqnarray} \mathcal{T}^1 & = & \{ [t_1, t_2], (t_2, +\infty] \} \\ \mathcal{P}^1 & = & \{ p_1, p_1 \} \\ \mathcal{V}^1 & = & \{ v_1, v_1 - v_2 \} \end{eqnarray} \right. \]
- \(\mathbf{o}^2(t)\) is added to \(\mathcal{O}\): \[ \mathbf{o}^2(t) = \left[ \begin{eqnarray} \mathcal{T}^2 & = & \{ {t_2} \} \\ \mathcal{P}^2 & = & \{ p_2 \} \\ \mathcal{V}^2 & = & \{ v_2 \} \end{eqnarray} \right. \]
- A sell order with price \(p_3 < p_1\) and volume \(v_3\) is submitted by the same trader as the one who submitted \(\mathbf{o}^1(t)\). The order is transformed into \(\mathbf{o}^3(t)\) and \(\mathbf{o}^1(t)\) is cancelled (note that the right endpoint \(t_3\) below is not included) due to self-trade prevention policy, which is part of \(\mathcal{P}\), at time \(t_3\) (as, for example, CME Globex would do by default):
- \(\mathbf{o}^1(t)\) is changed: \[ \mathbf{o}^1(t) = \left[ \begin{eqnarray} \mathcal{T}^1 & = & \{ [t_1, t_2], (t_2, t_3) \} \\ \mathcal{P}^1 & = & \{ p_1, p_1 \} \\ \mathcal{V}^1 & = & \{ v_1, v_1 - v_2 \} \end{eqnarray} \right. \]
- \(\mathbf{o}^2(t)\) remains the same,
- \(\mathbf{o}^3(t)\) is added to \(\mathcal{O}\): \[ \mathbf{o}^3(t) = \left[ \begin{eqnarray} \mathcal{T}^3 & = & \{ [t_3, +\infty) \} \\ \mathcal{P}^3 & = & \{ p_3 \} \\ \mathcal{V}^3 & = & \{ v_3 \} \end{eqnarray} \right. \]
2.3 Quotes and Trades Streams
Formally speaking, the reconstruction algorithm is supposed to produce Ideal Order Book from non-ideal data - Quotes and Trades Streams - provided by an exchange. The effort required varies by an exchange. For example MOEX provides data of close to ideal quality while Bitfinex does not report unique order identifications for trades and omits market orders altogether.
Definition 2.12 (Quote Event) A tuple \(\mathcal{S}_Q^i[k] = (t_k^i, i, k, p_k^i, v_k^i, e^k_i)\) where
- \(t_k^i \in \mathbb{R}_{> 0}\) is a timestamp,
- \(i \in \mathbb{N}\) is an identification number of the quote,
- \(k \in \mathbb{N}\) is a quote event number,
- \(p_k^i \in \mathbb{R}_{\geq 0}\) is a quote price at \(t_k^i\),
- \(v_k^i \in \mathbb{R}\) is a quote volume since \(t_k^i\),
- \(e_k^i \in \{ A, C, D \}\) is an event type: \(A\) stands for addition to an order book event, \(C\) stands for change event and \(D\) stands for deletion from an order book,
Definition 2.13 (Quote) A set of quote event tuples \(\mathcal{S}_Q^i =\{\mathcal{S}_Q^i[k] \}_{k=1}^{K^i} = \{(t_k^i, i, k, p_k^i, v_k^i, e^k_i)\}_{k=1}^{K^i}\) where
- \(k_1 < k_2 \implies t_{k_1}^i \leq t_{k_2}^i\),
- all \(v_k^i\) are either non-negative or non-positive
is called quote \(i\). If all \(v_k^i \leq 0\) then the quote is a sell quote. Otherwise it is a buy quote.
Table 3.1 in section Fragments of Quotes and Trades Streams shows a fragment of ETHUSD Quotes Stream generated by Bitstamp during 1 second
while table 3.3 shows similar fragment generated by Bitfinex. They should be taken as a raw dataset for illustrations of reconstruction algorithm performance hereafter.
Definition 2.16 (Trade) A tuple \(\mathcal{S}_T[j] = (t_j, j, a_j, r_j, d_j, p_j, v_j)\) where
- \(t_j \in \mathbb{R}_{>0}\) is a timestamp of the trade,
- \(j \in \mathbb{N}\) is a unque trade identification number,
- \(a_j \in \mathbb{N} \cup \{0\}\) is an optional taker (aggressing) quote identification number: if \(a_j =`\{0\}\) then the number is not known,
- \(r_j \in \mathbb{N} \cup \{0\}\) is an optional maker (resting) quote identification number: if \(a_j =\{0\}\) then the number is not known,
- \(d_j \in \{-1, +1\}\) is the side of the maker quote (if \(d_j = -1\) then ‘ask’, otherwise ‘bid’)
- \(p_j \in \mathbb{R}_{>0}\) is a trade price,
- \(v_j \in \mathbb{R}_{\neq 0}\) is a trade volume
Definition 2.17 (Trades Stream) A set of trades \[\mathcal{S}_T = \{\mathcal{S}_T[j]\}_{j=1}^{J} = \{(t_j, j, a_j, r_j, d_j, p_j, v_j)\}_{j=0}^{J}\] where
- \(j_1 \neq j_2 \implies \ (a_{j_1}, r_{j_1}) \neq (a_{j_2}, r_{j_2})\)
- \(j_1 < j_2 \implies t_{j_1} \leq t_{j_2}\)
is called a Trades Stream.
Table 3.2 in section Fragments of Quotes and Trades Streams shows a fragment of ETHUSD Trades Stream generated by Bitstamp during 1 second while table 3.4 shows similar fragment generated by Bitfinex.
A typical exchange gives us some kind of event stream and, separately, some kind of trade stream with various level of details.
As it is shown below, most of quote events may be uniquely utilized for construction of an empirical order since those quote events can be caused by unique actions. The only ambiguous exception is a volume decreasing quote event which may originate from
- A single fill
- Multiple fills (i.e. several taker orders matched with a single maker order simultaneously)
- A cancellation
- Combination of the above (i.e. Immediate or Cancel Order)
These ambiguous volume decreasing quote events need, well, to be disambiguated before an empirical order can be constructed from the quote.
2.4 Coupling
2.4.1 Makers
Definition 2.20 (Linkable Makers) Suppose that Quotes Stream \(\mathcal{S}_Q\) and Trades Stream \(\mathcal{S}_T\) are given. For each trade \(\mathcal{S}_T[j] \in \mathcal{S}_T\) as defined by 2.16, a (possibly empty) set \(\mathcal{L}_R[\mathcal{S}_T[j]] = \{\mathcal{S}_Q^r[k]\}_{r, k}\) of a volume-decreasing quote events \(\mathcal{S}_Q^r[k] \in \mathcal{S}_Q^r \in \mathcal{S}_Q\) as defined by 2.14 that satisfy the following conditions:
- \(\text{sgn}(v_k^r) = d_j\), i.e. \(\mathcal{S}_Q^r\) is a resting (maker) quote for trade \(j\),
- \(r = r_j \lor r = 0\), i.e if trade \(j\) contains maker quote identification number \(r_j > 0\) then \(\mathcal{S}_Q^r[k]\) must belong to quote \(r_j\); otherwise it may belong to any other quote satisfying the other conditions of this definition,
- \(t_k^r - t_j < \tau\), quote event \(\mathcal{S}_Q^r[k]\) is no more than \(\tau \in \mathbb{R}_{>0}\) seconds later than trade \(j\) and delay tolerance \(\tau\) is usually small,
- \(p_k^r = p_j\),
- \(\begin{cases} \Delta v_k^i \geq v_j, & k > 1 \\ e^i_k = D, & k = 1 \end{cases}\), i.e. the volume of trade \(j\) is smaller that the volume decrease of quote \(\mathcal{S}_Q^r\) at time \(t_k^r\) or volume decrease is not defined and event type is a deletion,
is called linkable makers (‘resting’ quotes) for trade \(j\). The set \(\mathcal{L}_R = \bigcup_j \mathcal{L}_R[\mathcal{S}_T[j]]\) is called linkable makers.
Suppose that Quotes Stream \(\mathcal{S}_Q\) and Trades Stream \(\mathcal{S}_T\) are given. For each trade \(\mathcal{S}_T[j] \in \mathcal{S}_T\) as defined by 2.16, a set \(\mathcal{H}_R[\mathcal{S}_T[j]]\) defined as follows \[ \mathcal{H}_R[\mathcal{S}_T[j]] = \mathcal{H}_R[(t_j, j, a_j, r_j, d_j, p_j, v_j)] = \{(t_j, j, 1, p_j, v_j, D)\}\] is called hidden maker for trade \(j\). The tuple \(\{(t_j, j, 1, p_j, v_j, D)\} = \{\mathcal{S}_Q^j[1]\}\) is called a hidden quote event for trade \(j\). The set \(\mathcal{H}_R = \bigcup_j \mathcal{H}_R[\mathcal{S}_T[j]]\) is called hidden makers.
If for some \(j\) the cardinality of \(\mathcal{L}_R[\mathcal{S}_T[j]]\) is greater than one then there are more than one different maker coupling \(\mathbf{l}_R(\mathcal{S}_T[j])\) of Trade Stream \(\mathcal{S}_T\) to Quotes Stream \(\mathcal{S}_Q\).
The condition \(k = 1 \land e^i_k = D\) in defintion 2.24 means that Quotes Stream has a deletion event for order \(i\) without previous addition and/or change event(s). The volume decrease of the quote \(i\) event \(1\) is not defined and the reasonable decision in this situation is to assume that it is equal to the volume of the linked trade.
\(\DeclareMathOperator*{\argmax}{arg\,max}\)
Definition 2.26 (Optimal Maker Coupling) Suppose that Quotes Stream \(\mathcal{S}_Q\) and Trades Stream \(\mathcal{S}_T\) are given. The maker coupling \[ \mathbf{l}_R^* = \argmax_{\argmax_{\mathbf{l}_R} \mathscr{Q}_V(\mathbf{l}_R)}\, \mathscr{Q}_T(\mathbf{l}_R) \] is called the optimal maker copling.
2.4.2 Takers
Bitfinex does not provide quote events for takers while MOEX and Bitstamp do provide.
Definition 2.27 (Linkable Takers) Suppose that Quotes Stream \(\mathcal{S}_Q\) and Trades Stream \(\mathcal{S}_T\) are given. For each trade \(\mathcal{S}_T[j] \in \mathcal{S}_T\) as defined by 2.16, a non-empty set of a volume-decreasing quote events \(\mathcal{S}_Q^i[k] \in \mathcal{S}_Q^i \in \mathcal{S}_Q\) as defined by 2.14 that satisfy the following conditions:
- \(|t_j - t_k^i| < \tau\), where \(\tau\) is a constant,
- \(i = a_j\) if the trade contains taker quote identification number,
- \(p_k^i d_j \leq p_j d_j\),
- \(\text{sgn}(v_k^i) = -d_j\),
- \(|v_k^i - v_{k-1}^i| \geq v_j\)
is called linkable takers for trade \(j\).
TBD
2.5 Order Intervals Reconstruction
To reconstruct an empirical order \(\hat{\mathbf{o}}^{i}(t)\) as defined by 2.18 from the Quotes Stream \(\mathcal{S}_Q^i\) as defined by 2.15 one needs to define the set of intervals \(\mathcal{T^i}\) since order prices \(\mathcal{P}^i\) and order volumes \(\mathcal{V}^i\) are unambiguously defined by the Quotes Stream \(\mathcal{S}_Q^i\).
TBD
2.6 Enforcement of IME condition in EOB
TBD
2.7 Enforcement of VC condition in EOB
TBD