According to [1]:

Power laws appear to describe histograms of relevant financial fluctuations, such as fluctuations in stock price, trading volume and the number of trades. Surprisingly, the exponents that characterize these power laws are similar for different types and sizes of markets, for different market trends and even for different countries—suggesting that a generic theoretical basis may underlie these phenomena.

Let’s check whether these exponents are similar for cryptocurrencies too. Spoiler: not always.

Trades

In [1] we read:

Empirical studies … show that the distribution of trading volume \(V_t\) obeys a … power law: \[ P(V_t > x) \sim x^{-\zeta_V}\] with \(\zeta_V \approx 1.5\), while the number of trades \(N_t\) obeys: \[ P(N_t > x) \sim x^{-\zeta_N}\] with \(\zeta_N \approx 3.4\)

First let’s load all trades for LTCUSD pair traded at Bitstamp from 2019-02-01 00:00:00+03 till 2019-09-17 00:00:00+03:

trades <- obadiah::trades(con, 
                          '2019-02-01 00:00:00+03', 
                          '2019-09-17 00:00:00+03', 
                          'Bitstamp', 
                          'LTCUSD', 
                          tz='Europe/Moscow') %>%
  select(-maker.event.id, -taker.event.id)
kable(head(trades))
timestamp price volume direction maker taker exchange.trade.id
2019-02-01 00:04:08.3971 31.28 0.5851052 buy 2814151777 2814151901 82145385
2019-02-01 00:17:46.8451 31.26 72.1700000 buy 2814188413 2814190362 82145682
2019-02-01 00:17:47.5039 31.26 25.4422370 sell 2814190362 2814190392 82145683
2019-02-01 00:17:49.1047 31.19 7.2058347 sell 2814190401 2814190371 82145684
2019-02-01 00:18:57.6547 31.27 0.3000000 buy 2814190510 2814193203 82145698
2019-02-01 00:42:32.9018 31.19 2.1663516 sell 2814264503 2814266233 82146060

We need only real trades, i.e. those having exchange.trade.id set and only timestamp and volume columns:

trades <- trades %>% 
  filter(!is.na(exchange.trade.id)) %>% 
  select(timestamp, volume, price, direction, exchange.trade.id) 
kable(head(trades))
timestamp volume price direction exchange.trade.id
2019-02-01 00:04:08.3971 0.5851052 31.28 buy 82145385
2019-02-01 00:17:46.8451 72.1700000 31.26 buy 82145682
2019-02-01 00:17:47.5039 25.4422370 31.26 sell 82145683
2019-02-01 00:17:49.1047 7.2058347 31.19 sell 82145684
2019-02-01 00:18:57.6547 0.3000000 31.27 buy 82145698
2019-02-01 00:42:32.9018 2.1663516 31.19 sell 82146060

There are 621,605 real trades in total. Let’s calculate a trading volume and number of trades per 900 period:

by_period <- trades %>%
  mutate(period_end=ceiling_date(timestamp, paste0(900, " seconds"))) %>% 
  mutate(volume=replace_na(volume, 0), price=replace_na(price, 0)) %>%
  group_by(period_end) %>%
  summarize(volume=sum(volume), number.of.trades=sum(!is.na(timestamp)))

knitr::kable(head(by_period))
period_end volume number.of.trades
2019-02-01 00:19:00 0.5851052 1
2019-02-01 00:32:00 104.8180717 3
2019-02-01 00:33:00 0.3000000 1
2019-02-01 00:57:00 3.9536516 2
2019-02-01 01:00:00 1.9475032 1
2019-02-01 01:01:00 477.4679622 2

We can now caclulate Complementary cumulative distribution function (tail distribution) for a trading volume per period \(P(V_t > x)\):

V <- by_period %>% 
  arrange(-volume) %>%
  mutate(r=row_number()) %>%
  mutate(prob=r/max(r), sigma=volume/sd(volume)) %>%
  filter(sigma >= 0.1)

s_i <- coef(lm(log10(prob) ~ log10(sigma), V %>% filter(sigma >=10)))
zeta_V <- -s_i[2]

\(x\) is measured in units of sample standard deviation of volume: \(\sigma_V =\)sd(volume) \(\approx\) 287.26.

ggplot(V ,  aes(sigma, prob)) + 
  geom_point(size=0.1) + 
  scale_y_log10(TeX('$P(V_t > x)$'),
                breaks=c(0.001, 0.02, 0.1, 1),
                labels=scales::percent) +
  scale_x_log10(TeX('$\\frac{x}{\\sigma_N$}'),
                breaks=c(0.1, 1,2,10, 100,40),
                limits=c(0.1, 40),
                labels=scales::number_format(0.1)) +
  geom_abline(slope=-zeta_V, intercept=s_i[1], colour="blue") + 
  annotate("text",
           x=10,
           y=0.01, 
           label=TeX(paste0('$\\zeta_V=',round(zeta_V,2),'$'), output = "character"),
           parse=TRUE)
#> Warning: Removed 4 rows containing missing values (geom_point).

We see that \(\zeta_V=3.17 \neq 1.5\).

For a number of trades per period \(P(N_t > x)\) we have:

N <- by_period %>% 
  arrange(-number.of.trades) %>%
  mutate(r=row_number()) %>%
  mutate(prob=r/max(r), sigma=number.of.trades/sd(number.of.trades), category='N') %>%
  filter(sigma > 0.1)

s_i <- coef(lm(log10(prob) ~ log10(sigma), N %>% filter(sigma >=9)))
zeta_N <- -s_i[2]

Again, \(x\) is measured in units of sample standard deviation of number.of.trades: \(\sigma_N =\)sd(number.of.trades) \(\approx\) 9.43.

ggplot(N ,  aes(sigma, prob)) + geom_point(size=0.1) + 
  scale_y_log10(TeX('$P(N_t > x)$'),
                breaks=c(0.001, 0.01, 0.1, 1),
                labels=scales::percent) + 
  scale_x_log10(TeX('$\\frac{x}{\\sigma_N$}'),
                breaks=c(0.1, 1, 2, 10, 40),
                limits=c(0.1, 40),
                labels=scales::number_format(0.01)) +
  geom_abline(slope=-zeta_N, intercept=s_i[1], colour="blue") +
  annotate("text",
           x=10, 
           y=0.01, 
           label=TeX(paste0('$\\zeta_N=',round(zeta_N,2),'$'), output = "character"),
           parse=TRUE)
#> Warning: Removed 1 rows containing missing values (geom_point).

We see that \(\zeta_N=3.62 \approx 3.4\).

Returns

In [1] we read:

Define \(p_t\) as the price of a given stock and the stock price ‘return’ \(r_t\) as the change of the logarithm of stock price in a given time interval \(\Delta t\), \(r_t \equiv \ln p_t - \ln p_{t -\Delta t}\). The probability that a return has an absolute value larger than \(x\) is found empirically to be: \[ P(| r_t | > x) \sim x^{-\zeta_r} \tag{1}\] with \(\zeta_r \approx 3\). The ‘inverse cubic law’ of equation (1) is rather ‘universal’, holding over as many as 80 standard deviations for some stock markets, with \(\Delta t\) ranging from one minute to one month, across different sizes of stocks, different time periods, and also for different stock market indices.

Let’s load spreads at the end of every 900 interval for LTCUSD pair traded at Bitstamp from 2019-02-01 00:00:00+03 till 2019-09-17 00:00:00+03. Since there are too many of them to be loaded at once, we will process them by time ranges:

ranges <- tibble(s=with_tz(seq(ymd_hms('2019-02-01 00:00:00+03'), ymd_hms('2019-09-17 00:00:00+03'), by="1 week"), tz="Europe/Moscow"))
ranges$e <- c(tail(ranges$s, -1),ymd_hms('2019-09-17 00:00:00+03'))
kable(head(ranges))
s e
2019-02-01 2019-02-08
2019-02-08 2019-02-15
2019-02-15 2019-02-22
2019-02-22 2019-03-01
2019-03-01 2019-03-08
2019-03-08 2019-03-15

We will use a mid-price (best bid price plus best ask price divided by two) as a proxy for \(p_t\):

by_period <- ranges %>% rowwise() %>% do({
  obadiah::spread(con, .$s, .$e, 'Bitstamp', 'LTCUSD', frequency=900, tz='Europe/Moscow') %>% 
    mutate(price=(best.ask.price+best.bid.price)/2) %>% # mid-price
    select(timestamp, price)
  }) %>%  
  ungroup() %>% 
  fill(price) %>%
  mutate(abs_r=abs(log(price) - log(lag(price))))  %>% 
  filter(!is.na(abs_r)) 

kable(head(by_period))
timestamp price abs_r
2019-02-01 00:15:00 31.215 0.0006405
2019-02-01 00:30:00 31.220 0.0001602
2019-02-01 00:45:00 31.200 0.0006408
2019-02-01 01:00:00 31.170 0.0009620
2019-02-01 01:15:00 31.230 0.0019231
2019-02-01 01:30:00 31.240 0.0003202
R <- by_period %>%
  arrange(-abs_r) %>%
  mutate(r=row_number()) %>%
  mutate(prob=r/max(r), sigma=abs_r/sd(abs_r)) %>%
  filter(sigma >= 0.1)

s_i <- coef(lm(log10(prob) ~ log10(sigma), R %>% filter(sigma >=11)))
zeta_r <- -s_i[2]

\(x\) is measured in units of sample standard deviation of abs_r: \(\sigma_r =\)sd(abs_r) \(\approx\) 0.0046.

ggplot(R ,  aes(sigma, prob)) + 
  geom_point(size=0.1) + 
  scale_y_log10(TeX('$P(|r_t| > x)$'),
                breaks=c(0.001, 0.02, 0.1, 1),
                labels=scales::percent) +
  scale_x_log10(TeX('$\\frac{x}{\\sigma_r$}'),
                breaks=c(0.1, 1,2,10, 100,40),
                limits=c(0.1, 40),
                labels=scales::number_format(0.1)) +
  geom_abline(slope=-zeta_r, intercept=s_i[1], colour="blue") + 
  annotate("text",
           x=10,
           y=0.01, 
           label=TeX(paste0('$\\zeta_r=',round(zeta_r,2),'$'), output = "character"),
           parse=TRUE)

Again we see that \(\zeta_r=3.23 \approx 3\).

References

[1] Gabaix, X., Gopikrishnan, P., Plerou, V. and Stanley, H. E. (2003). A theory of power-law distributions in financial market fluctuations. Nature 423 267–70.