Created
September 12, 2018 02:09
-
-
Save raddy/4084ade3d3a82911d43df9375f9697f4 to your computer and use it in GitHub Desktop.
Sample from a distribution with known skew and kurtosis
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import statsmodels.sandbox.distributions.extras as extras | |
import scipy.interpolate as interpolate | |
import numpy as np | |
def generate_normal_four_moments(mu, sigma, skew, kurt, size=1000, sd_wide = 10): | |
f = extras.pdf_mvsk([mu, sigma, skew, kurt]) | |
x = np.linspace(mu - sd_wide * sigma, mu + sd_wide * sigma, num=500) | |
y = [f(i) for i in x] | |
yy = np.cumsum(y) / np.sum(y) | |
inv_cdf = interpolate.interp1d(yy, x, fill_value="extrapolate") | |
rr = np.random.rand(size) | |
return inv_cdf(rr) |
"Thanks for sharing! Could you explain what is the parameter "sd_wide" used for?"
He's using it in the np.linspace function, so basically he's creating evenly spaced numbers that are mean plus/minus 10 times variance, then applying the pdf_mvsk function on each point on that linspace.
just for anyone else who stumbles upon this. you're use of the term
sigma
is somewhat misleading. the function asks forv
(variance). should probably be sigma ** 2.this gist comes up on searches for use of pdf_mvsk so thought i'd point this out. @raddy, feel free to disagree.
I think you're correct looking at the pdf_mvsk implementation:
N = len(mvsk)
if N < 4:
raise ValueError("Four moments must be given to "
"approximate the pdf.")
mu, mc2, skew, kurt = mvsk
totp = poly1d(1)
sig = sqrt(mc2)
if N > 2:
Dvals = _hermnorm(N + 1)
C3 = skew / 6.0
C4 = kurt / 24.0
# Note: Hermite polynomial for order 3 in _hermnorm is negative
# instead of positive
totp = totp - C3 * Dvals[3] + C4 * Dvals[4]
def pdffunc(x):
xn = (x - mu) / sig
return totp(xn) * np.exp(-xn * xn / 2.0) / np.sqrt(2 * np.pi) / sig
return pdffunc
Let me play with some numbers to sanity check!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
For anyone who comes across this that is looking to create custom distributions for the purposes of generating Brownian motion random walks and/or Monte Carlo simulations, I adapted this function and another source to make it very simple to generate N simulations for multiple different series that exhibit mean reversion, correlation, and/or non-normal distributions:
https://github.com/mattyTokenomics/brownian-motion-simulations