Skip to content

Benchmark PDF and CDF implementations; resolve warnings #1

@mdhaber

Description

@mdhaber

When I was last working on this, I was benchmarking it against results I generated with Matlab using this script, generated by an LLM (so I don't want to include it here).

stable_grid_tails.mat can be generated by running this file as-is. The central part of the distribution can be investigated by uncommenting out the first p = ... line and commenting out the next one.

Here is my Python script, untouched since I last used it.

import time

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
# from scipy.stats._stable import f, F
from scipy.io import loadmat
stats.levy_stable.parameterization = 'S0'

def err_plot(a, b, res, ref, thresh, title, i=None):
    # den = np.where(ref != 0, ref, res)
    # err = np.where(den != 0, np.abs((res - ref) / ref), 0)
    err = np.abs((res - ref) / ref)
    if i is None:
        err_max_x = np.max(err, axis=-1)
    else:
        err_max_x = err[..., i]
    err_max_x_log10 = np.log10(err_max_x)

    err_max_x_log10[err_max_x_log10 < -16] = -16
    err_max_x_log10[err_max_x_log10 > thresh] = 5
    err_max_x_log10[np.isnan(err_max_x_log10)] = 10

    plt.pcolormesh(a[..., 0], b[..., 0], err_max_x_log10)
    plt.xlabel('a')
    plt.ylabel('b')
    plt.title(title)
    plt.colorbar()
    plt.clim(-16, 10)
    plt.show()
    return err_max_x_log10

def load_mat(filename):
    data_mat = loadmat(filename)
    p = data_mat['p']
    x = data_mat['Q']
    a = data_mat['a']
    b = data_mat['b']
    cdf = data_mat['CDF']
    pdf = data_mat['PDF']
    p = p.ravel()
    a = a.ravel()[:, np.newaxis]
    b = b.ravel()[:, np.newaxis, np.newaxis]
    p, x, a, b, cdf, pdf = np.broadcast_arrays(p, x.T, a, b, cdf.T, pdf.T)
    return p, x, a, b, cdf, pdf

p, x, a, b, cdf_ref2, pdf_ref2 = load_mat('stable_grid_tails.mat')

tic = time.perf_counter()
# res = np.exp(f(x, a, b, log=True))
res = stats.levy_stable.pdf(x, a, b)
toc = time.perf_counter()
# 289.98363619996235 in PR
# in main

err = err_plot(a, b, res, pdf_ref2, -2,'PDF relative error')
print(np.mean(err))
print(np.median(err))
print(toc - tic)

# import matplotlib.pyplot as plt
# from scipy.stats import levy_stable as sp_levy_stable
# import numpy as np
# sp_levy_stable.parameterization = "S0"
# (alpha, beta) = (1.6, 1.0)
# xs = np.linspace(-15, 10, 100)
# plt.plot(xs, sp_levy_stable.logpdf(xs,alpha=alpha, beta=beta))
# plt.show()

# p, x, a, b, cdf_ref2, pdf_ref2 = load_mat('stable_grid_tails.mat')
# tic = time.perf_counter()
# res = F(x, a, b)
# toc = time.perf_counter()
# # 61.13875540008303 in PR
# # in main
#
# err = err_plot(a, b, res, cdf_ref2, -2,'PDF relative error, tails')
# print(np.mean(err))
# print(np.median(err))
# print(toc - tic)

f and F imported at the top were the pdf and cdf. Now, in this repo, we'd do, from Stable import Stable instead, create a Stable random variable, and call the pdf and cdf methods.

I'd like to check that they are as accurate as I remember them being, and it would be good to figure out what to do about the warnings.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions