graphical-sampling

graphical-sampling is a Python package for finite-population sampling, with a particular focus on graphical sampling designs, unequal inclusion probabilities, and spatially well-spread samples.

The package implements the Graphical Finite-Population Sampling (GFS) framework and its spatial extensions, including probability-balanced n-means clustering, nested spatial ordering, and intelligent search procedures for improving spatial spread while preserving prescribed first-order inclusion probabilities.

The package is designed for researchers and practitioners working in survey sampling, spatial statistics, environmental monitoring, ecological sampling, agricultural surveys, and related fields.

Main Features

Construct fixed-size sampling designs with prescribed first-order inclusion probabilities.
Represent sampling designs through the graphical/bar construction of GFS.
Draw samples from the resulting design.
Compute design properties such as:
- first-order inclusion probabilities,
- second-order inclusion probabilities,
- entropy and relative entropy,
- exact Narain--Horvitz--Thompson variance when the response variable is supplied.
Build probability-balanced spatial clusters using FIP-balanced n-means.
Create nested cluster-zone structures for spatial sampling.
Evaluate spatial spread using indices such as:
- Moran-type spatial balance,
- Voronoi-based spread,
- Density Disparity Index,
- local balance measures.
Improve sampling designs using intelligent search procedures such as Greedy Best-First Search.

Installation

Install the package from PyPI:

pip install graphical-sampling

or install the development version from GitHub:

pip install git+https://github.com/mehdimhb/graphical-sampling.git

Then import the package in Python:

import graphical_sampling

Depending on the installation version, the main classes can also be imported directly from their submodules.

Basic Example

The following example constructs a finite population with spatial coordinates, unequal inclusion probabilities, and a response variable. It then builds a graphical sampling design and draws samples from it.

import numpy as np

from graphical_sampling.population import Population
from graphical_sampling.design import Design

# Reproducibility
rng = np.random.default_rng(123)

# Population size and sample size
N = 200
n = 20

# Spatial coordinates
coords = rng.random((N, 2))

# Unequal size measure, normalized internally to sum to n
weights = 0.5 + rng.random(N)

# Example response variable
y = coords[:, 0] + coords[:, 1] + rng.normal(scale=0.1, size=N)

# Create the finite population
pop = Population(
    coords=coords,
    inclusions=weights,
    variable=y,
    n=n
)

# Build a graphical sampling design
design = Design(population=pop)

# Draw five samples
samples = design.sample(num_samples=5)

print(samples)
print("Relative entropy:", design.relative_entropy)
print("NHT variance:", design.nht_variance)

Spatial Sampling with FIP-Balanced `n`-Means

The package also provides probability-balanced spatial clustering. This is useful when the aim is to form compact spatial clusters whose total inclusion probabilities are controlled exactly.

from graphical_sampling.population import Population
from graphical_sampling.design import Design
from graphical_sampling.order import Order
from graphical_sampling.clustering.fip_balanced_nmeans import FIPBalancedNMeans

# Fit FIP-balanced n-means clustering
fbn = FIPBalancedNMeans(
    n=n,
    n_init=20,
    init_clust_method="expanded"
)

fbn.fit(population=pop)

# Optionally divide each cluster into internal zones
fbn.fit_zones(
    num_zones=(2, 2),
    mode="sweep_xy"
)

# Build a spatial order from the cluster-zone structure
order = Order.from_clusters(
    population=pop,
    clusters=fbn.clusters,
    zone_strategy="snake",
    point_strategy="snake"
)

# Construct the corresponding spatial graphical design
spatial_design = Design.from_order(pop, order)

print("Moran index:", spatial_design.moran)
print("Voronoi index:", spatial_design.voronoi)
print("Density disparity:", spatial_design.density_disparity)

Intelligent Spatial Sampling

The package includes search tools for improving a sampling design while preserving design validity. These methods modify the graphical order or exchange probability mass in a controlled way, and therefore maintain the prescribed inclusion probabilities.

A typical workflow is:

Create a Population.
Build an initial design using GFS or FIP-balanced n-means clustering.
Choose a criterion, such as a spatial spread index or a weighted combination of indices.
Run an intelligent search algorithm to improve the design.
Use the optimized design for sampling and design-based inference.

Citation

If you use graphical-sampling, please cite the software package. If you use the spatial clustering or intelligent spatial sampling methods, please also cite the corresponding methodological paper.

Software citation

@software{graphical_sampling_2025,
  author = {Panahbehagh, Bardia and Mohebbi, Mehdi and HosseiniNasab, Amir Mohammad and Hosseini Moghadam, Mehdi},
  title = {graphical-sampling: A Python package for graphical finite-population and spatial sampling},
  year = {2025},
  url = {https://github.com/mehdimhb/graphical-sampling},
  note = {Python package}
}

Methodological papers

For the graphical finite-population sampling framework, cite:

@article{panahbehagh2026geometric,
  author = {Panahbehagh, Bardia},
  title = {Graphical Finite-Population Sampling},
  year = {2026},
  note = {Manuscript}
}

For the spatial sampling design, cite:

@article{panahbehagh2026intelligent,
  author = {Panahbehagh, Bardia and Mohebbi, Mehdi},
  title = {Intelligent n-Means Spatial Sampling},
  year = {2026},
  note = {Manuscript}
}

For the spatial spread measure, cite:

@article{panahbehagh2026spread,
  author = {Panahbehagh, Bardia and Mohebbi, Mehdi and HosseiniNasab, Amir Mohammad},
  title = {Measuring Spatial Spread via n-Means Balanced Clustering},
  year = {2026},
  note = {Manuscript}
}

Please replace the manuscript entries with the final journal citation once the papers are published.

Maintainers

Bardia Panahbehagh
Mehdi Mohebbi
Amir Mohammad HosseiniNasab
Mehdi Hosseini Moghadam

License

License information should be checked in the repository before redistribution.

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
graphical_sampling		graphical_sampling
simulations		simulations
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

graphical-sampling

Main Features

Installation

Basic Example

Spatial Sampling with FIP-Balanced `n`-Means

Intelligent Spatial Sampling

Citation

Software citation

Methodological papers

Maintainers

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

graphical-sampling

Main Features

Installation

Basic Example

Spatial Sampling with FIP-Balanced n-Means

Intelligent Spatial Sampling

Citation

Software citation

Methodological papers

Maintainers

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Spatial Sampling with FIP-Balanced `n`-Means

Packages