Minimalistic stand-alone package for generating Behler–Parrinello symmetry-function descriptors for atomistic machine-learning potentials based on neural networks.
A .data file represents one structure/frame. It stores geometry, metadata,
and (optionally) the computed symmetry-function matrix.
Data files are created with pysrc/xdatcar2data.py from:
- an
XDATCARtrajectory (-x/--xdatcar) - a plain-text energy file (
-e/--energies, one value per frame) - an output folder (
-o/--output)
Example:
python3 pysrc/xdatcar2data.py \
--xdatcar data/XDATCAR_ch4_lt \
--energies data/XDATEN_ch4_lt \
--output datafilesBy default this writes files named like struc_00001.data,
struc_00002.data, ... with an initially empty Symmetry Functions: block.
The C++ executable later fills that block according to the request file.
The generated .data structure contains these sections:
Energy: eV— scalar energy for this frameLattice:— 3x3 lattice vectorsPeriodicity:— periodicity flag (currently written as3)Atomlist:— integer species indices per atomChemical Symbols:— element symbols in index order (e.g.C H)Number of Bulk Atoms:— number of atoms in this frameElectronic Convergence:/Ionic Convergence:— metadata placeholdersCoordinates: Cartesian— Cartesian coordinates for each atomSymmetry Functions:— populated after descriptor generation
Notes when using your own systems:
- Keep the number of energy entries aligned with the number of trajectory frames.
- Keep atom count and ordering consistent across the trajectory.
- Multi-element systems are supported; element IDs are built dynamically from symbols found in each frame.
The request file specifies which symmetry functions are calculated and with which hyperparameters. Each block header is followed by one or more parameter rows:
G0:betaG1:cut-off radiusG2:eta RsG3:kappaG3f:kappaG4:eta zeta lambdaG5:eta zeta lambda
See data/request for an example.
-
Extract one energy value per frame (e.g. from
OUTCAR) intoXDATEN:grep "y w" OUTCAR | awk '{print $7}' > XDATEN
-
Convert
XDATCAR + XDATENinto a binary package (.pkg):python3 pysrc/xdatcar2pack.py \ -x /path/to/XDATCAR \ -e /path/to/XDATEN \ -o /path/to/my_system.pkg
-
Copy and edit a request file:
cp data/request data/my_request
-
Generate descriptors with the C++ executable:
./build/bpsfp \ -d /path/to/my_system.pkg \ -r data/my_request \ -o /path/to/my_system_coeff.npz
-
Train/evaluate models from the generated
.npz(PyTorch utilities are intorch/perceptronium).
Perceptronium is a mixed C++ / Python software suite. The documented setup is to install a minimal system toolchain and use a dedicated Python virtual environment.
sudo apt install python3 python3-dev python3-venv build-essential cmakeThen create and activate an environment:
python3 -m venv ~/.venv-perceptronium
source ~/.venv-perceptronium/bin/activateInstall Python dependencies in that environment:
pip install torch torchvision torchaudio pyyaml matplotlib ase tqdmcmake -S src -B build
cmake --build build -jThis produces the descriptor executable at ./build/bpsfp.