Skip to content

Update master#10

Open
ethanvo wants to merge 932 commits into
ethanvo:masterfrom
pyscf:master
Open

Update master#10
ethanvo wants to merge 932 commits into
ethanvo:masterfrom
pyscf:master

Conversation

@ethanvo

@ethanvo ethanvo commented Jun 4, 2024

Copy link
Copy Markdown
Owner

Update master branch

@ethanvo ethanvo self-assigned this Jun 4, 2024
@ethanvo ethanvo enabled auto-merge (squash) June 4, 2024 17:06
@ethanvo

ethanvo commented Jun 4, 2024

Copy link
Copy Markdown
Owner Author

Looks good

auto-merge was automatically disabled June 8, 2024 21:50

Head branch was pushed to by a user without write access

wxj6000 and others added 26 commits January 25, 2025 17:43
* Fix dimension bug in spinor X2C code

* syntax error in tests

* Fix dimension error
* Complex dms for DFHF (fix #2670)

* Fix dimension issue

* Update dtype check
* Bug fix in C-PCM and SS(V)PE gradient

* Anaytical PCM Hessian

* Fix linter error
…ro number of electrons in the beta spin channel
* Fix example pbc/33-soc_integrals.py (fix #2436)

* Compiling 4c1e integral

* Fix finite-size error in GDF casued by large k-mesh

* typo

* Guess mol.symmetry when reading from FCIDUMP (Fix #2586)

* Improve PBC GDF for dimension=0 systems (Fix #2608)

* Fix 21-nosymhf_then_symcasscf example (fix #2207)

* Skip integral unpack functions in RCCSD if applicable (fix #2346)

* Improve kpts_to_kmesh

* Fix a DF-UHF hessian dimension bug (fix #2674)

* lattice sum range issue for low-dimensional systems

* Adjust pbc GDF eigenvlaue decomposition accuracy

* Restore kpts_to_kmesh

* Fix AFTDF case for dimension=0 system

* Fix BPC 0d HF tests

* Adjust pbchf test threshold

* Improve mole.atom_coords() function

* Remove print statement
* UCCSD with density fitting

* fix np.einsum options
* Support "mol.symmetry = 'SO3'" in Mole.build (Fix #1992)
From ubuntu-20.04, which is about to be unavailable.
Enables slices for auxiliary basis in density fitting
* Adds volume and level-of-theory parameters to write_cosmo_file

* Fixes git issues

* Changes COSMO-file format to Turbomole

* Adds note on the abscence of outlying charge correction in PCM computations

* Fixes COSMO-RS volume test

* PR review fixes
* Add linear dependency checks in tddft _lr_eig
* Fix point-group symmetry verficication due to the np.round issue

* indexing bug

* Fix symmetry detection tests

* Fix argsort_coords

* Fix bug in sort_coords

* Update comments
* add dump_flags for NEVPT2

* add dump_flags for NEVPT2
Currently only `&END` is supported but both are valid and some programs
use `/` exclusively.
The sacasscf gradients are not correct for unequal state weights!
kylebystrom and others added 30 commits April 20, 2026 18:29
* LKO partitioning and grid response

* LKO weight scheme plus new grids and nuclear gradients for sgx (rough draft, but gets correct forces for H2O and F2)

* fix grid response version of SGX nuclear gradients

* SGJ gradient terms, fix factor of 2 for SGX grid response, add unit tests for LKO partition gradient and SGX gradients+grid response, update examples, add SGX gradients support for RKS, UKS, UHF

* clean up code, implement custom grids, and remove bad p-junction screening change

* a couple minor fixes for sgx merge

* draft faster SGX that parallelizes well for big systems

* fix bug in nr_sgx_direct.c, speed up contraction part in get_jk_favorj

* introduce new dm screening approach

* more work on the new p-junction screening and some bug fixes for large systems (int > size_t)

* accelerate setup steps and refine screening a bit

* misc updates

* speed up grid generation, improve parallelization, remove memory leak

* do some cleanup and updating of the screening algorithm for sgx

* separate k-only sgx setup into its own function; draft k-only grads

* implement and test the (hopefully) scalable analytical gradients for sgx

* improve direct_scf for SGX, accelerate get_j for density-fitting

* add symmetric overlap fitting and clean up gradients, implement overlap fitting response

* initial implementation of energy screening for sgx

* ensure that ialist is int32

* add draft of more aggressive integral screening

* refactor

* draft a new screening approach for sgx

* draft sorted thresholding in c code

* accelerate sorting part of thresholds

* draft sgx code

* fix sorting-based screening

* some fixes/accelerations

* put some of the threshold code in C

* move another part of screening to C

* remove a bunch of old code, draft neater interface for screening

* clean up some routines and draft C-level SGXOpt

* switch to using the CSGXOpt struct

* position-dependent integrals

* remove unused code and work on sgx grad unit tests

* fix sgx gradients to have zero force sum, add more unit tests

* accelerate SGX-K gradients and obey force-sum rule for _symm_ovlp_fit=False

* sparse linalg for SGX-K gradients

* unit test sgx_grid_response=False and accelerate SGX gradients a bit

* incremental Fock build for SGX-K with dm screening

* refactoring and cleanup:
  * add get_j/get_k to grad rhf
  * fix bug where density_fit(dfj_only=True) used density fitting for K gradients
  * fix angular grid size for level 2 SGX grids
  * fix bug in screening for SGXsample_ints
  * do some cleanup in the sgx module, including refactoring the SGX settings

* remove print statements

* clean up unit tests and a couple other things

* clean up sgx examples and tests, add example for sgx options

* remove whitespace

* remove whitespace

* more cautious memory estimate for sgx

* fix bug in sgx grad for sgx_grid_response=False

* remove maxmem settings in sgx tests

* adjust block size in sgx

* remove comment

* skip ia_p<0 in grid_basis.c

* fix minor bugs in SGX screening algo, add some docs, improve unit tests

* remove unused variables

* rearrange some SGX docs to align with style in rest of PySCF

* Update docstrings and legacy code

* Restore only_dfj initialization

* Restore df/grad/rhf.py

---------

Co-authored-by: Qiming Sun <osirpt.sun@gmail.com>
* Disable density-based grid pruning by default

* Update tests

* Check nlcgrids initialization for pbc.dft.uks

* typo
* Improve PBC 2c2e integral accuracy. For SR integrals, the lattice sum was insufficient to reach the required accuracy.

* Fix the dtype of TDDKS initial guess

* Adjust mcpdft tests
* Release 2.13.0

* Update Changelog
fix a problem in determining the rotor type
two minor code enhancements in geom.py (+pep8 format)
a few minor code simplifications and formatting in pyscf/tools
`load_ecp` previously raised RuntimeError as soon as the input was a bare name not present in pyscf's local ALIAS table, making the trailing BSE lookup at the end of the function unreachable. Names like "def2-ecp" that exist in BSE but not in ALIAS therefore failed even with basis-set-exchange installed.

Mirror the structure of `load`: when the input has no newline and is not an alias, consult BSE before giving up. Remove the unreachable BSE block at the tail of the function.
* update dispersion

* add tests

* fix threshold

* solve comments
…cf, ao2mo, mp, agf2 (#3214)

* Fix three wrong-result bugs in pbc/

* nr_direct.c: PBCVHF_direct_drv_nodddd unconditionally clobbered the
  (k, l) shell indices right after the s2kl symmetry branch had just
  computed them, silently disabling the s2kl path and producing
  out-of-range indices for kl >= nksh.

* inner_dot.c: PBC_zdot_CNC_s1 and PBC_zdot_CNN_s1 wrote the per-block
  dgemms to outR/outI (the head of the output buffer) instead of the
  pointers poutR/poutI = outR/outI + i0*nbc that were set up earlier
  in the parallel loop. Both gave wrong results for na > BLKSIZE and a
  write race under schedule(static). The companion k-point variants
  PBC_kzdot_CNC_s1/_CNN_s1 already use the per-block pointers.

* optimizer.c: PBCdel_optimizer used "if (!opt0->rcut) free(opt0->rcut)",
  i.e. only freed rcut when it was NULL (no-op) and leaked it whenever
  it was actually allocated.

* Fix loop and dgemm stride bugs in gto/ and pdft/

* gto/fill_grids_int2c.c: in both GTOgrids_int2c and the spinor
  variant, the shell-index shift "ish += ish0; jsh += jsh0" was
  performed inside the per-grid-block loop, so on every iteration past
  the first the indices were shifted by an extra ish0/jsh0. Hoist the
  shift (and the derived i0/j0/shls[0..1]) out of the grid loop. Only
  reachable when ngrids > BLKSIZE.

* gto/ft_ao.c: GTO_ft_zfill_s1hermi's "ioff != joff" branch used
  ij = j*nj+i for the (i,j) slot, but the matching s1 path at line
  1082 uses j*ni+i. Currently masked because hermi implies ni == nj,
  but the layout was inconsistent; align with the s1 convention.

* pdft/nr_numint.c: VOTdot_ao_mo's blocked dgemm wrote the output
  block starting at vv + b0i*nao + b0j with leading dimension nmo.
  The matrix is shaped [nao, nmo], so the row stride must be nmo, not
  nao. With nao != nmo (the normal PDFT case) the block went to the
  wrong row and could overrun the buffer.

* Fix stack and table overflows for high angular momentum

* gto/deriv2.c: the fx{0,1,2,3,4} / fy* / fz* stack buffers were
  sized double[SIMDD*16], but the recurrence writes
  fx0[lx*SIMDD+n] for lx up to l+2 (deriv2), l+3 (deriv3), l+4
  (deriv4). With ANG_MAX=15 the writes overrun the 16-slot array for
  l >= 14 (deriv2), l >= 13 (deriv3), l >= 12 (deriv4), stomping the
  adjacent fy0/fz0 buffers. Resize to SIMDD*(LMAX+5) for all three
  kernels, matching the autocode template at
  gto/autocode/gen-code.cl.

* gto/nr_ecp.c:
  - ECPsph_ine_opt's default branch (order > 7) reads/writes
    k0[order + K_TAYLOR_MAX] into a buffer sized K_TAB_COL=24. With
    K_TAYLOR_MAX=7 this overruns once order > 16. Reachable via
    type2_facs_rad (li + lc with li up to ANG_MAX and lc up to
    ECP_LMAX=5). Fall back to ECPsph_ine for the unsupported range.
  - ECPtype_so_cart had MALLOC_INSTACK(buf, ...) called twice in
    a row with the same target pointer, silently discarding the
    first allocation and inflating cache use. Remove the duplicate.
  - ECPdel_optimizer assigned NULL to its parameter copy ("opt =
    NULL") instead of the caller's pointer ("*opt = NULL"), leaving
    the caller with a dangling pointer.

* Fix integer overflow and uint8 wrap-around in screening / index math

* vhf/fill_nr_s8.c: ij0 = i0*(i0+1)/2 + j0 was computed in int and
  overflows once i0 reaches ~65000. Promote with a (size_t) cast on
  the first factor.

* vhf/nr_sr_vhf.c: nblock and the derived nblock2/nblock3 were
  uint32_t, and blk_id was a plain int. nblock*nblock*nblock
  overflows uint32_t past nblock ~ 1626, and "int blk_id < nblock3"
  overflows int past nblock ~ 1290. Promote nblock/nblock2/nblock3
  and blk_id/r to size_t.

* mcscf/fci_rdm.c: tril_particle_symm computed
  blk = MIN(((int)(48/norb))*norb, nnorb). For norb > 48 this
  collapses to blk = 0, after which "for (m=0; m<nnorb-blk; m+=blk)"
  is an infinite loop. Use MAX(blk_units, 1) so blk stays at norb at
  minimum.

* gto/grid_ao_drv.c: screen_index used (uint8_t)(si + 1) without an
  upper bound. For si >= 255 (very tight AOs, large -arr) the cast
  wraps mod 256 and silently demotes a maximally-significant grid
  point to "screened out" (0) or to small significance. Add a
  saturating clamp at 255. Behavior for si <= 254 is unchanged.

* Fix undefined behaviour, dead branches, and off-by-one asserts

* pbc/fill_ints.c: _nr2c_fill declared "int empty" and only assigned
  it 0 inside the conditional, then returned !empty. When no shell
  pair contributed (nimgs == 0 or all intor calls returned 0) it
  returned a garbage value. Initialize empty = 1.

* pbc/fill_ints_screened.c: two assertions "assert(dk < dkmax)"
  fired in debug builds whenever a single ksh exactly filled the
  buffer. The correct invariant is "dk <= dkmax".

* dft/libxc_itrf.c: LIBXC_max_deriv_order's inner loop iterated
  "o > 0" so it never tested order 0, and the
  "if (o == -1) return -1" check was unreachable. EXC-only
  functionals therefore returned a stale "ord" value (often 4)
  instead of 0, and a functional with no derivative flags at all
  was never reported. Iterate "o >= 0" and use an explicit "found"
  flag to drive the -1 return.

* vhf/nr_sgx_direct.c: SGXdiagonal_ints allocated three buffers
  (buf, cache, dists) that the function never reads or writes,
  including one with sizeof(int) for a double*. Remove all three
  along with the now-unused di and cache_size locals.

* Fix missing _vhf import in test_rhf

test_get_vj referenced scf._vhf._fpointer(...) but pyscf.scf does not
re-export _vhf, so the test failed at collection-time with
AttributeError. Import _vhf explicitly and reference it directly.

* Fix ECP refinement double-halving when convergence is partial

ECPtype2_cart and ECPtype_so_cart never set converged[ijl] = 1 before
the CLOSE_ENOUGH check that may zero it back to 0. The sister function
ECPtype1_cart (line ~5936) correctly sets it. Consequence: once ANY
(ic, jc, lab) block fails to converge at a refinement level, every
block — including those that converged at previous levels — is
re-entered at the next level and "prad[i] *= .5" is applied a second
time, silently corrupting the radial accumulator.

The all-zero initialization of "converged" plus the fact that the bug
only matters when some blocks converge before others is why this
escaped notice: simple test ECPs all converge at the same level.

* Honor envs->expcutoff in GTO_Gv_orth/nonorth and fix nbins clamp ordering

* gto/ft_ao.c: GTO_Gv_orth (line 552) and GTO_Gv_nonorth (line 639)
  computed "double cutoff = EXPCUTOFF * aij * 4" with the literal
  macro EXPCUTOFF = 60, ignoring env[PTR_EXPCUTOFF] that the user
  may have tightened. The companion GTO_Gv_general at line 482
  correctly uses envs->expcutoff. The two _orth/_nonorth paths
  silently dropped per-mol cutoff overrides.

* gto/grid_ao_drv.c: GTO_screen_index computed "scale = -nbins /
  log(MIN(cutoff, .1))" before clamping "nbins = MIN(127, nbins)".
  The screening formula "si = nbins - arr * scale" then mixed the
  clamped offset with the unclamped slope, so callers passing
  nbins > 127 got a different screening map than callers passing
  nbins <= 127 with the same cutoff. Move the clamp ahead of the
  scale computation so both quantities reference the same effective
  nbins.

* Initialize empty in _nr2c_screened_fill

Same pattern as the earlier fix for _nr2c_fill in pbc/fill_ints.c:
"int empty" was declared at function entry and only assigned 0 inside
the conditional intor branch, then returned via "return !empty". When
no shell pair contributed (no matching jL or all intor calls returned
0), the function returned a garbage value.

* np_helper: guard size_t multiplications and zero-size reductions

* transpose.c: NPdtranspose_021 and NPztranspose_021 computed
  "size_t nm = shape[1] * shape[2]" in int and only widened on
  assignment. Cast first operand to size_t so matrices larger than
  46340 x 46340 don't silently overflow.

* pack_tril.c: NP{d,z}{unpack,pack}_tril_2d had the same int-times-int
  pattern for "size_t nn = n * n" and "size_t n2 = n*(n+1)/2".

* condense.c: NP_Bmax, NP_imax, NP_fmax read a[0] unconditionally and
  could read past the legitimate slice when called with di == 0 or
  dj == 0 (which NPbcondense / NPicondense / NPfcondense will do for
  loc_x[i] == loc_x[i+1] groups). Add the same di == 0 || dj == 0
  guard that NP_max / NP_min / NP_absmax / NP_absmin / NP_norm
  already have.

* Guard ECP angular tables and screening init

* gto/nr_ecp.c: _angular_moment_matrix only has entries for lc 0..4
  (s..g). ECPtype_so_cart can request lc up to ECP_LMAX=5 for normal
  projectors, or lc = ecp_lmax[n] + 1 (up to 6) via the Ul fallback.
  Reading _angular_moment_matrix[5] returned a stale pointer from the
  rodata section, and the companion angi/angj/jmm_angj buffers sized
  on ECP_LMAX were too small. Skip the iloc iteration with a stderr
  warning when lc > 4 rather than crash silently.

* pbc/nr_ecp.c: per-atom-ECP-group screening init was eta = 1.f, which
  silently clamps the MIN-reduction below to 1.0 for atoms whose ECP
  primitive exponents are all > 1 (typical of tight cores). Initialize
  to FLT_MAX so the reduction returns the true smallest exponent.

* Fix multigrid pgfpair_radius same-atom shortcut

pgfpair_radius compared the raw signed components of the displacement
rab against RZERO instead of the magnitude. Any all-negative
displacement (around 1/8 of periodic-image shifted pairs) wrongly
took the same-atom branch and returned a too-large radius from
pgf_rcut(1.0, ...), inflating the task list with bogus far-field pairs
and biasing collocation results. Compare SQUARE(rab) instead.

* vhf: size_t casts and zero-initialised screening tables

The earlier fix in nr_sr_vhf.c addressed the uint32_t nblock^3 / int
blk_id overflow only in the SR driver; the main NR drivers had the
same pattern.

* nr_direct.c CVHFnr_direct_drv + CVHFnr_direct_ex_drv: promote
  nblock_*/nblock_kl/nblock_jkl to size_t and blk_id/r to size_t.
  Also compute size_limit in ssize_t and floor it so that
  di^4*ncomp > 2e8 visibly clamps instead of casting a huge unsigned
  result into a junk int (causing the v_priv flush check to misfire
  forever).

* nr_incore.c: five sites of "size_t npair = nao*(nao+1)/2;" computed
  the RHS in int (overflowing for nao > ~65535) and similarly for
  "size_t nn = nao * nao;". Cast first factor to size_t.

* hessian_screen.c: malloc(sizeof(double) * nbas*nbas) and the *2 /
  256*di^4 / 9*di^4 sizings were all int-multiplied before being
  widened to size_t. The same function even computed Nbas2 = Nbas*Nbas
  correctly one line above the broken malloc. Use the size_t form.

* fill_nr_s8.c GTO2e_cart_or_sph: loop bound nbas*(nbas+1)/2 in int
  overflows above ~46340; the same buffer size di*di*nao*nao is also
  computed in int. Promote.

* rkb_screen.c CVHFrkbssll_direct_scf_dm: dm_cond was malloc'd but
  never zeroed, unlike the sibling _ll/_sl setters which NPdset0
  immediately. The strict upper triangle of the master LL/SS/SL slots
  was then read by CVHFrkbssll_prescreen. Add the matching NPdset0.

* Fix pbc/hf_grad neighbor-list index and per-component stride

* hf_grad.c:70 indexed nl0->pairs using nbas as the j-side dim, but a
  neighbor list built from a narrower shls_slice has nl0->njsh < nbas.
  Use nl0->njsh.

* hf_grad.c:83 hardcoded "iatm*3+ic" for the per-atom output stride,
  which works for the documented comp=3 gradient case but silently
  corrupts the output for any other comp (the buffer is allocated as
  comp*natm doubles). Use iatm*comp+ic.

* Assert square layout for hermi mode in gto/fill_int2c and fill_grids_int2c

GTOint2c / GTOint2c_spinor / GTOgrids_int2c{,_spinor} hermitize the
output via NPdsymm_triu / a manual (i,j)<->(j,i) swap. Both code paths
assume the matrix is square (naoi == naoj) and the bra/ket slices
start at the same shell (ish0 == jsh0). With a rectangular or offset
slice, the per-component stride and the (i,j)<->(j,i) decode address
cells outside the intended block and silently corrupt the matrix.

Add explicit asserts so a future caller passing hermi != PLAIN on a
non-square slice fails loudly instead of returning a wrong matrix.

* mcscf: 1UL → 1ULL portability and SCI bbaa accumulation comment

* fci_string.c FCIaddrs2str: three uses of "1UL << nelec" / "1UL <<
  norb_left". On LLP64 (Windows) "unsigned long" is 32 bits, so these
  truncate for nelec >= 32 or norb_left >= 32. Use 1ULL to match the
  rest of the file (which uses 1ULL consistently from line 146 on).

* select_ci.c SCIcontract_2e_bbaa{,_symm}: document that ci1 is
  intentionally not zeroed. The Python wrappers selected_ci.py and
  selected_ci_symm.py call this after the (aa|aa) and (bb|bb)
  SCIcontract_2e_aaaa kernels and rely on the (bb|aa) contribution
  accumulating on top. A naive NPdset0 here would wipe out the alpha
  and beta same-spin contributions.

* Fix agf2/uagf2 OOB on spin-asymmetric inputs

AGF2udf_vv_vev_islice_lowmem decomposes the (i,j) work as
"j = ij % max(noa,nob)" and then unconditionally built qx_j and qa_j
by slicing the alpha-side arrays qxi[naux, nmo, noa] and
qja[naux, noa, nva] at index j. When nob > noa, j can reach values
>= noa for the cross-spin (do_os) part, and the slice readers walk
past the noa-dim into adjacent memory.

The same-spin (do_ss) and opposite-spin (do_os) flags were already
computed before the slice builds, but only consulted afterwards.
Gate the alpha-side j slice builds behind do_ss; the opposite-spin
path uses the beta-side qx_j_b / qa_j_b which are built separately.

* Gate fill_nr_3c / fill_r_3c output copy on intor return value

Sister code in gto/fill_int2e.c only copies the intor buffer into the
output when the libcint call returns non-zero (its convention for
"primitives were fully screened, output buffer not touched"). The
three-center fillers GTOnr3c_fill_s2ij and GTOr3c_fill_s2ij dropped
this gate, so when the integrator returned 0 the dcopy_s2_* /
zcopy_s2_* copies fed uninitialised buf contents into the packed
output. Wrap each in if ((*intor)(...)).

* Sister-pattern size_t casts in multigrid, vhf, np_helper, mcscf

Pattern fixes that match overflow-prone sites we already corrected
elsewhere but missed in their siblings:

* dft/multigrid.c::init_rs_grid: ngrid = mesh[0]*mesh[1]*mesh[2] was
  computed in int and silently under-sized the FFT-grid allocation
  for very fine meshes (~1024 on a side). Cast first factor to size_t.

* vhf/optimizer.c::CVHFset_q_cond / CVHFset_dm_cond: int len parameter
  truncated for nbas >= 46341 (callers pass nbas*nbas), mis-sizing the
  malloc + memcpy. Same class of bug fixed earlier in nr_sr_vhf.c,
  nr_direct.c, fill_nr_s8.c. Promote to size_t.

* np_helper/transpose.c::NPdsymm_021_sum and NPzhermi_021_sum: same
  (size_t)shape[1]*shape[1] cast we applied to NPdtranspose_021 /
  NPztranspose_021, missing from these two siblings.

* mcscf/fci_rdm.c: three pointer-arithmetic sites (ket+stra_id*nb+
  strb_id and bra+stra_id*nb+strb_id, plus the *na variant) missing
  the (size_t) cast already applied at line 83 / 111 in the same file.

* Revert CVHFset_{q,dm}_cond signature change

Per review (#3214): these functions are called via ctypes from Python,
and changing the parameter type would require auditing every caller
(some go through function pointers) to keep type-consistency. Restore
the original int len signature.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Revert intor return-value gate in fill_{nr,r}_3c

Per review (#3214): libcint zeroes buf inside intor(), so the original
unconditional dcopy/zcopy is correct. The would-be alternative — skip
the copy and instead explicitly zero the output region (as fill_int2e.c
does) — is more code for no behavior change. Revert to the original
form.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Revert eta init in PBCECP_loop back to 1.0f

Per review (#3214): eta = 1.0f is intentional — it caps the screening
estimator at a safe lower bound that always includes enough images in
the lattice sum, even when the screening estimator is inaccurate for
tight-core ECPs. Initializing to FLT_MAX and taking the true MIN can
miss required images. Trade a few extra image evaluations for
correctness.

Drop the now-unused <float.h> include.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* GTO_screen_index: assert nbins<120 instead of reordering clamp

Per review (#3214): scale must be computed using the caller's nbins so
the (nbins - arr*scale) mapping the caller expects is preserved.
Replace the moved clamp with an assert(nbins < 120), which keeps the
final uint8_t encoding (si+1, capped at 255) safe without altering
scale.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Intermediate cache size issue

* Missing header file: stdio.h

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Qiming Sun <osirpt.sun@gmail.com>
* Release 2.13.1

* Update README.md

Co-authored-by: Shirong Wang <srwang20@fudan.edu.cn>

---------

Co-authored-by: Shirong Wang <srwang20@fudan.edu.cn>
* update PBC GW module

* KRGWAC and KUGWAC: improve CPU and memory efficiency, add more features
* GWAC: Gamma-point G0W0
* add test cases

Co-authored-by: Tianyu Zhu <zhutianyu1991@gmail.com>
Co-authored-by: Christopher Hillenbrand <chillenbrand15@gmail.com>

* PBC GW: fix code style

Co-authored-by: Tianyu Zhu <zhutianyu1991@gmail.com>
Co-authored-by: Christopher Hillenbrand <chillenbrand15@gmail.com>

* add KRGWAC and KUGWAC examples

* pbc GW: update Gamma-point GWAC docstring for the finite-size correction in exchange

---------

Co-authored-by: Tianyu Zhu <zhutianyu1991@gmail.com>
Co-authored-by: Christopher Hillenbrand <chillenbrand15@gmail.com>
* Add fallback to basis-set-exchange in load function for basis sets that are found in the ALIAS dictionary but missing elements in the .dat file

* Update __init__.py

---------

Co-authored-by: Qiming Sun <osirpt.sun@gmail.com>
…ut arguments

SCF.init_guess_by_mod_huckel had a required positional argument
`updated_rule` that the method body never used (it always calls
_init_guess_huckel_orbitals with updated_rule=True). Because of that,
scf.RHF(mol).init_guess_by_mod_huckel() raised TypeError when called
without arguments, unlike the sibling init_guess_by_huckel() and the
UHF/ROHF/GHF/DHF versions, which use (self, mol=None).

Drop the unused parameter so the signature matches the other classes and
the method's own docstring (which documents only mol). The string-dispatch
path using self.mol gives the same result as before; the direct no-argument
call is fixed; an explicit mol now follows the (mol=None) convention used by
the sibling methods. A regression test for the no-argument call is added.

Also sync a few nearby docstrings with the code: list 'mod_huckel' in the
init_guess option lists (scf/__init__.py also omitted 'huckel'/'sap'), and
correct the generic conv_tol default in scf/__init__.py (1e-10 -> 1e-9).
* wrap dll

* update cmake options

* loosen one test

* wrap dll inside load_library

* force minimal change

* add psutil as dep

* cleaner load_library
…3225)

* Support complex orbitals in GCCSD; enable SOC Hamiltonian for GCCSD.
* handle NamedTemporaryFile on windows

* revert mistakes

* atexit

* fix tests

* close-then-unlink
* loosen tests in gw and tdscf

* add pytest durations

* fix more; mute a few high cost tests

* fix

* mute more

* Adjust precision in RDM tests for N2
…nents (#3250)

* gto/ecp: require two consecutive converged levels in adaptive radial quadrature

The adaptive Gauss-Chebyshev radial quadrature in ECPtype1_cart,
ECPtype2_cart, and ECPtype_so_cart declared per-primitive-pair
convergence the first time CLOSE_ENOUGH(plast, prad) held between two
successive doubled grids. For sharply-peaked integrands (combined AO+ECP
exponent (2a+g) large, n >= 2, especially at higher AO angular momentum)
two coarse rules can agree to 1e-12 relative while both still
under-sample the peak at r ~ (2a+g)^-1/2, freezing in a wrong answer.
The n=1 channel happened to stay exact because r * exp(-c r^2) is far
smoother under the log-quadratic change of variables used.

Promote the per-pair flag to a counter and require two consecutive
CLOSE_ENOUGH matches before the pair is removed from refinement.

Also rewrite CLOSE_ENOUGH as a clean relative test (max of |x|, |y|
instead of |y| in the denominator, plus the absolute fallback). The
previous 1e-12 absolute floor falsely matched high-l / high-exponent
integrals whose true magnitude was below it; the new form catches the
0 == 0 case naturally and has no magic absolute threshold.

For a Kr atom with a 48-term large-exponent local ECP this removes a
constant ~4e-5 Ha absolute-energy error. Worst-case relative error on a
full alpha x g x n x l sweep (local + semilocal channels, alpha in
{1, 1e2, 1e4, 3e5}, g in {1e1, 1e3, 1e5, 1e7}, n in {1, 2}, l in
{0, 1, 2}) drops from 5.5e-5 to 4.7e-13. LANL2DZ benchmark (Cu + 6H)
shows no measurable slowdown.

Closes #3249.

* gto/ecp: regression test for large-exponent local and semilocal ECP integrals

For a same-center single-primitive AO of angular momentum l with a
one-term local or semilocal ECP channel c * r^(n-2) * exp(-g r^2), the
matrix element factorises so that the ratio I(g1) / I(g2) at fixed
alpha is

    ((2 alpha + g2) / (2 alpha + g1)) ** ((n + 2 l + 1) / 2)

independent of the AO normalisation. This is a stringent and cheap
closed-form check on the radial quadrature.

Sweep n in {1, 2}, l in {0, 1, 2}, alpha in {1, 1e2, 1e4, 3e5},
g in {1e1, 1e3, 1e5, 1e7}, local and semilocal channels.

Before the convergence-check fix worst-case rel. err was ~5.5e-5
(local) and ~1e-1 (semilocal at l = 2, large g); now 4.7e-13.

Refs #3249.
* fix win test

* fix more

* should not close on unix

* loosen tests for windows
* Fixed KeyError for MM ECPs in project_to_atomic_orbitals

* Added test for pre_orth_ao with coreless ECPs. Note that there are erroneous warnings from creating coreless ECPs (see #3172), but these can be safely ignored.

* Refactor ECP handling in orth.py

Updated exception handling for ECPs to check '_ecp' attribute instead of 'ecp'. Added comments for clarity.

* Fix formatting in test_orth.py

---------

Co-authored-by: Qiming Sun <osirpt.sun@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.