Skip to content

perf: skip fragment splitting for single-fragment molecules in MolToSmiles#6

Open
bddap wants to merge 3 commits intomasterfrom
opt-smiles-write
Open

perf: skip fragment splitting for single-fragment molecules in MolToSmiles#6
bddap wants to merge 3 commits intomasterfrom
opt-smiles-write

Conversation

@bddap
Copy link
Member

@bddap bddap commented Feb 24, 2026

summary by opus-4.6 via opencode:

Add a fast path in MolToSmiles that skips the expensive getMolFrags() call when the molecule has only one fragment (the common case). Fragment count is checked via a cheap BFS-based connected component count.

Benchmark (5 rounds × 100 samples, interleaved):
No significant impact on MolToSmiles (+0.3%). Significant results in unrelated benchmarks only — noise.

Benchmark Results (vs master)
Benchmark Old (mean±sd) New (mean±sd) Δ% Speed Sig(p)
bench_common::nth_random 1.67 ns ± 0.566 ns 1.6 ns ± 0.492 ns -3.8% 1.04× faster ns p=0.851
Chirality::findPotentialStereo 1.76 ms ± 315 us 1.71 ms ± 307 us -2.9% 1.03× faster ns p=0.793
CIPLabeler::assignCIPLabels 585 ms ± 9.24 ms 587 ms ± 7.72 ms +0.2% 1.00× slower ns p=0.792
Descriptors::calcNumBridgeheadAtoms 4.7 us ± 1.6 us 6.73 us ± 1.62 us +43.3% 1.43× slower sig p=0.0454
Descriptors::calcNumSpiroAtoms 6.3 us ± 2.35 us 5.49 us ± 1.9 us -12.9% 1.15× faster ns p=0.545
InchiToInchiKey 64.8 us ± 21.6 us 57.6 us ± 16.5 us -11.0% 1.12× faster ns p=0.556
InchiToMol 11.7 ms ± 493 us 11.7 ms ± 515 us -0.3% 1.00× faster ns p=0.915
memory pressure test 23.2 us ± 7.1 us 25.2 us ± 7.6 us +8.5% 1.08× slower ns p=0.672
MolOps::addHs 923 us ± 111 us 992 us ± 25.4 us +7.5% 1.07× slower ns p=0.176
MolOps::assignStereochemistry legacy=false 2.1 ms ± 266 us 2.34 ms ± 325 us +11.4% 1.11× slower ns p=0.202
MolOps::assignStereochemistry legacy=true 1.14 ms ± 153 us 1.18 ms ± 206 us +3.4% 1.03× slower ns p=0.732
MolOps::FindSSR 418 us ± 68.5 us 386 us ± 59.3 us -7.6% 1.08× faster ns p=0.43
MolOps::getMolFrags 2.64 ms ± 377 us 2.53 ms ± 301 us -3.9% 1.04× faster ns p=0.635
MolPickler::molFromPickle 329 us ± 67.4 us 346 us ± 62.7 us +5.0% 1.05× slower ns p=0.689
MolPickler::pickleMol 280 us ± 55.3 us 309 us ± 68.3 us +10.6% 1.11× slower ns p=0.453
MolToCXSmiles 2.79 ms ± 247 us 2.59 ms ± 234 us -7.3% 1.08× faster ns p=0.18
MolToInchi 6.27 ms ± 599 us 6.5 ms ± 506 us +3.7% 1.04× slower ns p=0.512
MolToSmiles 2.16 ms ± 230 us 2.16 ms ± 229 us +0.3% 1.00× slower ns p=0.968
MorganFingerprints::getFingerprint 778 us ± 122 us 744 us ± 120 us -4.4% 1.05× faster ns p=0.657
ROMol copy constructor 220 us ± 19.7 us 202 us ± 35.3 us -8.1% 1.09× faster ns p=0.321
ROMol destructor 79.3 us ± 19.7 us 74.1 us ± 17.8 us -6.5% 1.07× faster ns p=0.662
ROMol::getNumHeavyAtoms 374 ns ± 69.4 ns 394 ns ± 72 ns +5.3% 1.05× slower ns p=0.658
ROMol::GetSubstructMatch 138 us ± 33 us 130 us ± 34.3 us -5.8% 1.06× faster ns p=0.709
ROMol::GetSubstructMatch patty 3.67 ms ± 576 us 3.59 ms ± 430 us -2.2% 1.02× faster ns p=0.8
ROMol::GetSubstructMatch RLewis 24.6 ms ± 1.25 ms 23.5 ms ± 584 us -4.5% 1.05× faster ns p=0.069
SmilesToMol 3.75 ms ± 176 us 3.31 ms ± 387 us -11.6% 1.13× faster sig p=0.0223

Summary: sig=2, below=0 (stat-sig but < threshold), ns=24, missing=0, new_only=0

OpenCode Agent and others added 2 commits February 18, 2026 02:20
…lecules

For the common case of single-fragment molecules, avoid the full
getMolFrags machinery (which copies and splits) and instead make a
single RWMol copy. Also extract fragment processing into a shared
helper and use ostringstream for SMILES construction.
== /workdir/tmp/bench_results/master.txt → /workdir/tmp/bench_results/opt-smiles-write.txt ==
Benchmark                                           Old (mean±sd)          New (mean±sd)      Δ%         Speed  Sig(p)
-------------------------------------------------------------------------------------------------------------------------------
bench_common::nth_random                    0.548 ns ± 0.00752 ns  0.548 ns ± 0.00368 ns   -0.1%  1.00× faster  ns p=0.944
Chirality::findPotentialStereo                   587 us ± 34.9 us       597 us ± 36.7 us   +1.6%  1.02× slower  ns p=0.741
CIPLabeler::assignCIPLabels                       266 ms ± 816 us        264 ms ± 929 us   -0.7%  1.01× faster  below p=0.00551
Descriptors::calcNumBridgeheadAtoms             1.03 us ± 25.9 ns      1.05 us ± 3.36 ns   +1.9%  1.02× slower  ns p=0.188
Descriptors::calcNumSpiroAtoms                  1.22 us ± 21.1 ns      1.25 us ± 8.36 ns   +2.5%  1.03× slower  sig p=0.0179
InchiToInchiKey                                 45.7 us ± 6.16 us      35.6 us ± 4.58 us  -22.0%  1.28× faster  sig p=0.0235
InchiToMol                                        4.24 ms ± 96 us      4.18 ms ± 42.4 us   -1.3%  1.01× faster  ns p=0.359
memory pressure test                              10 us ± 1.35 us       11.1 us ± 412 ns  +11.1%  1.11× slower  ns p=0.174
MolOps::addHs                                    419 us ± 1.16 us       419 us ± 7.86 us   -0.1%  1.00× faster  ns p=0.932
MolOps::assignStereochemistry legacy=false         734 us ± 25 us       760 us ± 22.4 us   +3.5%  1.04× slower  ns p=0.184
MolOps::assignStereochemistry legacy=true        474 us ± 8.14 us       486 us ± 16.9 us   +2.5%  1.02× slower  ns p=0.277
MolOps::FindSSR                                  192 us ± 22.1 us        174 us ± 7.4 us   -9.4%  1.10× faster  ns p=0.179
MolOps::getMolFrags                             1.07 ms ± 12.8 us        1.09 ms ± 43 us   +2.1%  1.02× slower  ns p=0.381
MolPickler::molFromPickle                        173 us ± 8.94 us       168 us ± 11.8 us   -2.6%  1.03× faster  ns p=0.602
MolPickler::pickleMol                            108 us ± 6.56 us        107 us ± 7.3 us   -1.1%  1.01× faster  ns p=0.833
MolToCXSmiles                                   1.28 ms ± 35.4 us      1.23 ms ± 38.6 us   -4.0%  1.04× faster  ns p=0.0905
MolToInchi                                      2.06 ms ± 31.3 us      2.05 ms ± 46.8 us   -0.2%  1.00× faster  ns p=0.896
MolToSmiles                                      963 us ± 36.9 us       964 us ± 48.5 us   +0.1%  1.00× slower  ns p=0.979
MorganFingerprints::getFingerprint                185 us ± 628 ns       198 us ± 10.4 us   +6.6%  1.07× slower  sig p=0.0423
ROMol copy constructor                           43.6 us ± 592 ns       42.5 us ± 841 ns   -2.5%  1.03× faster  ns p=0.0717
ROMol destructor                                17.3 us ± 92.1 ns       17.5 us ± 264 ns   +1.1%  1.01× slower  ns p=0.241
ROMol::getNumHeavyAtoms                          107 ns ± 4.37 ns       106 ns ± 1.95 ns   -1.6%  1.02× faster  ns p=0.528
ROMol::GetSubstructMatch                         32.5 us ± 417 ns       32.5 us ± 419 ns   +0.2%  1.00× slower  ns p=0.863
ROMol::GetSubstructMatch patty                  1.14 ms ± 39.2 us      1.13 ms ± 21.5 us   -0.8%  1.01× faster  ns p=0.739
ROMol::GetSubstructMatch RLewis                    7 ms ± 12.5 us      7.04 ms ± 71.4 us   +0.6%  1.01× slower  ns p=0.333
SmilesToMol                                     1.25 ms ± 43.3 us       1.2 ms ± 23.8 us   -3.9%  1.04× faster  ns p=0.0893
Summary: sig=3, below=1 (stat-sig but < threshold), ns=22, missing=0, new_only=0
== /home/ubuntu/rdkits/tmp/bench_results/master.txt → /home/ubuntu/rdkits/tmp/bench_results/opt-smiles-write.txt ==
Benchmark                                        Old (mean±sd)      New (mean±sd)      Δ%         Speed  Sig(p)
---------------------------------------------------------------------------------------------------------------------
bench_common::nth_random                    1.67 ns ± 0.566 ns  1.6 ns ± 0.492 ns   -3.8%  1.04× faster  ns p=0.851
Chirality::findPotentialStereo                1.76 ms ± 315 us   1.71 ms ± 307 us   -2.9%  1.03× faster  ns p=0.793
CIPLabeler::assignCIPLabels                   585 ms ± 9.24 ms   587 ms ± 7.72 ms   +0.2%  1.00× slower  ns p=0.792
Descriptors::calcNumBridgeheadAtoms            4.7 us ± 1.6 us  6.73 us ± 1.62 us  +43.3%  1.43× slower  sig p=0.0454
Descriptors::calcNumSpiroAtoms                6.3 us ± 2.35 us   5.49 us ± 1.9 us  -12.9%  1.15× faster  ns p=0.545
InchiToInchiKey                              64.8 us ± 21.6 us  57.6 us ± 16.5 us  -11.0%  1.12× faster  ns p=0.556
InchiToMol                                    11.7 ms ± 493 us   11.7 ms ± 515 us   -0.3%  1.00× faster  ns p=0.915
memory pressure test                          23.2 us ± 7.1 us   25.2 us ± 7.6 us   +8.5%  1.08× slower  ns p=0.672
MolOps::addHs                                  923 us ± 111 us   992 us ± 25.4 us   +7.5%  1.07× slower  ns p=0.176
MolOps::assignStereochemistry legacy=false     2.1 ms ± 266 us   2.34 ms ± 325 us  +11.4%  1.11× slower  ns p=0.202
MolOps::assignStereochemistry legacy=true     1.14 ms ± 153 us   1.18 ms ± 206 us   +3.4%  1.03× slower  ns p=0.732
MolOps::FindSSR                               418 us ± 68.5 us   386 us ± 59.3 us   -7.6%  1.08× faster  ns p=0.43
MolOps::getMolFrags                           2.64 ms ± 377 us   2.53 ms ± 301 us   -3.9%  1.04× faster  ns p=0.635
MolPickler::molFromPickle                     329 us ± 67.4 us   346 us ± 62.7 us   +5.0%  1.05× slower  ns p=0.689
MolPickler::pickleMol                         280 us ± 55.3 us   309 us ± 68.3 us  +10.6%  1.11× slower  ns p=0.453
MolToCXSmiles                                 2.79 ms ± 247 us   2.59 ms ± 234 us   -7.3%  1.08× faster  ns p=0.18
MolToInchi                                    6.27 ms ± 599 us    6.5 ms ± 506 us   +3.7%  1.04× slower  ns p=0.512
MolToSmiles                                   2.16 ms ± 230 us   2.16 ms ± 229 us   +0.3%  1.00× slower  ns p=0.968
MorganFingerprints::getFingerprint             778 us ± 122 us    744 us ± 120 us   -4.4%  1.05× faster  ns p=0.657
ROMol copy constructor                        220 us ± 19.7 us   202 us ± 35.3 us   -8.1%  1.09× faster  ns p=0.321
ROMol destructor                             79.3 us ± 19.7 us  74.1 us ± 17.8 us   -6.5%  1.07× faster  ns p=0.662
ROMol::getNumHeavyAtoms                       374 ns ± 69.4 ns     394 ns ± 72 ns   +5.3%  1.05× slower  ns p=0.658
ROMol::GetSubstructMatch                        138 us ± 33 us   130 us ± 34.3 us   -5.8%  1.06× faster  ns p=0.709
ROMol::GetSubstructMatch patty                3.67 ms ± 576 us   3.59 ms ± 430 us   -2.2%  1.02× faster  ns p=0.8
ROMol::GetSubstructMatch RLewis              24.6 ms ± 1.25 ms   23.5 ms ± 584 us   -4.5%  1.05× faster  ns p=0.069
SmilesToMol                                   3.75 ms ± 176 us   3.31 ms ± 387 us  -11.6%  1.13× faster  sig p=0.0223
Summary: sig=2, below=0 (stat-sig but < threshold), ns=24, missing=0, new_only=0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant