Skip to content

refactor(db): use Spanner INSERT OR UPDATE DML#2314

Open
chenba wants to merge 4 commits into
masterfrom
refactor/use-spanner-upsert-stor-218
Open

refactor(db): use Spanner INSERT OR UPDATE DML#2314
chenba wants to merge 4 commits into
masterfrom
refactor/use-spanner-upsert-stor-218

Conversation

@chenba
Copy link
Copy Markdown
Collaborator

@chenba chenba commented May 19, 2026

This commit:

  • replaces all inserts-and/or-updates with Spanner's upsert DML where appropriate
  • drops the database_spanner_use_mutations settings as mutations are no longer used

Issue(s)

Closes STOR-218

@chenba chenba force-pushed the refactor/use-spanner-upsert-stor-218 branch from c238629 to efa5104 Compare May 19, 2026 03:52
@chenba chenba marked this pull request as ready for review May 19, 2026 12:23
@chenba chenba requested review from pjenvey and taddes May 19, 2026 12:23
@chenba chenba force-pushed the refactor/use-spanner-upsert-stor-218 branch from efa5104 to 2dbb7dd Compare May 20, 2026 12:57

The entire move (batch commit) happens in one single transaction. Spanner
limits the number of writes (or "mutations") within a transaction to 20000
limits the number of writes (or "mutations") within a transaction to 80000
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chenba chenba force-pushed the refactor/use-spanner-upsert-stor-218 branch from 2dbb7dd to 38caa71 Compare May 21, 2026 18:47
This commit:
 - replaces all inserts-and/or-updates with Spanner's upsert DML where
   appropriate
 - drops the database_spanner_use_mutations settings as mutations are no
   longer used
@chenba chenba force-pushed the refactor/use-spanner-upsert-stor-218 branch from 38caa71 to abc80e1 Compare May 21, 2026 19:15
Copy link
Copy Markdown
Collaborator

@taddes taddes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! Just a few minor nits.

by avoiding writing to the batch table, so not included here)

- Write 1664 (MAX_TOTAL_RECORDS) to `bsos`: max 19968 (1664 * 12) mutations
- Write 6665 (MAX_TOTAL_RECORDS) to `bsos`: max 79980 (6665 * 12) mutations
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll actually be 6,656 since we're preserving the same legacy pattern, basically 1664 * 4, giving us a tiny bit more headroom, as in the webservices-infra PR. No biggie tho

Suggested change
- Write 6665 (MAX_TOTAL_RECORDS) to `bsos`: max 79980 (6665 * 12) mutations
- Write 6656 (MAX_TOTAL_RECORDS) to `bsos`: max 79872 (6656 * 12) mutations

Totals:
- quota: False
4 + 19968 + 1 = 19973
4 + 79980 + 1 = 79985
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4 + 79980 + 1 = 79985
4 + 79872 + 1 = 79877


- quota: True
6 + 19968 + 1 + 6 = 19981
6 + 79980 + 1 + 6 = 79993
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
6 + 79980 + 1 + 6 = 79993
6 + 79872 + 1 + 6 = 79885

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking maybe we should delete this file and rely on the doc you are adding in #2327. No need to duplicate.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that works fine I think

Comment thread syncstorage-spanner/src/db/mod.rs Outdated
pjenvey
pjenvey previously approved these changes May 22, 2026
Copy link
Copy Markdown
Member

@pjenvey pjenvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad we revisited this, this is great --

It's obvious the benefit of having INSERT OR UPDATE available as DML vs solely in mutations -- the latter being so much more limiting not being intertwined with sql.

The only downside I can see to this change is we're now always reading/writing in the existing data vs not on selective updates (e.g. payload of None/NULL passed in). Is the Spanner query optimizer smart enough to e.g. not penalize an update that's directed to write the existing payload back into the row? (It'd be interesting to see the query execution plan here but I doubt it would answer that).

If not how much are we paying for that? At the very least it's costing us extra mutations, but that's totally worth the cost (we've been limiting ourselves to 20k and that's getting bumped soon anyway).

Approved but I added a couple nits

Comment thread syncstorage-spanner/src/db/mod.rs Outdated
bso: params::PostCollectionBso,
timestamp: SyncTimestamp,
) -> DbResult<()> {
use syncstorage_db_common::util::to_rfc3339;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can hoist this import out of here to the module level while you're here -- it was probably more convenient here in the past when this was a test only method

Comment thread syncstorage-spanner/src/db/mod.rs Outdated
.unwrap_or_else(null_value);

let mut row = ListValue::new();
row.set_values(RepeatedField::from_vec(vec![
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: RepeatedField has a From<Vec<T>> so you don't even need to use it here, you can just do into() on the vecs (it doesn't save much but protobuf is so verbose..)

@pjenvey
Copy link
Copy Markdown
Member

pjenvey commented May 22, 2026

FYI this also closes STOR-565

@chenba
Copy link
Copy Markdown
Collaborator Author

chenba commented May 23, 2026

The only downside I can see to this change is we're now always reading/writing in the existing data vs not on selective updates (e.g. payload of None/NULL passed in). Is the Spanner query optimizer smart enough to e.g. not penalize an update that's directed to write the existing payload back into the row?

Good point/question. I'm skeptical that'd be the case.

Let's keep an eye on the cost once this is deployed. 🤞

Co-authored-by: Taddes <tkorris@mozilla.com>
@chenba chenba requested review from pjenvey and taddes May 26, 2026 02:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants