Skip to content

json_web_signature_test.json: tcIds 357/367/370 are byte-identical with contradictory result labels #236

Description

@KEIJOT

Hi maintainers,

While scanning testvectors_v1/ for test vectors with identical inputs
but divergent result labels, I found one genuine contradiction in
json_web_signature_test.json that looks like a data-quality bug rather
than an intentional dual-oracle case. Filing for awareness; impact is
low.

The three tests

All three live in the same testGroups[21] (type=JsonWebSignature,
HS256 key = 32 zero bytes, group comment "base64"):

tcId result comment
357 valid ValidMac
367 invalid invalidBase64Padding
370 invalid invalidBase64PaddingInPayload

All three carry the byte-identical jws field:

eyJraWQiOiJoczI1Ni1rZXkiLCJhbGciOiJIUzI1NiJ9.VGVzdA.c1LROH7eNQwUT8KMVEO52VC3WZ9e_AnDWbZ7aMmowV8

SHA-256 of this string: 2f3785868159e2bd3efd39767356c8bf0a93e4aa72cb212663fa18f8e2c84822.
Length: 95 bytes. Segment lengths: [44, 6, 43], no = in any segment.
The HMAC-SHA256 of "<hdr>.<payload>" with the group's private.k
matches the signature segment exactly — it is the canonical
RFC 7515 §2 Compact Serialization, exactly as tcId 357 describes.

Live permalink (main branch, re-verified 2026-04-21):
testvectors_v1/json_web_signature_test.json — search group 21,
tcIds 357, 367, 370.

What the comments on 367/370 imply and why the bytes contradict them

Every other variant in group 21 mutates its target segment by
inserting bytes (spaces, invalid characters, etc.) and records a
correspondingly different segment-length pair in the emitted jws.
The two *Base64Padding* variants are the only exceptions — they
share the baseline [44, 6, 43] structure rather than showing a
mutation-inserted = somewhere in the payload or signature segment.

Full length-vs-mutation table for group 21:

tcId comment segment lengths mutation visible in bytes?
357 ValidMac (baseline) [44, 6, 43] n/a
360 rejectsSpacesInMac [44, 6, 47] yes (sig +4)
361 rejectsInvalidCharacterInsertedInMac [44, 6, 44] yes (sig +1)
365 spacesInHeader [48, 6, 43] yes (hdr +4)
366 invalidCharactersInHeader [48, 6, 43] yes (hdr +4)
367 invalidBase64Padding [44, 6, 43] NO
368 spacesInPayload [44, 10, 43] yes (pay +4)
369 invalidCharactersInPayload [44, 10, 43] yes (pay +4)
370 invalidBase64PaddingInPayload [44, 6, 43] NO
374 ModifiedUnusedBitsInPayload [44, 2, 43] yes

The baseline (tcId 357) can't have any = padding to "make invalid"
because RFC 7515 §2 forbids it in the first place — the canonical
form has no = in any segment. So the intended mutation was likely
"insert a spurious = in the payload / signature segment" rather
than "modify an existing =". For tcIds 367 and 370 the serialized
jws field does not carry that mutation.

Plausible generator-side explanation

The strongest reading is a dropped-mutation bug in the vector
generator. Candidate shapes:

  1. A string.replace('=', ...) applied to the already-stripped
    base64url form — no-op on the baseline.
  2. A mutation step that targets a pre-encoding representation but a
    later canonicalization pass re-strips the padding before the jws
    field is serialized.
  3. A template that was reused for "invalid padding" after first
    being used for "invalid characters" but where the character-insert
    step wasn't re-parameterized for padding-insert.

I was unable to inspect the generator source because the public
C2SP/wycheproof tree is data-only today. The 2019-era legacy tree
(tag wycheproof-v0-vectors) contained Java JUnit consumer
harnesses under java/com/google/security/wycheproof/testcases/ but
no test-vector generator. The source.name = "google-wycheproof"
stamp on group 21 and the version = "0.3" in the group header
suggest the vectors were generated by Google-internal tooling that
was never open-sourced. PR #160 (merged 2025-09-01)
moved these JSON files from testvectors/ to testvectors_v1/
byte-for-byte; the bug was preserved rather than introduced.

Impact — low-priority; scope-honest note

In-scope impact (what is definitely true):

  • Any library that consumes json_web_signature_test.json and treats
    all three tcIds as conformance requirements will see a contradiction
    (identical input, contradictory oracles) and cannot be simultaneously
    conformant to all three.
  • The intended attack class — "library silently tolerates a spurious
    = padding character inside a base64url-encoded JWS segment" — is
    not exercised anywhere else in json_web_signature_test.json that I
    found, so the coverage gap is real for any future integrator who
    assumed this suite exercised it.

Out-of-scope (what I specifically am NOT claiming):

  • No live JWT libraries currently integrate Wycheproof. I checked
    the main-branch source of jpadilla/pyjwt, mpdavis/python-jose,
    auth0/node-jsonwebtoken, panva/jose, golang-jwt/jwt, and
    jwt/ruby-jwt on 2026-04-20 and found zero references to
    "wycheproof" in any of them. The vector data bug did not cause
    any known library vulnerability; library-level test-coverage gaps
    exist independently of this JSON issue. This filing is for future
    integrators and for corpus integrity, not as a root-cause claim on
    any downstream project.

  • Wycheproof remains integrated and valuable at the primitive crypto
    layer (openssl, pyca/cryptography, liboqs, golang/go). Those suites
    are unaffected by this specific JWS data issue.

Proposed fix

Re-apply the intended =-padding-insertion mutation and regenerate
the jws field for tcIds 367 and 370 so the bytes reflect the
comment. Two sensible choices:

  • tcId 367 (invalidBase64Padding): append = into the signature
    segment mid-string, e.g. sig with = at position N, so the serialized
    jws differs from the baseline and the result: invalid label is
    justified.
  • tcId 370 (invalidBase64PaddingInPayload): same mutation in the
    payload segment.

Either choice resolves the contradiction with tcId 357 and restores
the coverage of the intended attack class.

Reproduction

import json, hashlib
data = json.load(open("testvectors_v1/json_web_signature_test.json"))
g = data["testGroups"][21]
jws_seen = {}
for t in g["tests"]:
    if t["tcId"] in (357, 367, 370):
        jws_seen.setdefault(t["jws"], []).append((t["tcId"], t["result"], t["comment"]))
for jws, entries in jws_seen.items():
    print("SHA256:", hashlib.sha256(jws.encode()).hexdigest())
    print("Length:", len(jws))
    for e in entries: print(" ", e)

Expected output: one SHA-256 / three entries with result values of
valid, invalid, invalid.

What I tested and what I did not

Tested: the testvectors_v1/json_web_signature_test.json file at
C2SP/wycheproof main (re-verified 2026-04-21), plus the same file
in the legacy testvectors/ tree at tag wycheproof-v0-vectors.
Did not test: the generator source (not public), Wycheproof's
primitive suites, other JOSE test files in the corpus.

Thanks for maintaining the test-vector project.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions