Skip to content

[Android] Add missing JNI exception checks in crypto PAL#128777

Draft
simonrozsival wants to merge 7 commits into
mainfrom
dev/simonrozsival/android-pal-jni-exception-checks-pr1
Draft

[Android] Add missing JNI exception checks in crypto PAL#128777
simonrozsival wants to merge 7 commits into
mainfrom
dev/simonrozsival/android-pal-jni-exception-checks-pr1

Conversation

@simonrozsival
Copy link
Copy Markdown
Member

Note

This PR description was drafted by an AI assistant (GitHub Copilot CLI) based on the actual code changes in this branch. All code in this PR was authored interactively with Simon driving design decisions through multiple review rounds.

Summary

This PR adds missing JNI exception checks across several files in the Android cryptography PAL. The work is based on a static audit of all Call*Method/NewObject/GetArrayLength-style JNI calls and targets sites where a pending Java exception could silently leak back into managed code.

Beyond adding the exception checks, the changed functions are tightened to follow three patterns already established elsewhere in the PAL:

  1. ON_EXCEPTION_PRINT_AND_GOTO(label) immediately after each fallible JNI call. The macro both prints and clears the exception, so subsequent JNI calls along the cleanup/error path are safe.
  2. INIT_LOCALS(loc, …) + single cleanup:/RELEASE_LOCALS(loc, env) tail for local-ref hygiene, replacing ad-hoc jobject foo = NULL; declarations and scattered DeleteLocalRef calls.
  3. No partial output state on failure — results are computed into local variables and only promoted to caller-visible out parameters / global refs on the success path, so the caller never observes a half-populated result.

Files changed (10)

All under src/native/libs/System.Security.Cryptography.Native.Android/:

File Lines Functions
pal_cipher.c +24 / −17 ReinitializeCipher
pal_dsa.c +56 / −31 DsaSizeP, GetDsaParameters, GetQParameter, DsaKeyCreateByExplicitParameters
pal_ecc_import_export.c +98 / −68 GetECKeyParameters, GetECCurveParameters, CreateKeyPairFromCurveParameters, ConvertBigIntegerToPositiveInt32, EcKeyCreateByExplicitParameters
pal_ecdh.c +25 / −37 EcdhDeriveKey
pal_eckey.c +17 / −6 EcKeyGetSize
pal_hmac.c +33 / −29 HmacCreate
pal_ssl.c +4 / −0 SSLGetSupportedProtocols
pal_sslstream.c +25 / −7 Close, DoUnwrap, FreeSSLStream, SSLStreamInitialize, AddCertChainToStore, SSLStreamWrite
pal_x509.c +2 / −0 X509DecodeCollection, X509ExportPkcs7
pal_x509store.c +88 / −17 ContainsEntryForAlias, ContainsMatchingCertificateForAlias, X509StoreAddCertificate, X509StoreAddCertificateWithPrivateKey, X509StoreContainsCertificate, X509StoreRemoveCertificate, EnumerateCertificates, SystemAliasFilter

Behavior changes worth flagging

  • AndroidCryptoNative_GetDsaParameters / AndroidCryptoNative_GetECKeyParameters now zero out their out length parameters (*pLength, *qLength, *cbQx, etc.) at function entry. Previously these were set on success or left untouched on failure. No managed caller relies on the prior partial-write behavior.
  • ContainsEntryForAlias / ContainsMatchingCertificateForAlias (pal_x509store.c) — return type changed from bool to int32_t (SUCCESS/FAIL) with a separate *contains/*matches out-parameter, so callers can distinguish "lookup failed due to JNI exception" from "lookup succeeded; not contained". Callers are updated accordingly.
  • FreeSSLStream now NULL-checks sslStream->managedContextCleanup before invoking it. This guards the case where SSLStreamCreate succeeded but SSLStreamInitialize was never called (or failed before assigning the callback fields).
  • SSLStreamInitialize assigns managedContextHandle / streamReader / streamWriter / managedContextCleanup only on the success path. Combined with the FreeSSLStream NULL-check above, this means partial-init failure no longer dispatches the cleanup callback for a handle that was never registered.

Out of scope

A few pre-existing bugs were surfaced during review but intentionally left out of scope to keep this PR focused on the exception-check work:

  • pal_x509store.c:115,141,186EntryFlags_HasCertificate & EntryFlags_MatchesCertificate should be | (dead code since 2021).
  • pal_ecc_import_export.c:563AddGRef should be ToGRef (intermediate lref leak).
  • pal_ecc_import_export.c:566-570CheckJNIExceptions only runs when keyPair is NULL.
  • pal_x509.c X509DecodeCollection — per-iteration gref leak on failure.

These will be addressed as separate targeted fixes.

Related

Other open PRs in the same area (Android crypto PAL / JNI error handling):

Testing

Suite Result
System.Security.Cryptography.Tests (Android arm64 Debug emulator) 12373 passed / 0 failed
System.Net.Security.Tests (Android arm64 Debug emulator) 5124 passed / 4 failed (matches pre-PR baseline — unrelated DelayedCertificate/ALPN tests)
Native build (Android arm64 Debug) 0 warnings / 0 errors at HEAD

Functional tests were validated at commit b86598b0381. The final commit (395fdb3d023) is a 3-line macro substitution (RELEASE_LOCALS_ENV(loc, ReleaseLRef)RELEASE_LOCALS(loc, env)) and was rebuilt cleanly; the two macro forms are functionally equivalent. CI will provide cross-architecture validation (arm, x86, x64) and additional API levels not covered locally.

simonrozsival and others added 6 commits May 29, 2026 13:09
This is PR 1 of 3 addressing findings from a static audit of the Android
crypto PAL (libSystem.Security.Cryptography.Native.Android). It targets
the 26 HIGH-severity findings on files that are not touched by any other
in-flight PR. Two follow-up PRs will address the remaining HIGH findings
on pal_x509chain.c and pal_x509store.c (after merging dependencies)
and the MEDIUM/LOW findings.

Files affected (10):
  pal_cipher.c               (1 HIGH)
  pal_dsa.c                  (4 HIGH)
  pal_ecc_import_export.c    (8 HIGH)
  pal_ecdh.c                 (1 HIGH)
  pal_eckey.c                (1 HIGH)
  pal_hmac.c                 (1 HIGH + a gref leak on failure path)
  pal_ssl.c                  (3 HIGH)
  pal_sslstream.c            (5 HIGH + 2 tightly coupled bug fixes)
  pal_x509.c                 (2 HIGH)
  pal_x509store.c            (4 HIGH + 1 partial)

Patterns applied
----------------
- Pattern B (chained Call*Method without intermediate checks): inserted
  ON_EXCEPTION_PRINT_AND_GOTO between successive throwing JNI calls so
  later calls never run with a pending exception or on a NULL receiver
  (undefined behavior today, varying by CheckJNI vs. release).

- Pattern C (pending exception leaks back to .NET): added explicit
  CheckJNIExceptions or TryClearJNIExceptions at the relevant return
  paths so callers see a clean FAIL rather than a poisoned env.

- Internal helpers ContainsEntryForAlias and
  ContainsMatchingCertificateForAlias in pal_x509store.c were refactored
  to return int32_t with an out-param so JNI exceptions can no longer be
  swallowed by collapsing them into a bool result.

Tightly coupled bug fixes in pal_sslstream.c
--------------------------------------------
- FreeSSLStream: NULL-check managedContextCleanup before calling it.
  This crash is reachable today on main: when InitializeSslContext
  throws (e.g. EncryptionPolicy.NoEncryption), Dispose runs on a
  zero-initialized SSLStream and dereferences a NULL function pointer.
  Verified by running System.Net.Security.Tests against pristine main:
  same SIGSEGV at AndroidCryptoNative_SSLStreamRelease+48.

- SSLStreamInitialize: hoist managedContext / streamReader /
  streamWriter / managedContextCleanup assignments above the first
  goto so a failure during init leaves the struct in a consistent
  state for FreeSSLStream.

Out of scope
------------
- Explicit abort_unless / abort_if_invalid_pointer_argument on JNI
  pointers in pal_evp.c, pal_ecdsa.c, and pal_dsa.c:192 are left
  as-is per the directive to preserve existing predictable abort
  behavior.

Validation
----------
- ./build.sh libs.native -os android -arch arm64 -c Debug : OK
- ./build.sh clr+libs+host -rc release -lc release         : OK (baseline)
- System.Security.Cryptography.Tests on emulator-5554 :
    Tests run: 11737 Passed: 10738 Failed: 0 Ignored: 658 Skipped: 977
- System.Net.Security.Tests on R58Y30HZ65V (API 36) :
    Tests run: 5124 Passed: 4854 Failed: 4 Ignored: 234 Skipped: 32
  The four failures are pre-existing on main (verified by re-running
  the same tests against a pristine main native lib): two
  CertificateSelectionCallback_DelayedCertificate_OK cases that report
  SkipTestException as FAIL, and two AlpnListTotalSizeExceedsLimit_Throws
  cases where Android wraps ArgumentException in AuthenticationException.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In response to review feedback on PR #1, refactor functions touched by
the previous commit so that they:

- Use ON_EXCEPTION_PRINT_AND_GOTO(cleanup) with a single cleanup label
  instead of bare `if (CheckJNIExceptions(env))` blocks where a
  function has resources to release on either the success or error path.
- Replace raw DeleteLocalRef calls with the NULL-safe ReleaseLRef
  wrapper. This makes it safe to fall through to cleanup with partially
  initialized locals.
- Unify duplicated success/error cleanup blocks into a single cleanup
  section driven by a `ret` (or `success`) sentinel.
- Place the exception check immediately after the JNI call that may
  have thrown, not after subsequent unrelated work.

While refactoring, two pre-existing local-ref / global-ref leaks were
incidentally fixed:

- pal_ecc_import_export.c GetECKeyParameters: the original `error`
  block zeroed the output pointers but did not ReleaseGRef the global
  refs allocated via ToGRef. The new cleanup releases them on failure.
- pal_ecc_import_export.c EcKeyCreateByExplicitParameters: `ec` and
  `keyPairGenerator` were only released on error, leaking on success.
  They are now in the INIT_LOCALS array so they are released in both
  cases.

Validated on the same Android arm64 emulator + physical device used for
the previous commit; System.Security.Cryptography.Tests reports 10738
passed / 0 failed and System.Net.Security.Tests reports 4854 passed /
4 pre-existing failures (DelayedCertificate SkipTestException + ALPN
AuthenticationException wrapping).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In response to the second round of review feedback, refactor the three
functions that write to multiple jobject* out parameters so that they
no longer leave the caller with a partially-populated output on
failure:

* pal_ecc_import_export.c GetECKeyParameters: keep all BigInteger
  results as local refs inside the INIT_LOCALS array and only promote
  them to global refs (via AddGRef) when the entire function succeeds.
  The cleanup path no longer needs to release global refs because none
  were created on the failure path.

* pal_dsa.c GetDsaParameters: extend INIT_LOCALS with pBn, qBn, gBn,
  yBn, xBn; compute byte lengths into local int32_t variables; promote
  to *p/*q/*g/*y/*x only on success.

* pal_cipher.c ReinitializeCipher: move ivBytes to function scope and
  put the exception check immediately after each NewObject call (not
  after ReleaseLRef). Cleanup releases ivBytes alongside the other
  locals so failure paths no longer leak it.

Validated on the same Android arm64 emulator + physical device:
  System.Security.Cryptography.Tests   10738 passed / 0 failed
  System.Net.Security.Tests             4854 passed / 4 pre-existing

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Generalize the previous round's principle (only write outputs on the
success path) to the remaining functions modified by this PR:

* pal_eckey.c EcKeyGetSize: use a local `size` variable for the
  CallIntMethod result and only assign `*keySize = size` once we
  reach the success path. The init-time `*keySize = 0` covers all
  failure paths so the cleanup-time re-zero is removed.

* pal_x509store.c ContainsEntryForAlias: only copy `*contains` and
  `*flags` from the locals when `ret == SUCCESS`. On failure the
  caller already ignores them, but matching the documented behaviour
  avoids surprising future readers.

* pal_ecdh.c EcdhDeriveKey: drop the cleanup-time
  `if (ret != SUCCESS) *usedBufferLength = 0;` since the function
  initializes `*usedBufferLength = 0` at entry and only writes the
  final value adjacent to `ret = SUCCESS`.

Validated on the same Android arm64 emulator + physical device:
  System.Security.Cryptography.Tests   10738 passed / 0 failed
  System.Net.Security.Tests             4854 passed / 4 pre-existing

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… state in SSLStreamInitialize

Refactor ReinitializeCipher, EcdhDeriveKey and CryptoNative_HmacCreate
to manage their jobject locals via INIT_LOCALS + RELEASE_LOCALS_ENV
instead of ad-hoc stack locals released one-by-one in cleanup. This
matches the convention used elsewhere in the file (e.g.
SSLStreamSetTargetHost, GetQParameter, GetDsaParameters).

Restore AndroidCryptoNative_SSLStreamInitialize to its original
'assign callback/handle fields at the end on the success path'
pattern so that a failed Initialize does not leave the SSLStream in
a partial state. The previous up-front assignment was unnecessary:
FreeSSLStream now NULL-checks managedContextCleanup before invoking
it, which is the correct safeguard for the partial-init case.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ng code

RELEASE_LOCALS(loc, env) is the more common convention across the
crypto PAL (35 uses vs 15 for RELEASE_LOCALS_ENV) and makes the env
dependency visible at the call site. The two macros are functionally
equivalent for releasing local refs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@simonrozsival
Copy link
Copy Markdown
Member Author

/azp run runtime-android

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to 'arch-android': @vitek-karas, @simonrozsival, @steveisok, @akoeplinger
See info in area-owners.md if you want to be subscribed.

@github-actions

This comment has been minimized.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Defensive hardening of the Android System.Security.Cryptography.Native.* PAL to add missing JNI exception checks across ~10 files. Per-function patterns are aligned with existing PAL conventions: ON_EXCEPTION_PRINT_AND_GOTO after each fallible JNI call, INIT_LOCALS + single cleanup:/RELEASE_LOCALS tail, and writing caller-visible out/global-ref state only on the success path. A couple of related correctness fixes ride along (FreeSSLStream NULL-checks managedContextCleanup; SystemAliasFilter handles a NULL return from GetStringUTFChars).

Changes:

  • Add ON_EXCEPTION_PRINT_AND_GOTO(cleanup/error) after JNI calls in cipher/dsa/ec/ecdh/eckey/hmac/ssl/sslstream/x509/x509store, and convert ad-hoc local-ref cleanup to INIT_LOCALS/RELEASE_LOCALS tails.
  • Re-architect ContainsEntryForAlias / ContainsMatchingCertificateForAlias to return SUCCESS/FAIL with a separate out-parameter so callers can distinguish JNI failure from "not contained"; update all callers.
  • SSLStreamInitialize now assigns managedContextHandle/callbacks only on the success path, and FreeSSLStream skips invoking managedContextCleanup when it was never registered.
Show a summary per file
File Description
pal_cipher.c Convert ReinitializeCipher to INIT_LOCALS/cleanup pattern with exception checks.
pal_dsa.c Add exception checks across DSA size/parameter/key paths; only promote to globals on success.
pal_ecc_import_export.c Same hardening for EC import/export; zero out out parameters at entry.
pal_ecdh.c Convert EcdhDeriveKey to INIT_LOCALS + single cleanup.
pal_eckey.c Add exception checks in EcKeyGetSize with proper FAIL propagation.
pal_hmac.c Convert HmacCreate to INIT_LOCALS + success-flag pattern.
pal_ssl.c Add exception checks in SSLGetSupportedProtocols and a cleanup: label.
pal_sslstream.c Add exception checks in Close/DoUnwrap/Write/Init/AddCertChainToStore; NULL-guard managedContextCleanup in FreeSSLStream; defer callback-field assignment to success path.
pal_x509.c Add exception checks after Collection.size and ArrayList construction.
pal_x509store.c Change Contains helpers to SUCCESS/FAIL with out-parameter; update callers; handle NULL GetStringUTFChars in SystemAliasFilter.

Copilot's findings

  • Files reviewed: 10/10 changed files
  • Comments generated: 1

Comment thread src/native/libs/System.Security.Cryptography.Native.Android/pal_sslstream.c Outdated
ON_EXCEPTION_PRINT_AND_GOTO calls CheckJNIExceptions, which clears the
pending exception. After a goto into the cleanup block, the subsequent
CheckJNIExceptions(env) at the end of the block always returned false
(the exception was already cleared), so DoUnwrap silently continued
past a failed NewDirectByteBuffer with netInBuffer in an unchanged
state and position == 0. The unwrap then proceeded against stale
buffer state instead of returning SSLStreamStatus_Error.

Restructure DoUnwrap to follow the convention used elsewhere in this
file (e.g. SSLStreamRead): a single 'cleanup:' label at the end with
'return ret;', and all early exits going through 'goto cleanup' after
setting 'ret'. This also adds the missing exception check after
ByteBuffer.put().

Reported by Copilot review on PR #128777.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Note

This review was generated by an AI assistant (GitHub Copilot).

🤖 Copilot Code Review — PR #128777

Holistic Assessment

Motivation: Justified and important. Missing JNI exception checks in native Android PAL code can cause CheckJNI aborts or allow pending exceptions to leak back into managed code, producing undefined behavior. The static audit approach is thorough.

Approach: Sound. The PR consistently applies three well-established patterns already used elsewhere in this PAL: ON_EXCEPTION_PRINT_AND_GOTO, INIT_LOCALS/RELEASE_LOCALS for local-ref hygiene, and no-partial-output-on-failure. The refactoring toward a single cleanup path per function is a clear correctness improvement.

Summary: ✅ LGTM. The changes are mechanically consistent, well-scoped, and follow existing conventions. The behavioral changes (return-type widening for ContainsEntryForAlias/ContainsMatchingCertificateForAlias, NULL-check in FreeSSLStream, deferred output promotion) are all sensible hardening. No blocking issues found.


Detailed Findings

✅ JNI Exception Check Coverage

The added ON_EXCEPTION_PRINT_AND_GOTO calls correctly cover every Call*Method, NewObject, CallStaticObjectMethod, GetArrayLength, and CallIntMethod site that was previously unchecked. The placement is immediately after each fallible JNI call, consistent with the macro's contract (it checks and clears the pending exception).

✅ Local Reference Management

Conversion from ad-hoc jobject foo = NULL + scattered DeleteLocalRef to INIT_LOCALS/RELEASE_LOCALS is correct. All locals that previously needed manual cleanup are now tracked in the loc[] array and released uniformly at the cleanup label.

✅ Output Parameter Safety

Functions like GetDsaParameters, GetECKeyParameters now zero out-parameters at entry and only promote local refs to global refs (via AddGRef) on the success path. This ensures callers never observe dangling or half-populated state on failure.

FreeSSLStream NULL-check

The added if (sslStream->managedContextCleanup != NULL) guard correctly handles the case where SSLStreamCreate succeeded but SSLStreamInitialize never ran (or failed before assigning callbacks). The comment explains the scenario clearly.

DoUnwrap Refactoring

The previous review comment about NewDirectByteBuffer failure silently continuing is addressed in the current version: ret defaults to SSLStreamStatus_Error, so any goto cleanup before the success assignment correctly returns an error. The free(tmpNative) and ReleaseLRef(env, tmp) in cleanup are safe because both are initialized to NULL at function entry.

ContainsEntryForAlias / ContainsMatchingCertificateForAlias Return Type Change

The return type change from bool to int32_t (SUCCESS/FAIL) with a separate *contains/*matches out-parameter is a well-motivated API improvement. All callers are updated to check the return value before inspecting the out-parameter.

💡 Pre-existing & vs | Bug (Out of Scope)

EntryFlags_HasCertificate & EntryFlags_MatchesCertificate evaluates to 0 (since the enum values are 2 and 4), making the check (flags & matchesFlags) == matchesFlags always true. This is acknowledged in the PR description as a pre-existing bug to be fixed separately. Confirmed this is not introduced by this PR.

💡 EcdhDeriveKey — Leaked Return Value from KeyAgreementDoPhase

Line 147 in the diff: ReleaseLRef(env, (*env)->CallObjectMethod(env, loc[keyAgreement], g_KeyAgreementDoPhase, ...)) — if CallObjectMethod throws, ON_EXCEPTION_PRINT_AND_GOTO(cleanup) on the next line fires, but the ReleaseLRef call on the exception path receives NULL (since the JNI call failed). ReleaseLRef handles NULL safely (per ARGS_NON_NULL(1) — only env is required non-null), so this is safe but slightly unusual. The return value of DoPhase is intentionally discarded via ReleaseLRef which matches the original code's intent. No action needed.

💡 ConvertBigIntegerToPositiveInt32 — Manual Cleanup

This function uses a manually managed int32MaxBigInt local with ReleaseLRef in cleanup rather than INIT_LOCALS. This is fine given there's only one local to track, and the function is small enough that the overhead of INIT_LOCALS isn't warranted.


Overall: Clean, well-structured defensive hardening. The PR is focused, consistent with existing patterns, and explicitly documents known pre-existing issues that are out of scope. Ready for maintainer review.

Generated by Code Review for issue #128777 · ● 6.1M ·

@simonrozsival
Copy link
Copy Markdown
Member Author

/azp run runtime-android

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants