security bugfix for Phi4VisionProcessor by prathikr · Pull Request #1034 · microsoft/onnxruntime-extensions

prathikr · 2026-03-05T22:21:01Z

A critical heap buffer overflow vulnerability exists in the ONNX Runtime Extensions library, specifically in the Phi4VisionProcessor::Compute method within image_transforms_phi_4.hpp. The issue arises from a parser confusion attack where a CMYK JPEG image bypasses RGB sanitization, leading to an out-of-bounds memory copy. The root cause includes a sanitization bypass in DecodeImage and a buffer miscalculation in Phi4VisionProcessor. Exploitation of this vulnerability can result in Remote Code Execution (RCE) due to heap corruption.

Copilot

Pull request overview

Addresses a reported heap buffer overflow in the Phi-4 vision preprocessing pipeline by preventing non-RGB (notably CMYK JPEG) inputs from propagating into transforms that assume 3-channel data.

Changes:

Add channel-count validation in Phi4VisionDynamicPreprocess to reject non-3-channel tensors.
Add JPEG decoder validation to reject unsupported JPEG color spaces based on output_components.
Add a Python regression test that constructs a CMYK JPEG and asserts the pipeline rejects it.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
`test/test_pp_api.py`	Adds a security regression test to ensure CMYK JPEG inputs are rejected.
`shared/api/image_transforms_phi_4.hpp`	Rejects non-3-channel tensors at the Phi-4 dynamic preprocess entry point.
`operators/vision/image_decoder.hpp`	Rejects JPEGs that decode to an unexpected channel count (e.g., CMYK=4).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-06T23:07:18Z

operators/vision/image_decoder.hpp

      // Start decompression
      jpeg_start_decompress(&cinfo);

+      // Security: reject non-RGB/grayscale color spaces (e.g. CMYK has 4 channels).
+      // Allowing other channel counts propagates attacker-controlled tensor shapes downstream,
+      // causing heap overflows in consumers that assume 3 channels (CVE-class: CWE-122).
+      if (cinfo.output_components != 3 && cinfo.output_components != 1) {
+        jpeg_destroy_decompress(&cinfo);
+        return {kOrtxErrorInvalidArgument,
+                "[ImageDecoder]: Unsupported JPEG color space. Only RGB and grayscale are supported."};


The JPEG path allows output_components == 1 (grayscale), but DecodePNG always returns 3 channels and downstream code (e.g. DecodeImage BGR swap logic and many vision transforms) assumes 3-channel data. Returning a 1-channel tensor here can lead to out-of-bounds access/corruption when later code indexes as if channels==3. Consider forcing JPEG decode output to RGB (3 channels) or rejecting grayscale JPEGs to keep the output shape consistent and safe.

Suggested change

// Start decompression

jpeg_start_decompress(&cinfo);

// Security: reject non-RGB/grayscale color spaces (e.g. CMYK has 4 channels).

// Allowing other channel counts propagates attacker-controlled tensor shapes downstream,

// causing heap overflows in consumers that assume 3 channels (CVE-class: CWE-122).

if (cinfo.output_components != 3 && cinfo.output_components != 1) {

jpeg_destroy_decompress(&cinfo);

return {kOrtxErrorInvalidArgument,

"[ImageDecoder]: Unsupported JPEG color space. Only RGB and grayscale are supported."};

// Force libjpeg to output 3-channel RGB regardless of the original color space

// (e.g., grayscale and YCbCr will be expanded/converted to RGB). This keeps the

// output tensor shape consistent with PNG decoding and downstream 3-channel

// assumptions (H x W x 3).

cinfo.out_color_space = JCS_RGB;

// Start decompression

jpeg_start_decompress(&cinfo);

// Security: reject non-3-channel output. Allowing other channel counts propagates

// attacker-controlled tensor shapes downstream, causing heap overflows in

// consumers that assume 3 channels (CVE-class: CWE-122).

if (cinfo.output_components != 3) {

jpeg_destroy_decompress(&cinfo);

return {kOrtxErrorInvalidArgument,

"[ImageDecoder]: Unsupported JPEG color space. Decoder must produce RGB (3 channels)."};

Copilot · 2026-03-06T23:07:18Z

operators/vision/image_decoder.hpp

+      // Security: reject non-RGB/grayscale color spaces (e.g. CMYK has 4 channels).
+      // Allowing other channel counts propagates attacker-controlled tensor shapes downstream,
+      // causing heap overflows in consumers that assume 3 channels (CVE-class: CWE-122).
+      if (cinfo.output_components != 3 && cinfo.output_components != 1) {
+        jpeg_destroy_decompress(&cinfo);
+        return {kOrtxErrorInvalidArgument,
+                "[ImageDecoder]: Unsupported JPEG color space. Only RGB and grayscale are supported."};
+      }
+


There is an early-return error path later in this function (if (srcManager.extError != kOrtxOK) return ...;) that exits without calling jpeg_finish_decompress/jpeg_destroy_decompress, which can leak libjpeg resources. Since this change adds a new early return with cleanup, it would be good to apply the same cleanup pattern to the existing extError return as well (e.g., ensure jpeg_destroy_decompress(&cinfo) runs on all return paths after jpeg_create_decompress).

Copilot · 2026-03-06T23:07:19Z

shared/api/image_transforms_phi_4.hpp

+    // Security: reject non-RGB inputs (e.g. CMYK JPEGs decoded as 4-channel tensors).
+    // Downstream allocation is hardcoded to 3 channels, so a 4-channel input would cause
+    // memcpy to write past the buffer end, enabling heap corruption / RCE (CWE-122).
+    int64_t c = ts_image.Shape()[2];
+    if (c != 3) {
+      return {kOrtxErrorInvalidArgument,
+              "[Phi4VisionProcessor]: Expected 3-channel RGB input, got " + std::to_string(c) + " channels"};
+    }


This check protects Phi4VisionDynamicPreprocess, but Phi4VisionProcessor::Compute in the same header still hardcodes 3 channels when allocating/copying tensors. For defense-in-depth (and to match the PR’s security goal), consider adding the same channels == 3 validation in Phi4VisionProcessor::Compute so it remains safe even if a non-RGB tensor reaches it through another path.

Copilot · 2026-03-06T23:07:19Z

test/test_pp_api.py

+            f.write(buf.getvalue())
+            tmp_path = f.name
+        try:
+            with self.assertRaises(Exception, msg="CMYK JPEG must be rejected, not silently overflow the heap"):


The test uses assertRaises(Exception), which is very broad and can mask unrelated failures (e.g., file/JSON issues) while still passing. It would be more reliable to assert a narrower exception type (e.g., the RuntimeError raised by the pybind layer) and/or validate the error message (e.g., via assertRaisesRegex) to ensure the failure is specifically due to CMYK rejection.

Suggested change

with self.assertRaises(Exception, msg="CMYK JPEG must be rejected, not silently overflow the heap"):

with self.assertRaises(RuntimeError, msg="CMYK JPEG must be rejected, not silently overflow the heap"):

Copilot · 2026-03-06T23:07:19Z

test/test_pp_api.py

+        proc = pp_api.create_processor(phi4_proc_path)
+        buf = io.BytesIO()
+        Image.new("CMYK", (64, 64), color=(128, 64, 32, 255)).save(buf, format="JPEG")
+        with tempfile.NamedTemporaryFile(suffix=".jpg", delete=False) as f:
+            f.write(buf.getvalue())
+            tmp_path = f.name
+        try:
+            with self.assertRaises(Exception, msg="CMYK JPEG must be rejected, not silently overflow the heap"):
+                images = pp_api.load_images([tmp_path])
+                pp_api.image_pre_process(proc, images)
+        finally:
+            os.unlink(tmp_path)


This test creates a native processor via pp_api.create_processor(...) but never calls pp_api.delete_object(proc), so the underlying C++ object may leak for the duration of the test run. Consider using pp_api.ImageProcessor(phi4_proc_path) (which deletes in __del__) or explicitly deleting the processor handle in the finally block.

Suggested change

proc = pp_api.create_processor(phi4_proc_path)

buf = io.BytesIO()

Image.new("CMYK", (64, 64), color=(128, 64, 32, 255)).save(buf, format="JPEG")

with tempfile.NamedTemporaryFile(suffix=".jpg", delete=False) as f:

f.write(buf.getvalue())

tmp_path = f.name

try:

with self.assertRaises(Exception, msg="CMYK JPEG must be rejected, not silently overflow the heap"):

images = pp_api.load_images([tmp_path])

pp_api.image_pre_process(proc, images)

finally:

os.unlink(tmp_path)

proc = None

buf = io.BytesIO()

Image.new("CMYK", (64, 64), color=(128, 64, 32, 255)).save(buf, format="JPEG")

with tempfile.NamedTemporaryFile(suffix=".jpg", delete=False) as f:

f.write(buf.getvalue())

tmp_path = f.name

try:

proc = pp_api.create_processor(phi4_proc_path)

with self.assertRaises(Exception, msg="CMYK JPEG must be rejected, not silently overflow the heap"):

images = pp_api.load_images([tmp_path])

pp_api.image_pre_process(proc, images)

finally:

os.unlink(tmp_path)

if proc is not None:

pp_api.delete_object(proc)

kunal-vaishnavi · 2026-03-06T23:25:26Z

Thanks for the fix! Have we tested that E2E still works fine with ORT GenAI? If not, the simplest way is to create a dev branch, change the Extensions commit in GenAI's cmake/deps.txt file, and build GenAI from source before running any of the E2E examples.

fix

a511541

prathikr requested a review from a team as a code owner March 5, 2026 22:21

vraspar requested a review from Copilot March 6, 2026 23:03

Copilot started reviewing on behalf of vraspar March 6, 2026 23:04 View session

Copilot AI reviewed Mar 6, 2026

View reviewed changes

file loc

90e65fe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security bugfix for Phi4VisionProcessor#1034

security bugfix for Phi4VisionProcessor#1034
prathikr wants to merge 2 commits intomainfrom
prathikrao/cmyk-rcb-security-fix

prathikr commented Mar 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

kunal-vaishnavi commented Mar 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	with self.assertRaises(Exception, msg="CMYK JPEG must be rejected, not silently overflow the heap"):
	with self.assertRaises(RuntimeError, msg="CMYK JPEG must be rejected, not silently overflow the heap"):

Conversation

prathikr commented Mar 5, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

kunal-vaishnavi commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kunal-vaishnavi commented Mar 6, 2026 •

edited

Loading