Skip to content

XGraphics.FromPdfPage() corrupts pages with multi-stream /Contents arrays #345

@crwells

Description

@crwells

Description

Calling XGraphics.FromPdfPage() on an existing PDF page that has a /Contents array with multiple streams can produce a corrupted PDF. Adobe Acrobat reports "An error exists on this page. Acrobat may not
display the page correctly." Visual content may also be lost.

Simply opening and immediately disposing the XGraphics object (without drawing anything) is enough to trigger the corruption.

Versions affected

  • PDFsharp 6.1.1
  • PDFsharp 6.2.4

How to reproduce

The attached file example.pdf is a single-page PDF generated by "Microsoft Print to PDF". Its page has 3 content streams in the /Contents array (~200KB + ~200KB + ~65KB uncompressed).

using PdfSharp.Drawing;
using PdfSharp.Pdf.IO;

string inputPdf = @"example.pdf";
string outputPdf = @"output_after_xgraphics.pdf";

using var doc = PdfReader.Open(inputPdf, PdfDocumentOpenMode.Modify);
var page = doc.Pages[0];

// Just open and immediately close XGraphics — no drawing
using (var gfx = XGraphics.FromPdfPage(page)) { }

doc.Save(outputPdf);

Open output_after_xgraphics.pdf in Adobe Acrobat. It will display the error dialog:

"An error exists on this page. Acrobat may not display the page correctly. Please contact the person who created the PDF document to correct the problem."

The original example.pdf opens without error. We have generated thousands of PDF files in this same manner, and have only recently started seeing this issue on a few PDF files.

Workaround

Flatten the multi-stream /Contents array into a single stream before calling XGraphics.FromPdfPage():

private static void FlattenPageContentStreams(PdfDocument pdf)
{
  for (int i = 0; i < pdf.PageCount; i++)
  {
	  var page = pdf.Pages[i];
	  if (page.Contents.Elements.Count <= 1)
		  continue;

	  using var combined = new MemoryStream();
	  for (int j = 0; j < page.Contents.Elements.Count; j++)
	  {
		  var dict = page.Contents.Elements.GetDictionary(j);
		  var data = dict?.Stream?.UnfilteredValue;
		  if (data != null)
		  {
			  combined.Write(data, 0, data.Length);
			  combined.WriteByte((byte)'\n');
		  }
	  }

	  // Replace with a single uncompressed content stream
	  while (page.Contents.Elements.Count > 1)
		  page.Contents.Elements.RemoveAt(page.Contents.Elements.Count - 1);

	  var remaining = page.Contents.Elements.GetDictionary(0);
	  if (remaining != null)
	  {
		  remaining.Stream.Value = combined.ToArray();
		  remaining.Elements.SetInteger("/Length", (int)combined.Length);
		  remaining.Elements.Remove("/Filter");
	  }
  }
}

Environment

  • .NET 8.0, Windows 11
  • Test PDF generated by "Microsoft Print to PDF" printer driver

example.pdf

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions