PdfHandlerETC

PdfHandlerETC is a lightweight command-line and Python toolkit for handling common PDF tasks including text extraction, encryption, decryption, permissions inspection, word counting, page resizing, and file merging.

This project is released under the CC0 1.0 Public Domain Dedication.

Features

Extract text from PDFs by page or range
Encrypt and decrypt PDFs with customizable permissions
Count words across entire documents or selected pages
Inspect encryption status and permissions
Resize page dimensions
Merge two PDFs with optional visual separators (blank page or black bar)
Detect duplicate PDFs based on text content
Includes both a Python API and command-line interface (CLI)

Installation

Install from PyPI:

pip install pdfhandleretc

Command-Line Usage

After installation, you can use the pdfhandler CLI tool:

python -m pdfhandler wordcount document.pdf --pages "1, 3" > document_text.txt
python -m pdfhandler encrypt document.pdf --output secure.pdf
python -m pdfhandler decrypt secure.pdf --in-place
python -m pdfhandler permissions secure.pdf
python -m pdfhandler resize document.pdf 612 792 --output resized.pdf
python -m pdfhandler dupe-check file1.pdf file2.pdf
python -m pdfhandler merge intro.pdf appendix.pdf merged.pdf --add-separator black
python -m pdfhandler extract document.pdf --pages "1-3, 5"

Use --help for details:

python -m pdfhandler --help
python -m pdfhandler extract --help

Python Usage

from pdfhandler import PdfHandler

handler = PdfHandler("example.pdf")

# Extract text
text = handler.get_pdf_text("1-2, 4")
print(text)

# Word count
print("Words:", handler.word_count("1-3"))

# Encrypt the file
handler.encrypt(output="example-encrypted.pdf")

# Show permissions
handler.print_permissions()

# Resize pages
handler.resize(width=612, height=792, output_path="resized.pdf")

# Merge with a visual separator (black bar or blank page)
PdfHandler.merge_pdfs(
    "intro.pdf",
    "appendix.pdf",
    "merged.pdf",
    add_separator=True,
    separator_type="black"  # or "blank"
)

License

This project is licensed under the CC0 1.0 Universal public domain dedication. You may use, modify, and distribute it freely without attribution or restriction.

Dependencies

pdfminer.six - for text extraction
pikepdf - for encryption and PDF manipulation
colorama - for cross-platform terminal colors

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github		.github
docs		docs
pdfhandler		pdfhandler
scripts		scripts
.coverage		.coverage
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
readthedocs.yml		readthedocs.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PdfHandlerETC

Features

Installation

Command-Line Usage

Python Usage

License

Dependencies

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PdfHandlerETC

Features

Installation

Command-Line Usage

Python Usage

License

Dependencies

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages