feat: Add nexojornal.com.br custom parser by jocmp · Pull Request #176 · jocmp/mercury-parser

jocmp · 2026-05-02T02:28:18Z

Adds a custom Mercury parser for Nexo Jornal, a Brazilian Portuguese news outlet, to support jocmp/capyreader#1914.

Notes

Nexo Jornal is a Next.js single-page app: the article body is rendered client-side from the __NEXT_DATA__ JSON blob, and Mercury strips <script> tags before extraction runs. As a result, the parser relies entirely on the head meta tags that Mercury normalizes (property -> name, content -> value):

title: og:title (the title cleaner strips the " - Nexo Jornal" suffix)
dek: og:description
date_published: articlePublishTime
lead_image_url: og:image
author: null (the only available editor meta tag is the section editor, not the article author)
content: #__next selector (empty server-side, so Mercury returns null with fallback: false)

This still gives Capyreader clean title, description, image, and date for items in the RSS feed.

Test plan

npx jest src/extractors/custom/www.nexojornal.com.br/index.test.js passes

feat: Add nexojornal.com.br custom parser

75338d9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add nexojornal.com.br custom parser#176

feat: Add nexojornal.com.br custom parser#176
jocmp wants to merge 1 commit into
mainfrom
jc/1914/nexojornal-parser

jocmp commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jocmp commented May 2, 2026

Notes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant