Skip to content

feature: HTML to Markdown (component & routes)#235

Closed
jbmoelker wants to merge 9 commits intomainfrom
feat/page-to-markdown
Closed

feature: HTML to Markdown (component & routes)#235
jbmoelker wants to merge 9 commits intomainfrom
feat/page-to-markdown

Conversation

@jbmoelker
Copy link
Copy Markdown
Member

Changes

Making our HTML pages available as Markdown is of interest to some people and especially suitable for software such as LLM's. These changes make HTML content available as Markdown.

  • Adds a ToMarkdown component to render HTML as Markdown (works both run- and build-time).
  • Adds a [locale]/[...path]/index.md.astro route to render matching route as Markdown (works both run- and build-time).
  • Adds post build script to rename Markdown routes to .md as Astro always generates .md/index.html files.
  • Adds support for Github Flavoured Markdown, including tables.

See decision log entry for background.

Associated issue

N/A

How to test

  1. Open preview link
  2. Navigate to a page, like /en/documentation/getting-started/
  3. Replace the trailing slash with .md, like /en/documentation/getting-started.md
  4. Verify the page is now rendered as Markdown.
  5. Navigate to a page with tables, like /en/demos/tables-demo.md
  6. Verify that tables are rendered to Markdown.

Checklist

  • I have performed a self-review of my own code
  • I have made sure that my PR is easy to review (not too big, includes comments)
  • I have made updated relevant documentation files (in project README, docs/, etc)
  • I have added a decision log entry if the change affects the architecture or changes a significant technology
  • I have notified a reviewer


export const partial = true;

Astro.response.headers.set('Content-Type', 'text/markdown; charset=utf-8');
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTML entities in the content are still encoded (< becomes &lt; etc). This is not the cause of the ToMarkdown component as it also happens with the HTML comment below. So Astro forces this higher up I believe. Haven't been able to stop encoding or force decoding 🤷 .


**Renders nested HTML content to Markdown.**

## Examples
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to write these and others as test. But while this component works in a route, I'm not able to isolate the output properly. 🤷 .

import { datocmsCollection } from '@lib/datocms';
import { type PageRouteForPath, getPagePath } from '@lib/routing/page';

export async function getStaticPaths() {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could keep this in index.astro and try to import it in index.md.astro. I like this separate file as it makes it more clear it can be used across routes. And it also obfuscated the content of index.astro a bit. So I'm happy with this.

@@ -0,0 +1,16 @@
---
import ToMarkdown from '@components/ToMarkdown/ToMarkdown.astro';
import Page from './index.astro';
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this pattern where the Page from index.astro can remain unmodified and this route does all the ToMarkdown magic. If a developer using Head Start doesn't need this behaviour, this index.md.astro route can simply be removed.

Copy link
Copy Markdown
Contributor

@jurgenbelien jurgenbelien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works (apart from the html entities) and there are some changes that are IMHO beneficial to the project regardless (moving staticPaths to separate file).

I do wonder if this functionality makes sense out of the box, instead of adding it later on for a project that needs it. For starters this increases build time for every project, even if they don't use it. A developer might remove the markdown route but forget the link tag, serving erroneous alternates that might negatively impact SEO.

Comment thread docs/decision-log/2024-12-27-render-to-markdown.md Outdated
Comment thread docs/decision-log/2024-12-27-render-to-markdown.md Outdated
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Feb 7, 2025

Deploying head-start with  Cloudflare Pages  Cloudflare Pages

Latest commit: 4b6e1ec
Status:🚫  Build failed.

View logs

@jbmoelker jbmoelker requested a review from Copilot April 16, 2025 19:42
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 6 out of 11 changed files in this pull request and generated 3 comments.

Files not reviewed (5)
  • package.json: Language not supported
  • public/_headers: Language not supported
  • src/components/ToMarkdown/ToMarkdown.astro: Language not supported
  • src/pages/[locale]/[...path]/index.astro: Language not supported
  • src/pages/[locale]/[...path]/index.md.astro: Language not supported

Comment thread scripts/rename-md-files.ts Outdated
Comment thread docs/decision-log/2024-12-27-render-to-markdown.md Outdated
Comment thread docs/decision-log/2024-12-27-render-to-markdown.md Outdated
@jbmoelker
Copy link
Copy Markdown
Member Author

Will be replaced by #345

@jbmoelker jbmoelker closed this Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants