Skip to content

mhchem arrows produce PUA characters in AssistiveMML, causing broken speech text (MathJax 4+) #3486

@brichwin

Description

@brichwin

Issue Summary

When using the mhchem extension to render chemical equations with arrows (e.g., \ce{SO4^2- + Ba^2+ -> BaSO4 v} ), MathJax 4.x produces Private Use Area (PUA) characters in the AssistiveMML output. These PUA characters are passed to the Speech Rule Engine (SRE), resulting in tofu characters (□) in the generated speech text and braille.

This issue exists both in MathJax 4.0 and in MathJax 4.1. MathJax 3.2.2 uses standard Unicode characters and does not have the issue. The issue exists with multiple types of mhchem reaction arrow tested, not just -> used in the examples here.

Steps to Reproduce:

LaTeX input:

\ce{SO4^2- + Ba^2+ -> BaSO4 v}

Any other information you want to share that is relevant to the issue
being reported. Especially, why do you consider this to be a bug? What
do you expect to happen instead?

Expected Behavior (as seen in MathJax 3.2.2)

The <mo> element for the reaction arrow uses the standard Unicode character U+27F6 (LONG RIGHTWARDS ARROW):

<mo stretchy="false" data-semantic-type="relation" data-semantic-role="arrow" 
    data-semantic-id="16" data-semantic-parent="31" 
    data-semantic-operator="multirel">&#x27F6;</mo>

Speech text output:

upper S upper O Subscript 4 Baseline Superscript 2 minus Baseline plus 
upper B a Superscript 2 plus Baseline long right arrow 
upper B a upper S upper O Subscript 4 Baseline down arrow

The arrow is spoken meaningfully as "long right arrow".

Actual Behavior (MathJax 4.0 and 4.1)

The <mo> element for the reaction arrow contains a PUA character (the element contains U+E429):

MathJax 4.1:

<mo data-mjx-variant="-mhchem" data-mjx-texclass="REL" stretchy="true" 
    data-latex="\mhchemlongrightarrow" data-semantic-type="operator" 
    data-semantic-role="unknown" data-semantic-annotation="depth:3" 
    data-semantic-="" data-semantic-parent="30" 
    data-semantic-attributes="latex:\mathrel{\mhchemlongrightarrow};texclass:REL" 
    data-semantic-operator="infixop," data-semantic-level-number="2" 
    data-speech-node="true"></mo>

MathJax 4.0:

<mo data-mjx-variant="-mhchem" data-mjx-texclass="REL" stretchy="true" 
    data-latex="\mhchemlongrightarrow" data-semantic-type="operator" 
    data-semantic-role="unknown" data-semantic-annotation="depth:3" 
    data-semantic-="" data-semantic-parent="30" 
    data-semantic-attributes="latex:\mathrel{\mhchemlongrightarrow};texclass:REL" 
    data-semantic-operator="infixop," aria-level="2" 
    data-speech-node="true"></mo>

Speech text output (both 4.0 and 4.1):

SO sub 4 raised to the 2 minus power plus 
Ba raised to the 2 plus power  BaSO sub 4 down arrow, math

The arrow is rendered as a tofu character (□) in the speech output because SRE does not map the PUA character to spoken text.

Technical details:

Affected Versions

  • MathJax 4.0.0 - Bug present
  • MathJax 4.1.0 - Bug present
  • MathJax 3.2.2 - Works as expected

I am using the following MathJax configuration (MathJax 4.1):

    window.MathJax = {
      loader: { load: ['[tex]/mhchem'] },
      tex: { packages: { '[+]': ['mhchem'] } },
      svg: { fontCache: 'none' },
      options: {
        enableMenu: true,
        menuOptions: {
          settings: {
            enrich: true,
            speech: true,
            braille: false,
            collapsible: false,
            assistiveMml: true
          }
        }
      },
      startup: {
        pageReady: () => {
          return MathJax.startup.defaultPageReady().then(() => {
            setTimeout(extractAssistiveMML, 100);
          });
        }
      }
    };

and loading MathJax via

  <script id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/mathjax@4.1/tex-svg.js"></script>

Supporting information:

Image

Metadata

Metadata

Assignees

Labels

AcceptedIssue has been reproduced by MathJax teamSREv4

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions