Skip to content

Fix decision trees and bounds checking issues#55

Merged
lenzo-ka merged 10 commits into
masterfrom
fix-dt
Dec 12, 2025
Merged

Fix decision trees and bounds checking issues#55
lenzo-ka merged 10 commits into
masterfrom
fix-dt

Conversation

@lenzo-ka
Copy link
Copy Markdown
Contributor

No description provided.

Prevents strcpy with overlapping memory when i==j (undefined behavior).
Adds bounds check before array access to prevent buffer overflows.
Fixes applied to both continuous and semi-continuous code paths.
The bounds check with break caused incomplete acoustic model data when
triggered, silently discarding remaining phones. Since j increments only
for filtered phones (same logic as n_model calculation), the check is
unnecessary. Removing it prevents data loss while keeping the strcpy
overlap fix intact.
backward.c:
- Fix tacc allocation to use n_state instead of max_n_next for safe j-i indexing
- Add bounds checks before all tacc[i][j-i] accesses to prevent out-of-bounds
- Fix state_seq[j] to state_seq[0] in CI mixw accumulation for initial state

viterbi.c:
- Fix tacc allocation to use n_state instead of max_n_next for safe indexing
- Add bounds checks before tacc[prev][j-prev] accesses in both code paths

init_mixw/main.c:
- Initialize uninitialized destination tmat slots using source tmat[0]
- Critical when duplicating from .semi. to .cont. model definitions
Prevents out-of-bounds access when src_tmat pointer is valid but the
array is empty (n_tmat_src == 0). The condition now checks both pointer
validity and array size before accessing src_tmat[0].
Changed 9 occurrences of st_mode && permission_bit to st_mode & permission_bit.
Using logical AND (&&) with constant permission bits (S_IROTH, S_IRUSR, etc.) is
incorrect and always evaluates to true. Bitwise AND (&) correctly tests the bits.
- Suppress legacy code warnings (sign-compare, unused-parameter, pointer-sign, etc.)
- Fix uninitialized n_mllr variable by moving MLLR transform code inside conditional
- Reduces warnings from 267 to 3 (only truly unused variables remain)
- actions/checkout@v3 → v4
- actions/upload-artifact@v3 → v4
- actions/download-artifact@v3 → v4

Fixes deprecation notice: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
The libngram-dev package is only available in Ubuntu 22.04+, not in
ubuntu-latest (which was Ubuntu 20.04). This fixes the package install
error: 'Unable to locate package libngram-dev'.
- Removed libfst-dev and libngram-dev from default install steps
- Disabled BUILD_G2P by default (it already defaults to OFF in CMake)
- Only train-g2p-lda-vtln job uses ubuntu-22.04 and installs G2P deps
- G2P job rebuilds sphinxtrain with -DBUILD_G2P=ON
- All other jobs run on ubuntu-latest without optional dependencies

This makes G2P truly optional and allows CI to work on any Ubuntu version.
Changed from running on all pushes/PRs to only:
- push to master branch (verify master stays healthy)
- pull_request targeting master (test before merge)

This prevents duplicate runs when pushing to PR branches, saving CI
resources and time while maintaining full test coverage.
@lenzo-ka lenzo-ka merged commit 30fee4c into master Dec 12, 2025
8 checks passed
@lenzo-ka lenzo-ka deleted the fix-dt branch December 12, 2025 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant