Question: how does your language model work? (Inaccuracies in generated output )

Hello,

I hope this message finds you well. I have been using your project to generate text from audio files.
To accomplish this, I have been following a specific process: I create an MP3 file using the website https://ttsmp3.com/, then convert the MP3 file to a WAV file. 
I have seen your [Document about  The recognized text is wrong.](https://cmusphinx.github.io/doc/pocketsphinx/) but to prevent any noise or etc. I create my samples by [ttsmp3](https://ttsmp3.com/).

![1](https://github.com/cmusphinx/pocketsphinx/assets/44742050/0d711dfa-bb1b-48a8-a3bc-4aac008f9461)

here are my commands:


`sox 1.mp3 1.wav`

`pocketsphinx single 1.wav > 1.json`

But some of the words are incorrect like these.

![2](https://github.com/cmusphinx/pocketsphinx/assets/44742050/36a35575-f6fa-436d-9bf0-8c41d5bbcca3)

I wanted to bring this issue to your attention and kindly ask for assistance or guidance on how your language model works. I am curious to understand if there are any specific steps I should follow or considerations I should keep in mind to ensure the accuracy of the generated audio output.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: how does your language model work? (Inaccuracies in generated output ) #348

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Question: how does your language model work? (Inaccuracies in generated output ) #348

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions