Research on Algorithm of Video Analysis System Based on Text Error Correction

التفاصيل البيبلوغرافية
العنوان: Research on Algorithm of Video Analysis System Based on Text Error Correction
المؤلفون: Jinjin Wang, Yang Qin, Jiahao Shi, Jiachen Luo, Guo Huang, Jiaqi Lu
المصدر: Frontiers in Computing and Intelligent Systems. 2:123-126
بيانات النشر: Darcy & Roy Press Co. Ltd., 2023.
سنة النشر: 2023
الوصف: When making a video, if the video has a language organization error, it needs to be re-recorded. It is not possible to remove inappropriate or unnatural pronunciation parts of the recording more effectively. In response to this problem, this paper studies the speech extraction, error correction and synthesis of video, which is divided into three parts: (1) Speech segmentation and speech-to-text of video; (2) Text recognition error correction; (3) Text-to-speech and video speech synthesis. For the first part, we applied the staged and efficient algorithm based on (Bayesian Information Criterion) BIC & (Statistical Mean Euclidean Distance) MEdist to segment the video voice, and then, the segmented audio is subtracted to reduce noise, and finally converted to text using the iFLYTEK interface. For the second part, we apply the (Double Automatic Error Correction) DAEC algorithm to text error correction. For the third part, we use the (Improved Chinese Realtime Voice Cloning) I-Zhrtvc for text-to-speech. Then merge the voice into the video. The simulation result shows that the staged and efficient algorithm based on BIC & MEdist, which accurately segmented by sentences, can identify audio with dialect accents, and has high accuracy in translating to text, up to an average of 95.8%. DAEC algorithm has a high error correction rate. The audio prosody accuracy after synthesis is high. ZVTOW text-to-speech (Mean Opinion Score) MOS up to 4.5.
تدمد: 2832-6024
DOI: 10.54097/fcis.v2i3.5510
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::d3510bed8dbd58a8d5352c83e1899a2e
https://doi.org/10.54097/fcis.v2i3.5510
Rights: OPEN
رقم الانضمام: edsair.doi...........d3510bed8dbd58a8d5352c83e1899a2e
قاعدة البيانات: OpenAIRE
الوصف
تدمد:28326024
DOI:10.54097/fcis.v2i3.5510