英语翻译
2.3.The SR process
The speech recognition process follows five steps
[21]:
*1.Audio input:The human voice is transmitted
through a microphone connected to a PC with a
standard sound card.
*2.Acoustic processor:The acoustic processor filters
out background noise and converts the captured
audio into a series of phonemes.
*3.Word matching:The software attempts to match
the sounds to the most-likely words in two ways.
First,it uses acoustical analysis to build a list of
possible matches that contain similar sounds.
Then,it uses language modeling (the likelihood
that a given word appears between those coming
before and after it) to narrow the list to the best
candidates.In addition,the word-matching process
draws on the user-defined domain (the set of voca-
bularies,pronunciations,and word-usage models,
as well as a model of the user’s speech and words).
The user can extend the domain by adding new
words and can create multiple domains for
different applications.Finally,continuous-speech
SR examines contextual information to predict
what words should come next in the current
phrase.This also helps the system to distinguish
among homonyms.
*4.Decoder:The decoder selects the most-likely
word based on the rankings assigned during word
matching and assembles the word along with those
selected earlier into the most-likely sentence
combination.
*5.Text output:Some SR programs include their own
word processors,but many also allow text
transcription directly into a separate word proces-
sing program or a text box in an application,such
as a web browser or e-mail program.
2.4.SR limitations
While SR requires less hardware (e.g.no keyboard
is needed for input—especially advantageous for
PDA’s) and people speaking can generate text faster
than those typing,there are some significant limita-
tions.Most important,the current state of the technol-
ogy prevents transcriptions from achieving 100%
accuracy because of slurred speech,mispronunciation,
and background noise that becomes worse in crowded
*offices [24].Enrollment can improve the accuracy
somewhat,but this represents added startup cost.
一年多不碰语音识别了,翻译的不好,对付看吧:
语音识别过程:
语音识别可以分为5步:
1、音频输入:人类语音可以通过麦克风输入至带有标准声卡的PC机.
2、音频预处理:预处理器过滤出背景噪音,并将提取出的语音转换成一系列音节.
3、单词匹配:软件会通过两步将声音与最接近的单词匹配.首先,会根据音频分析建立包含近似词语在内的一些匹配.然后利用语言模型(即计算相似度,某个特定单词出现在其上下文环境中的概率)不断过滤,直至得出最佳候选.此外,词匹配过程可以得出一个用户定义的域(即词汇、发音及常用词模型的集合,也是用户语音和词汇的模型).用户可以通过加入新词来扩展这个域,也可以为不同的语音应用创建多个不同的域.最后,连续语音识别系统通过上下文信息来预测在目前的短语中应该出现哪个词.
4、解码器:解码器基于单词匹配中标注的序列选择最近似的词,并根据选择将词语组合成概率最大的句子.
5、文本输出:一些语音识别程序包含单独的词处理器,但也有很多识别程序直接将单词转换功能包含在词处理程序中,或在程序中做成文本工具包,比如web浏览器或电子邮件程序.
2.4 语音识别的局限
如果语音识别器在硬件上有局限性的要求(比如无键盘可供输入的情况,特别是PDA的情况),而人们的语言又能以比打字输入更快的方式生成,那么语音识别器还是有一些重要的局限的.首先,目前的技术很难使识别率达到100%,因为有连读、错发音、背景噪声特别是办公室环境的影响.不断的循环的识别语料多少可以增加一些精确度,但这会增加额外的输入开销.
-
点赞数:0回答数:1
-
点赞数:0回答数:1
-
点赞数:0回答数:3
-
点赞数:0回答数:5
-
点赞数:0回答数:2
-
点赞数:0回答数:2
-
点赞数:0回答数:2
-
点赞数:0回答数:1
-
点赞数:0回答数:1
-
点赞数:0回答数:1

最新问答: 如图,在梯形ABCD中,AB平行于CD,∠A+∠B=90°CD=5,AB=11点M、N分别为CD的中点求线段MN的长 谓语与表语我有一本语法书 上面写道 表语与系动词构成了复合谓语 这句话换个意思是不是就是表语本身就是谓语的一部分 还有谓 基本事件的具体定义是什么一共三个相同的白球 取出一个 基本事件是一个还是三个 The great thing in this world is not so much where we stand 求儿歌,能帮助记单词的那种像Big Elephants Can Always Understand Small Elep 水是生命之源,溶液遍布生活的每一个角落。 TO / FOR 的区别从这两个句子帮我区别.you are important to me / you are imp The trees can make the air____ _____ ______. 关于安静的名人名言写清出自哪里要是名人说的啊!!!老师让查的 一个批发兼零售的盒饭店规定:凡一次购买盒饭80盒以上(不包括80盒)可以按批发价付款,购买80盒以下(包括80盒)只能按 All the wonder that I want I found in her 人耳的鼓膜很薄,所以很轻的声音不能使它产生振动.这句话对吗 在“测滑轮组机械效率”的实验中,用同一滑轮组(如图所示)进行了两次实验,实验数据如下表所示.(1)在表格的空格处,填上合 这个英文为什么不带the,that可否去掉 在比例尺是1:4000000的地图上,量得甲、乙两地相距20厘米,两列火车同时从甲、乙两地相对开出、甲车每小时行55千米 黄鹤楼送孟浩然之广陵中千古流传的名句是什么 写景抒情的600字作文 (3x 2)的平方-(x 3)的平方=0用因式分解法解一下步骤 杯子里成有浓度为80%的酒精100克,现从中倒出10克,加入10克水,搅匀后,在倒出10克水,再倒入10克水,浓 《春雪》(唐)韩愈 新年都未有芳华,二月初惊见草芽.白雪却嫌春色晚,故穿庭树作飞花.