Skip to search formSkip to main contentSkip to account menu
DOI:10.21437/interspeech.2019-1235 - Corpus ID: 195791553
@article{Cai2019PolyphoneDF, title={Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-level Embedding Features}, author={Zexin Cai and Yaogen Yang and Chuxiong Zhang and Xiaoyi Qin and Ming Li}, journal={ArXiv}, year={2019}, volume={abs/1907.01749}, url={https://api.semanticscholar.org/CorpusID:195791553}}
- Zexin Cai, Yaogen Yang, Ming Li
- Published in Interspeech 3 July 2019
- Computer Science, Linguistics
The experimental results show that both the sentence-level and the word-level conditional embedding features are able to attain good performance for Mandarin Chinese polyphone disambiguation.
23 Citations
2
13
10
Figures and Tables from this paper
- figure 1
- table 1
- table 2
Topics
Polyphone Disambiguation (opens in a new tab)Polyphonic Character (opens in a new tab)Prediction Network (opens in a new tab)Bidirectional Recurrent Neural Networks (opens in a new tab)ConditionaL Neural Network (opens in a new tab)Sentence Encoders (opens in a new tab)
Ask This Paper
BETA
AI-Powered
Ask This Paper
BETA
AI-Powered
Unknown Error
An unexpected error occurred. Please try again.
No Answer Found
Ask another question that can be answered by this paper or rephrase your question.
We are still processing this paper
Please try again later.
Question Answering Unavailable
Please try again later.
No Response
The server took too long to answer your question. You can either rephrase your question or wait until it is less busy.
AI-Generated
Thank you for your feedback!
We're sorry, something went wrong while submitting this feedback.
Thank you for your feedback!
We're sorry, something went wrong while submitting this feedback.
Supporting Statements
Our system tries to constrain to information found in this paper. Results quality may vary. Learn more about how we generate these answers.
Feedback?
23 Citations
- Junjie LiZhiyu ZhangMinchuan ChenJun MaShaojun WangJing Xiao
- 2021
Computer Science
Interspeech
A novel system based on word-level features and window-based attention for polyphone disambiguation, which is a fundamental task for Grapheme-to-phoneme (G2P) conversion of Mandarin Chinese, is proposed.
- 1
- PDF
- Jiawen ZhangYuanyuan ZhaoJiaqi ZhuJinba Xiao
- 2020
Computer Science, Linguistics
INTERSPEECH
This paper proposes a framework that can predict the pronunciations of Chinese characters, and the core model is trained in a distantly supervised way, and improves the predictive accuracy for unbalanced polyphonic characters.
- 7
- PDF
- Yi ShiCong WangYu ChenBin Wang
- 2021
Computer Science, Linguistics
Interspeech
A novel semi-supervised learning (SSL) framework for Mandarin Chinese polyphone disambiguation that can potentially leverage unlimited unlabeled text data and explores the effect of various proxy labeling strategies including entropy-thresholding and lexicon-based labeling.
- 2
- PDF
- Yi ShiCong WangYu ChenBin Wang
- 2021
Computer Science, Linguistics
ArXiv
This paper proposes a novel semi-supervised learning (SSL) framework for Mandarin Chinese polyphone disambiguation that can potentially leverage unlimited unlabeled text data and explores the effect of various proxy labeling strategies including entropy-thresholding and lexicon-based labeling.
- Haiteng Zhang
- 2021
Computer Science
Interspeech
This paper proposes a model that encodes both the input character sequence and dictionary matched words of the sentence, enabling the model to both avoid segment errors and leverage the well-designed pronunciation dictionary in the model.
- 5
- PDF
- Chen Li
- 2023
Computer Science
ArXiv
A novel method to solve the problem of polyphone disambiguation when doing grapheme-to-phoneme (G2P) conversion in Mandarin Chinese text-to-speech systems and results show that it outperforms the existing methods on a public dataset called CPP.
- Song ZhangKengtao ZhengXiaoxu ZhuBaoxiang Li
- 2022
Computer Science
INTERSPEECH
A Chinese polyphone BERT model to predict the pronunciations of Chinese polyphonic characters is proposed, which can turn the polyphone disambiguation task into a pre-training task of the Chinese polyphones BERT.
- Rem HidaMasaki HamadaChie KamadaE. TsunooToshiyuki SekiyaToshiyuki Kumakura
- 2022
Computer Science, Linguistics
ICASSP 2022 - 2022 IEEE International Conference…
The objective evaluation results showed that the proposed method improved the accuracy by 5.7 points in PD and 6.0 points in AP, and the perceptual listening test results confirmed that a TTS system employing the proposed model as a front-end achieved a mean opinion score close to that of synthesized speech with ground-truth pronunciation and accent in terms of naturalness.
- Chunyu QiangPeng YangHao CheJinba XiaoXiaorui WangZhongyuan Wang
- 2022
Computer Science, Linguistics
2022 Asia-Pacific Signal and Information…
A back-translation-style data augmentation method for mandarin Chinese polyphone disambiguation, utilizing a large amount of unlabeled text data, and a data balance strategy to improve the accuracy of some typical polyphonic characters in the training set with imbalanced distribution or data scarcity.
- Yi-Chang ChenYu-Chuan ChangYenling ChangYi-Ren Yeh
- 2022
Computer Science
INTERSPEECH
This work proposes a novel approach, called g2pW, which adapts learnable softmax-weights to condition the outputs of BERT with the polyphonic character of interest and its POS tagging and shows that it outperforms existing methods on the public CPP dataset.
...
...
30 References
- Changhao ShanLei XieK. Yao
- 2016
Computer Science, Linguistics
2016 10th International Symposium on Chinese…
This paper proposes to use bidirectional long short-term memory (BLSTM) neural network to encode both the past and future observations on the character sequence as its inputs and predict the pronunciations of polyphonic characters.
- 26
- Highly Influential
- PDF
- Xinnian MaoYuan DongJinyu HanDezhi HuangHaila Wang
- 2007
Computer Science, Linguistics
2007 IEEE International Conference on Acoustics…
This paper formulate the polyphone disambiguation problem into a classification problem and proposes a language independent classifier based on maximum entropy to address the issue, and introduces inequality smoothing to alleviate data sparseness and exploit language independent character features as linguistic knowledge.
- 16
- PDF
- Fangzhou LiuYou Zhou
- 2011
Computer Science, Linguistics
A maximum entropy model to disambiguate polyphones, and various keyword selection approaches in different domains are proposed, and a hierarchical clustering algorithm for automatic generation of feature templates is designed, which minimizes the need for human supervision during ME model training.
- 11
- Feng-Long Huang
- 2008
Computer Science, Linguistics
2008 International Conference on Machine Learning…
The paper addresses the ambiguity issue of Chinese character polyphones and disambiguity approaches for such issues and proposed the unify approaches to improve the performance with respect to various threshold value.
- 14
- Ziran ZhangMin ChuEric Chang
- 2002
Computer Science, Linguistics
This paper points out that correct G2P conversion for 41 key polyphones and 22 key polyphonic multi-syllabic words will constrain the overall error rate to below 0.068%, and proposes a semi-automatic approach to do this, which saves almost half of the workload.
- 24
- PDF
- Kanishka RaoFuchun PengHasim SakF. Beaufays
- 2015
Computer Science
2015 IEEE International Conference on Acoustics…
This work proposes a G2P model based on a Long Short-Term Memory (LSTM) recurrent neural network (RNN) that has the flexibility of taking into consideration the full context of graphemes and transform the problem from a series of grapheme-to-phoneme conversions to a word- to-pronunciation conversion.
- 183
- PDF
- William ChanN. JaitlyQuoc V. LeO. Vinyals
- 2016
Computer Science
2016 IEEE International Conference on Acoustics…
We present Listen, Attend and Spell (LAS), a neural speech recognizer that transcribes speech utterances directly to characters without pronunciation models, HMMs or other components of traditional…
- 2,068 [PDF]
- Yan SongShuming ShiJing LiHaisong Zhang
- 2018
Computer Science
NAACL
Direction Skip-gram (DSG), a simple but effective enhancement of the skip-gram model by explicitly distinguishing left and right context in word prediction, and which outperforms others on different datasets in semantic and syntactic evaluations.
- 275
- PDF
- Honghui DongJ. TaoBo Xu
- 2004
Computer Science
2004 International Symposium on Chinese Spoken…
A study on the relation between Chinese characters and their pronunciation, the solution to the disambiguation of polyphonic characters, a dictionary- based method, and a rules-based method are proposed.
- 12
- PDF
- Yao QianF. SoongYining ChenMin Chu
- 2006
Computer Science, Linguistics
ISCSLP
The listening test results show that LSP and its dynamic counterpart, both in time and frequency, are preferred for the resultant higher synthesized speech quality.
- 53
- PDF
...
...
Related Papers
Showing 1 through 3 of 0 Related Papers