[PDF] Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-level Embedding Features

Skip to search formSkip to main contentSkip to account menu

Semantic ScholarSemantic Scholar's Logo

DOI:10.21437/interspeech.2019-1235
Corpus ID: 195791553

@article{Cai2019PolyphoneDF, title={Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-level Embedding Features}, author={Zexin Cai and Yaogen Yang and Chuxiong Zhang and Xiaoyi Qin and Ming Li}, journal={ArXiv}, year={2019}, volume={abs/1907.01749}, url={https://api.semanticscholar.org/CorpusID:195791553}}

Zexin Cai, Yaogen Yang, Ming Li
Published in Interspeech 3 July 2019
Computer Science, Linguistics

The experimental results show that both the sentence-level and the word-level conditional embedding features are able to attain good performance for Mandarin Chinese polyphone disambiguation.

[PDF] Semantic Reader

23 Citations

Highly Influential Citations

Background Citations

Methods Citations

Figures and Tables from this paper

figure 1
table 1
table 2

Topics

Polyphone Disambiguation (opens in a new tab)Polyphonic Character (opens in a new tab)Prediction Network (opens in a new tab)Bidirectional Recurrent Neural Networks (opens in a new tab)ConditionaL Neural Network (opens in a new tab)Sentence Encoders (opens in a new tab)

Ask This Paper
BETA
AI-Powered

Our system tries to constrain to information found in this paper. Results quality may vary. Learn more about how we generate these answers.

Feedback?

23 Citations

Improving Polyphone Disambiguation for Mandarin Chinese by Combining Mix-Pooling Strategy and Window-Based Attention

Junjie LiZhiyu ZhangMinchuan ChenJun MaShaojun WangJing Xiao

Computer Science

Interspeech

2021

A novel system based on word-level features and window-based attention for polyphone disambiguation, which is a fundamental task for Grapheme-to-phoneme (G2P) conversion of Mandarin Chinese, is proposed.

Distant Supervision for Polyphone Disambiguation in Mandarin Chinese

Jiawen ZhangYuanyuan ZhaoJiaqi ZhuJinba Xiao

Computer Science, Linguistics

INTERSPEECH

2020

This paper proposes a framework that can predict the pronunciations of Chinese characters, and the core model is trained in a distantly supervised way, and improves the predictive accuracy for unbalanced polyphonic characters.

Polyphone Disambiguation in Mandarin Chinese with Semi-Supervised Learning

Yi ShiCong WangYu ChenBin Wang

Computer Science, Linguistics

Interspeech

2021

A novel semi-supervised learning (SSL) framework for Mandarin Chinese polyphone disambiguation that can potentially leverage unlimited unlabeled text data and explores the effect of various proxy labeling strategies including entropy-thresholding and lexicon-based labeling.

Polyphone Disambiguition in Mandarin Chinese with Semi-Supervised Learning

Yi ShiCong WangYu ChenBin Wang

Computer Science, Linguistics

ArXiv

2021

This paper proposes a novel semi-supervised learning (SSL) framework for Mandarin Chinese polyphone disambiguation that can potentially leverage unlimited unlabeled text data and explores the effect of various proxy labeling strategies including entropy-thresholding and lexicon-based labeling.

[PDF]

PDF: Polyphone Disambiguation in Chinese by Using FLAT

Haiteng Zhang

Computer Science

Interspeech

2021

This paper proposes a model that encodes both the input character sequence and dictionary matched words of the sentence, enabling the model to both avoid segment errors and leverage the well-designed pronunciation dictionary in the model.

External Knowledge Augmented Polyphone Disambiguation Using Large Language Model

Chen Li

Computer Science

ArXiv

2023

A novel method to solve the problem of polyphone disambiguation when doing grapheme-to-phoneme (G2P) conversion in Mandarin Chinese text-to-speech systems and results show that it outperforms the existing methods on a public dataset called CPP.

[PDF]

A Polyphone BERT for Polyphone Disambiguation in Mandarin Chinese

Song ZhangKengtao ZhengXiaoxu ZhuBaoxiang Li

Computer Science

INTERSPEECH

2022

A Chinese polyphone BERT model to predict the pronunciations of Chinese polyphonic characters is proposed, which can turn the polyphone disambiguation task into a pre-training task of the Chinese polyphones BERT.

[PDF]

Polyphone Disambiguation and Accent Prediction Using Pre-Trained Language Models in Japanese TTS Front-End

Rem HidaMasaki HamadaChie KamadaE. TsunooToshiyuki SekiyaToshiyuki Kumakura

Computer Science, Linguistics

ICASSP 2022 - 2022 IEEE International Conference…

2022

The objective evaluation results showed that the proposed method improved the accuracy by 5.7 points in PD and 6.0 points in AP, and the perceptual listening test results confirmed that a TTS system employing the proposed model as a front-end achieved a mean opinion score close to that of synthesized speech with ground-truth pronunciation and accent in terms of naturalness.

[PDF]

Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation

Chunyu QiangPeng YangHao CheJinba XiaoXiaorui WangZhongyuan Wang

Computer Science, Linguistics

2022 Asia-Pacific Signal and Information…

2022

A back-translation-style data augmentation method for mandarin Chinese polyphone disambiguation, utilizing a large amount of unlabeled text data, and a data balance strategy to improve the accuracy of some typical polyphonic characters in the training set with imbalanced distribution or data scarcity.

[PDF]

g2pW: A Conditional Weighted Softmax BERT for Polyphone Disambiguation in Mandarin

Yi-Chang ChenYu-Chuan ChangYenling ChangYi-Ren Yeh

Computer Science

INTERSPEECH

2022

This work proposes a novel approach, called g2pW, which adapts learnable softmax-weights to condition the outputs of BERT with the polyphonic character of interest and its POS tagging and shows that it outperforms existing methods on the public CPP dataset.

[PDF]

...

30 References

A bi-directional LSTM approach for polyphone disambiguation in Mandarin Chinese

Changhao ShanLei XieK. Yao

Computer Science, Linguistics

2016 10th International Symposium on Chinese…

2016

This paper proposes to use bidirectional long short-term memory (BLSTM) neural network to encode both the past and future observations on the character sequence as its inputs and predict the pronunciations of polyphonic characters.

26
Highly Influential
PDF

Inequality Maximum Entropy Classifier with Character Features for Polyphone Disambiguation in Mandarin TTS Systems

Xinnian MaoYuan DongJinyu HanDezhi HuangHaila Wang

Computer Science, Linguistics

2007 IEEE International Conference on Acoustics…

2007

This paper formulate the polyphone disambiguation problem into a classification problem and proposes a language independent classifier based on maximum entropy to address the issue, and introduces inequality smoothing to alleviate data sparseness and exploit language independent character features as linguistic knowledge.

Polyphone Disambiguation Based on Maximum Entropy Model in Mandarin Grapheme-to-Phoneme Conversion

Fangzhou LiuYou Zhou

Computer Science, Linguistics

2011

A maximum entropy model to disambiguate polyphones, and various keyword selection approaches in different domains are proposed, and a hierarchical clustering algorithm for automatic generation of feature templates is designed, which minimizes the need for human supervision during ME model training.

Disambiguating effectively Chinese polyphonic ambiguity based on unify approach

Feng-Long Huang

Computer Science, Linguistics

2008 International Conference on Machine Learning…

2008

The paper addresses the ambiguity issue of Chinese character polyphones and disambiguity approaches for such issues and proposed the unify approaches to improve the performance with respect to various threshold value.

An Efficient Way to Learn Rules for Grapheme-to-Phoneme Conversion in Chinese

Ziran ZhangMin ChuEric Chang

Computer Science, Linguistics

2002

This paper points out that correct G2P conversion for 41 key polyphones and 22 key polyphonic multi-syllabic words will constrain the overall error rate to below 0.068%, and proposes a semi-automatic approach to do this, which saves almost half of the workload.

Grapheme-to-phoneme conversion using Long Short-Term Memory recurrent neural networks

Kanishka RaoFuchun PengHasim SakF. Beaufays

Computer Science

2015 IEEE International Conference on Acoustics…

2015

This work proposes a G2P model based on a Long Short-Term Memory (LSTM) recurrent neural network (RNN) that has the flexibility of taking into consideration the full context of graphemes and transform the problem from a series of grapheme-to-phoneme conversions to a word- to-pronunciation conversion.

Listen, attend and spell: A neural network for large vocabulary conversational speech recognition

William ChanN. JaitlyQuoc V. LeO. Vinyals

Computer Science

2016 IEEE International Conference on Acoustics…

2016

We present Listen, Attend and Spell (LAS), a neural speech recognizer that transcribes speech utterances directly to characters without pronunciation models, HMMs or other components of traditional…

2,068

[PDF]

Directional Skip-Gram: Explicitly Distinguishing Left and Right Context for Word Embeddings

Yan SongShuming ShiJing LiHaisong Zhang

Computer Science

NAACL

2018

Direction Skip-gram (DSG), a simple but effective enhancement of the skip-gram model by explicitly distinguishing left and right context in word prediction, and which outperforms others on different datasets in semantic and syntactic evaluations.

Grapheme-to-phoneme conversion in Chinese TTS system

Honghui DongJ. TaoBo Xu

Computer Science

2004 International Symposium on Chinese Spoken…

2004

A study on the relation between Chinese characters and their pronunciation, the solution to the disambiguation of polyphonic characters, a dictionary- based method, and a rules-based method are proposed.

An HMM-Based Mandarin Chinese Text-To-Speech System

Yao QianF. SoongYining ChenMin Chu

Computer Science, Linguistics

ISCSLP

2006

The listening test results show that LSP and its dynamic counterpart, both in time and frequency, are preferred for the resultant higher synthesized speech quality.

...

Related Papers

Showing 1 through 3 of 0 Related Papers

[PDF] Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-level Embedding Features | Semantic Scholar (2024)

Figures and Tables from this paper

Topics

Ask This PaperBETAAI-Powered

23 Citations

30 References

Related Papers

Ask This Paper
BETA
AI-Powered