Using automatic speech recognition to assess Thai speech language fluency in montreal cognitive assessment (MoCA)

Pimarn Kantithammakorn

dc.contributor.advisor	Proadpran Punyabukkana
dc.contributor.advisor	Dittaya Wanvarie
dc.contributor.author	Pimarn Kantithammakorn
dc.contributor.other	Chulalongkorn University. Faculty of Engineering
dc.date.accessioned	2021-09-22T23:39:08Z
dc.date.available	2021-09-22T23:39:08Z
dc.date.issued	2020
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/77264
dc.description	Thesis (M.Sc.)--Chulalongkorn University, 2020
dc.description.abstract	The Montreal Cognitive Assessment (MoCA), a widely accepted screening tool for identifying patients with mild cognitive impairment (MCI), includes a language fluency test of verbal functioning where scores are based on the number of unique correct words produced by the test-taker. However, with different languages, it is possible that unique words may be counted differently. This study focuses on Thai as a language that differs from English in its type of word combination. We applied various automatic speech recognition (ASR) techniques to develop an assisted scoring system for the language fluency test of the MoCA with Thai language support. The extra challenge is that Thai is a low-resource language where domain-specific data are not publicly available, especially speech data from patients with MCI. We propose a hybrid Time Delay Neural Network - Hidden Markov Model (TDNN-HMM) architecture for acoustic model training to create our ASR system that is robust to environmental noise and the variation of voice quality impacted by MCI. The LOTUS Thai speech corpus is incorporated into the training set to improve the model’s generalization. A preprocessing algorithm is implemented to reduce the background noise and improve the overall data quality before feeding into the TDNN-HMM system for automatic word detection and language fluency score calculation. The results show that the TDNN-HMM model in combination with data augmentation using lattice-free maximum mutual information (LF-MMI) objective function provides a word error rate (WER) of 41.30%. To our knowledge, this is the first study to develop an ASR with Thai language support to automate the scoring system of the MoCA’s language fluency assessment.
dc.description.abstractalternative	Montreal Cognitive Assessment (MoCA) เป็นแบบประเมินที่ได้รับการยอมรับอย่างแพร่หลายในการคัดกรองคนไข้ที่มีภาวะรู้คิดบกพร่องเล็กน้อยรวมถึงการประเมินความสามารถทางภาษาและการพูดโดยให้คนไข้พูดคำตามเงื่อนไขให้ได้มากที่สุดภายในระยะเวลาที่กำหนด โดยการคิดคะแนนจะนับคำที่ถูกต้องตามเงื่อนไขและไม่ซ้ำคำเดิมซึ่งอาจแตกต่างกันในแต่ละภาษา งานวิจัยชิ้นนี้ศึกษาการประเมินแบบทดสอบด้วยภาษาไทยโดยนำเทคนิคด้านการรู้จำเสียงพูดแบบอัตโนมัติมาช่วยในการคิดคะแนนของความสามารถทางภาษาในการทดสอบแบบประเมิน MoCA. ภาษาไทยเป็นภาษาที่มีข้อมูลเสียงที่สามารถนำมาใช้ได้แบบสาธารณะได้ค่อนข้างจำกัด โดยเฉพาะข้อมูลเสียงของคนไข้ที่มีภาวะรู้คิดบกพร่องเล็กน้อย เราจึงนำเสนอวิธีการสร้างแบบจำลองทางอะคูสติกด้วย Time Delay Neural Network - Hidden Markov Model (TDNN-HMM) มาช่วยในการพัฒนาระบบการรู้จำเสียงพูดแบบอัตโนมัติ ที่สามารถนำไปใช้ในสภาวะที่อาจมีเสียงรบกวนและคุณภาพเสียงของคนไข้อาจไม่ดีเท่าที่ควร โดยการนำข้อมูลเสียงภาษาไทยสาธารณะที่ชื่อว่า LOTUS มาช่วยในการพัฒนาโมเดลรวมทั้งขั้นตอนในการลดสัญญาณรบกวนออกจากไฟล์เสียงก่อนนำมาประมวณผลเพื่อไปใช้ในการนับคำและให้คะแนนในส่วนการประเมินความสามารถทางภาษาต่อไป ผลการทดลองแสดงให้เห็นว่า โมเดลแบบ TDNN-HMM ร่วมกับการเพิ่มปริมาณข้อมูลเสียง มาช่วยในการเรียนรู้คุณลักษณะแบบ lattice-free maximum mutual information (LF-MMI) ช่วยลดความผิดพลาดของคำที่ทำนายได้ โดยมีอัตราการผิดพลาดของคำอยู่ที่ประมาณ 41.30% ซึ่งยังไม่เคยมีงานวิจัยชิ้นใดเคยทำมาก่อนในการนำเทคนิคด้านการรู้จำเสียงพูดอัตโนมัติมาช่วยในการคิดคะแนนความสามารถทางภาษาสำหรับภาษาไทย
dc.language.iso	en
dc.publisher	Chulalongkorn University
dc.relation.uri	http://doi.org/10.58837/CHULA.THE.2020.141
dc.rights	Chulalongkorn University
dc.subject.classification	Computer Science
dc.title	Using automatic speech recognition to assess Thai speech language fluency in montreal cognitive assessment (MoCA)
dc.title.alternative	การใช้เทคโนโลยีการรู้จำเสียงพูดแบบอัตโนมัติช่วยประเมินความสามารถทางภาษาของเสียงภาษาไทยจากแบบประเมินพุทธิปัญญาโมคา
dc.type	Thesis
dc.degree.name	Master of Science
dc.degree.level	Master's Degree
dc.degree.discipline	Computer Science
dc.degree.grantor	Chulalongkorn University
dc.identifier.DOI	10.58837/CHULA.THE.2020.141