การเรียนรู้เชิงลึกสำหรับเข้ารหัสบัญชีการจำแนกโรคระหว่างประเทศ (ไอซีดี) จากบันทึกเวชระเบียนโดยใช้คลังข้อมูลภาษาไทยและภาษาอังกฤษ

ณัฐชา ซาซุม

dc.contributor.advisor	เกริก ภิรมย์โสภา
dc.contributor.advisor	กฤษณ์ เจริญลาภ
dc.contributor.author	ณัฐชา ซาซุม
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์
dc.date.accessioned	2024-02-05T10:12:31Z
dc.date.available	2024-02-05T10:12:31Z
dc.date.issued	2566
dc.identifier.uri	https://cuir.car.chula.ac.th/handle/123456789/84315
dc.description	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2566
dc.description.abstract	งานวิจัยนี้เสนอแบบจำลองสำหรับจำแนกรหัสไอซีดีแบบหลายฉลากเพื่อเป็นตัวช่วยจำแนกรหัสไอซีดี การให้รหัสไอซีดีไม่ครบส่งผลให้โรงพยาบาลไม่ได้รับเงินค่าชดเชยอย่างเหมาะสม เราจึงมุ่งเน้นที่จะช่วยโรงพยาบาลให้รหัสไอซีดีอย่างครบถ้วนในขั้นตอนการเบิกจ่ายเงินค่ารักษาพยาบาลซึ่งผลลัพธ์ที่ได้คือการสนับสนุนทางการเงินแก่โรงพยาบาล โดยปกติแล้วปัญหาการให้รหัสไอซีดีแบบหลายฉลากเป็นชุดข้อมูลบันทึกเวชระเบียนที่มีการกระจายตัวแบบหางยาวซึ่งไม่ควรละทิ้งข้อมูลใด ๆ ดังนั้น การจำแนกแบบหลายฉลากของเราจึงเป็นการรวมกันของวิธีการรวมแบบจำลองการเรียนรู้เชิงลึกทั้งสาม (Bidirectional Long-short term memory, Convolutional neural network and Transformer encoder) ด้วยการเลือกค่าทำนายสูงสุดและแบบจำลองทางสถิติคือ Multinomial Naïve Bayes ร่วมกับวิธี Binary Relevance ในส่วนของแบบจำลองการเรียนรู้เชิงลึกจะรับผิดชอบกลุ่มรหัสไอซีดีทั่วไป (โรคที่พบบ่อย) ในขณะที่แบบจำลองทางสถิติจะจัดการกับกลุ่มรหัสไอซีดีพบยาก (โรคที่ผู้ป่วยไม่ค่อยเป็น) ผลการทำนายรหัสไอซีดี-10 จากบันทึกการรักษาด้วยแบบจำลองนี้ให้ค่า Jaccard index ที่ 0.792 สำหรับกลุ่มรหัสไอซีดีทั่วไปและที่ 0.205 สำหรับกลุ่มรหัสไอซีดีพบยาก ส่วนของผลการทำนายรหัสไอซีดี-9 จากบันทึกการให้ยา (หัตถการรักษา) ให้ค่า Jaccard index ที่ 0.963 สำหรับกลุ่มรหัสไอซีดีทั่วไปและที่ 0.201 สำหรับกลุ่มรหัสไอซีดีพบยาก ซึ่งแบบจำลองนี้มีความสามารถในการแนะนำรหัสไอซีดีที่ขั้นตอนการเบิกจ่ายเงินค่ารักษาพยาบาล
dc.description.abstractalternative	We propose an ensemble model for multi-label ICD classification to assist in ICD coding. An incomplete ICD code prevents hospitals from receiving a full compensation. We aim at helping hospitals to complete the ICD code in the reimbursement process. The eventual result is to financially support the hospital. Naturally, the multi-label ICD is a long-tailed distribution dataset of medical records. No data should be dropped out. Therefore, our multi-label classification is a combination of a maximizing ensemble model which has 3 deep learning models (bidirectional Long-short term memory, convolutional neural network and 4 multi-head attention Transformers) and a binary relevance with multinomial Naïve Bayes (statistical) model. The deep learning models are responsible for frequent ICD codes while the statistical model handles infrequent ones. Our model can predict ICD-10-TM from course notes up to 0.792 of Jaccard index for frequent ICD group and 0.205 for infrequent group. The prediction of ICD-9-CM from medication note with procedures achieves 0.963 of Jaccard index for frequent ICD group and 0.201 for infrequent group. The model is capable of suggesting ICD code in the reimbursement process.
dc.language.iso	th
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย
dc.subject.classification	Medicine
dc.subject.classification	Computer Science
dc.subject.classification	Information and communication
dc.title	การเรียนรู้เชิงลึกสำหรับเข้ารหัสบัญชีการจำแนกโรคระหว่างประเทศ (ไอซีดี) จากบันทึกเวชระเบียนโดยใช้คลังข้อมูลภาษาไทยและภาษาอังกฤษ
dc.title.alternative	Deep learning for coding international classification of diseases (ICD) from medical records using Thai and English corpus
dc.type	Thesis
dc.degree.name	วิทยาศาสตรมหาบัณฑิต
dc.degree.level	ปริญญาโท
dc.degree.discipline	วิทยาศาสตร์คอมพิวเตอร์
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย