การรู้จำตัวอักษรพิมพ์ภาษาไทยโดยใช้หน่วยความจำระยะสั้นแบบยาว

ทวีศักดิ์ เอี่ยมสวัสดิ์

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/52285

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	บุญเสริม กิจศิริกุล	en_US
dc.contributor.author	ทวีศักดิ์ เอี่ยมสวัสดิ์	en_US
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์	en_US
dc.date.accessioned	2017-03-03T03:04:35Z	-
dc.date.available	2017-03-03T03:04:35Z	-
dc.date.issued	2559	en_US
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/52285	-
dc.description	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2559	en_US
dc.description.abstract	วิธีแบ่งส่วนสำหรับการรู้จำตัวอักษรทำงานโดยการแบ่งภาพบรรทัดตัวอักษรเป็นภาพตัวอักษรและนำไปรู้จำตัวอักษรแต่ละตัวอักษร วิธีนี้ได้รับผลกระทบจากประสิทธิภาพของกระบวนการแบ่งส่วนในปัญหาตัวอักษรที่เชื่อมติดกันหรือตัวอักษรที่บางส่วนขาดหายอย่างมาก ในขณะที่วิธีไม่แบ่งส่วนจะทำการรู้จำภาพบรรทัดตัวอักษรโดยไม่แบ่งส่วนภาพตัวอักษรแต่ละตัว วิธีนี้เหมาะสมกับภาษาอย่างเช่นภาษาไทยที่ประกอบด้วยตัวอักษรที่เชื่อมติดกันจำนวนมาก เป้าหมายของวิทยานิพนธ์นี้คือการประยุกต์ใช้หน่วยความจำระยะสั้นแบบยาว ซึ่งเป็นวิธีไม่แบ่งส่วนในการรู้จำตัวอักษรภาษาไทย นอกจากนี้วิทยานิพนธ์นำเสนอวิธีการเลื่อนองค์ประกอบแนวตั้ง ในการแก้ไขปัญหารูปแบบการรวมกันของตัวอักษรที่เกิดขึ้นแนวตั้งจำนวนมากบนโครงสร้างตัวอักษรสี่ระดับของภาษาไทย และยากต่อการนำมาใช้กับโครงข่ายหน่วยความจำระยะสั้นแบบยาวมาตรฐาน ผลการทดลองแสดงค่าความแม่นยำเปรียบเทียบวิธีที่นำเสนอบนโครงข่ายหน่วยความจำระยะสั้นแบบยาวมาตรฐาน กับซอฟต์แวร์เชิงพาณิชย์ในการรู้จำตัวอักษรภาษาไทย	en_US
dc.description.abstractalternative	The segmentation-based approach for Optical Character Recognition (OCR) works by first segmenting a text line image into individual character images and then recognizing the characters. The approach relies heavily on the performance of the segmentation process and thus suffers from the problem of touching and broken characters. On the other hand, the unsegmented approach for OCR processes the text line image without segmenting the image into individual characters, and the approach is more suitable for languages such as Thai that contains a lot of touching characters in nature. This thesis proposes an application of Long Short-Term Memory (LSTM), which is an unsegmented method, to Thai OCR. The thesis also introduces a method called vertical component shifting to solve the problem of a large number of vertically occurring character combinations that occur in four-level writing system of Thai, and pose difficulty for standard LSTM networks. The experimental results demonstrate the better accuracy of our proposed method over standard LSTM networks and other commercial software for Thai OCR.	en_US
dc.language.iso	th	en_US
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.relation.uri	http://doi.org/10.58837/CHULA.THE.2016.825	-
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.subject	การรู้จำอักขระ (คอมพิวเตอร์)	-
dc.subject	Character recognition	-
dc.title	การรู้จำตัวอักษรพิมพ์ภาษาไทยโดยใช้หน่วยความจำระยะสั้นแบบยาว	en_US
dc.title.alternative	Thai printed character recognition using long short-term memory	en_US
dc.type	Thesis	en_US
dc.degree.name	วิทยาศาสตรมหาบัณฑิต	en_US
dc.degree.level	ปริญญาโท	en_US
dc.degree.discipline	วิทยาศาสตร์คอมพิวเตอร์	en_US
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.email.advisor	Boonserm.K@Chula.ac.th,Boonserm.K@chula.ac.th	en_US
dc.identifier.DOI	10.58837/CHULA.THE.2016.825	-
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
5770420421.pdf		4.73 MB	Adobe PDF	View/Open

Show simple item record