การทำขั้นตอนวิธีบีบข้อมูลเสียงพูดโดยการประมวลผลสัญญาณดิจิตอล

มีลาภ เรืองรัตนวิชา

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/29623

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	ประภาส จงสถิตย์วัฒนา
dc.contributor.author	มีลาภ เรืองรัตนวิชา
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. บัณฑิตวิทยาลัย
dc.date.accessioned	2013-03-11T07:35:17Z
dc.date.available	2013-03-11T07:35:17Z
dc.date.issued	2539
dc.identifier.isbn	9746359142
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/29623
dc.description	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2539	en
dc.description.abstract	การวิจัยครั้งนี้มีวัตถุประสงค์หลักในการพัฒนาโปรแกรมบีบข้อมูลเสียงพูดซึ่งให้คุณภาพของเสียงพูด ในระดับที่สูงเพียงพอสำหรับการประยุกต์ด้านการสื่อสารหลักการทำงานของโปรแกรมที่พัฒนาขึ้นมีพื้นฐานมาจากวิธีการเข้ารหัสแบบทำนายเชิงเส้น (Linear Predictive Coding - LPC) วิธีการที่ใช้ได้แก่ แอลพีซี10 (LPC10) ซีอีแอลพี (CELP - Code Excited Linear Prediction) และ อาร์พีอี-แอลทีพี (RPE-LTP - Residual Pulse Excitation - Linear Prediction) ในขั้นแรกของการวิจัยได้พัฒนาโปรแกรมให้ทำงานอยู่บนคอมพิวเตอร์พีซีและรับ ข้อมูลเสียงพูดที่อยู่ในแฟ้มข้อมูลเวฟ (wave file) เป็นข้อมูลเข้า สิ่งที่สนใจศึกษาได้แก่อัตราข้อมูลของรหัสที่ได้หลังการบีบข้อมูล ความซับซ้อนของขั้นตอนวิธี และ คุณภาพของสัญญาณเสียงพูดที่ได้ หัวข้อเหล่านี้เป็นหลักเกณฑ์สำคัญในการเลือกวิธีที่เหมาะสมสำหรับการพัฒนาเป็นโปรแกรมบีบข้อมูลเสียงที่ทำงานแบบทันที (real time) ซึ่งจะทำงานบนตัวประมวลผลสัญญาณ ดิจิตอล ADSP2101 อัตราข้อมูลที่ได้สำหรับวิธีแอลพีซี10 คือ 2.4 Kbps สำหรับวิธีซีอีแอลพีคือ 4.8 Kbps และ สำหรับวิธีอาร์พีอี-แอลทีพีคือ 13 Kbps พบว่าวิธีซีอีแอลพีเป็นวิธีที่มีความซับซ้อนในการทำงานมากที่สุด รองลง มาคือวิธีอาร์พีอี-แอลทีพี ส่วนวิธีแอลพีซี10 มีความซับซ้อนน้อยที่สุดการเปรียบเทียบคุณภาพของเสียงพูดที่ได้ ใช้คะแนนความเห็นจากผู้ทดสอบจำนวน 12 ท่าน ตัวอย่างเสียงพูดที่ใช้ในการทดลองเป็นเสียงไทยหนึ่งตัวอย่าง และเสียงหญิงหนึ่งตัวอย่าง วิธีแอลพีซี 10 ได้คะแนนเฉลี่ย 5.3 วิธีซีอีแอลพีได้ 6.7 คะแนน ส่วนวิธีอาร์พีอีแอลทีพี ได้คะแนนสูงสุดคือ 8.1 คะแนน ดังนั้นจึงได้เลือกวิธีอาร์พีอี-แอลทีพีในการพัฒนาโปรแกรมบีบข้อมูลเสียงพูด แบบทันทีเนื่องจากมีเป็นวิธีที่ให้คุณภาพของเสียงพูดจัดว่าดีและการทำงานไม่ซับซ้อนมากนัก โปรแกรมดังกล่าว ใช้เวลาประมาณ 16.3 ms ในการบีบและคลายข้อมูลเสียงต่อหนึ่งเฟรม (frame) ซึ่งมีความยาว 20 ms
dc.description.abstractalternative	The main objective of this research is to develop speech compression programs of which the speech quality levels are high enough for communication applications. The programs employ methods based on linear predictive coding (LPC) i.e. LPC10, CELP (Code Excited Linear Prediction) and RPE-LTP (Residual Pulse Excitation - Lone Term Prediction). In the first phase, the programs were developed for working on a PC and accepted speech in wave file format (.wav) as the input. The characteristics of each compression method e.g. the compression ratio or data rate after compression, algorithm complexity and the quality of the speech were studied and used as the criteria to choose one of them to implement as the real-time version which intended to run on the digital signal processor ADSP2101. The data rate after compression is 2.4 Kbps for LPC10, 4.8 Kbps for CELP and 13 Kbps for RPE-LTP. CELP is found as the most complicated method, RPE-LTP is the second and LPC10 is the least of the three methods. The quality of the speech of each method was compared by using the opinion score from 12 listeners. The experiment was performed with two sample files of man and woman speech. LPC10 got the average score of 5.3, CELP got 6.7 and RPE-LTP got the highest score of 8.1. RPE-LTP was therefore selected for real-time implementation due to its good quality of speech and moderate complexity. The estimated compression and decompression time of the real-time compression program is 16.3 ms for 20 ms speech frame.
dc.format.extent	4283930 bytes
dc.format.extent	2837969 bytes
dc.format.extent	8700114 bytes
dc.format.extent	8480553 bytes
dc.format.extent	5674629 bytes
dc.format.extent	5182951 bytes
dc.format.extent	6046716 bytes
dc.format.extent	1100494 bytes
dc.format.extent	3833826 bytes
dc.format.mimetype	application/pdf
dc.format.mimetype	application/pdf
dc.format.mimetype	application/pdf
dc.format.mimetype	application/pdf
dc.format.mimetype	application/pdf
dc.format.mimetype	application/pdf
dc.format.mimetype	application/pdf
dc.format.mimetype	application/pdf
dc.format.mimetype	application/pdf
dc.language.iso	th	es
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย	en
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย	en
dc.title	การทำขั้นตอนวิธีบีบข้อมูลเสียงพูดโดยการประมวลผลสัญญาณดิจิตอล	en
dc.title.alternative	An implementation of a speech compression algorithm by digital signal processing	en
dc.type	Thesis	es
dc.degree.name	วิทยาศาสตรมหาบัณฑิต	es
dc.degree.level	ปริญญาโท	es
dc.degree.discipline	วิศวกรรมคอมพิวเตอร์	es
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย	en
Appears in Collections:	Grad - Theses

Files in This Item:

File	Size	Format
Meelarp_ru_front.pdf	4.18 MB	Adobe PDF	View/Open
Meelarp_ru_ch1.pdf	2.77 MB	Adobe PDF	View/Open
Meelarp_ru_ch2.pdf	8.5 MB	Adobe PDF	View/Open
Meelarp_ru_ch3.pdf	8.28 MB	Adobe PDF	View/Open
Meelarp_ru_ch4.pdf	5.54 MB	Adobe PDF	View/Open
Meelarp_ru_ch5.pdf	5.06 MB	Adobe PDF	View/Open
Meelarp_ru_ch6.pdf	5.9 MB	Adobe PDF	View/Open
Meelarp_ru_ch7.pdf	1.07 MB	Adobe PDF	View/Open
Meelarp_ru_back.pdf	3.74 MB	Adobe PDF	View/Open

Show simple item record