A scalable shapelet discovery for time series classification

Nattakit Vichit

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/63652

Title:	A scalable shapelet discovery for time series classification
Other Titles:	การค้นพบเชพเลทอย่างรวดเร็วสำหรับการจำแนกอนุกรมเวลา
Authors:	Nattakit Vichit
Advisors:	Chotirat Ratanamahatana
Other author:	Chulalongkorn University. Faculty of Engineering
Advisor's Email:	Chotirat.R@Chula.ac.th
Issue Date:	2018
Publisher:	Chulalongkorn University
Abstract:	As time series data become more complex and users expect more sophisticated information, numerous algorithms have been proposed to solve these challenges. Among those algorithms to classify time series data, shapelet – a discriminative subsequence of time series data – is considered a practical approach due to its accurate and insightful classification. However, previously proposed shapelet algorithms still suffer from exceedingly high computational complexity, as a result, limiting its scalability to larger datasets. Therefore, in this work propose a novel algorithm that speeds up shapelet discovery process. The algorithm so called “Dual Increment Shapelets (DIS)” is a combination of two-layered incremental neural network and filtering process based on subsequence characteristics. Empirical experiments on forty datasets evidently demonstrate that the proposed work could achieve large speedup while maintaining its accuracy. Unlike the previous algorithm that mainly emphasizes speedup of the search algorithm, DIS essentially reduces the number of shapelet candidates based on subsequence characteristics. As a result, The DIS algorithm could achieve more than three orders of magnitude speedup, comparing with the baseline algorithms, while preserving the accuracy of the state-of-the-art algorithm.
Other Abstract:	เนื่องจากข้อมูลอนุกรมเวลามีความซับซ้อนเพิ่มขึ้นและผู้ใช้คาดหวังประโยชน์จากการวิเคราะห์มากขึ้น อัลกอรึทึมทั้งหลายจึงถูกนำเสนอเพื่อนำมาแก้ปัญหา อัลกอริทึมเชพเล็ทหรือการจำแนกส่วนของอนุกรมเวลาเป็นหนึ่งในอัลกอรึทึมที่สามารถจำแนกอนุกรมเวลาที่ให้ผลลัพธ์ที่ดีทั้งในด้านของความแม่นยำและสามารถมอบข้อมูลเชิงลึกที่สามารถนำไปประยุกต์ใช้งานจริงได้ แต่ทั้งนี้ ในอดีตที่ผ่านมาอัลกอริทึมเชพเล็ทยังประสพปัญหาเนื่องจากความต้องการเวลาประมวลผลมากจึงมีข้อจำกัดในการใช้งานกับชุดข้อมูลทีมีขนาดใหญ่ จากปัญหาดังกล่าววิทยานิพนธ์นี้จึงเสนออัลกอริทึมเชพเลทตัวใหม่ ซึ่งจะปรับปรุงเรื่องความรวดเร็วในการประมวลผลเป็นหลัก อัลกอริทึมที่นำเสนอจะใช้ชื่อว่า Dual Increment Shapelets (DIS) เกิดจากการรวมกันของ Incremental neural network สองชั้นและกระบวนการคัดเลือกผลลัพธ์จากลักษณะส่วนอนุกรมเวลา ผลลัพธ์จากการทดลองเชิงประจักษ์ใน 40 ชุดข้อมูลแสดงให้เห็นว่า อัลกอริทึมนี้มีความเร็วในการประมวลผลเพิ่มขึ้นอย่างมากอีกทั้งสามารถคงความแม่นยำไว้ในระดับสูง อัลกอริทึมที่ถูกนำเสนอที่ผ่านมาจะเน้นด้านการปรับปรุงความเร็วของกระบวนการค้นหาเชพเลทเป็นหลัก แต่ในวิทยานิพนธ์นี้จะใช้กระบวนการลดจำนวนตัวเลือกจากลักษณะส่วนอนุกรมเวลาแทน ผลการทดลองแสดงให้เห็นว่า อัลกอริทึมนี้จะสามารถเพิ่มประสิทธิภาพในด้านความเร็วมากกว่า 1,000 เท่าเมื่อเปรียบเทียบกับอัลกอริทึมพื้นฐานอีกทั้งสามารถคงความแม่นยำไว้ได้
Description:	Thesis (M.Sc.)--Chulalongkorn University, 2018
Degree Name:	Master of Science
Degree Level:	Master's Degree
Degree Discipline:	Computer Science
URI:	http://cuir.car.chula.ac.th/handle/123456789/63652
URI:	http://doi.org/10.58837/CHULA.THE.2018.159
metadata.dc.identifier.DOI:	10.58837/CHULA.THE.2018.159
Type:	Thesis
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
6070178221.pdf		1.78 MB	Adobe PDF	View/Open

Show full item record