Please use this identifier to cite or link to this item: http://cuir.car.chula.ac.th/handle/123456789/8823
Title: Thoughts on word and sentence segmentation in Thai
Authors: Wirote Aroonmanakun
Email: awirote@chula.ac.th
Other author: Chulalongkorn University. Faculty of Arts
Subjects: Thai language -- Sentences
Thai language -- Phonology
Word (Linguistics)
Issue Date: 2007
Publisher: Chulalongkorn University
Abstract: This paper discusses problems of word and sentence segmentation in Thai. Disagreements on word segmentation are caused mostly from compound words. To set a standard resource and tool of word segmentation, we suggest that only simple words and true compound words should be segmented in the process of word segmentation. Other compounds can be grouped later by the same means as multiword identification in other languages. Sentence segmentation is also difficult because the boundary of sentence in Thai is fuzzy. We suggest that a discourse should be seen as a combination of clauses rather than sentences. Some discourse clues then can be used to segment these discourse units. The result from sentence segmentation module could be a sequence of segments composed of clauses, which then can be constructed into the discourse structure.
URI: http://cuir.car.chula.ac.th/handle/123456789/8823
ISBN: 9789746230629
Type: Technical Report
Appears in Collections:Arts - Research Reports

Files in This Item:
File Description SizeFormat 
default.html373 BHTMLView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.