Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/8823
Full metadata record
DC FieldValueLanguage
dc.contributor.authorWirote Aroonmanakun-
dc.contributor.otherChulalongkorn University. Faculty of Arts-
dc.date.accessioned2009-02-17T01:29:09Z-
dc.date.available2009-02-17T01:29:09Z-
dc.date.issued2007-
dc.identifier.isbn9789746230629-
dc.identifier.urihttp://cuir.car.chula.ac.th/handle/123456789/8823-
dc.description.abstractThis paper discusses problems of word and sentence segmentation in Thai. Disagreements on word segmentation are caused mostly from compound words. To set a standard resource and tool of word segmentation, we suggest that only simple words and true compound words should be segmented in the process of word segmentation. Other compounds can be grouped later by the same means as multiword identification in other languages. Sentence segmentation is also difficult because the boundary of sentence in Thai is fuzzy. We suggest that a discourse should be seen as a combination of clauses rather than sentences. Some discourse clues then can be used to segment these discourse units. The result from sentence segmentation module could be a sequence of segments composed of clauses, which then can be constructed into the discourse structure.en
dc.format.extent373 bytes-
dc.format.mimetypetext/html-
dc.language.isoenes
dc.publisherChulalongkorn Universityen
dc.rightsChulalongkorn Universityen
dc.subjectThai language -- Sentences-
dc.subjectThai language -- Phonology-
dc.subjectWord (Linguistics)-
dc.titleThoughts on word and sentence segmentation in Thaien
dc.typeTechnical Reportes
dc.email.authorawirote@chula.ac.th-
dc.description.publicationAroonmanakun, W. 2007. Thoughts on Word and Sentence Segmentation in Thai. In Proceedings of the Seventh Symposium on Natural Language Processing, Dec 13-15, 2007, Pattaya, Thailand. 85-90.en
dc.subject.keywordword segmentationen
dc.subject.keywordsentence segmentationen
Appears in Collections:Arts - Research Reports

Files in This Item:
File Description SizeFormat 
default.html373 BHTMLView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.