Transit signal priority control using reinforcement learning based on cell transmission model

Pitipong Chanloha

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/36735

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Chaodit Aswakul	-
dc.contributor.advisor	Jatuporn Chinrungrueng	-
dc.contributor.advisor	Wipawee Hattagam	-
dc.contributor.author	Pitipong Chanloha	-
dc.date.accessioned	2013-11-27T06:29:52Z	-
dc.date.available	2013-11-27T06:29:52Z	-
dc.date.issued	2012	-
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/36735	-
dc.description	Thesis (Ph.D.)--Chulalongkorn University, 2012	en_US
dc.description.abstract	This dissertation is aimed at developing a new framework to control traffic signal light for the road network with recently introduced bus rapid transit (BRT) system. By applying the automated goal-directed learning and decision making called reinforcement learning (RL), the best possible traffic signal actions can be sought upon changes of network states as modelled by the signalised cell transmission model (CTM). There are three main original contributions in this dissertation. Firstly, the model combining CTM to capture the system dynamics together with the implementation of RL approach called Q learning has been introduced for an isolated intersection. Despite of such isolation constraint, a new external delay function has been proposed at the system boundary condition to capture the effects on the neighbourhood of that isolated intersection system. With the proper setting of red light delay as the RL reward function, reported results show that our proposed framework using RL and CTM in the macroscopic level can efficiently find the proper control solution that is close to the brute-forcely searched best periodic signal solution (BPSS). Secondly, the performance comparison of optimal traffic signal controls based on the derivation of theoretical M/M/1 and D/D/1 models and based on the RL approach has been evaluated. In particular, based on M/M/1 and D/D/1 queuing, the optimal split has been derived to minimise the mean waiting time of an intersection with two conflicting flows. The results confirm the validity in adopting the RL approach to control the traffic signal. Finally, an extension to a network of cascading interactions with BRT system has been proposed with simple uni-directional flows without turning movements. Motivated by the BRT system in Bangkok, the conventional signalised CTM has been generalised to cope with the preplanned space-usage priority of BRT over other non-priority vehicles by modelling explicitly the existence of BRT physical lane separator as well as the location of BRT stations. The delay function of both carried passengers on BRT and on other non-priority vehicles as well as waiting passengers at stations has been introduced. Based on the investigated scenarios, the deployment of BRT system with one lane deducted by the lane separator cannot reduce the total passenger delay in comparison with the comparable road and traffic condition before the BRT installation. However, with BRT, the passenger throughput can be greatly increased by up to 9-15\\% in the jamming conditions when at least 40\\% from the overall passengers choose the BRT for their journey. Moreover, our proposed method outperforms the conventional preemptive and differential priority control methods because of the improved awareness of signal switching cost. Based on all findings, the outstanding merit will entirely contribute towards to support the development of sustainable transportation systems.	en_US
dc.description.abstractalternative	วิทยานิพนธ์ฉบับนี้มีจุดมุ่งหมายเพื่อพัฒนากรอบการวิเคราะห์ใหม่ในการควบคุมสัญญาณไฟสำหรับโครงข่ายถนนที่มีรถโดยสารประจำทางด่วนพิเศษ โดยใช้การเรียนรู้อัตโนมัติแบบเป้าหมายกำกับและกระทำการตัดสินใจที่เรียกว่า การเรียนรู้แบบเสริมแรง (Reinforcement learning : RL) โดยสัญญาณไฟจราจรที่ดีที่สุดที่เป็นไปได้สามารถหาได้ขึ้นกับการเปลี่ยนแปลงของสถานะโครงข่ายที่ถูกจำลองด้วยแบบจำลองการส่งผ่านเซลล์ที่มีสัญญาณไฟ (Cell transmission model : CTM) โดยมีผลงานสามส่วนหลักที่ปรากฏอยู่ในวิทยานิพนธ์ฉบับนี้ ประการแรก การรวมเข้าของแบบจำลอง CTM เพื่อจับระบบกลศาสตร์ร่วมกับการนำ RL ไปใช้เรียกว่า การเรียนรู้แบบคิวเพื่อหาคำตอบที่ดีที่สุดที่เป็นไปได้สำหรับสี่แยกเดี่ยว แม้ว่าจะเป็นการพิจารณาเพียงแยกเดียวแต่ก็ได้มีการนำเสนอฟังก์ชันการประวิงเวลาของระบบที่เงื่อนไขขอบเขตเพื่อจับผลประสิทธิผลจากแยกข้างเคียง ด้วยการเลือกการประวิงเวลาสัญญาณไฟแดงเป็นฟังก์ชันผลรางวัล ผลการทดลองแสดงให้เห็นว่ากรอบการวิเคราะห์ที่นำเสนอ RL และ CTM ในระดับมหภาคสามารถหาคำตอบของการควบคุมได้อย่างมีประสิทธิภาพใกล้เคียงกับการเอาแต่แรงด้วยการค้นหาคำตอบแบบสัญญาณคาบที่ดีที่สุด ประการที่สอง การเปรียบเทียบสมรรถนะของการควบคุมสัญญาณไฟจราจรที่ดีที่สุดบนพื้นฐานขอแบบจำลองอนุพัทธ์เชิงทฤษฎีระบบแถวคอย M/M/1 และ D/D/1 และถูกประเมินบนพื้นฐานของการเข้าสู่ RL โดยการแบ่งที่ดีที่สุดสามารถอนุพัทธ์เพื่อทำให้ค่าเฉลี่ยของการรอที่แยกเดี่ยวที่มีสองความขัดแย้งของการไหลลดลง ผลที่ได้มายืนยันมีผลใช้ได้ใน RL เพื่อคุมสัญญาณไฟ ประการสุดท้าย การยืดออกไปเป็นโครงข่ายอันตรกิริยาแบบต่อเรียงกับระบบสัญญาณที่มีลำดับความสำคัญได้ถูกนำเสนอควบคู่ไปกับการไกลแบบทิศทางเดียวอย่างง่ายที่ไม่มีการเลี้ยวกลับ ด้วยระบบ BRT ในกรุงเทพมหานครแบบจำลองการส่งผ่านเซลล์ที่มีสัญญาณไฟแบบทั่วไปสามารถทำให้ใช้ได้กับการวางแผนพื้นที่การใช้งานล่วงหน้าของ BRT ซึ่งเหนือกว่ารถที่ไม่มีลำดับความสำคัญ ด้วยการจำลองอย่างชัดแจ้งของการมีอยู่ของ BRT ที่ถูกแยกช่องกายภาพ เช่นเดียวกับ ตำแหน่งของสถานี BRT โดยนำเสนอฟังก์ชันประวิงเวลาของทั้งผู้โดยสารที่ถูกขนส่งไปได้บน BRT และบนรถที่ไม่มีลำดับความสำคัญ และ ผู้โดยสารที่รอที่สถานี ด้วยพื้นฐานของแบบการที่จำลองสืบค้นอยู่ การวางระบบ BRT ที่ลดลง หนึ่งช่องทางด้วยตัวแบ่งช่องทาง ไม่สามารถลดการประวิงเวลาโดยสารได้เมื่อได้ลองเทียบกับถนนที่มีเงื่อนไขจราจรก่อนจะมีการวางระบบ BRT อย่างไรก็ตาม ระบบ BRT สามารถเพิ่มปริมาณงานได้ถึง 9-15% ในเงื่อนไขการติดเมื่ออย่างน้อยที่สุด 40 % ของผู้โดยสารทั้งหมด เลือกใช้ BRT ยิ่งไปกว่านั้น ระเบียบวิธีที่ที่นำเสนอให้ผลดีกว่าวิธีระเบียบวิธีการจองแบบสามัญและระเบียบวิธีควบคุมผลต่างเชิงอนุพันธ์ที่มีลำดับความสำคัญเพราะความเพิ่มขึ้นของการตระหนักของราคาการเปลี่ยนแปลงของระดับสัญญาณ จากทุกการค้นพบทั้งหมดผลงานนี้มีคุณค่าที่จะสร้างผลงานเพื่อการพัฒนาระบบขนส่งมวลชนที่ยั่งยืนต่อไป	en_US
dc.language.iso	en	en_US
dc.publisher	Chulalongkorn University	en_US
dc.relation.uri	http://doi.org/10.14457/CU.the.2012.917	-
dc.rights	Chulalongkorn University	en_US
dc.subject	Reinforcement learning	en_US
dc.subject	Traffic engineering	en_US
dc.subject	Communication and traffic	en_US
dc.subject	การเรียนรู้แบบเสริมแรง	en_US
dc.subject	วิศวกรรมจราจร	en_US
dc.subject	จราจร	en_US
dc.subject	ปริญญาดุษฎีบัณฑิต	en_US
dc.title	Transit signal priority control using reinforcement learning based on cell transmission model	en_US
dc.title.alternative	การควบคุมสัญญาณที่มีลำดับความสำคัญของการเดินทางผ่านโดยใช้การเรียนรู้แบบเสริมแรงบนพื้นฐานของแบบจำลองการส่งผ่านเซลล์	en_US
dc.type	Thesis	en_US
dc.degree.name	Doctor of Engineering	en_US
dc.degree.level	Doctoral Degree	en_US
dc.degree.discipline	Electrical Engineering	en_US
dc.degree.grantor	Chulalongkorn University	en_US
dc.email.advisor	Chaodit.A@chula.ac.th	-
dc.email.advisor	No information province	-
dc.email.advisor	No information province	-
dc.identifier.DOI	10.14457/CU.the.2012.917	-
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
pitipong_ch.pdf		2.51 MB	Adobe PDF	View/Open

Show simple item record