Transit signal priority control using reinforcement learning based on cell transmission model

Pitipong Chanloha

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/36735

Title:	Transit signal priority control using reinforcement learning based on cell transmission model
Other Titles:	การควบคุมสัญญาณที่มีลำดับความสำคัญของการเดินทางผ่านโดยใช้การเรียนรู้แบบเสริมแรงบนพื้นฐานของแบบจำลองการส่งผ่านเซลล์
Authors:	Pitipong Chanloha
Advisors:	Chaodit Aswakul Jatuporn Chinrungrueng Wipawee Hattagam
Advisor's Email:	Chaodit.A@chula.ac.th No information province No information province
Subjects:	Reinforcement learning Traffic engineering Communication and traffic การเรียนรู้แบบเสริมแรง วิศวกรรมจราจร จราจร ปริญญาดุษฎีบัณฑิต
Issue Date:	2012
Publisher:	Chulalongkorn University
Abstract:	This dissertation is aimed at developing a new framework to control traffic signal light for the road network with recently introduced bus rapid transit (BRT) system. By applying the automated goal-directed learning and decision making called reinforcement learning (RL), the best possible traffic signal actions can be sought upon changes of network states as modelled by the signalised cell transmission model (CTM). There are three main original contributions in this dissertation. Firstly, the model combining CTM to capture the system dynamics together with the implementation of RL approach called Q learning has been introduced for an isolated intersection. Despite of such isolation constraint, a new external delay function has been proposed at the system boundary condition to capture the effects on the neighbourhood of that isolated intersection system. With the proper setting of red light delay as the RL reward function, reported results show that our proposed framework using RL and CTM in the macroscopic level can efficiently find the proper control solution that is close to the brute-forcely searched best periodic signal solution (BPSS). Secondly, the performance comparison of optimal traffic signal controls based on the derivation of theoretical M/M/1 and D/D/1 models and based on the RL approach has been evaluated. In particular, based on M/M/1 and D/D/1 queuing, the optimal split has been derived to minimise the mean waiting time of an intersection with two conflicting flows. The results confirm the validity in adopting the RL approach to control the traffic signal. Finally, an extension to a network of cascading interactions with BRT system has been proposed with simple uni-directional flows without turning movements. Motivated by the BRT system in Bangkok, the conventional signalised CTM has been generalised to cope with the preplanned space-usage priority of BRT over other non-priority vehicles by modelling explicitly the existence of BRT physical lane separator as well as the location of BRT stations. The delay function of both carried passengers on BRT and on other non-priority vehicles as well as waiting passengers at stations has been introduced. Based on the investigated scenarios, the deployment of BRT system with one lane deducted by the lane separator cannot reduce the total passenger delay in comparison with the comparable road and traffic condition before the BRT installation. However, with BRT, the passenger throughput can be greatly increased by up to 9-15\\% in the jamming conditions when at least 40\\% from the overall passengers choose the BRT for their journey. Moreover, our proposed method outperforms the conventional preemptive and differential priority control methods because of the improved awareness of signal switching cost. Based on all findings, the outstanding merit will entirely contribute towards to support the development of sustainable transportation systems.
Other Abstract:	วิทยานิพนธ์ฉบับนี้มีจุดมุ่งหมายเพื่อพัฒนากรอบการวิเคราะห์ใหม่ในการควบคุมสัญญาณไฟสำหรับโครงข่ายถนนที่มีรถโดยสารประจำทางด่วนพิเศษ โดยใช้การเรียนรู้อัตโนมัติแบบเป้าหมายกำกับและกระทำการตัดสินใจที่เรียกว่า การเรียนรู้แบบเสริมแรง (Reinforcement learning : RL) โดยสัญญาณไฟจราจรที่ดีที่สุดที่เป็นไปได้สามารถหาได้ขึ้นกับการเปลี่ยนแปลงของสถานะโครงข่ายที่ถูกจำลองด้วยแบบจำลองการส่งผ่านเซลล์ที่มีสัญญาณไฟ (Cell transmission model : CTM) โดยมีผลงานสามส่วนหลักที่ปรากฏอยู่ในวิทยานิพนธ์ฉบับนี้ ประการแรก การรวมเข้าของแบบจำลอง CTM เพื่อจับระบบกลศาสตร์ร่วมกับการนำ RL ไปใช้เรียกว่า การเรียนรู้แบบคิวเพื่อหาคำตอบที่ดีที่สุดที่เป็นไปได้สำหรับสี่แยกเดี่ยว แม้ว่าจะเป็นการพิจารณาเพียงแยกเดียวแต่ก็ได้มีการนำเสนอฟังก์ชันการประวิงเวลาของระบบที่เงื่อนไขขอบเขตเพื่อจับผลประสิทธิผลจากแยกข้างเคียง ด้วยการเลือกการประวิงเวลาสัญญาณไฟแดงเป็นฟังก์ชันผลรางวัล ผลการทดลองแสดงให้เห็นว่ากรอบการวิเคราะห์ที่นำเสนอ RL และ CTM ในระดับมหภาคสามารถหาคำตอบของการควบคุมได้อย่างมีประสิทธิภาพใกล้เคียงกับการเอาแต่แรงด้วยการค้นหาคำตอบแบบสัญญาณคาบที่ดีที่สุด ประการที่สอง การเปรียบเทียบสมรรถนะของการควบคุมสัญญาณไฟจราจรที่ดีที่สุดบนพื้นฐานขอแบบจำลองอนุพัทธ์เชิงทฤษฎีระบบแถวคอย M/M/1 และ D/D/1 และถูกประเมินบนพื้นฐานของการเข้าสู่ RL โดยการแบ่งที่ดีที่สุดสามารถอนุพัทธ์เพื่อทำให้ค่าเฉลี่ยของการรอที่แยกเดี่ยวที่มีสองความขัดแย้งของการไหลลดลง ผลที่ได้มายืนยันมีผลใช้ได้ใน RL เพื่อคุมสัญญาณไฟ ประการสุดท้าย การยืดออกไปเป็นโครงข่ายอันตรกิริยาแบบต่อเรียงกับระบบสัญญาณที่มีลำดับความสำคัญได้ถูกนำเสนอควบคู่ไปกับการไกลแบบทิศทางเดียวอย่างง่ายที่ไม่มีการเลี้ยวกลับ ด้วยระบบ BRT ในกรุงเทพมหานครแบบจำลองการส่งผ่านเซลล์ที่มีสัญญาณไฟแบบทั่วไปสามารถทำให้ใช้ได้กับการวางแผนพื้นที่การใช้งานล่วงหน้าของ BRT ซึ่งเหนือกว่ารถที่ไม่มีลำดับความสำคัญ ด้วยการจำลองอย่างชัดแจ้งของการมีอยู่ของ BRT ที่ถูกแยกช่องกายภาพ เช่นเดียวกับ ตำแหน่งของสถานี BRT โดยนำเสนอฟังก์ชันประวิงเวลาของทั้งผู้โดยสารที่ถูกขนส่งไปได้บน BRT และบนรถที่ไม่มีลำดับความสำคัญ และ ผู้โดยสารที่รอที่สถานี ด้วยพื้นฐานของแบบการที่จำลองสืบค้นอยู่ การวางระบบ BRT ที่ลดลง หนึ่งช่องทางด้วยตัวแบ่งช่องทาง ไม่สามารถลดการประวิงเวลาโดยสารได้เมื่อได้ลองเทียบกับถนนที่มีเงื่อนไขจราจรก่อนจะมีการวางระบบ BRT อย่างไรก็ตาม ระบบ BRT สามารถเพิ่มปริมาณงานได้ถึง 9-15% ในเงื่อนไขการติดเมื่ออย่างน้อยที่สุด 40 % ของผู้โดยสารทั้งหมด เลือกใช้ BRT ยิ่งไปกว่านั้น ระเบียบวิธีที่ที่นำเสนอให้ผลดีกว่าวิธีระเบียบวิธีการจองแบบสามัญและระเบียบวิธีควบคุมผลต่างเชิงอนุพันธ์ที่มีลำดับความสำคัญเพราะความเพิ่มขึ้นของการตระหนักของราคาการเปลี่ยนแปลงของระดับสัญญาณ จากทุกการค้นพบทั้งหมดผลงานนี้มีคุณค่าที่จะสร้างผลงานเพื่อการพัฒนาระบบขนส่งมวลชนที่ยั่งยืนต่อไป
Description:	Thesis (Ph.D.)--Chulalongkorn University, 2012
Degree Name:	Doctor of Engineering
Degree Level:	Doctoral Degree
Degree Discipline:	Electrical Engineering
URI:	http://cuir.car.chula.ac.th/handle/123456789/36735
URI:	http://doi.org/10.14457/CU.the.2012.917
metadata.dc.identifier.DOI:	10.14457/CU.the.2012.917
Type:	Thesis
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
pitipong_ch.pdf		2.51 MB	Adobe PDF	View/Open

Show full item record