Analyzing NYPD stop, question, and frisk with machine learning techniques

Passiri Bodhidatta

dc.contributor.advisor	Daricha Sutivong
dc.contributor.author	Passiri Bodhidatta
dc.contributor.other	Chulalongkorn University. Faculty of Engineering
dc.date.accessioned	2022-11-02T09:44:55Z
dc.date.available	2022-11-02T09:44:55Z
dc.date.issued	2021
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/80853
dc.description	Thesis (M.Eng.)--Chulalongkorn University, 2021
dc.description.abstract	Although stops from “Stop, Question, and Frisk” program have decreased dramatically after the New York Police Department (NYPD) reform in 2013, the unnecessary stops and weapon use against innocent citizens remain critical problems. This study analyzes the stops during 2014 – 2019, using three tree-based machine learning approaches: Decision Tree, Random Forest, and XGBoost. Models for predicting stops that resulted in a conviction and police’s level of force used are developed and driving factors are identified. Results show that XGBoost outperformed other models in both predictions. The performance of Guilty Prediction was at 65.9% F1 score and 84.0% accuracy. For Level of Force Prediction, the F1 score obtained for “Level 1” and “Level 2” were 40.7% and 35.0% respectively, with 80.4% overall accuracy. The findings indicated that the presence of a weapon implies a suspect's conviction. Despite that, numerous unnecessary stops are likely driven by inaccurate assumptions about suspect’s weapon possession, which lead to police’s gunfire usage against innocent citizens. Additionally, this study explores a hybrid technique called Super Learner. Experiments on various structures of Super Learners are performed. For base models, Super Learners can improve performance from their own base models when using untuned base models but do not improve when using tuned base models. The performance of base models also played a significant role in the performance of Super Learners, namely having high-performance base models improved meta models’ performance, and vice versa. For meta models, XGBoost and Logistic Regression outperform other meta models across both predictions.
dc.description.abstractalternative	ถึงแม้ว่าการสกัดจับ ในปฏิบัติการเรียกสกัดจับ สอบถาม และค้นตัว ได้ลดลงอย่างมาก หลังจากการปฏิรูปกรมตำรวจนิวยอร์ค ในปี 2013 แต่การสกัดจับที่ไม่จำเป็น และการใช้อาวุธกับประชาชนผู้บริสุทธิ์ ยังคงเป็นปัญหาสำคัญ งานศึกษานี้ ได้วิเคราะห์การสกัดจับระหว่างปี 2014-2019 โดยใช้การเรียนรู้ด้วยเครื่องแบบต้นไม้ 3 ประเภท ได้แก่ Decision Tree, Random Forest และ XGBoost เพื่อสร้างแบบจำลองเพื่อทำนายการสกัดจับว่าจะมีการกระทำผิดหรือไม่ และเพื่อทำนายระดับการใช้กำลังของตำรวจ รวมทั้งระบุปัจจัยที่ส่งผล ผลการศึกษา แสดงให้เห็นว่า XGBoost ให้ผลลัพธ์ดีกว่าแบบจำลองอื่นในการทำนายทั้งสองปัญหา ในการทำนายความผิด ได้คะแนน F1 ที่ 65.9% และความแม่นยำ 84.0% ส่วนในการทำนายระดับการใช้กำลังของตำรวจ ได้คะแนน F1 ของระดับ 1 และระดับ 2 เป็น 40.7% และ 35.0% ตามลำดับ ด้วยความแม่นยำโดยรวม 80.4% โดยผลลัพธ์ชี้ให้เห็นว่าการมีอาวุธสื่อถึงการที่ผู้ต้องสงสัยได้กระทำผิด ถึงกระนั้น ตำรวจอาจมีการสันนิษฐานที่ไม่แม่นยำเกี่ยวกับการครอบครองอาวุธของผู้ต้องสงสัย ซึ่งอาจนำไปสู่การสกัดจับ และการใช้ปืนกับประชาชนผู้บริสุทธิ์ได้ นอกจากนี้ งานศึกษานี้ยังได้ศึกษาเทคนิคการผสมผสานที่ชื่อว่า Super Learner โดยได้ทดลองสร้างโครงสร้างหลากหลายแบบ พบว่า Super Learner ให้ผลลัพธ์ที่พัฒนาขึ้นจากแบบจำลองพื้นฐานของมันเองเมื่อใช้แบบจำลองพื้นฐานที่ไม่ได้ปรับตั้งค่า แต่จะไม่พัฒนาขึ้นมากนักหากใช้แบบจำลองพื้นฐานที่ผ่านการปรับตั้งค่ามาแล้ว ความสามารถการทำนายของแบบจำลองพื้นฐานก็เป็นสิ่งหลักที่ส่งผลต่อความสามารถในการทำนายของ Super Learner เช่นกัน นั่นคือหากใช้แบบจำลองพื้นฐานที่มีความสามารถที่ดี ก็จะช่วยพัฒนาความสามารถของ meta model ได้ และในทางกลับกันก็เช่นกัน สุดท้ายพบว่า meta model ซึ่งใช้ XGBoost และ Logistic Regression ให้ผลลัพธ์ดีกว่าแบบจำลองอื่นในการทำนายทั้ง 2 ปัญหา
dc.language.iso	en
dc.publisher	Chulalongkorn University
dc.relation.uri	http://doi.org/10.58837/CHULA.THE.2021.200
dc.rights	Chulalongkorn University
dc.subject.classification	Computer Science
dc.subject.classification	Social Sciences
dc.title	Analyzing NYPD stop, question, and frisk with machine learning techniques
dc.title.alternative	การวิเคราะห์ปฏิบัติการเรียกสกัดจับสอบถามและค้นตัวของกรมตำรวจนิวยอร์ค ด้วยเทคนิคการเรียนรู้ของเครื่อง
dc.type	Thesis
dc.degree.name	Master of Engineering
dc.degree.level	Master's Degree
dc.degree.discipline	Industrial Engineering
dc.degree.grantor	Chulalongkorn University
dc.identifier.DOI	10.58837/CHULA.THE.2021.200