LLMLight: Large Language Models as Traffic Signal Control Agents

The Hong Kong University of Science and Technology (Guangzhou)
Under Review

*Corresponding Author

The demonstration video of LLMLight (powered by LightGPT).

Abstract

Traffic Signal Control (TSC) is a crucial component in urban traffic management, aiming to optimize road network efficiency and reduce congestion. Traditional methods in TSC, primarily based on transportation engineering and reinforcement learning (RL), often exhibit limitations in generalization across varied traffic scenarios and lack interpretability. This paper presents LLMLight, a novel framework employing Large Language Models (LLMs) as decision-making agents for TSC. Specifically, the framework begins by instructing the LLM with a knowledgeable prompt detailing real-time traffic conditions. Leveraging the advanced generalization capabilities of LLMs, LLMLight engages a reasoning and decision-making process akin to human intuition for effective traffic control. Moreover, we build LightGPT, a specialized backbone LLM tailored for TSC tasks. By learning nuanced traffic patterns and control strategies, LightGPT enhances the LLMLight framework cost-effectively. Extensive experiments on nine real-world and synthetic datasets showcase the remarkable effectiveness, generalization ability, and interpretability of LLMLight against nine transportation-based and RL-based baselines.

Motivation

Existing research on TSC has primarily fallen into two categories: transportation and Reinforcement Learning (RL)-based approaches. Transportation methods primarily focus on crafting efficient heuristic algorithms, dynamically adapting traffic signal configurations based on lane-level traffic conditions. However, these methods heavily rely on manual design, demanding substantial human effort. The emergence of deep neural networks (DNNs) led to the introduction of deep RL-based techniques to address this challenge. These approaches have exhibited remarkable performance across various traffic scenarios. Nevertheless, RL-based methods also present several drawbacks. Primarily, they may struggle with limited generalization ability, particularly when transferring to larger-scale road networks or under highly uncommon scenarios (e.g., extreme high-traffic situations), as their training data only covers limited traffic situations. Additionally, RL-based methods lack interpretability due to the black-box nature of DNNs, which makes it hard to explain the rationale behind their control actions under specific traffic conditions.

Challenge 1

First, LLMs are typically pre-trained on large-scale natural language corpora and rarely incorporate non-textual traffic data, such as sensor readings and GPS trajectories. Despite their generalization capability across various tasks and domains, an inherent gap exists between real-time traffic data and linguistic understanding. The first challenge lies in enabling LLMs to comprehend real-time traffic dynamics and effectively interact with the traffic environment, which is critical for effective LLM-based traffic signal control.

Challenge 2

Second, selecting and developing an effective LLM for TSC poses another significant challenge. Generalist LLMs often lack specific domain knowledge and are prone to hallucination problems in professional fields. Although state-of-the-art LLMs such as GPT-4 demonstrate promising generalization abilities, their closed-source nature and substantial usage costs pose barriers to their optimization for real-time TSC tasks. Consequently, building a specialized LLM tailored for TSC tasks is crucial to deliver more effective and human-aligned control policies.

Workflow

To this end, we introduce LLMLight, a traffic signal control agent framework based on LLMs, as depicted in the figure. Specifically, we consider TSC as a partially observable Markov Game, where each agent, armed with an LLM, manages the traffic light at an intersection. At each signal-switching time step, the agent collects traffic conditions of the target intersection and transforms them into human-readable text as real-time observation. Additionally, we incorporate task descriptions enriched with commonsense knowledge about a control strategy to aid the LLM's understanding of traffic management tasks. The combined real-time observation, task description, and control action space form a knowledgeable prompt that guides the LLM's decision-making. Then, the agent leverages Chain-of-Thought (CoT) reasoning to determine the optimal traffic signal configuration for the subsequent time step.

workflow

LightGPT Training

Furthermore, we construct a specialized LLM, LightGPT, to enhance the LLMLight framework. On the one hand, we propose imitation fine-tuning to let the specialized LLM exploit high-quality control actions and underlying rationales derived from GPT-4. On the other hand, we introduce a policy refinement process that utilizes a well-trained critic model to evaluate and improve the agent's actions. The optimized LightGPT produces more effective control policies and showcases remarkable generalization ability across diverse traffic scenarios in a more cost-effective manner than GPT-4.

training

Comparison Between LLMLight and Traditional Methods

training

Generalization Ability

Prompt & Decision-making Process

BibTeX

@misc{lai2024llmlight,
      title={LLMLight: Large Language Models as Traffic Signal Control Agents},
      author={Siqi Lai and Zhao Xu and Weijia Zhang and Hao Liu and Hui Xiong},
      year={2024},
      eprint={2312.16044},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}