Estimating Point of Interests (POI) Visit Demand using Location-Based Services (LBS) Data and Large-Language Models (LLMs)

Term Start:

June 1, 2025

Term End:

May 31, 2026

Budget:

$90,000

Keywords:

Data Collection, Large Language Models, Location-Based Services Data, Machine Learning, Travel Demand

Thrust Area(s):

Data Modeling and Analytic Tools

University Lead:

University of Washington

Researcher(s):

Lyra Chen; Cynthia Chen

Estimating demands to points of interests (POI) involves predicting the number of visitors to specific locations, such as restaurants, retail stores, parks, or cultural sites like museums. Unlike traditional travel demand models, which focus on large zones (e.g., Transportation Analysis Zones (TAZ) or census tracts) for long-term planning such as transit network, POI visits estimation targets individual sites that are at a much smaller spatial scale and this granularity is critical for short-term decisions: local Departments of Transportation (DOTs) need to assess how traffic and pedestrian patterns change after reconfiguration of a street segment or a newly constructed pedestrian plaza, while businesses rely on visitor forecasts to optimize staffing, inventory, and site selection. Technology has produced a large quantity of Location-Based Services (LBS) data from GPS traces and mobile app check-ins, allowing us to do precision analysis of POIs across time and space.

The proposed research will tackle this limitation by developing uncertainty-aware models to estimate demand for POIs. More specifically, the research will incorporate recent advances in Large Language Models (LLMs) into the new estimates. This research aims to develop integrated modeling frameworks that combine LBS data, POI metadata (e.g., from OpenStreetMap or SafeGraph), and contextual information (e.g., land use, weather) to estimate POI visit demand. The research will involve LLMs (like GPT-3 or open-source models) in two main jobs: 1) Semantic disambiguation of LBS pings in dense or overlapping POI areas, by using reasons to figure out which POI a certain ping is most likely to be about, considering the various dimensions of the situation. 2) Performance evaluation and error analysis of the LLM-based approaches, by trying to understand factors under which these approaches do and don’t work. The effectiveness of these models will be validated using ground truth data (manual visit counts, ticket sales, or sensor data) collected in contrasting urban environments, to help test model generalizability across different spatial structures and POI distributions.

Scroll to Top