CE-CoLLM: Efficient and Adaptive Large Language Models Through Cloud-Edge Collaboration

CE-CoLLM: Efficient and Adaptive Large Language Models Through Cloud-Edge Collaboration Conference

Jin, H, Wu, Y. (2025). CE-CoLLM: Efficient and Adaptive Large Language Models Through Cloud-Edge Collaboration .(2025), 316-323. 10.1109/ICWS67624.2025.00046

cited authors

Jin, H; Wu, Y

authors

Wu, Yanzhao

abstract

Large Language Models (LLMs) exhibit remarkable human-like predictive capabilities. However, it is challenging to deploy LLMs to provide efficient and adaptive inference services at the edge. This paper proposes a novel Cloud-Edge Collaboration framework for LLMs (CE-CoLLM) to tackle these challenges. First, we identify the transmission of LLM contextual data between the cloud and edge as a key performance bottleneck, which introduces substantial communication overhead that dominates overall inference latency and makes naïve cloudedge collaboration for LLMs inefficient. Second, we introduce a suite of novel techniques, including a latency-aware early exit mechanism and efficient cloud context management, into CECoLLM, which collectively reduce communication overhead and preserve LLM inference accuracy. Third, we design two adaptive inference modes to accommodate diverse edge environments: (1) a low-latency standalone edge inference mode that enables reliable edge-side independent LLM inference even under unstable network conditions, and (2) a high-accuracy cloud-edge collaborative inference mode that adaptively leverages cloud resources to enhance prediction accuracy. Extensive experiments on multiple benchmark datasets demonstrate that CE-CoLLM reduces overall inference time by up to 13.81% and offloads over 84.53% of the computational workload from the cloud to the edge, compared to conventional cloud-based LLM deployment, without sacrificing prediction accuracy. The code is provided on GitHub at https://github.com/mlsysx/CE-CoLLM.

FIU Discovery

CE-CoLLM: Efficient and Adaptive Large Language Models Through Cloud-Edge Collaboration Conference

Overview

cited authors

authors

abstract

publication date

Identifiers

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

issue