vLLM is an inference and serving engine for large language models (LLMs). From to before 0.20.0, the extract_hidden_states speculative decoding proposer in vLLM returns a tensor with an incorrect shape after the first decode step, causing a RuntimeError that crashes the EngineCore process. The crash is triggered when any request in the batch uses sampling penalty parameters (repetition_penalty, frequency_penalty, or presence_penalty). A single request with a penalty parameter (e.g., "repetition_penalty": 1.1) is sufficient to crash the server. This vulnerability is fixed in 0.20.0.
Advisories
| Source | ID | Title |
|---|---|---|
Github GHSA |
GHSA-83vm-p52w-f9pw | vLLM: extract_hidden_states speculative decoding crashes server on any request with penalty parameters |
Fixes
Solution
No solution given by the vendor.
Workaround
No workaround given by the vendor.
References
History
Tue, 12 May 2026 23:30:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| First Time appeared |
Vllm-project
Vllm-project vllm |
|
| Vendors & Products |
Vllm-project
Vllm-project vllm |
Tue, 12 May 2026 20:15:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| Description | vLLM is an inference and serving engine for large language models (LLMs). From to before 0.20.0, the extract_hidden_states speculative decoding proposer in vLLM returns a tensor with an incorrect shape after the first decode step, causing a RuntimeError that crashes the EngineCore process. The crash is triggered when any request in the batch uses sampling penalty parameters (repetition_penalty, frequency_penalty, or presence_penalty). A single request with a penalty parameter (e.g., "repetition_penalty": 1.1) is sufficient to crash the server. This vulnerability is fixed in 0.20.0. | |
| Title | vLLM: extract_hidden_states speculative decoding crashes server on any request with penalty parameters | |
| Weaknesses | CWE-131 CWE-704 |
|
| References |
| |
| Metrics |
cvssV3_1
|
Projects
Sign in to view the affected projects.
Status: PUBLISHED
Assigner: GitHub_M
Published:
Updated: 2026-05-12T19:58:40.862Z
Reserved: 2026-05-05T15:42:40.518Z
Link: CVE-2026-44223
No data.
Status : Awaiting Analysis
Published: 2026-05-12T20:16:43.293
Modified: 2026-05-13T18:16:08.537
Link: CVE-2026-44223
No data.
OpenCVE Enrichment
Updated: 2026-05-12T23:15:26Z
Github GHSA