Inducing high energy-latency of large vision-language models with verbose images

Large vision-language models (VLMs) such as GPT-4 have achieved exceptional performance across various multi-modal tasks. However, the deployment of VLMs necessitates substantial energy consumption and computational resources. Once attackers maliciously induce high energy consumption and latency tim...

Ամբողջական նկարագրություն

Մատենագիտական մանրամասներ
Հիմնական հեղինակներ:	Gao, K, Bai, Y, Gu, J, Xia, ST, Torr, P, Li, Z, Liu, W
Ձևաչափ:	Conference item
Լեզու:	English
Հրապարակվել է:	OpenReview 2024

Inducing high energy-latency of large vision-language models with verbose images

Նմանատիպ նյութեր