Rethinking visual prompting for multimodal large language models with external knowledge

Rethinking visual prompting for multimodal large language models with external knowledge

In recent years, multimodal large language models (MLLMs) have made significant strides by training on vast high-quality image-text datasets, enabling them to generally understand images well. However, the inherent difficulty in explicitly conveying fine-grained or spatially dense information in tex...

Mô tả đầy đủ

Chi tiết về thư mục
Những tác giả chính:	Lin, Y, Li, Y, Chen, D, Xu, W, Clark, R, Torr, P, Yuan, L
Định dạng:	Internet publication
Ngôn ngữ:	English
Được phát hành:	2024

Những quyển sách tương tự

Prompting Large Language Models with Knowledge-Injection for Knowledge-Based Visual Question Answering
Bằng: Zhongjian Hu, et al.
Được phát hành: (2024-09-01)

Knowledge graph construction for heart failure using large language models with prompt engineering
Bằng: Tianhan Xu, et al.
Được phát hành: (2024-07-01)

Prompt Optimization in Large Language Models
Bằng: Antonio Sabbatella, et al.
Được phát hành: (2024-03-01)

CAT: enhancing multimodal large language model to answer questions in dynamic audio-visual scenarios
Bằng: Ye, Q, et al.
Được phát hành: (2024)

Review of large vision models and visual prompt engineering
Bằng: Jiaqi Wang, et al.
Được phát hành: (2023-11-01)

A unified prompt-based framework for few-shot multimodal language analysis
Bằng: Xiaohan Zhang, et al.
Được phát hành: (2025-06-01)

Learning visual prompts for guiding the attention of vision transformers
Bằng: Rezaei, R, et al.
Được phát hành: (2024)

REKP: Refined External Knowledge into Prompt-Tuning for Few-Shot Text Classification
Bằng: Yuzhuo Dang, et al.
Được phát hành: (2023-11-01)

Improving language model predictions via prompts enriched with knowledge graphs
Bằng: Brate, R, et al.
Được phát hành: (2023)

Aligning, autoencoding and prompting large language models for novel disease reporting
Bằng: Liu, F, et al.
Được phát hành: (2025)

uCAP: an unsupervised prompting method for vision-language models
Bằng: Nguyen, AT, et al.
Được phát hành: (2024)

Predictive Prompts with Joint Training of Large Language Models for Explainable Recommendation
Bằng: Ching-Sheng Lin, et al.
Được phát hành: (2023-10-01)

Extracting Fruit Disease Knowledge from Research Papers Based on Large Language Models and Prompt Engineering
Bằng: Yunqiao Fei, et al.
Được phát hành: (2025-01-01)

Balancing Privacy and Robustness in Prompt Learning for Large Language Models
Bằng: Chiyu Shi, et al.
Được phát hành: (2024-10-01)

Response Generated by Large Language Models Depends on the Structure of the Prompt
Bằng: Pradosh Kumar Sarangi, et al.
Được phát hành: (2024-07-01)

Prompt Engineering: Guiding the Way to Effective Large Language Models
Bằng: Mohammad Aljanabi, et al.
Được phát hành: (2023-11-01)

An image is worth 1000 lies: adversarial transferability across prompts on vision-language models
Bằng: Luo, H, et al.
Được phát hành: (2024)

A Brief Overview of Few-Shot Prompting in the Large Language Models
Bằng: Vladlen Kulikov, et al.
Được phát hành: (2023-05-01)

Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine
Bằng: Thomas Savage, et al.
Được phát hành: (2024-01-01)

The application of multimodal large language models in medicine
Bằng: Jianing Qiu, et al.
Được phát hành: (2024-04-01)

Clinical prompt learning with frozen language models
Bằng: Taylor, N, et al.
Được phát hành: (2023)

LLMR: Real-time Prompting of Interactive Worlds using Large Language Models
Bằng: De La Torre, Fernanda, et al.
Được phát hành: (2024)

Large language model enhanced with prompt-based vanilla distillation for sentence embeddings
Bằng: Wang, Minghao
Được phát hành: (2024)

Large multimodal models for visual reasoning
Bằng: Duong, Ngoc Yen
Được phát hành: (2024)

Intelligent extraction of reservoir dispatching information integrating large language model and structured prompts
Bằng: Yangrui Yang, et al.
Được phát hành: (2024-06-01)

A Security Risk Taxonomy for Prompt-Based Interaction With Large Language Models
Bằng: Erik Derner, et al.
Được phát hành: (2024-01-01)

DetToolChain: a new prompting paradigm to unleash detection ability of MLLM
Bằng: Wu, Y, et al.
Được phát hành: (2024)

Research and application of defense mechanism for prompt injection attack of large language model in financial industry
Bằng: MOU Daen, et al.
Được phát hành: (2024-10-01)

A medical multimodal large language model for future pandemics
Bằng: Liu, F, et al.
Được phát hành: (2023)

On the legal implications of Large Language Model answers: A prompt engineering approach and a view beyond by exploiting Knowledge Graphs
Bằng: George Hannah, et al.
Được phát hành: (2025-01-01)

Rethinking Language
Bằng: Gastor Mapunda, et al.
Được phát hành: (2024-09-01)

Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation
Bằng: Cyril Chhun, et al.
Được phát hành: (2024-09-01)

Harnessing multimodal large language models for traffic knowledge graph generation and decision-making
Bằng: Senyun Kuang, et al.
Được phát hành: (2024-12-01)

PromptSMILES: prompting for scaffold decoration and fragment linking in chemical language models
Bằng: Morgan Thomas, et al.
Được phát hành: (2024-07-01)

The influence of knowledge visualization on externalizing tacit knowledge
Bằng: Ahmad, Khairul Bariah, et al.
Được phát hành: (2011)

Rethinking of Coase Theorem: Externalities and Uncertainty
Bằng: Evgeny A. Kuzmin, et al.
Được phát hành: (2015-10-01)

Rethinking of Coase Theorem: Externalities and Uncertainty
Bằng: Evgeny A. Kuzmin, et al.
Được phát hành: (2015-10-01)

Rethinking of Coase Theorem: Externalities and Uncertainty
Bằng: Evgeny A. Kuzmin, et al.
Được phát hành: (2015-10-01)

Rethinking of Coase Theorem: Externalities and Uncertainty
Bằng: Evgeny A. Kuzmin, et al.
Được phát hành: (2015-12-01)

TEACHING ENGLISH AS A FOREIGN LANGUAGE: RETHINKING THE MULTIMODALITY AND COMMUNICATION SKILLS IN THE 21st CENTURY
Bằng: Liudmyla Byrkun
Được phát hành: (2023-12-01)