Rethinking visual prompting for multimodal large language models with external knowledge

Rethinking visual prompting for multimodal large language models with external knowledge

In recent years, multimodal large language models (MLLMs) have made significant strides by training on vast high-quality image-text datasets, enabling them to generally understand images well. However, the inherent difficulty in explicitly conveying fine-grained or spatially dense information in tex...

وصف كامل

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	Lin, Y, Li, Y, Chen, D, Xu, W, Clark, R, Torr, P, Yuan, L
التنسيق:	Internet publication
اللغة:	English
منشور في:	2024

مواد مشابهة

Prompting Large Language Models with Knowledge-Injection for Knowledge-Based Visual Question Answering
حسب: Zhongjian Hu, وآخرون
منشور في: (2024-09-01)

Knowledge graph construction for heart failure using large language models with prompt engineering
حسب: Tianhan Xu, وآخرون
منشور في: (2024-07-01)

Prompt Optimization in Large Language Models
حسب: Antonio Sabbatella, وآخرون
منشور في: (2024-03-01)

CAT: enhancing multimodal large language model to answer questions in dynamic audio-visual scenarios
حسب: Ye, Q, وآخرون
منشور في: (2024)

Review of large vision models and visual prompt engineering
حسب: Jiaqi Wang, وآخرون
منشور في: (2023-11-01)

A unified prompt-based framework for few-shot multimodal language analysis
حسب: Xiaohan Zhang, وآخرون
منشور في: (2025-06-01)

Learning visual prompts for guiding the attention of vision transformers
حسب: Rezaei, R, وآخرون
منشور في: (2024)

REKP: Refined External Knowledge into Prompt-Tuning for Few-Shot Text Classification
حسب: Yuzhuo Dang, وآخرون
منشور في: (2023-11-01)

Improving language model predictions via prompts enriched with knowledge graphs
حسب: Brate, R, وآخرون
منشور في: (2023)

Aligning, autoencoding and prompting large language models for novel disease reporting
حسب: Liu, F, وآخرون
منشور في: (2025)

uCAP: an unsupervised prompting method for vision-language models
حسب: Nguyen, AT, وآخرون
منشور في: (2024)

Predictive Prompts with Joint Training of Large Language Models for Explainable Recommendation
حسب: Ching-Sheng Lin, وآخرون
منشور في: (2023-10-01)

Extracting Fruit Disease Knowledge from Research Papers Based on Large Language Models and Prompt Engineering
حسب: Yunqiao Fei, وآخرون
منشور في: (2025-01-01)

Balancing Privacy and Robustness in Prompt Learning for Large Language Models
حسب: Chiyu Shi, وآخرون
منشور في: (2024-10-01)

Response Generated by Large Language Models Depends on the Structure of the Prompt
حسب: Pradosh Kumar Sarangi, وآخرون
منشور في: (2024-07-01)

Prompt Engineering: Guiding the Way to Effective Large Language Models
حسب: Mohammad Aljanabi, وآخرون
منشور في: (2023-11-01)

An image is worth 1000 lies: adversarial transferability across prompts on vision-language models
حسب: Luo, H, وآخرون
منشور في: (2024)

A Brief Overview of Few-Shot Prompting in the Large Language Models
حسب: Vladlen Kulikov, وآخرون
منشور في: (2023-05-01)

Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine
حسب: Thomas Savage, وآخرون
منشور في: (2024-01-01)

The application of multimodal large language models in medicine
حسب: Jianing Qiu, وآخرون
منشور في: (2024-04-01)

Clinical prompt learning with frozen language models
حسب: Taylor, N, وآخرون
منشور في: (2023)

LLMR: Real-time Prompting of Interactive Worlds using Large Language Models
حسب: De La Torre, Fernanda, وآخرون
منشور في: (2024)

Large language model enhanced with prompt-based vanilla distillation for sentence embeddings
حسب: Wang, Minghao
منشور في: (2024)

Large multimodal models for visual reasoning
حسب: Duong, Ngoc Yen
منشور في: (2024)

Intelligent extraction of reservoir dispatching information integrating large language model and structured prompts
حسب: Yangrui Yang, وآخرون
منشور في: (2024-06-01)

A Security Risk Taxonomy for Prompt-Based Interaction With Large Language Models
حسب: Erik Derner, وآخرون
منشور في: (2024-01-01)

DetToolChain: a new prompting paradigm to unleash detection ability of MLLM
حسب: Wu, Y, وآخرون
منشور في: (2024)

Research and application of defense mechanism for prompt injection attack of large language model in financial industry
حسب: MOU Daen, وآخرون
منشور في: (2024-10-01)

A medical multimodal large language model for future pandemics
حسب: Liu, F, وآخرون
منشور في: (2023)

On the legal implications of Large Language Model answers: A prompt engineering approach and a view beyond by exploiting Knowledge Graphs
حسب: George Hannah, وآخرون
منشور في: (2025-01-01)

Rethinking Language
حسب: Gastor Mapunda, وآخرون
منشور في: (2024-09-01)

Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation
حسب: Cyril Chhun, وآخرون
منشور في: (2024-09-01)

Harnessing multimodal large language models for traffic knowledge graph generation and decision-making
حسب: Senyun Kuang, وآخرون
منشور في: (2024-12-01)

PromptSMILES: prompting for scaffold decoration and fragment linking in chemical language models
حسب: Morgan Thomas, وآخرون
منشور في: (2024-07-01)

The influence of knowledge visualization on externalizing tacit knowledge
حسب: Ahmad, Khairul Bariah, وآخرون
منشور في: (2011)

Rethinking of Coase Theorem: Externalities and Uncertainty
حسب: Evgeny A. Kuzmin, وآخرون
منشور في: (2015-10-01)

Rethinking of Coase Theorem: Externalities and Uncertainty
حسب: Evgeny A. Kuzmin, وآخرون
منشور في: (2015-10-01)

Rethinking of Coase Theorem: Externalities and Uncertainty
حسب: Evgeny A. Kuzmin, وآخرون
منشور في: (2015-10-01)

Rethinking of Coase Theorem: Externalities and Uncertainty
حسب: Evgeny A. Kuzmin, وآخرون
منشور في: (2015-12-01)

TEACHING ENGLISH AS A FOREIGN LANGUAGE: RETHINKING THE MULTIMODALITY AND COMMUNICATION SKILLS IN THE 21st CENTURY
حسب: Liudmyla Byrkun
منشور في: (2023-12-01)