Efficient inference offloading for mixture-of-experts large language models in internet of medical things

Efficient inference offloading for mixture-of-experts large language models in internet of medical things

Despite recent significant advancements in large language models (LLMs) for medical services, the deployment difficulties of LLMs in e-healthcare hinder complex medical applications in the Internet of Medical Things (IoMT). People are increasingly concerned about e-healthcare risks and privacy prote...

Full description

Bibliographic Details
Main Authors:	Yuan, Xiaoming, Kong, Weixuan, Luo, Zhenyu, Xu, Minrui
Other Authors:	School of Computer Science and Engineering
Format:	Journal Article
Language:	English
Published:	2024
Subjects:	Computer and Information Science Large language models Efficient inference offloading
Online Access:	https://hdl.handle.net/10356/179743

Similar Items

Computation offloading and content caching and delivery in Vehicular Edge Network: a survey
by: Dziyauddin, Rudzidatul Akmam, et al.
Published: (2022)

Visible light based occupancy inference using ensemble learning
by: Hao, Jie, et al.
Published: (2018)

Towards formal verification of Bayesian inference in probabilistic programming via guaranteed bounds
by: Zaiser, F
Published: (2024)

Internet of Things: Principles and Paradigms /
by: Buyya, Rajkumar, editor 184757, et al.
Published: (2016)

Arbitrarily strong utility-privacy tradeoff in multi-agent systems
by: Wang, Chong Xiao, et al.
Published: (2021)

Implementing security in internet of things (IoT)
by: Tee, Ting Yi
Published: (2016)

Mechanism design for internet of things services market
by: Jiao, Yutao
Published: (2020)

Defending Forward : Pre-emption in the Internet of Things
by: Lohaus, Phillip
Published: (2019)

Advances in Industrial Internet of Things, Engineering and Management /
by: Cagáňová, Dagmar, editor 643104, et al.
Published: (2021)

Interventions over Predictions: Reframing the Ethical Debate for Actuarial Risk Assessment
by: Barabas, Chelsea, et al.
Published: (2018)

Microbial communities: network reconstruction and control
by: Fu, A
Published: (2024)

Efficient personal-health-records sharing in internet of medical things using searchable symmetric encryption, blockchain, and IPFS
by: Bisht, Abhishek, et al.
Published: (2024)

Large-Scale Equipment and Higher Performance Materials for Laser Additive Manufacturing
by: Shi, Yusheng, et al.
Published: (2016)

Translational GTPase BipA is involved in the maturation of a large subunit of bacterial ribosome at suboptimal temperature
by: Goh, Kwok Jian, et al.
Published: (2022)

Leveraging large language models and BERT for log parsing and anomaly detection
by: Zhou, Yihan, et al.
Published: (2024)

MCQGen: a large language model-driven MCQ generator for personalized learning
by: Hang, Ching Nam, et al.
Published: (2024)

Design and comparison of adder topologies for high performance computing
by: Gong, Wenkang
Published: (2025)

The devil is in the detail? Investors’ mispricing of proxy voting outcomes on M&A deals
by: Li, Lingwei, et al.
Published: (2021)

Efficient and stable all-inorganic perovskite solar cells based on nonstoichiometric CsₓPbI₂Brₓ (x > 1) alloys
by: Frolova, Lyubov A., et al.
Published: (2021)

Utilization of the Internet of Things in traditional musical instrument Talempong learning
by: Novian Anggis Suwastika, Novian Anggis Suwastika, et al.
Published: (2022)

A fast wake-up circuit for internet-of-things regulator
by: Yee, Hern Yue
Published: (2023)

LKAW: a robust watermarking method based on large kernel convolution and adaptive weight assignment
by: Zhang, Xiaorui, et al.
Published: (2023)

Advances and applications of large language models II
by: Ng, Qi Xuan
Published: (2024)

Multi-objectives firefly algorithm for task offloading in the edge-fog-cloud computing
by: A. Saif, Faten, et al.
Published: (2024)

Aquatic life monitoring using raspberry-Pi in internet of things (IoT).
by: Sopian, Zulhaziq, et al.
Published: (2022)

An ultra-low supply low dropout regulator for internet-of-things applications
by: Zhang, Xinyan
Published: (2021)

Machine-Type Communication for Maritime Internet-of-Things : From Concept to Practice /
by: Wang, Michael Mao, author 643099, et al.
Published: ([202)

A bi-level probabilistic path planning algorithm for multiple robots with motion uncertainty
by: Wang, Jingchuan, et al.
Published: (2021)

Economic assessment of a Dynamic Autonomous Road Transit system for Singapore
by: Sun, Shan-Shan, et al.
Published: (2022)

Annotating videos that teach MS Excel and predicting mouse / keyboard actions
by: Tan, Genson Yao Jie
Published: (2024)

Get a Copy Find a copy in the library Internet of Things and Connected Technologies : Conference Proceedings on 5th International Conference on Internet of Things and Connected Technologies (ICIoTCT), 2020 /
by: International Conference on Internet of Things and Connected Technologies (5th : 2020 : Patna City, India), organiser 643049, et al.
Published: ([202)

Large-scale flow measurements and analysis for radial inlets of industrial centrifugal compressors based on multi-hole probe system
by: Han, Fenghui, et al.
Published: (2021)

Superstition and financial decision making
by: Hirshleifer, David, et al.
Published: (2019)

Process economics and operating strategy for the energy-efficient reverse osmosis (EERO) process
by: Chong, Tzyy Haur, et al.
Published: (2020)

An energy-saving oriented air balancing method for demand controlled ventilation systems with branch and black-box model
by: Cui, Can, et al.
Published: (2022)

Modelling of evaporation rate for peatland fire prevention using internet of things (IoT) system
by: Li, Lu, et al.
Published: (2023)

Internet-of-things gateway for augmented/virtual reality application using light fidelity transmission
by: Leong, Zhi Cheng
Published: (2021)

Wireless Sensor Network for Internet of Things Facility Management (IoT-FM) environment sensing
by: Guan, Jun Liang
Published: (2020)

Improving Quality of Experience in multimedia Internet of Things leveraging machine learning on big data
by: Huang, Xiaohong, et al.
Published: (2020)

Design of a 40nm CMOS Internet-of-Things regulator with fast power on and off
by: Zhong, Si Yuan
Published: (2020)