[【论文阅读】MOE,《OUTRAGEOUSLY LARGE NEURAL NETWORKS: THE SPARSELY-GATED MIXTURE-OF-EXPERTS LAYER》](https://blog.csdn.net/bylander/article/details/138139345)
[【论文速读】MOD,《Mixture-of-Depths: Dynamically allocating compute in transformer-based language models》](https://blog.csdn.net/bylander/article/details/139536003)
[Mixture of Depths论文解读](https://zhuanlan.zhihu.com/p/691324301)
[Mixture of Depths论文解读](https://zhuanlan.zhihu.com/p/691324301)
| 2024/01/16 | 多模态 | [A Survey of Resource-efficient LLM and Multimodal Foundation Models](https://arxiv.org/abs/2401.08092) | Mengwei Xu | [UbiquitousLearning](https://github.com/UbiquitousLearning/Efficient_Foundation_Model_Survey) | 一篇关于资源高效的大模型和多模态基础模型的综述论文. 论文涵盖了算法和系统两个方面的创新, 包括了高校的模型架构, 训练算法, 推理算法和模型压缩等内容. |
| 2024/04/18 | 效率提升 | [The Efficiency Spectrum of Large Language Models: An Algorithmic Survey](https://arxiv.org/abs/2312.00678) | Tianyu Ding | [tding1](https://github.com/tding1/Efficient-LLM-Survey) | 一篇关于提供大语言模型效率的综合性调查论文, 全面回顾了旨在提高 LLM 效率的算法, 涵盖了扩展定律, 数据利用, 架构创新, 训练和调优策略以及推理计划等. |
| 2024/05/23 | LLMs | [Efficient Large Language Models: A Survey](https://arxiv.org/abs/2312.03863) | Zhongwei Wan | [AIoT-MLSys-Lab](https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey) | 本文对高效 LLMs 研究的发展进行了系统而全面的回顾, 并将文献整理成由三个主要类别组成的分类法, 从模型中心、数据中心和框架中心的角度涵盖了不同但相互关联的高效 LLMs 主题, 并且从以模型为中心和以数据为中心的角度, 回顾了 LLMs 的算法层面和系统层面的高效技术. 详细介绍了每个分类下的具体技术, 如: 量化, 剪枝, 知识蒸馏, 数据选择, 提示工程等<br>1. [知乎--黄浴--高效大语言模型:综述](https://zhuanlan.zhihu.com/p/671710012)<br>2. [知乎--磐石--大模型高效推理 I 推理技术框架总结](https://zhuanlan.zhihu.com/p/696850285)<br>3. [知乎--享享学AI--大模型LLM微调技术方法汇总!](https://zhuanlan.zhihu.com/p/673675939) |
| 2024/04/22 | 综述 | [A Survey on Efficient Inference for Large Language Models](https://arxiv.org/abs/2404.14294) | Zixuan Zhou | NA | 1. [如何加速大模型推理?万字综述全面解析大语言模型高效推理技术](https://www.sohu.com/a/790365299_121119001)<br>2. [知乎--罗清雨--大语言模型高效推理综述](https://zhuanlan.zhihu.com/p/707685591) |
| 2023/06/23 | 多模态 | [A Survey on Multimodal Large Language Models](https://arxiv.org/abs/2306.13549) | Shukang Yin | [BradyFU](https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models) | 本综述中主要介绍了多模态幻觉、多模态上下文学习(Multimodal InContext Learning,M-ICL)、多模态思维链(Multimodal Chain of Thought,M-CoT)和 LLM 辅助的视觉推理(LLM-Aided Visual Reasoning,LAVR)等. |
| 2024/05/23 | LLMs | [Efficient Large Language Models: A Survey](https://arxiv.org/abs/2312.03863) | Zhongwei Wan | [AIoT-MLSys-Lab](https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey) | 本文对高效 LLMs 研究的发展进行了系统而全面的回顾, 并将文献整理成由三个主要类别组成的分类法, 从模型中心、数据中心和框架中心的角度涵盖了不同但相互关联的高效 LLMs 主题, 并且从以模型为中心和以数据为中心的角度, 回顾了 LLMs 的算法层面和系统层面的高效技术. 详细介绍了每个分类下的具体技术, 如: 量化, 剪枝, 知识蒸馏, 数据选择, 提示工程等<br>1. [知乎--黄浴--高效大语言模型:综述](https://zhuanlan.zhihu.com/p/671710012)<br>2. [知乎--磐石--大模型高效推理 I 推理技术框架总结](https://zhuanlan.zhihu.com/p/696850285)<br>3. [知乎--享享学AI--大模型LLM微调技术方法汇总!](https://zhuanlan.zhihu.com/p/673675939) |
| 2024/04/22 | 综述 | [A Survey on Efficient Inference for Large Language Models](https://arxiv.org/abs/2404.14294) | Zixuan Zhou | NA | 1. [如何加速大模型推理?万字综述全面解析大语言模型高效推理技术](https://www.sohu.com/a/790365299_121119001)<br>2. [知乎--罗清雨--大语言模型高效推理综述](https://zhuanlan.zhihu.com/p/707685591) |
| 2023/06/23 | 多模态 | [A Survey on Multimodal Large Language Models](https://arxiv.org/abs/2306.13549) | Shukang Yin | [BradyFU](https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models) | 本综述中主要介绍了多模态幻觉、多模态上下文学习(Multimodal InContext Learning,M-ICL)、多模态思维链(Multimodal Chain of Thought,M-CoT)和 LLM 辅助的视觉推理(LLM-Aided Visual Reasoning,LAVR)等. |
| 2024/07/26 | 模型压缩 | [Comprehensive Study on Performance Evaluation and Optimization of Model Compression: Bridging Traditional Deep Learning and Large Language Models](https://arxiv.org/abs/2407.15904) | Aayush Saxena | [Comprehensive](https://arxiv.org/abs/2407.15904) | 近年来, 深度学习模型在大多数行业都取得了巨大成功. 这些模型的发展还导致模型大小和能源需求增加, 使其难以在低计算设备上的生产环境中进行部署. 全球互联设备数量的增加保证了压缩模型可以轻松部署在本地设备上, 但计算容量和电源可访问性较低. 不同的研究人员提出了广泛的解决方案来减小此类模型的大小和复杂性, 其中突出的是权重量化、参数修剪、网络修剪、低秩表示、权重共享、神经架构搜索、知识蒸馏等. 在这项研究工作中, 我们调查了使用量化和修剪技术进行压缩的各种训练有素的深度学习模型的性能影响. 我们在图像分类、对象检测、语言模型和基于生成模型的问题陈述中使用的常用深度学习模型上实施了量化和剪枝压缩技术. 我们还探讨了各种大型语言模型在量化和低秩适应后的性能. 我们对所有相关问题陈述使用了标准评估指标(模型的大小、准确性和推理时间), 并通过讨论挑战和未来的工作来总结本文. |
| 2024/06/04 | 投机 | [Unlocking Efficiency in Large Language Model Inference:A Comprehensive Survey of Speculative Decoding](https://arxiv.org/abs/2401.07851) | Heming Xia | [hemingkx/SpeculativeDecodingPapers](https://github.com/hemingkx/SpeculativeDecodingPapers) | [COLING 2025 Tutorial:Speculative Decoding for Efficient LLM Inference](https://speculative-decoding.github.io), [知乎-LLM推理加速新范式!推测解码(Speculative Decoding)最新综述](https://zhuanlan.zhihu.com/p/678404136) |
| 2024/06/04 | 投机 | [Unlocking Efficiency in Large Language Model Inference:A Comprehensive Survey of Speculative Decoding](https://arxiv.org/abs/2401.07851) | Heming Xia | [hemingkx/SpeculativeDecodingPapers](https://github.com/hemingkx/SpeculativeDecodingPapers) | [COLING 2025 Tutorial:Speculative Decoding for Efficient LLM Inference](https://speculative-decoding.github.io), [知乎 - LLM推理加速新范式!推测解码(Speculative Decoding)最新综述](https://zhuanlan.zhihu.com/p/678404136) |
[Mobile Edge Intelligence for Large Language Models: A Contemporary Survey](https://arxiv.org/abs/2407.18921)
[Edge Intelligence: Architectures, Challenges, and Applications](https://arxiv.org/abs/2003.12172)
[A Survey on Model Compression for Large Language Models](https://arxiv.org/abs/2308.07633)
| NA | NA | [Towards Efficient Generative Large Language Model Serving: ASurvey from Algorithms to Systems](https://arxiv.org/abs/2312.15234) | NA | NA | [知乎--路漫漫独求索--LLM推理加速技术简介](https://zhuanlan.zhihu.com/p/691360124)
| NA | NA | [Towards Efficient Generative Large Language Model Serving: ASurvey from Algorithms to Systems](https://arxiv.org/abs/2312.15234) | NA | NA | [知乎--路漫漫独求索--LLM推理加速技术简介](https://zhuanlan.zhihu.com/p/691360124)
[A Survey of Techniques for Maximizing LLM Performance]()
[Large Language Models: A Survey](https://arxiv.org/abs/2402.06196)
[A Survey of Large Language Models](https://arxiv.org/abs/2303.18223)
| 2015/05/28 | Li Bin <huawei.libin@huawei.com> | [livepatch: add support on arm64](https://lore.kernel.org/patchwork/cover/947588) | NA | RFC ☐ 5.14-rc1 | [PatchWork RFC,0/5](https://lore.kernel.org/patchwork/cover/947588), [LKML](https://lkml.org/lkml/2015/5/28/54)<br>*-*-*-*-*-*-*-* <br>[libin2015/livepatch-for-arm64](https://github.com/libin2015/livepatch-for-arm64)<br>*-*-*-*-*-*-*-* <br>[gcc/arm64: support -mfentry feature for arm64](https://gcc.gnu.org/legacy-ml/gcc-patches/2016-03/msg00755.html) |
| 2022 | [为了忘却的纪念——2022 Linux 内核十大技术革新功能](https://blog.csdn.net/csdnnews/article/details/128731761) |
| 2023 | [熠熠生辉 | 2023 年 Linux 内核十大技术革新功能](https://blog.csdn.net/csdnnews/article/details/135493424) |
| 2024 | [2024年Linux内核十大技术革新盘点|年终盘点](https://blog.csdn.net/csdnnews/article/details/145127830)<br>*-*-*-*-*-*-*-* <br>[phoronix, 2025/01/01, The Most Popular Linux & Open-Source News Of 2024](https://www.phoronix.com/news/Linux-Open-Source-News-2024)
| 2024/11/19 | K Prateek Nayak <kprateek.nayak@amd.com> | [sched/fair: Idle load balancer fixes for fallouts from IPI optimization to TIF_POLLING CPUs](https://lore.kernel.org/all/20241119054432.6405-1-kprateek.nayak@amd.com) | TODO | v5 ☐☑✓ | [LORE v5,0/4](https://lore.kernel.org/all/20241119054432.6405-1-kprateek.nayak@amd.com) |
| 2024/11/29 | Rafael J. Wysocki <rjw@rjwysocki.net> | [cpufreq: intel_pstate: Enable EAS on hybrid platforms without SMT](https://lore.kernel.org/all/5861970.DvuYhMxLoT@rjwysocki.net) | TODO | v21 ☐☑✓ | [LORE v21,0/9](https://lore.kernel.org/all/5861970.DvuYhMxLoT@rjwysocki.net) |
| 2025/01/09 | Changwoo Min <changwoo@igalia.com> | [sched_ext: Support high-performance monotonically non-decreasing clock](https://lore.kernel.org/all/20250109131456.7055-1-changwoo@igalia.com) | TODO | v8 ☐☑✓ | [LORE v8,0/6](https://lore.kernel.org/all/20250109131456.7055-1-changwoo@igalia.com) |
| 2024/12/02 | Vincent Guittot <vincent.guittot@linaro.org> | [sched/fair: Fix statistics with delayed dequeue](https://lore.kernel.org/all/20241202174606.4074512-1-vincent.guittot@linaro.org) | TODO | v3 ☐☑✓ | [LORE v3,0/11](https://lore.kernel.org/all/20241202174606.4074512-1-vincent.guittot@linaro.org) |
| 2024/12/23 | K Prateek Nayak <kprateek.nayak@amd.com> | [x86, sched: Dynamic ITMT core ranking support and some yak shaving](https://lore.kernel.org/all/20241223043407.1611-1-kprateek.nayak@amd.com) | TODO | v2 ☐☑✓ | [LORE v2,0/8](https://lore.kernel.org/all/20241223043407.1611-1-kprateek.nayak@amd.com) |
| 2024/12/12 | Vineeth Pillai (Google) <vineeth@bitbyteword.org> | [sched/dlserver: flag to represent active status of dlserver](https://lore.kernel.org/all/20241213032244.877029-1-vineeth@bitbyteword.org) | TODO | v1 ☐☑✓ | [LORE v1,0/2](https://lore.kernel.org/all/20241213032244.877029-1-vineeth@bitbyteword.org) |
| 2024/12/20 | Swapnil Sapkal <swapnil.sapkal@amd.com> | [Fixes and improvements in /proc/schedstat](https://lore.kernel.org/all/20241220063224.17767-1-swapnil.sapkal@amd.com) | TODO | v2 ☐☑✓ | [LORE v2,0/6](https://lore.kernel.org/all/20241220063224.17767-1-swapnil.sapkal@amd.com) |
| 2025/01/13 | Chuyi Zhou <zhouchuyi@bytedance.com> | [Take the scheduling domain into account in numa balancin](https://lore.kernel.org/all/20250113073050.2811925-1-zhouchuyi@bytedance.com) | TODO | v3 ☐☑✓ | [LORE v3,0/3](https://lore.kernel.org/all/20250113073050.2811925-1-zhouchuyi@bytedance.com) |
| 2025/01/06 | wujing <realwujing@qq.com> | [sched/fair: Correct CPU selection from isolated domain](https://lore.kernel.org/all/tencent_160A5B6C838FD9A915A67E67914350EB1806@qq.com) | TODO | v1 ☐☑✓ | [LORE](https://lore.kernel.org/all/tencent_160A5B6C838FD9A915A67E67914350EB1806@qq.com)|
| 2025/01/04 | Andrea Righi <arighi@nvidia.com> | [sched_ext: idle: small CPU iteration refactoring](https://lore.kernel.org/all/20250104090009.331193-1-arighi@nvidia.com) | TODO | v1 ☐☑✓ | [LORE](https://lore.kernel.org/all/20250104090009.331193-1-arighi@nvidia.com) |
| 2025/01/08 | Honglei Wang <jameshongleiwang@126.com> | [sched_ext: switch class when preempted by higher priority scheduler](https://lore.kernel.org/all/20250108023328.37675-1-jameshongleiwang@126.com) | TODO | v2 ☐☑✓ | [LORE](https://lore.kernel.org/all/20250108023328.37675-1-jameshongleiwang@126.com) |
| 2024/12/16 | Michal Koutný <mkoutny@suse.com> | [Add kernel cmdline option for rt_group_sched](https://lore.kernel.org/all/20241216201305.19761-1-mkoutny@suse.com) | TODO | v1 ☐☑✓ | [LORE v1,0/9](https://lore.kernel.org/all/20241216201305.19761-1-mkoutny@suse.com) |
| 2024/12/16 | Michal Koutný <mkoutny@suse.com> | [Add kernel cmdline option for rt_group_sched](https://lore.kernel.org/all/20241216201305.19761-1-mkoutny@suse.com) | TODO | v1 ☐☑✓ | [LORE v1,0/9](https://lore.kernel.org/all/20241216201305.19761-1-mkoutny@suse.com)<br>*-*-*-*-*-*-*-* <br>[LORE v1,0/9](https://lore.kernel.org/all/20250210151239.50055-1-mkoutny@suse.com)<br>*-*-*-*-*-*-*-* <br>[LORE v2,00/10](https://lore.kernel.org/all/20250310170442.504716-1-mkoutny@suse.com/) |
| 2025/01/14 | Florian Schmaus <flo@geekplace.eu> | [sched: provide sched_set_batch()](https://lore.kernel.org/all/20250114130513.498482-3-flo@geekplace.eu) | TODO | v1 ☐☑✓ | [LORE v1,0/2](https://lore.kernel.org/all/20250114130513.498482-3-flo@geekplace.eu) |
| 2024/12/04 | Tobias Huschle <huschle@linux.ibm.com> | [sched/fair: introduce new scheduler group type group_parked](https://lore.kernel.org/all/20241204112149.25872-1-huschle@linux.ibm.com) | TODO | v1 ☐☑✓ | [LORE v1,0/2](https://lore.kernel.org/all/20241204112149.25872-1-huschle@linux.ibm.com) |
| 2024/12/04 | Tobias Huschle <huschle@linux.ibm.com> | [sched/fair: introduce new scheduler group type group_parked](https://lore.kernel.org/all/20241204112149.25872-1-huschle@linux.ibm.com) | TODO | v1 ☐☑✓ | [LORE v1,0/2](https://lore.kernel.org/all/20241204112149.25872-1-huschle@linux.ibm.com)<br>*-*-*-*-*-*-*-* <br>[LORE v2,0/3](https://lore.kernel.org/all/20250217113252.21796-1-huschle@linux.ibm.com) |
| 2025/01/13 | I Hsin Cheng <richard120310@gmail.com> | [sched/fair: Refactor can_migrate_task() to elimate looping](https://lore.kernel.org/all/20250113041249.6847-1-richard120310@gmail.com) | TODO | v2 ☐☑✓ | [LORE](https://lore.kernel.org/all/20250113041249.6847-1-richard120310@gmail.com) |
| 2025/01/16 | Phil Auld <pauld@redhat.com> | [sched: Mention autogroup disabled behavior](https://lore.kernel.org/all/20250116124654.2365691-1-pauld@redhat.com) | TODO | v1 ☐☑✓ | [LORE](https://lore.kernel.org/all/20250116124654.2365691-1-pauld@redhat.com) |
| 2024/11/13 | Juri Lelli <juri.lelli@redhat.com> | [Fix DEADLINE bandwidth accounting in root domain changes and hotplug](https://lore.kernel.org/all/20241113125724.450249-1-juri.lelli@redhat.com) | TODO | v1 ☐☑✓ | [LORE v1,0/2](https://lore.kernel.org/all/20241113125724.450249-1-juri.lelli@redhat.com) |
| 2025/01/25 | Andrea Righi <arighi@nvidia.com> | [sched_ext: Move built-in idle CPU selection policy to a separate file](https://lore.kernel.org/all/20250125213911.283318-1-arighi@nvidia.com) | TODO | v2 ☐☑✓ | [LORE](https://lore.kernel.org/all/20250125213911.283318-1-arighi@nvidia.com) |
| 2024/12/19 | Pierre Gondois <pierre.gondois@arm.com> | [sched/fair: Decrease util_est in presence of idle time](https://lore.kernel.org/all/20241219091207.2001051-1-pierre.gondois@arm.com) | TODO | v1 ☐☑✓ | [LORE](https://lore.kernel.org/all/20241219091207.2001051-1-pierre.gondois@arm.com) |
| 2025/01/16 | Phil Auld <pauld@redhat.com> | [sched: Mention autogroup disabled behavior](https://lore.kernel.org/all/20250116124654.2365691-1-pauld@redhat.com) | TODO | v1 ☐☑✓ | [LORE](https://lore.kernel.org/all/20250116124654.2365691-1-pauld@redhat.com) |
[Linux 6.14 Resource Control To Allow Total Memory Bandwidth Monitoring](https://www.phoronix.com/news/Linux-6.14-resctrl-Total-RAM-BW)
[phoronix, 2025/03/20, Google Developing "Live Update Orchestrator" As New Means Of Live Linux Kernel Updates](https://www.phoronix.com/news/Google-Live-Update-Orchestrator)
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.