From 206773302395e26d3a6ea93443c00b89829b36e1 Mon Sep 17 00:00:00 2001 From: Vinta Chen Date: Wed, 22 Apr 2026 00:27:20 +0800 Subject: [PATCH] Move speech models into new Speech subcategory Split vibevoice and voxcpm out of Pre-trained Models and Inference (which now skews to LLMs and diffusion) into a dedicated Speech subcategory to make room for TTS/ASR growth. Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index c7bf7611..246c2dfa 100644 --- a/README.md +++ b/README.md @@ -152,8 +152,9 @@ _Libraries for building AI applications, LLM integrations, and autonomous agents - [sglang](https://github.com/sgl-project/sglang) - A high-performance serving framework for large language models and multimodal models. - [transformers](https://github.com/huggingface/transformers) - A framework that lets you easily use pre-trained transformer models for NLP, vision, and audio tasks. - [unsloth](https://github.com/unslothai/unsloth) - A library for faster LLM fine-tuning and training with reduced memory usage. - - [vibevoice](https://github.com/microsoft/VibeVoice) - A family of open-source voice AI models from Microsoft for text-to-speech and long-form speech recognition. - [vllm](https://github.com/vllm-project/vllm) - A high-throughput and memory-efficient inference and serving engine for LLMs. +- Speech + - [vibevoice](https://github.com/microsoft/VibeVoice) - A family of open-source voice AI models from Microsoft for text-to-speech and long-form speech recognition. - [voxcpm](https://github.com/OpenBMB/VoxCPM) - A tokenizer-free text-to-speech foundation model for multilingual speech generation and voice cloning. ## Deep Learning