From c08b1235ac12863974123bba645a77e402abe02a Mon Sep 17 00:00:00 2001
From: Vinta Chen <vinta.chen@gmail.com>
Date: Wed, 22 Apr 2026 00:24:14 +0800
Subject: [PATCH] Move voxcpm to AI and Agents > Pre-trained Models and
 Inference

It is a pretrained neural TTS foundation model, not an audio manipulation library, so it fits better alongside transformers, diffusers, and vllm.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 3f56dd4b..13bb10f2 100644
--- a/README.md
+++ b/README.md
@@ -153,6 +153,7 @@ _Libraries for building AI applications, LLM integrations, and autonomous agents
   - [transformers](https://github.com/huggingface/transformers) - A framework that lets you easily use pre-trained transformer models for NLP, vision, and audio tasks.
   - [unsloth](https://github.com/unslothai/unsloth) - A library for faster LLM fine-tuning and training with reduced memory usage.
   - [vllm](https://github.com/vllm-project/vllm) - A high-throughput and memory-efficient inference and serving engine for LLMs.
+  - [voxcpm](https://github.com/OpenBMB/VoxCPM) - A tokenizer-free text-to-speech foundation model for multilingual speech generation and voice cloning.
 
 ## Deep Learning
 
@@ -935,7 +936,6 @@ _Libraries for manipulating audio, video, and their metadata._
   - [librosa](https://github.com/librosa/librosa) - Python library for audio and music analysis.
   - [matchering](https://github.com/sergree/matchering) - A library for automated reference audio mastering.
   - [pydub](https://github.com/jiaaro/pydub) - Manipulate audio with a simple and easy high level interface.
-  - [voxcpm](https://github.com/OpenBMB/VoxCPM) - A tokenizer-free text-to-speech system for multilingual speech generation and voice cloning.
 - Video
   - [moviepy](https://github.com/Zulko/moviepy) - A module for script-based movie editing with many formats, including animated GIFs.
   - [vidgear](https://github.com/abhiTronix/vidgear) - Most Powerful multi-threaded Video Processing framework.