Commit Graph

2069 Commits

Author SHA1 Message Date
Vinta Chen
f52c85a151 docs: remove unmaintained or superseded entries across multiple sections
Drops entries that are abandoned, redundant, or no longer meet quality bar:
- NLP: pattern (unmaintained since 2018)
- Computer Vision: tesserocr (superseded by pytesseract)
- Recommender Systems: spotlight (archived, no longer maintained)
- API: flask-api (deprecated in favour of flask-restful), hug (archived)
- Caching: beaker (unmaintained, WSGI-era legacy)
- File Formats: textract (unmaintained, fragile system deps)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-21 15:04:54 +08:00
Vinta Chen
61e6f442dc refactor: merge and reorganize sections for better discoverability
- Web Scraping: split into Frameworks and Content Extraction subcategories
- DevOps: rename SSH-style Deployment to Deployment (absorbs Serverless),
  merge Process Management into Monitoring as Monitoring and Processes,
  collapse Backup/Chaos Engineering/Git Hooks into Other
- Fold standalone Processes section into DevOps > Monitoring and Processes
- Merge Audio Processing and Video Processing into Audio & Video Processing
- Remove Processes from ToC; update Audio/Video ToC entry

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-21 14:58:05 +08:00
Vinta Chen
0c4d411ae8 refactor: consolidate thin subcategories and split Data Analysis by domain
- Merge URL Manipulation (single entry) into HTTP Clients
- Move python-slugify into General Text Processing, removing the one-entry Slugify subcategory
- Consolidate YAML, TOML, and CSV subcategories into a single Data Formats group
- Split Data Analysis into General and Financial Data subcategories to improve discoverability

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-21 14:52:23 +08:00
Vinta Chen
1fa8847631 docs: remove unmaintained or low-quality entries across multiple categories
Removes 14 entries from Asset Management, Web Content Extracting, URL
Parsing, Search, Testing, Task Queues, Subprocesses, Network
Virtualization, Text / Slugify, HTML & XML, File Format Processing,
Audio, and Video sections.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-21 14:46:22 +08:00
Vinta Chen
1bc421867c docs: add missing well-known libraries across multiple sections
Adds entries for bottle, robyn, starlette, connexion, strawberry,
flask-socketio, trafilatura, cachetools, ibis, modin, pandera,
chalice, grpcio, anyio, dateparser, python-dotenv, and mitmproxy.
Also removes zappa (deprecated/unmaintained) and graphene (replaced
by strawberry).

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-21 14:37:25 +08:00
Vinta Chen
d7ffd63fa7 docs: fix stale cross-reference link in Web Frameworks section
Update the 'Also see' link from the old absolute GitHub URL pointing
to #restful-api to the current relative anchor #web-apis, matching
the section rename in a previous refactor.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-20 21:10:55 +08:00
Vinta Chen
bec981806a refactor: reorganize thematic groups and reorder sections within groups
Restructures the top-level ToC groups for better logical cohesion:
- Split 'Web & API' into 'Web' (frameworks, servers, CMS) and
  'HTTP & Scraping' (clients, scraping, URL, email)
- Move 'Database & Storage' earlier in the ToC, before 'Data & Science'
- Merge 'Web Content Extraction' and 'Web Crawling' into a single
  'Web Scraping' section
- Rename 'Content & Media' to 'Text & Documents' and 'Media' (split)
- Rename 'System & Runtime' to 'Python Language' and 'Python Toolchain'
- Rename 'Security & Auth' to 'Security'; move Authentication to Web group
- Rename 'Development Tools' to 'Developer Tools', 'DevOps & Infrastructure' to 'DevOps'
- Reorder sections within groups to reflect learning progression
  (e.g., Deep Learning before Machine Learning in AI & ML)
- Move Hardware and Microsoft Windows to Miscellaneous group
- Add s3cmd to DevOps and youtube-dl to Command-line Tools
- Update CONTRIBUTING.md example group names to match new labels

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-20 21:06:08 +08:00
Vinta Chen
4322026817 refactor: parse thematic groups from README bold markers instead of hardcoding them
The website builder previously relied on a hardcoded SECTION_GROUPS list in
build.py to organize categories into thematic groups. This was fragile: any
rename or addition to README.md required a matching code change.

Replace this with a parser-driven approach:
- readme_parser.py now detects bold-only paragraphs (**Group Name**) as
  group boundary markers and groups H2 categories beneath them into
  ParsedGroup structs.
- build.py drops SECTION_GROUPS entirely; group_categories() now just
  passes parsed groups through and appends the Resources group.
- sort.py is removed as it relied on the old flat section model.
- Tests updated throughout to reflect the new (groups, resources) return
  shape and to cover the new grouping logic.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-20 18:43:09 +08:00
Vinta Chen
fd9b2665ed docs: reorganize ToC into thematic groups and update contributing guide
Group the Table of Contents entries under bold section headers (AI & ML,
Web & API, Data & Science, etc.) so the README is easier to navigate at
a glance. Update CONTRIBUTING.md to reflect that new sections should be
placed under the appropriate thematic group instead of in a flat list.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-20 18:42:58 +08:00
Vinta Chen
efc08daa5e docs: reorganize AI, Data Visualization, GUI, and Scientific Computing into subcategories
Group flat lists into labeled subcategories to improve scannability and navigation.
Also remove entries that don't meet curation standards and fix the toga repo URL.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-20 18:00:51 +08:00
Vinta Chen
1272d7059e refactor: flatten subcategories and rename sections for clarity
- Remove subcategory groupings (General, Financial Data, Mail Servers, etc.) in favor of flat alphabetical lists
- Rename sections to plural forms: Downloader -> Downloaders, Job Scheduler -> Job Schedulers, Static Site Generator -> Static Site Generators, Template Engine -> Template Engines, Web Content Extracting -> Web Content Extraction
- Rename Specific Formats Processing -> File Format Processing
- Move financial data libraries (akshare, edgartools, openbb, yfinance) from Downloaders to Data Analysis
- Fix TOC ordering: Database/Database Drivers, Web APIs, Web Servers entries moved to alphabetical positions

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-20 17:50:33 +08:00
Vinta Chen
0183bf15ae refactor: move entries to more accurate categories
- flower -> Task Queues (Celery sub-entry, not Admin Panels)
- vllm, rasa, diffusers, transformers -> AI Agents (not Deep Learning / ML)
- ccb -> AI Agents (not CLI Productivity Tools)
- code-graph-rag -> AI Agents (not Code Analysis)
- kafka-python -> Distributed Computing / Stream Processing (not NoSQL Databases)
- scapy -> Networking (not Hardware)
- sentry-python -> DevOps / Monitoring (not Logging)
- joblib -> Distributed Computing (not Job Scheduler)
- doit -> Build Tools (not Job Scheduler)
- karateclub -> Machine Learning (not Science)
- numba -> Science (not Python Implementations)
- diagrams -> Documentation (not Data Visualization)
- mkdocs -> Documentation (not Static Site Generators)
- pyelftools -> Text Formats / General (not Debugging)
- weasyprint -> Text Formats / PDF (not HTML/XML)
- webargs -> RESTful API (not URL Parsing)
- kafka-python stream sub-entry added, flower celery sub-entry added

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-20 17:46:54 +08:00
Vinta Chen
dd3394963d refactor: consolidate and reorganize categories for clarity
- Merge ASGI Servers and WSGI Servers into a single Web Servers section
- Merge RESTful API and GraphQL into Web APIs section
- Move Permissions entries under Authentication as a subsection
- Move Refactoring entries under Code Analysis as a subsection
- Move Serverless Frameworks entries under DevOps as a subsection
- Move Shell (xonsh) under CLI Tools
- Move Internationalization (babel) under Text Processing
- Move Robotics (PythonRobotics) under Science
- Remove standalone Internationalization, Permissions, Refactoring, Serverless Frameworks, Shell, Robotics, and GraphQL sections
- Fix capitalization in TOC entries (Penetration Testing, Framework Agnostic)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-20 17:41:11 +08:00
Vinta Chen
c0fc30cec6 docs: add new entries and reorganize existing ones across multiple categories
Adds new library entries: autogen, crewai, dspy, smolagents (AI/Agents),
invoke (Build Tools), django-cms (CMS), bandit (Code Analysis), gradio
(Data Visualization), chromadb (Database), pre-commit (DevOps), aiohttp
(HTTP), dagster (Job Scheduler), catboost, lightgbm (Machine Learning),
tortoise-orm (ORM), msgpack (Serialization), nox, playwright (Testing),
and py-sdl2 (Game Dev).

Removes stale or low-quality entries: howdoi, try, cuisine, django-schedule,
plan, gym, metrics, pydal, fastFM, tensorrec, mamba (testing), toonify.

Fixes alphabetical ordering for uv/virtualenv, pip/pipx, py-sdl2/pygame.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-20 17:27:40 +08:00
Vinta Chen
18011f86f3 feat: merge duplicate entries across multiple categories
Entries appearing in more than one category were previously emitted as
separate rows. They are now deduplicated in build.py by URL, collecting
all category and group names into lists.

The template encodes those lists as pipe-delimited data attributes
(data-cats, data-groups) and renders a tag button per category.
The JS filter is updated to split on '||' and check for membership,
so clicking any category tag correctly shows the merged row.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-20 17:02:22 +08:00
Vinta Chen
c5376618b8 docs: update Environment and Package Management entries
- Move pyenv-win as sub-entry under pyenv
- Add uv to Environment Management
- Remove pip-tools sub-entry from pip
- Add pipx and mamba entries
- Remove hatch entry
- Update uv description in Package Management

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-20 16:54:56 +08:00
Jinyang
bac9514660 Merge pull request #2979 from JinyangWang27/remove-bcbio-nextgen
remove bcbio-nextgen
2026-03-19 09:52:59 +04:00
Jinyang
7baa887589 remove bcbio-nextgen 2026-03-19 09:51:28 +04:00
Vinta Chen
716464e726 fix: improve CSS polish with active states, font smoothing, and text wrapping
Add active-state press feedback (scale transform) to buttons, filter clear,
and tags. Add moz-osx-font-smoothing for consistent antialiasing on Firefox/Mac.
Apply text-wrap: balance to headings and text-wrap: pretty to body text and
expanded row descriptions. Add text-underline-offset to links and highlight
active table rows with bg-hover.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-19 02:41:56 +08:00
Vinta Chen
4ea3134ba3 fix: move group tag into category cell and hide on mobile
- Relocate group tag from expand row to category column so it appears inline beside the category tag
- Add margin between stacked tags with .col-cat .tag + .tag spacing rule
- Remove fixed width from .col-cat; narrow .col-name from 35% to 30% to give category column room
- Hide .tag-group on screens ≤900px and widen .col-name to 50% to reclaim space

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-19 02:36:40 +08:00
Vinta Chen
4077051813 docs: clarify VS Code Python extension entry name
Rename 'Python' to 'Python for VSCode' in the Editor Plugins section
to disambiguate the extension from the Python language itself.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-19 02:29:07 +08:00
Vinta Chen
e967722a5d fix: improve mobile table layout with auto sizing and tighter spacing
Switch table-layout back to auto on mobile to let columns size naturally,
add uniform cell padding overrides, shrink num/arrow columns further,
pin stars column width, reduce edge padding, and left-align the number
column to avoid awkward right-aligned single digits.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-19 02:24:23 +08:00
Vinta Chen
ca350ebaf9 fix: use table-layout fixed on mobile to prevent column width inflation
Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-19 02:11:23 +08:00
Vinta Chen
fb2a693dbb fix: reduce number and arrow column widths on mobile for tighter table layout
Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-19 02:05:34 +08:00
Vinta Chen
ac9b69a0b2 fix: reduce table padding on mobile for better centering and arrow visibility
Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-19 02:00:07 +08:00
Vinta Chen
926ba010b7 chore: split AI & Data into AI & ML and Data & Science
AI & ML: AI and Agents, Machine Learning, Deep Learning, Computer Vision,
Natural Language Processing, Recommender Systems, Robotics.

Data & Science: Data Analysis, Data Validation, Data Visualization,
Geolocation, Science, Quantum Computing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 01:55:18 +08:00
Vinta Chen
8b518664d8 chore: redistribute Specialized group across existing groups
Remove the Specialized catchall group. Redistribute its categories:
- Web & API: Admin Panels, CMS, Email, Static Site Generator, URL Manipulation
- AI & Data: Geolocation, Robotics
- Content & Media: Game Development, Internationalization
- System & Runtime: Date and Time, Hardware, Microsoft Windows
- Development Tools: Algorithms and Design Patterns

Only Miscellaneous remains ungrouped (falls into Other).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 01:54:46 +08:00
Vinta Chen
46caf8cec4 docs: add AI and Agents category with autoresearch
New category for LLM integrations, agent frameworks, and AI applications.
Move agno, instructor, langchain, llama_index, praisonai, pydantic-ai,
ragflow from Machine Learning. Add autoresearch (karpathy).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 01:45:24 +08:00
Vinta Chen
9761bac1e0 docs: lowercase 8 project names to match their import names
eyeD3→eyed3, Gooey→gooey, gTTS→gtts, MechanicalSoup→mechanicalsoup,
MonkeyType→monkeytype, PraisonAI→praisonai, PyMySQL→pymysql,
Zappa→zappa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 01:39:49 +08:00
Vinta Chen
e70b25d42d docs: fix tkinter entry to use stdlib format
Lowercase name, link to official docs, add stdlib label.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 01:33:44 +08:00
Vinta Chen
55db9c7f64 docs: rename Box to box, PathPicker to fpp
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 01:30:57 +08:00
Vinta Chen
2cbb2d7c60 docs: normalize entry names to lowercase
Standardize display names to lowercase across all categories (audioFlux,
EasyOCR, UltraPlot, PySpark, cx_Freeze, OpenBB, DearPyGui, WeasyPrint,
Pillow, Quads, TaskFlow, Metrics, spaCy, funNLP, PynamoDB, Surprise,
Bowler, zeroRPC, SimPy, XlsxWriter, HTTPretty) for consistent formatting.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-19 01:27:54 +08:00
Vinta Chen
6204196083 docs: add naming convention rule to CONTRIBUTING.md
Prefer PyPI package name as display name so developers can copy it
directly to pip install. Fall back to GitHub repo name if not on PyPI.
Also update examples to use lowercase PyPI names.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 01:24:26 +08:00
Vinta Chen
8a131b7874 docs: rename Spark ML to spark.ml in Machine Learning
Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-19 00:58:11 +08:00
Vinta Chen
7a0abca2e5 docs: remove dataclasses and DottedDict from Data Structures
Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-19 00:58:09 +08:00
Vinta Chen
5036fe8201 docs: normalize entry names to lowercase for django.db.models and reportlab
Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-19 00:49:52 +08:00
Vinta Chen
3d534f57d7 docs: lowercase H2O and PyMC display names
H2O→h2o, PyMC→pymc (drop version suffixes from repo names).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 00:44:23 +08:00
Vinta Chen
6ad2a77bb4 docs: rename Jupyter Notebook (IPython) to jupyter
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 00:43:01 +08:00
Vinta Chen
fb3112d8d2 docs: match 11 more display names to their GitHub repo names
Django MongoDB Backend→django-mongodb-backend, Karate Club→karateclub,
Open Babel→openbabel, Robot Framework→robotframework,
Feature-engine→feature_engine, memory-graph→memory_graph, Jinja2→jinja,
Cocos2d→cocos, LlamaIndex→llama_index, VCR.py→vcrpy,
Spiff→SpiffWorkflow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 00:40:50 +08:00
Vinta Chen
4a0db0dee6 docs: match project display names to their GitHub repo names
Update 79 entries where the display name differed from the GitHub
repository name only in casing (e.g. NumPy→numpy, LangChain→langchain,
SQLAlchemy→sqlalchemy, DuckDB→duckdb).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 00:11:45 +08:00
Vinta Chen
65db5ab5e7 Merge pull request #2975 from vinta/chore/cleanup
Cleanup: replace deprecated entries, fix categories, add new entries
2026-03-18 23:52:00 +08:00
Vinta Chen
79c0be0a5c docs: move docling and textract to Text Processing
docling (document-to-structured-data conversion) and textract (text
extraction from Office/PDF files) are document parsing tools, not
data analysis or web scraping tools, so Text Processing > General
is a more accurate placement.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-18 23:50:25 +08:00
Vinta Chen
a7c5d84ce9 docs: split Downloader into General and Financial Data subcategories
The financial data tools (akshare, edgartools, OpenBB, yfinance) are a
distinct cluster from general-purpose downloaders (s3cmd, youtube-dl),
so grouping them into subcategories improves discoverability.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-18 23:50:13 +08:00
Vinta Chen
057081ff91 docs: move Beanie to ORM > NoSQL Databases from Database Drivers
Beanie is an ODM (Object-Document Mapper), not a raw database driver,
so it fits better under ORM > NoSQL Databases alongside mongoengine and ODMantic.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-18 23:49:59 +08:00
Vinta Chen
d48c1b8904 docs: move streamlit to Data Visualization from Admin Panels
streamlit is primarily a data visualization and dashboard framework,
so it better fits the Data Visualization category.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-18 23:49:47 +08:00
Vinta Chen
02dab03848 chore(text-processing): restructure Markdown subsection and add TOML
- Replace Jimmy, Mistune, Python-Markdown with markdown-it-py, markdown,
  markitdown, and mistune (lowercased names, added CommonMark parser)
- Add new TOML subsection with stdlib tomllib entry

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-18 23:41:54 +08:00
Vinta Chen
5f44045f64 chore(files): move markitdown to Text Processing > Markdown
markitdown converts documents to Markdown, so it belongs under the
Markdown subcategory of Text Processing rather than the generic Files
section.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-18 23:41:45 +08:00
Vinta Chen
0cd4ccaec2 chore(dates-times): replace pytz with zoneinfo
Remove the third-party pytz in favour of the stdlib zoneinfo module
(Python 3.9+), which ships the IANA tz database directly.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-18 23:41:32 +08:00
Vinta Chen
7b8002426d feat(website): add co-maintainer credit and bump hero-sub font size
- Replace 'Curated by @vinta since 2014' with 'Maintained by @vinta
  and @JinyangWang27' to reflect the new co-maintainer
- Increase .hero-sub font size from --text-sm to --text-base for
  better readability

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-18 23:30:47 +08:00
Vinta Chen
56ccdfae8f Merge pull request #2973 from vinta/fix/replace-non-github-urls-with-github-repos
Replace non-GitHub URLs with GitHub repo URLs
2026-03-18 23:24:37 +08:00