Commit Graph

21 Commits

Author SHA1 Message Date
Vinta Chen
0c26d352f0 fix(website): scope subcategory filter values to parent category
Subcategories with the same name (e.g. 'Frameworks') across different
top-level categories were sharing a filter value, so clicking one
subcategory tag would match entries from unrelated categories.

Each subcategory now stores both a display name and a scoped value
('Category > Subcategory') used for data-cats matching. The template
renders the display name on tags and mobile-cat span, but uses the
scoped value for filtering. Subcategory tags are also moved before
category tags so the most-specific label appears first.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-23 01:11:35 +08:00
Vinta Chen
f2b4a7bc83 feat(website): surface subcategory labels as filterable tags
Entries nested under a plain-text subcategory heading (e.g. "Frameworks"
inside Testing) now carry a subcategory field populated by the parser.
The build pipeline collects these into a subcategories list on each merged
entry, and the template renders them as tag-subcat buttons that plug into
the existing data-cats filter mechanism.

A dedicated .tag-subcat style distinguishes them visually from category
tags, and both are hidden on mobile alongside .tag-group.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-23 01:04:20 +08:00
Vinta Chen
df2191fc05 refactor(build): remove unused group_categories wrapper
group_categories only ever appended a Resources group when the
resources list was non-empty. All call sites passed an empty list,
making it a no-op indirection. Inline parsed_groups directly and
remove the dead code along with its tests.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-22 15:58:42 +08:00
Vinta Chen
5fc022d595 refactor(build): remove resources from build pipeline
Resources are no longer passed through parse_readme, group_categories,
or the index template — they are replaced with empty lists and the
unused variable is prefixed with an underscore.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-22 15:45:18 +08:00
Vinta Chen
d3070b735e feat: add build date to footer
Displays the UTC date the site was last built in the footer so visitors
can see how fresh the data is.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-22 15:30:04 +08:00
Vinta Chen
4f297a5301 fix(website): sort starred stdlib entries after starred non-stdlib entries
Within the same star count, built-in (stdlib) entries were interleaved
with third-party entries. Add a builtin tier to the sort key so stdlib
entries always rank below non-stdlib entries at equal star counts.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-22 02:15:22 +08:00
Vinta Chen
37a9443bbb fix(website): map built-in entries to cpython for star data lookup
Entries with source_type 'Built-in' have no extractable GitHub repo key
from their URL, so they never received star/metadata enrichment. Fall
back to python/cpython for these entries so the star count and related
data are populated correctly.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-22 02:10:44 +08:00
Vinta Chen
8e7b881659 fix(website): key dedup by (url, name) to allow same-url different-name entries
Previously, two entries sharing the same URL but different names would be
collapsed into one. Using a composite (url, name) key preserves distinct
entries while still merging true duplicates.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-22 02:08:42 +08:00
Vinta Chen
bfaa207ef3 refactor(website): rename stdlib source type label to Built-in
Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-22 02:04:45 +08:00
Vinta Chen
666f6e52d0 feat(website): add source type badges for non-GitHub entries
Detect the hosting source (stdlib, GitLab, Bitbucket, External) from
the entry URL and surface it as a small badge in the stars column where
a star count would otherwise show an em dash.

Stdlib entries also get their own sort tier — between starred entries
and other no-star entries — so the standard library is not buried at
the bottom of each category.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-22 02:03:00 +08:00
Vinta Chen
d36f1ed8d1 refactor(website): remove unused Entry TypedDict, write llms.txt from parsed text
Entry was dead code with no callers. Switching from shutil.copy to
write_text uses the already-loaded readme_text variable directly instead
of re-reading the file from disk.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-22 01:32:17 +08:00
Vinta Chen
4322026817 refactor: parse thematic groups from README bold markers instead of hardcoding them
The website builder previously relied on a hardcoded SECTION_GROUPS list in
build.py to organize categories into thematic groups. This was fragile: any
rename or addition to README.md required a matching code change.

Replace this with a parser-driven approach:
- readme_parser.py now detects bold-only paragraphs (**Group Name**) as
  group boundary markers and groups H2 categories beneath them into
  ParsedGroup structs.
- build.py drops SECTION_GROUPS entirely; group_categories() now just
  passes parsed groups through and appends the Resources group.
- sort.py is removed as it relied on the old flat section model.
- Tests updated throughout to reflect the new (groups, resources) return
  shape and to cover the new grouping logic.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-20 18:43:09 +08:00
Vinta Chen
18011f86f3 feat: merge duplicate entries across multiple categories
Entries appearing in more than one category were previously emitted as
separate rows. They are now deduplicated in build.py by URL, collecting
all category and group names into lists.

The template encodes those lists as pipe-delimited data attributes
(data-cats, data-groups) and renders a tag button per category.
The JS filter is updated to split on '||' and check for membership,
so clicking any category tag correctly shows the merged row.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-20 17:02:22 +08:00
Vinta Chen
926ba010b7 chore: split AI & Data into AI & ML and Data & Science
AI & ML: AI and Agents, Machine Learning, Deep Learning, Computer Vision,
Natural Language Processing, Recommender Systems, Robotics.

Data & Science: Data Analysis, Data Validation, Data Visualization,
Geolocation, Science, Quantum Computing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 01:55:18 +08:00
Vinta Chen
8b518664d8 chore: redistribute Specialized group across existing groups
Remove the Specialized catchall group. Redistribute its categories:
- Web & API: Admin Panels, CMS, Email, Static Site Generator, URL Manipulation
- AI & Data: Geolocation, Robotics
- Content & Media: Game Development, Internationalization
- System & Runtime: Date and Time, Hardware, Microsoft Windows
- Development Tools: Algorithms and Design Patterns

Only Miscellaneous remains ungrouped (falls into Other).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 01:54:46 +08:00
Vinta Chen
ab18c7e54c refactor: reformat build.py to Black style and add llms.txt output
Reformats dict and list literals to trailing-comma multiline style
throughout. Also copies README.md to llms.txt in the site output so
LLM crawlers can discover the full content.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-18 21:33:51 +08:00
Vinta Chen
280f250ce0 feat: migrate README parser to markdown-it-py and refresh website
Switch readme_parser.py from regex-based parsing to markdown-it-py for
more robust and maintainable Markdown AST traversal. Update build pipeline,
templates, styles, and JS to support the new parser output. Refresh GitHub
stars data and update tests to match new parser behavior.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-18 20:33:36 +08:00
Vinta Chen
45391b84e9 build: simplify Makefile targets and add live-reload preview
Rename site_* targets to bare names (install, fetch_stats, build,
preview). Replace the static preview target with a watchmedo-driven
live-reload loop so file changes trigger automatic rebuilds. Make
the output directory creation idempotent (exist_ok=True) and static
copy incremental (dirs_exist_ok=True) so repeated builds don't wipe
output on each run.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-18 18:28:27 +08:00
Vinta Chen
266a6b6b6c simplify: remove redundant _has_description, unused param, merge loops
- Remove `_has_description` which duplicated `_extract_description` logic;
  use truthiness of the description string instead
- Remove unused `resources` parameter from `extract_entries`
- Merge two sequential loops in `parse_readme` into a single pass over
  children to find hr, Resources, and Contributing indices

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 17:41:13 +08:00
Vinta Chen
0f374970dd refactor: extract parsing logic from build.py into readme_parser module
slugify, parse_readme, count_entries, extract_preview, render_content_html,
and related helpers are moved to a dedicated readme_parser module.
build.py now imports from readme_parser rather than defining these inline.
Tests for the removed functions are dropped from test_build.py since they
now live with the module they test.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-18 17:27:14 +08:00
Vinta Chen
177183d9bd add custom website build system
Replaces MkDocs with a bespoke Python site generator using Jinja2 templates
and Markdown. Adds uv for dependency management, GitHub Actions workflow for
deployment, and Makefile targets for local development (fetch_stars, build,
preview, deploy).

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-18 13:48:49 +08:00