ci(pr-comment-poster): add generic PR comment poster and migrate producers

Adds a stand-alone workflow that posts or updates sticky PR comments on behalf of any analysis workflow, including those triggered by fork PRs. The poster runs on `workflow_run` in the base repo context, which is the standard GitHub-sanctioned way to get a write token on events that originate from untrusted forks without ever checking out fork code. All validation, GitHub API interaction, and upsert logic lives in Tools/ci/pr-comment-poster.py (Python 3 stdlib only, two subcommands: `validate` and `post`). The workflow file itself is a thin orchestrator: sparse-checkout the script, download the pr-comment artifact via github-script, unzip, then invoke the script twice. No inline jq, no inline bash validation, no shell-interpolated marker strings. The sparse-checkout ensures only Tools/ci/pr-comment-poster.py lands in the workspace, never the rest of the repo. Artifact contract: a producer uploads an artifact named exactly `pr-comment` containing `manifest.json` (with `pr_number`, `marker`, and optional `mode`) and `body.md`. The script validates the manifest (positive integer pr_number, printable-ASCII marker bounded 1..200 chars, UTF-8 body under 60000 bytes, mode in an allowlist), finds any existing comment containing the marker via the comments REST API, and either edits it in place or creates a new one. The workflow file header documents six security invariants that any future change MUST preserve, most importantly: NEVER check out PR code, NEVER execute anything from the artifact, and treat all artifact contents as opaque data. Why a generic poster and not `pull_request_target`: `pull_request_target` is the tool people reach for first and the one that most often turns into a supply-chain vulnerability, because it hands a write token to a workflow that is then tempted to check out the PR head. `workflow_run` gives the same write token without any check-out temptation, because the only input is a pre-produced artifact treated as opaque data. Producer migrations =================== flash_analysis.yml: - Drop the fork gate on the `post_pr_comment` job. - Drop the obsolete TODO pointing at issue #24408 (the fork-comment workflow does not error anymore; it just no-ops). - Keep the existing "comment only if threshold crossed or previous comment exists" behaviour verbatim. peter-evans/find-comment@v3 stays as a read-only probe (forks can read issue comments just fine); its body-includes is updated to search for the new marker `` instead of the old "FLASH Analysis" heading substring. - Replace the peter-evans/create-or-update-comment@v4 step with two new steps that write pr-comment/manifest.json and pr-comment/body.md and then upload them as artifact pr-comment. The body markdown is byte-for-byte identical to the previous heredoc, with the marker prepended as the first line so subsequent runs can find it. - The threshold-or-existing-comment gate is preserved on both new steps. When the gate does not fire no artifact is uploaded and the poster no-ops. docs-orchestrator.yml (link-check job): - Drop the fork gate on the sticky-comment step. - Replace marocchino/sticky-pull-request-comment@v2 with two new steps that copy logs/filtered-link-check-results.md into pr-comment/body.md, write a pr-comment/manifest.json with the marker ``, and upload the directory as artifact pr-comment. - The prepare step checks `test -s` on the results file and emits a prepared step output; the upload step is gated on that output. In practice the existing link-check step always writes a placeholder ("No broken links found in changed files.") into the file when empty, so the guard is defensive but not load-bearing today. - Tighten the link-check job's permissions from `pull-requests: write` down to `contents: read`; writing PR comments now happens in the poster workflow. The poster's workflows allowlist is seeded with the two active producers: "FLASH usage analysis" and "Docs - Orchestrator". clang-tidy (workflow name "Static Analysis") is not in the list because platisd/clang-tidy-pr-comments posts line-level review comments, a different REST API from issue comments that the poster script does not handle. Extending the poster to cover review comments is a follow-up. Signed-off-by: Ramon Roche <mrpollo@gmail.com>
2026-05-27 10:17:45 +08:00 · 2026-04-08 22:22:56 -07:00
parent c9f1d2ab0f
commit 8c4b703103
4 changed files with 622 additions and 35 deletions
@@ -213,7 +213,6 @@ jobs:
    if: always() && (github.event_name == 'pull_request')
    permissions:
      contents: read
      pull-requests: write
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
@@ -281,12 +280,32 @@ jobs:
            > ./logs/link-check-results.md || true
          cat ./logs/link-check-results.md
-      - name: Post PR comment with link check results
+      - name: Prepare pr-comment artifact
-        if: github.event.pull_request.head.repo.full_name == github.repository
+        id: prepare-pr-comment
-        uses: marocchino/sticky-pull-request-comment@v2
+        run: |
          if [ ! -s ./logs/filtered-link-check-results.md ]; then
            echo "No link-check results file; skipping pr-comment artifact."
            echo "prepared=false" >> "$GITHUB_OUTPUT"
            exit 0
          fi
          mkdir -p pr-comment
          cp ./logs/filtered-link-check-results.md pr-comment/body.md
          cat > pr-comment/manifest.json <<EOF
          {
            "pr_number": ${{ github.event.pull_request.number }},
            "marker": "<!-- pr-comment-poster:docs-link-check -->",
            "mode": "upsert"
          }
          EOF
          echo "prepared=true" >> "$GITHUB_OUTPUT"
      - name: Upload pr-comment artifact
        if: steps.prepare-pr-comment.outputs.prepared == 'true'
        uses: actions/upload-artifact@v4
        with:
-          header: flaws
+          name: pr-comment
-          path: ./logs/filtered-link-check-results.md
+          path: pr-comment/
          retention-days: 1
      - name: Upload link check results
        uses: actions/upload-artifact@v4
@@ -93,9 +93,6 @@ jobs:
          echo '${{ steps.bloaty-step.outputs.bloaty-summary-map }}' >> $GITHUB_OUTPUT
          echo "$EOF" >> $GITHUB_OUTPUT
  # TODO:
  # This part of the workflow is causing errors for forks. We should find a way to fix this and enable it again for forks.
  # Track this issue https://github.com/PX4/PX4-Autopilot/issues/24408
  post_pr_comment:
    name: Publish Results
    runs-on: [runs-on,runner=1cpu-linux-x64,image=ubuntu24-full-x64,"run-id=${{ github.run_id }}"]
@@ -105,7 +102,7 @@ jobs:
      V5X-SUMMARY-MAP-PERC: ${{ fromJSON(fromJSON(needs.analyze_flash.outputs.px4_fmu-v5x-bloaty-summary-map).vm-percentage) }}
      V6X-SUMMARY-MAP-ABS: ${{ fromJSON(fromJSON(needs.analyze_flash.outputs.px4_fmu-v6x-bloaty-summary-map).vm-absolute) }}
      V6X-SUMMARY-MAP-PERC: ${{ fromJSON(fromJSON(needs.analyze_flash.outputs.px4_fmu-v6x-bloaty-summary-map).vm-percentage) }}
-    if: github.event.pull_request && github.event.pull_request.head.repo.full_name == github.repository
+    if: github.event.pull_request
    steps:
      - name: Find Comment
        uses: peter-evans/find-comment@v3
@@ -113,14 +110,14 @@ jobs:
        with:
          issue-number: ${{ github.event.pull_request.number }}
          comment-author: 'github-actions[bot]'
-          body-includes: FLASH Analysis
+          body-includes: '<!-- pr-comment-poster:flash-analysis -->'
      - name: Set Build Time
        id: bt
        run: |
          echo "timestamp=$(date +'%Y-%m-%dT%H:%M:%S')" >> $GITHUB_OUTPUT
-      - name: Create or update comment
+      - name: Write pr-comment artifact
        # This can't be moved to the job-level conditions, as GH actions don't allow a job-level if condition to access the env.
        if: |
          steps.fc.outputs.comment-id != '' ||
@@ -128,27 +125,46 @@ jobs:
          env.V5X-SUMMARY-MAP-ABS <= fromJSON(env.MIN_FLASH_NEG_DIFF_FOR_COMMENT) ||
          env.V6X-SUMMARY-MAP-ABS >= fromJSON(env.MIN_FLASH_POS_DIFF_FOR_COMMENT) ||
          env.V6X-SUMMARY-MAP-ABS <= fromJSON(env.MIN_FLASH_NEG_DIFF_FOR_COMMENT)
-        uses: peter-evans/create-or-update-comment@v4
+        run: |
          mkdir -p pr-comment
          cat > pr-comment/manifest.json <<EOF
          {
            "pr_number": ${{ github.event.pull_request.number }},
            "marker": "<!-- pr-comment-poster:flash-analysis -->",
            "mode": "upsert"
          }
          EOF
          cat > pr-comment/body.md <<'PR_COMMENT_BODY_EOF'
          <!-- pr-comment-poster:flash-analysis -->
          ## 🔎 FLASH Analysis
          <details>
            <summary>px4_fmu-v5x [Total VM Diff: ${{ env.V5X-SUMMARY-MAP-ABS }} byte (${{ env.V5X-SUMMARY-MAP-PERC}} %)]</summary>
            ```
            ${{ needs.analyze_flash.outputs.px4_fmu-v5x-bloaty-output }}
            ```
          </details>
          <details>
            <summary>px4_fmu-v6x [Total VM Diff: ${{ env.V6X-SUMMARY-MAP-ABS }} byte (${{ env.V6X-SUMMARY-MAP-PERC }} %)]</summary>
            ```
            ${{ needs.analyze_flash.outputs.px4_fmu-v6x-bloaty-output }}
            ```
          </details>
          **Updated: _${{ steps.bt.outputs.timestamp }}_**
          PR_COMMENT_BODY_EOF
      - name: Upload pr-comment artifact
        if: |
          steps.fc.outputs.comment-id != '' ||
          env.V5X-SUMMARY-MAP-ABS >= fromJSON(env.MIN_FLASH_POS_DIFF_FOR_COMMENT) ||
          env.V5X-SUMMARY-MAP-ABS <= fromJSON(env.MIN_FLASH_NEG_DIFF_FOR_COMMENT) ||
          env.V6X-SUMMARY-MAP-ABS >= fromJSON(env.MIN_FLASH_POS_DIFF_FOR_COMMENT) ||
          env.V6X-SUMMARY-MAP-ABS <= fromJSON(env.MIN_FLASH_NEG_DIFF_FOR_COMMENT)
        uses: actions/upload-artifact@v4
        with:
-          comment-id: ${{ steps.fc.outputs.comment-id }}
+          name: pr-comment
-          issue-number: ${{ github.event.pull_request.number }}
+          path: pr-comment/
-          body: |
+          retention-days: 1
            ## 🔎 FLASH Analysis
            <details>
              <summary>px4_fmu-v5x [Total VM Diff: ${{ env.V5X-SUMMARY-MAP-ABS }} byte (${{ env.V5X-SUMMARY-MAP-PERC}} %)]</summary>
              ```
              ${{ needs.analyze_flash.outputs.px4_fmu-v5x-bloaty-output }}
              ```
            </details>
            <details>
              <summary>px4_fmu-v6x [Total VM Diff: ${{ env.V6X-SUMMARY-MAP-ABS }} byte (${{ env.V6X-SUMMARY-MAP-PERC }} %)]</summary>
              ```
              ${{ needs.analyze_flash.outputs.px4_fmu-v6x-bloaty-output }}
              ```
            </details>
            **Updated: _${{ steps.bt.outputs.timestamp }}_**
          edit-mode: replace
@@ -0,0 +1,147 @@
 name: PR Comment Poster
 # Generic PR comment poster. Any analysis workflow (clang-tidy, flash_analysis,
 # fuzz coverage, SITL perf, etc.) can produce a `pr-comment` artifact and this
 # workflow will post or update a sticky PR comment with its contents. Designed
 # so that analysis jobs running on untrusted fork PRs can still get their
 # results posted back to the PR.
 #
 # ==============================================================================
 # SECURITY INVARIANTS
 # ==============================================================================
 # This workflow runs on `workflow_run` which means it runs in the BASE REPO
 # context with a WRITE token, even when the triggering PR comes from a fork.
 # That is the entire reason it exists, and also the reason it is a loaded
 # footgun. Anyone modifying this file MUST preserve the following invariants:
 #
 #   1. NEVER check out PR code. No `actions/checkout` with a ref. No git clone
 #      of a fork branch. No execution of scripts from the downloaded artifact.
 #      The ONLY things read from the artifact are `manifest.json` and `body.md`,
 #      and both are treated as opaque data (JSON parsed by the poster script
 #      and markdown posted verbatim via the GitHub API).
 #
 #   2. `pr_number` is validated to be a positive integer before use.
 #      `marker` is validated to be printable ASCII only before use. Validation
 #      happens inside Tools/ci/pr-comment-poster.py which is checked out from
 #      the base branch, not from the artifact.
 #
 #   3. The comment body is passed to the GitHub API as a JSON field, never
 #      interpolated into a shell command string.
 #
 #   4. This workflow file lives on the default branch. `workflow_run` only
 #      loads workflow files from the default branch, so a fork cannot modify
 #      THIS file as part of a PR. The fork CAN cause this workflow to fire
 #      by triggering a producer workflow that uploads a `pr-comment` artifact.
 #      That is intended.
 #
 #   5. The artifact-name filter (`pr-comment`) is the only gate on which
 #      workflow runs get processed. Any workflow in this repo that uploads
 #      an artifact named `pr-comment` is trusted to have written the
 #      manifest and body itself, NOT copied fork-controlled content into
 #      them. Producer workflows are responsible for that.
 #
 #   6. `actions/checkout@v4` below uses NO ref (so it pulls the base branch,
 #      the default-branch commit this workflow file was loaded from) AND uses
 #      sparse-checkout to materialize ONLY Tools/ci/pr-comment-poster.py. The
 #      rest of the repo never touches the workspace. This is safe: the only
 #      file the job executes is a base-repo Python script that was reviewed
 #      through normal code review, never anything from the PR.
 #
 # ==============================================================================
 # ARTIFACT CONTRACT
 # ==============================================================================
 # Producers upload an artifact named exactly `pr-comment` containing:
 #
 #   manifest.json:
 #     {
 #       "pr_number": 12345,                                      // required, int > 0
 #       "marker": "<!-- pr-comment-poster:flash-analysis -->",   // required, printable ASCII
 #       "mode": "upsert"                                         // optional, default "upsert"
 #     }
 #
 #   body.md: the markdown content of the comment. Posted verbatim.
 #
 # The `marker` string is used to find an existing comment to update. It MUST
 # be unique per producer (e.g. include the producer name). If no existing
 # comment contains the marker, a new one is created. If the marker is found
 # in an existing comment, that comment is edited in place.
 #
 # Producers MUST write `pr_number` from their own workflow context
 # (`github.event.pull_request.number`) and MUST NOT read it from any
 # fork-controlled source.
 on:
  workflow_run:
    # Producers that may upload a `pr-comment` artifact. When a new producer
    # is wired up, add its workflow name here. Runs of workflows not in this
    # list will never trigger the poster. Every run of a listed workflow will
    # trigger the poster, which will no-op if no `pr-comment` artifact exists.
    workflows:
      - "FLASH usage analysis"
      - "Docs - Orchestrator"
    types:
      - completed
 permissions:
  pull-requests: write
  actions: read
  contents: read
 jobs:
  post:
    name: Post PR Comment
    runs-on: ubuntu-latest
    if: github.event.workflow_run.conclusion != 'cancelled'
    steps:
      # Checkout runs first so the poster script is available AND so that
      # actions/checkout@v4's default clean step does not delete the artifact
      # zip that the next step writes into the workspace. Sparse-checkout
      # restricts the materialized tree to just the poster script.
      - name: Checkout poster script only
        uses: actions/checkout@v4
        with:
          sparse-checkout: |
            Tools/ci/pr-comment-poster.py
          sparse-checkout-cone-mode: false
      - name: Download pr-comment artifact
        id: download
        uses: actions/github-script@v7
        with:
          script: |
            const artifacts = await github.rest.actions.listWorkflowRunArtifacts({
              owner: context.repo.owner,
              repo: context.repo.repo,
              run_id: context.payload.workflow_run.id,
            });
            const match = artifacts.data.artifacts.find(a => a.name === 'pr-comment');
            if (!match) {
              core.info('No pr-comment artifact on this run; nothing to post.');
              core.setOutput('found', 'false');
              return;
            }
            const download = await github.rest.actions.downloadArtifact({
              owner: context.repo.owner,
              repo: context.repo.repo,
              artifact_id: match.id,
              archive_format: 'zip',
            });
            const fs = require('fs');
            fs.writeFileSync('pr-comment.zip', Buffer.from(download.data));
            core.setOutput('found', 'true');
      - name: Unpack artifact
        if: steps.download.outputs.found == 'true'
        run: |
          mkdir -p pr-comment
          unzip -q pr-comment.zip -d pr-comment
      - name: Validate artifact
        if: steps.download.outputs.found == 'true'
        run: python3 Tools/ci/pr-comment-poster.py validate pr-comment
      - name: Upsert sticky comment
        if: steps.download.outputs.found == 'true'
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: python3 Tools/ci/pr-comment-poster.py post pr-comment
@@ -0,0 +1,405 @@
 #!/usr/bin/env python3
 """
 PR comment poster for analysis workflows.
 This script is invoked from the `PR Comment Poster` workflow which runs on
 `workflow_run` in the base repository context. It consumes a `pr-comment`
 artifact produced by an upstream analysis job (clang-tidy, flash_analysis,
 etc.) and posts or updates a sticky PR comment via the GitHub REST API.
 Artifact contract (directory passed on the command line):
  manifest.json
    {
      "pr_number": 12345,                                     (required, int > 0)
      "marker":    "<!-- pr-comment-poster:flash-analysis -->", (required, printable ASCII)
      "mode":      "upsert"                                    (optional, default "upsert")
    }
  body.md
    Markdown comment body, posted verbatim. Must be non-empty and
    <= 60000 bytes (GitHub's hard limit is 65535, we cap under).
 Security: this script is run in a write-token context from a workflow that
 MUST NOT check out PR code. Both manifest.json and body.md are treated as
 opaque data. The marker is validated to printable ASCII only before use.
 Subcommands:
  validate <dir>   Validate that <dir> contains a conforming manifest + body.
  post <dir>       Validate, then upsert a sticky comment on the target PR.
                   Requires env GITHUB_TOKEN and GITHUB_REPOSITORY.
 Python stdlib only. No third-party dependencies.
 """
 import argparse
 import json
 import os
 import sys
 import typing
 import urllib.error
 import urllib.request
 # GitHub hard limit is 65535 bytes. Cap well under to leave headroom for
 # the appended marker line and any future wrapping.
 MAX_BODY_BYTES = 60000
 # Marker length bounds. 1..200 is plenty for an HTML comment tag such as
 # "<!-- pr-comment-poster:flash-analysis -->".
 MARKER_MIN_LEN = 1
 MARKER_MAX_LEN = 200
 ACCEPTED_MODES = ('upsert',)
 GITHUB_API = 'https://api.github.com'
 USER_AGENT = 'px4-pr-comment-poster'
 # ---------------------------------------------------------------------------
 # Validation
 # ---------------------------------------------------------------------------
 def _fail(msg: str) -> typing.NoReturn:
    print('error: {}'.format(msg), file=sys.stderr)
    sys.exit(1)
 def _is_printable_ascii(s):
    # Space (0x20) through tilde (0x7E) inclusive.
    return all(0x20 <= ord(ch) <= 0x7E for ch in s)
 def validate_marker(marker):
    """Validate the marker string.
    The marker is printable ASCII only and bounded in length. The original
    shell implementation also rejected quotes, backticks, and backslashes
    because the value flowed through jq and shell contexts. Now that Python
    owns the handling (the value is only ever used as a substring match in
    comment bodies and as a literal string in JSON request payloads that
    urllib serialises for us) those characters are safe. We keep the
    printable-ASCII and length rules as a belt-and-braces sanity check.
    """
    if not isinstance(marker, str):
        _fail('marker must be a string')
    n = len(marker)
    if n < MARKER_MIN_LEN or n > MARKER_MAX_LEN:
        _fail('marker length out of range ({}..{}): {}'.format(
            MARKER_MIN_LEN, MARKER_MAX_LEN, n))
    if not _is_printable_ascii(marker):
        _fail('marker contains non-printable or non-ASCII character')
 def validate_manifest(directory):
    """Validate <directory>/manifest.json and <directory>/body.md.
    Returns a dict with keys: pr_number (int), marker (str), mode (str),
    body (str, verbatim contents of body.md).
    """
    manifest_path = os.path.join(directory, 'manifest.json')
    body_path = os.path.join(directory, 'body.md')
    if not os.path.isfile(manifest_path):
        _fail('manifest.json missing at {}'.format(manifest_path))
    if not os.path.isfile(body_path):
        _fail('body.md missing at {}'.format(body_path))
    try:
        with open(manifest_path, 'r', encoding='utf-8') as f:
            manifest = json.load(f)
    except (OSError, json.JSONDecodeError) as e:
        _fail('manifest.json is not valid JSON: {}'.format(e))
    if not isinstance(manifest, dict):
        _fail('manifest.json must be a JSON object')
    pr_number = manifest.get('pr_number')
    # bool is a subclass of int in Python, so isinstance(True, int) is True.
    # Reject bools explicitly so "true"/"false" in the manifest doesn't silently
    # validate as 1/0 and then either fail upstream or poke the wrong PR.
    if not isinstance(pr_number, int) or isinstance(pr_number, bool):
        _fail('pr_number must be an integer')
    if pr_number <= 0:
        _fail('pr_number must be > 0 (got {})'.format(pr_number))
    marker = manifest.get('marker')
    validate_marker(marker)
    mode = manifest.get('mode', 'upsert')
    if mode not in ACCEPTED_MODES:
        _fail('unsupported mode {!r} (accepted: {})'.format(
            mode, ', '.join(ACCEPTED_MODES)))
    # Read as bytes first so the size check is an honest byte count (matching
    # GitHub's own 65535-byte comment limit) before we pay the cost of decoding.
    try:
        with open(body_path, 'rb') as f:
            body_bytes = f.read()
    except OSError as e:
        _fail('could not read body.md: {}'.format(e))
    if len(body_bytes) == 0:
        _fail('body.md is empty')
    if len(body_bytes) > MAX_BODY_BYTES:
        _fail('body.md too large: {} bytes (max {})'.format(
            len(body_bytes), MAX_BODY_BYTES))
    # Require UTF-8 up front so a producer that wrote a garbage encoding fails
    # here rather than later inside json.dumps with a less obvious traceback.
    try:
        body = body_bytes.decode('utf-8')
    except UnicodeDecodeError as e:
        _fail('body.md is not valid UTF-8: {}'.format(e))
    return {
        'pr_number': pr_number,
        'marker': marker,
        'mode': mode,
        'body': body,
    }
 # ---------------------------------------------------------------------------
 # GitHub HTTP helpers
 # ---------------------------------------------------------------------------
 def _github_request(method, url, token, json_body=None):
    """Perform a single GitHub REST request.
    Returns a tuple (parsed_json_or_none, headers_dict). Raises RuntimeError
    with the server response body on HTTP errors so CI logs show what
    GitHub complained about.
    """
    data = None
    headers = {
        'Authorization': 'Bearer {}'.format(token),
        'Accept': 'application/vnd.github+json',
        # Pin the API version so GitHub deprecations don't silently change
        # the response shape under us.
        'X-GitHub-Api-Version': '2022-11-28',
        'User-Agent': USER_AGENT,
    }
    if json_body is not None:
        data = json.dumps(json_body).encode('utf-8')
        headers['Content-Type'] = 'application/json; charset=utf-8'
    req = urllib.request.Request(url, data=data, method=method, headers=headers)
    try:
        with urllib.request.urlopen(req) as resp:
            raw = resp.read()
            # HTTPMessage is case-insensitive on lookup but its items() preserves
            # the original case. GitHub sends "Link" with a capital L, which is
            # what _parse_next_link expects.
            resp_headers = dict(resp.headers.items())
            if not raw:
                return None, resp_headers
            return json.loads(raw.decode('utf-8')), resp_headers
    except urllib.error.HTTPError as e:
        # GitHub error bodies are JSON with a "message" field and often a
        # "documentation_url". Dump the raw body into the exception so the CI
        # log shows exactly what the API objected to. A bare "HTTP 422"
        # tells us nothing useful.
        try:
            err_body = e.read().decode('utf-8', errors='replace')
        except Exception:
            err_body = '(no body)'
        raise RuntimeError(
            'GitHub API {} {} failed: HTTP {} {}\n{}'.format(
                method, url, e.code, e.reason, err_body))
    except urllib.error.URLError as e:
        # Network layer failure (DNS, TLS, connection reset). No HTTP response
        # to parse; just surface the transport reason.
        raise RuntimeError(
            'GitHub API {} {} failed: {}'.format(method, url, e.reason))
 def _parse_next_link(link_header):
    """Return the URL for rel="next" from an RFC 5988 Link header, or None.
    The Link header is comma-separated entries of the form:
      <https://...?page=2>; rel="next", <https://...?page=5>; rel="last"
    We walk each entry and return the URL of the one whose rel attribute is
    "next". Accept single-quoted rel values for robustness even though GitHub
    always emits double quotes.
    """
    if not link_header:
        return None
    for part in link_header.split(','):
        segs = part.strip().split(';')
        if len(segs) < 2:
            continue
        url_seg = segs[0].strip()
        if not (url_seg.startswith('<') and url_seg.endswith('>')):
            continue
        url = url_seg[1:-1]
        for attr in segs[1:]:
            attr = attr.strip()
            if attr == 'rel="next"' or attr == "rel='next'":
                return url
    return None
 def github_api(method, path, token, json_body=None):
    """GET/POST/PATCH a single GitHub API path. Non-paginated."""
    url = '{}/{}'.format(GITHUB_API.rstrip('/'), path.lstrip('/'))
    body, _ = _github_request(method, url, token, json_body=json_body)
    return body
 def github_api_paginated(path, token):
    """GET a GitHub API path and follow rel="next" Link headers.
    Yields items from each page's JSON array.
    """
    url = '{}/{}'.format(GITHUB_API.rstrip('/'), path.lstrip('/'))
    # GitHub defaults to per_page=30. Bump to 100 (the max) so a PR with a
    # few hundred comments fetches in 3 or 4 round-trips instead of 10+.
    sep = '&' if '?' in url else '?'
    url = '{}{}per_page=100'.format(url, sep)
    while url is not None:
        body, headers = _github_request('GET', url, token)
        if body is None:
            return
        if not isinstance(body, list):
            raise RuntimeError(
                'expected JSON array from {}, got {}'.format(
                    url, type(body).__name__))
        for item in body:
            yield item
        url = _parse_next_link(headers.get('Link'))
 # ---------------------------------------------------------------------------
 # Comment upsert
 # ---------------------------------------------------------------------------
 def find_existing_comment_id(token, repo, pr_number, marker):
    """Return the id of the first PR comment whose body contains marker, or None.
    PR comments are issue comments in GitHub's data model, so we hit
    /issues/{n}/comments rather than /pulls/{n}/comments (the latter only
    returns review comments tied to specific code lines, which is not what
    we want). The match is a plain substring check against the comment body;
    the marker is expected to be an HTML comment that will not accidentally
    appear in user-written prose.
    """
    path = 'repos/{}/issues/{}/comments'.format(repo, pr_number)
    for comment in github_api_paginated(path, token):
        body = comment.get('body') or ''
        if marker in body:
            return comment.get('id')
    return None
 def build_final_body(body, marker):
    """Append the marker to body if not already present.
    If the caller already embedded the marker (e.g. inside a hidden HTML
    comment anywhere in their body) we leave the body alone; otherwise we
    rstrip trailing newlines and append the marker on its own line after a
    blank-line separator. Trailing-newline stripping keeps the output from
    accumulating extra blank lines every time an existing comment is
    re-rendered and re-posted.
    """
    if marker in body:
        return body
    return '{}\n\n{}\n'.format(body.rstrip('\n'), marker)
 def upsert_comment(token, repo, pr_number, marker, body):
    final_body = build_final_body(body, marker)
    existing_id = find_existing_comment_id(token, repo, pr_number, marker)
    if existing_id is not None:
        print('Updating comment {} on PR #{}'.format(existing_id, pr_number))
        github_api(
            'PATCH',
            'repos/{}/issues/comments/{}'.format(repo, existing_id),
            token,
            json_body={'body': final_body},
        )
    else:
        print('Creating new comment on PR #{}'.format(pr_number))
        github_api(
            'POST',
            'repos/{}/issues/{}/comments'.format(repo, pr_number),
            token,
            json_body={'body': final_body},
        )
 # ---------------------------------------------------------------------------
 # Entry points
 # ---------------------------------------------------------------------------
 def cmd_validate(args):
    result = validate_manifest(args.directory)
    print('ok: pr_number={} marker_len={} mode={} body_bytes={}'.format(
        result['pr_number'],
        len(result['marker']),
        result['mode'],
        len(result['body'].encode('utf-8')),
    ))
    return 0
 def cmd_post(args):
    result = validate_manifest(args.directory)
    # GITHUB_TOKEN is provided by the workflow via env; GITHUB_REPOSITORY is
    # auto-set on every Actions runner. Both are required here because a local
    # developer running the script directly won't have either unless they
    # export them, and we want a clear error in that case.
    token = os.environ.get('GITHUB_TOKEN')
    if not token:
        _fail('GITHUB_TOKEN is not set')
    repo = os.environ.get('GITHUB_REPOSITORY')
    if not repo:
        _fail('GITHUB_REPOSITORY is not set (expected "owner/name")')
    # Minimal shape check. If "owner/name" is malformed the subsequent API
    # calls would 404 with an unhelpful URL. Fail fast here instead.
    if '/' not in repo:
        _fail('GITHUB_REPOSITORY must be "owner/name", got {!r}'.format(repo))
    try:
        upsert_comment(
            token=token,
            repo=repo,
            pr_number=result['pr_number'],
            marker=result['marker'],
            body=result['body'],
        )
    except RuntimeError as e:
        _fail(str(e))
    return 0
 def main(argv=None):
    parser = argparse.ArgumentParser(
        description='Validate and post sticky PR comments from CI artifacts.',
    )
    sub = parser.add_subparsers(dest='command', required=True)
    p_validate = sub.add_parser(
        'validate',
        help='Validate manifest.json and body.md in the given directory.',
    )
    p_validate.add_argument('directory')
    p_validate.set_defaults(func=cmd_validate)
    p_post = sub.add_parser(
        'post',
        help='Validate, then upsert a sticky PR comment. Requires env '
             'GITHUB_TOKEN and GITHUB_REPOSITORY.',
    )
    p_post.add_argument('directory')
    p_post.set_defaults(func=cmd_post)
    args = parser.parse_args(argv)
    return args.func(args)
 if __name__ == '__main__':
    sys.exit(main())