# Copilot Agent A TypeScript-based application for interacting with GitHub Copilot models, featuring a web-based portal. ## Prerequisites - [GitHub Copilot CLI](https://docs.github.com/en/copilot/how-tos/set-up/install-copilot-cli) must be installed and authenticated - Node.js 24+ and Yarn - Powershell 7 **IMPORTANT**: You'd better upgrade both copilot cli and sdk to the latest version before using it. ## Setup From the `.github/Agent` directory, run: ```bash yarn install ``` ## Build To build and test the application: ```bash yarn build ``` This runs three steps in sequence: 1. `yarn compile` — compiles all TypeScript packages 2. `yarn testStart` — starts the portal server in the background 3. `yarn testExecute` — runs all API and Playwright tests, then stops the server The server is always stopped via `api/stop` after tests, regardless of pass/fail. To compile only (no tests): ```bash yarn compile ``` ## Usage ### Portal (Web UI) ```bash yarn portal ``` Starts an HTTP server at `http://localhost:8888` serving a web UI and RESTful API. - Open `http://localhost:8888` to launch the portal. - Select a model and working directory, then click Start. - Send requests via the text box (Ctrl+Enter or Send button). - Session responses (reasoning, tool calls, messages) appear as collapsible message blocks. - "Close Session" ends the session and closes the tab; "Stop Server" also shuts down the server. - Use `http://localhost:8888/index.html?project=XXX` to default working directory to the sibling folder `XXX` next to the repo root. - Without `project` parameter, working directory defaults to the repo root. #### CLI Options - `--port ` — Set the HTTP server port (default: `8888`) - `--test` — Start in test mode (entry is not pre-installed; enables `copilot/test/installJobsEntry` API for test-driven entry loading) ### Portal for Test ```bash yarn portal-for-test ``` Starts the portal in test mode (`--test` flag). In test mode, the jobs entry is not installed at startup, and the `copilot/test/installJobsEntry` API is available for tests to install entries dynamically. ## Specification Structure There are two folders storing specification: - `prompts/snapshot`: The specification that the project implemented, it reflects the current state. - `prompts/spec`: The specification that the project need to be. File organization in these two folders are identical: - `CopilotPortal` folder: about `packages/CopilotPortal` - `JobsData.md`: Definitions of jobs data. - `API.md`: RESTful API, and how to start the project. More specific content in these files: - `API_Session.md` - `API_Task.md` - `API_Job.md` - `Index.md`: index.html page. - `Jobs.md`: jobs.html and jobTracking.html pages. - `Test.md`: test.html page. - `Shared.md`: Shared components between multiple web pages. ## Project Structure ``` .github/Agent/ ├── package.json # Workspace configuration (yarn workspaces) ├── tsconfig.json # Base TypeScript configuration ├── packages/ │ ├── CopilotPortal/ # Web UI + RESTful API server │ │ ├── src/ │ │ │ ├── copilotSession.ts # Copilot SDK session wrapper │ │ │ ├── copilotApi.ts # Copilot session API routes, token endpoint, and helpers │ │ │ ├── taskApi.ts # Task execution engine and task API routes │ │ │ ├── jobsApi.ts # Job API routes and job execution │ │ │ ├── jobsDef.ts # Jobs/tasks data definitions and validation │ │ │ ├── jobsChart.ts # Flow chart graph generation from work trees │ │ │ ├── jobsData.ts # Preloaded jobs/tasks data │ │ │ ├── sharedApi.ts # Shared HTTP utilities and token-based live entity state management │ │ │ └── index.ts # HTTP server, API routing, static files, entry management │ │ ├── assets/ # Static website files │ │ │ ├── index.html # Main portal page │ │ │ ├── index.js # Portal JS (session interaction, live polling) │ │ │ ├── index.css # Portal styles │ │ │ ├── jobs.html # Jobs selection page │ │ │ ├── jobs.js # Jobs page JS (matrix rendering, job selection) │ │ │ ├── jobs.css # Jobs page styles │ │ │ ├── jobTracking.html # Job tracking page │ │ │ ├── jobTracking.js # Job tracking JS (logic + renderer dispatch) │ │ │ ├── jobTracking.css # Job tracking styles │ │ │ ├── flowChartMermaid.js # Mermaid flow chart renderer │ │ │ ├── messageBlock.js # MessageBlock component │ │ │ ├── messageBlock.css # MessageBlock styles │ │ │ ├── sessionResponse.js # SessionResponseRenderer component │ │ │ ├── sessionResponse.css # SessionResponseRenderer styles │ │ │ └── test.html # Simple API test page │ │ ├── test/ # Test files │ │ │ ├── startServer.mjs # Starts server in test mode for testing │ │ │ ├── runTests.mjs # Test runner (always stops server) │ │ │ ├── testEntry.json # Test entry with simple tasks/jobs for API tests │ │ │ ├── jobsData.test.mjs # Jobs data validation tests │ │ │ ├── api.test.mjs # RESTful API tests (incl. task/job execution) │ │ │ ├── work.test.mjs # Work tree execution tests │ │ │ ├── web.test.mjs # Playwright UI tests (test.html) │ │ │ ├── web.index.mjs # Playwright tests for index.html │ │ │ ├── web.jobs.mjs # Playwright tests for jobs.html and jobTracking.html │ │ │ └── windowsHide.cjs # Windows process hiding helper │ │ └── package.json ``` ## Maintaining the Project - **Build**: `yarn build` compiles all packages and runs tests. - **Compile only**: `yarn compile` compiles all packages via TypeScript. - **Run portal**: `yarn portal` starts the web server (default port 8888). - **Run portal in test mode**: `yarn portal-for-test` starts in test mode. - **Run tests only**: `yarn testStart && yarn testExecute` starts server in test mode and runs tests. - **Playwright**: Install with `npx playwright install chromium`. Used for testing the portal UI. - **Spec-driven**: Portal features are defined in `prompts/spec/CopilotPortal/`. ## Features - **Web Portal**: Browser-based UI for Copilot sessions with real-time streaming - **Message Blocks**: User, Reasoning, Tool, and Message blocks with expand/collapse behavior - **Markdown Rendering**: Completed message blocks (except Tool) render markdown content as formatted HTML using marked.js - **Awaiting Status**: "Awaits responses ..." indicator shown in the session part while the agent is working - **Lazy CopilotClient**: Client starts on demand and closes when the server shuts down - **Multiple Sessions**: Supports parallel sessions sharing a single CopilotClient - **Live Polling**: Token-based sequential long-polling for real-time session/task/job callbacks. Clients acquire a token via `api/token`, then poll `live/{token}` endpoints. Responses are stored in a list with per-token reading positions, enabling multiple consumers to independently read the same response history - **Task System**: Job/task execution engine with availability checks, criteria validation, and retry logic - **Session Crash Retry**: `sendMonitoredPrompt` (private method on `CopilotTaskImpl`) automatically retries if a Copilot session crashes during prompt execution, creating new sessions when needed. Driving sessions use `entry.drivingSessionRetries` budget with multi-model fallback; task sessions retry up to 5 times with the same model. - **Detailed Error Reporting**: `errorToDetailedString` helper converts errors to detailed JSON with name, message, stack, and recursive cause chain for comprehensive crash diagnostics - **Jobs API**: RESTful API for listing, starting, stopping, and monitoring tasks and jobs via live polling - **Running Jobs API**: `copilot/job/running` lists all running or recently finished (within an hour) jobs with name, status, and start time; `copilot/job/{job-id}/status` returns detailed job status including per-task statuses - **Running Jobs List**: The portal home page displays a list of running/recent jobs with name, status, and time, auto-loaded on page load. A "Refresh" button updates the list, and each item has a "View" button to inspect the job in the tracking page - **Initial Job Status Loading**: Job tracking page loads initial status via `copilot/job/{job-id}/status` on page load, applying task status indicators to the flow chart before live polling begins - **Live Polling Drain**: Live APIs (session/task/job) use a drain model — clients continue polling until receiving terminal `*Closed` or `*NotFound` errors, ensuring all buffered responses are consumed. Each entity has a configurable countdown (1 minute normal, 5 seconds in test mode) after closing, during which new tokens can still join and read history - **Closed State Management**: Sessions, tasks, and jobs use `LiveEntityState` with token-based lifecycle management — entities transition through open/closed states with countdown periods before cleanup, preventing lost data - **Test Mode API**: `copilot/test/installJobsEntry` endpoint (test mode only) for dynamically installing job entries during testing - **Job Workflow Engine**: Composable work tree execution supporting sequential, parallel, loop, and conditional (alt) work patterns - **Task Selection UI**: Combo box in the portal to select and run tasks within an active session - **Tool Registration**: Custom job tools (e.g. `job_boolean_true`, `job_prepare_document`) are automatically registered with Copilot sessions - **Flow Chart Renderers**: Job tracking page uses Mermaid for declarative flowchart rendering - **Job Status Tracking**: Live polling of job execution status with visual status bar (RUNNING/SUCCEEDED/FAILED/CANCELED) and Stop Job button - **Flow Chart Status Indicators**: Running tasks display a green triangle indicator; succeeded tasks display a green tick indicator; failed tasks display a red cross indicator on the flow chart - **Task Inspection**: Clicking a TaskNode in the flow chart opens a tab control showing session responses for that task's sessions - **Job Preview Mode**: Job tracking page supports a preview mode (no jobId) showing the flow chart without tracking, with no Stop Job button and "JOB: PREVIEW" status - **Job-Created Tasks**: Jobs create tasks in managed session mode; `startTask` manages driving session creation internally. Task live API provides real-time session updates with `sessionId` and `isDriving` fields - **Task Decision Reporting**: `taskDecision` callback reports all driving session decisions with categorized prefixes (`[SESSION STARTED]`, `[OPERATION]`, `[CRITERIA]`, `[AVAILABILITY]`, `[SESSION CRASHED]`, `[TASK SUCCEEDED]`, `[TASK FAILED]`, `[DECISION]`) as User message blocks in the driving session tab - **Driving Session Consolidation**: All driving sessions for a task are consolidated into a single "Driving" tab; when a driving session is replaced (e.g., due to crash retry), the new session reuses the same tab and renderer - **Borrowing Session Mode**: Tasks can run with an externally-provided session (borrowing mode); crashes in borrowing mode fail immediately without retry - **Managed Session Mode**: Tasks in managed mode create their own sessions — single model mode reuses one session, multiple models mode creates ephemeral sessions per mission - **Task Stopping via TaskStoppedError**: When a task is stopped, `guardedSendRequest` (a private helper wrapping `session.sendRequest`) throws `TaskStoppedError` if the task's `stopped` flag is set. Retry logic in `sendMonitoredPrompt` checks `this.stopped` after catching non-TaskStoppedError exceptions and converts them to `TaskStoppedError`, ensuring immediate task termination without unnecessary session replacement. Stopping a job also stops all its running tasks and emits a `jobCanceled` callback. - **Separated Retry Budgets**: Driving session crash retries use `entry.drivingSessionRetries` with multi-model fallback per driving mission; task session crash retries are per-call (5 max in `sendMonitoredPrompt`); criteria retries are per failure action loop. A crash exhausting its per-call budget during a criteria retry loop is treated as a failed iteration rather than killing the task