Coding Agents in Existing Repositories: Onboarding a Delegate to the Codebase

Coding agents are most impressive when they edit real software instead of toy examples. They can inspect a repository, trace a bug, draft a patch, run tests, and prepare a review note faster than a person could move through the same mechanical steps. They are also most fragile in real repositories, because real repositories contain history, conventions, partial migrations, generated files, local scripts, ownership boundaries, flaky tests, and user changes that must not be overwritten.

Onboarding a coding agent is the practice of giving it enough project shape to work safely without flooding it with every file and every custom rule. The goal is not to teach the agent the entire codebase in one prompt. The goal is to help it build a correct working set, understand the local change discipline, and produce a patch that a reviewer can inspect without first cleaning up avoidable mess.

This topic is a concrete case of AI Agent Context Windows and Working Sets . A repository is too large to treat as one flat context. It also depends on AI Agent Dependency Hygiene because a coding agent cannot validate work if the project setup, package manager, test commands, and generated files are unclear.

The First Job Is Orientation

A coding agent should begin by learning the shape of the task and the shape of the repo. Those are different things. The task might be a bug fix, a feature, a refactor, a migration, a test addition, or a review response. The repository might be a monolith, a package workspace, a static site, a mobile app, a service, a library, or a pile of scripts. The same instruction means different work in each setting.

Orientation should be concrete. The agent needs to know where the relevant code probably lives, how the project names things, which files are generated, which directories are off limits, and how tests or checks are usually run. It should inspect before editing. A quick search for the feature name, error message, route, component, function, or test fixture often prevents a much larger mistake later.

This is not a call for long preliminary essays. The useful orientation is short and evidence based. The agent should be able to say which files look relevant and why. If it cannot find the path, that is information. If it finds multiple similar implementations, that is a sign to pause or narrow the assignment. Early orientation protects the codebase from a common agent failure: editing the first plausible file while the real convention lives nearby.

Local Conventions Beat General Style

Models have strong general knowledge about software style. That is useful, but it is not a substitute for repository conventions. A project may prefer one state management pattern, one error handling style, one routing structure, one test helper, one file naming scheme, or one way to express feature flags. A patch that is elegant in isolation can be wrong for the repo if it ignores those conventions.

The agent should read neighboring code before inventing an abstraction. If a component has three siblings with the same pattern, the fourth should probably follow that pattern. If tests use a local fixture helper, the new test should use it. If API clients are wrapped in a project service, the agent should not import a raw dependency just because it knows how. Code review becomes easier when the patch feels native.

AI Agent Tool Contracts applies here in a practical way. Tools that expose project guides, ownership notes, focused test commands, and code search results help the agent find local truth. A coding agent with only a shell can still work, but it will spend more effort rediscovering conventions and may miss the scripts the team already built to guide changes.

The Dirty Worktree Is Part of Reality

Real development work often starts in a dirty worktree. A user may have local edits, generated files may have changed, or another worker may be editing nearby files. A coding agent must treat existing changes as part of the environment, not as clutter to reset away. If it overwrites user work, the patch may be technically correct and still unacceptable.

The onboarding habit is simple: inspect status before editing and keep the change scope visible. If a relevant file already has changes, the agent should understand them before adding more. If unrelated files are dirty, the agent should ignore them. If the task cannot be completed without touching a file someone else is changing, the agent should call that out rather than silently mixing work.

This connects to AI Agent Sandboxes . The safest place for a coding agent is a controlled branch or workspace where its edits can be isolated and reviewed. But even inside a sandbox, there may be prior work. The agent’s responsibility is not only to avoid production harm. It is to preserve the human’s local state.

Tests Are Evidence, Not Ritual

A coding agent should not run tests merely to decorate the final answer. Tests are evidence about the patch. The useful question is which checks fit the change. A small parser fix may need a focused unit test. A UI behavior change may need a component test or a manual verification note when automation is unavailable. A dependency update may need install, type checks, and a targeted runtime check. A documentation-only change may need no code test, but it may still need link or build validation.

The agent should also know when not to overclaim. If a command fails because the environment is missing a service, that is not the same as a failing test. If the focused tests pass but the full suite was not run, the handoff should say so. If tests are known to be flaky, the agent should avoid turning a single pass into a broad guarantee. AI Agent Output Verification is the wider discipline; coding agents apply it through commands, diffs, and honest reporting.

Good test evidence includes the command, the result, and the reason the command was chosen. “Tests passed” is weaker than a specific statement that the relevant unit suite passed after the patch. If no test could be run, the handoff should name the gap and describe any static or manual check that was performed instead.

Scope Creep Looks Like Helpfulness

Coding agents often drift because they see nearby improvements. They may clean formatting, rename variables, upgrade dependencies, reorganize files, fix unrelated warnings, or simplify a neighboring function. Some of those changes may be good. They are still harmful when they hide the real patch, increase review burden, or create risk outside the assignment.

The onboarding instruction should make scope a first-class rule. The agent should prefer the smallest change that satisfies the task and local conventions. It should leave unrelated refactors for separate work. If it discovers an adjacent defect, it can mention it in the handoff. That is usually better than mixing it into the current patch.

This is a direct application of AI Agent Task Decomposition . A bug reproduction task, a patch task, a cleanup task, and a test expansion task may be related, but they are not always the same task. Splitting them keeps review manageable and makes rollback easier if one part proves wrong.

Generated Files and Dependencies Need Explicit Rules

Many repositories contain files that should not be edited by hand. Generated clients, lockfiles, compiled assets, snapshots, schema outputs, and vendor directories each have their own rules. A coding agent that edits generated output directly may appear to fix a symptom while leaving the source unchanged. A coding agent that updates a dependency without understanding the package manager may create churn unrelated to the requested fix.

The agent should learn which files are source and which are products of a tool. It should prefer running the generator when the repo expects generated files to change. It should avoid lockfile edits unless the task or package manager action requires them. It should treat dependency changes as behavior changes, not as incidental cleanup.

AI Agent Dependency Hygiene is especially important for coding work because software state is layered. A test result depends on installed packages, environment variables, system binaries, service mocks, and local caches. The agent’s handoff should make those assumptions visible enough that a reviewer can judge whether the validation means what it claims.

The Handoff Should Read Like a Review Entry

A coding agent’s final answer should help a reviewer inspect the patch. It should name the changed files, the behavioral change, the relevant checks, and any remaining risk. It should avoid a long narrative of every command unless that detail matters. It should not pretend that a change is merged, deployed, or safe in production merely because it exists in the workspace.

The handoff should also separate facts from inferences. “I changed the parser to reject empty names” is a fact if the diff shows it. “This should fix all onboarding failures” may be an inference unless the original failure was reproduced and covered. The reviewer needs that distinction. Human Review for AI Agents is easier when the agent gives the reviewer a narrow, truthful map of the work.

For pull request work, the same principle applies to the PR description. A good description does not sell the patch. It orients the reviewer. It says what changed, why, how it was checked, and what was intentionally left alone. Coding agents should be trained toward that standard because review is where their work becomes trusted software.

A Good Coding Agent Learns the Room Before Moving Furniture

The practical measure of a coding agent is not how much code it can write. It is whether it can make the right small change in a real codebase without making the reviewer pay for its uncertainty. That requires orientation, local convention, respect for user changes, focused validation, dependency awareness, and a handoff that preserves evidence.

The repository should meet the agent halfway. Good project scripts, clear test commands, ownership notes, generated-file rules, and concise contributor guidance all make agents safer. They help people too. An agent-friendly repo is often just a well-maintained repo with fewer hidden rules.

When onboarding is handled well, the agent becomes a useful delegate instead of a fast stranger in the codebase. It reads before editing. It follows the surrounding style. It leaves unrelated work alone. It reports checks honestly. It gives the reviewer a patch that can be inspected on its merits. That is the standard worth aiming for, because the real value of coding agents is not volume. It is reliable progress inside code people already depend on.

Coding Agents in Existing Repositories: Onboarding a Delegate to the Codebase

On this page

The First Job Is Orientation

Local Conventions Beat General Style

The Dirty Worktree Is Part of Reality

Tests Are Evidence, Not Ritual

Scope Creep Looks Like Helpfulness

Generated Files and Dependencies Need Explicit Rules

The Handoff Should Read Like a Review Entry

A Good Coding Agent Learns the Room Before Moving Furniture

Turn agent lessons into a better review setup

JJ Ben-Joseph

On this page

The First Job Is Orientation

Local Conventions Beat General Style

The Dirty Worktree Is Part of Reality

Tests Are Evidence, Not Ritual

Scope Creep Looks Like Helpfulness

Generated Files and Dependencies Need Explicit Rules

The Handoff Should Read Like a Review Entry

A Good Coding Agent Learns the Room Before Moving Furniture

Turn agent lessons into a better review setup

JJ Ben-Joseph

Related guidebooks

AI Agent Workspace Hygiene: Keeping Delegated Work Contained

AI Agent Tool Contracts: Designing the Handles Agents Can Safely Use

AI Agent Quality Gates: Moving Work From Draft to Trust