Skip to main content

Core Concepts

Environments

An environment represents a complete development workspace packaged as a Docker image. It contains your source code repositories, build tools, runtime dependencies, and any application servers needed to run and test your software.

Administrators configure environments to match your team's development setup. When you create a task, you select which environment to use, and CoderFlow launches an isolated container from that image. This ensures every agent works in a consistent, reproducible environment with access to the correct codebase and tools.

Environments can be scheduled to rebuild automatically, keeping dependencies current and repositories synced with your latest code.

Skills

Skills are reusable, prompt-based actions that agents can invoke while working. Skills can include their own instructions and supporting files, and they are managed by administrators in the Skills area of the Web UI.

Skills are assigned at the environment level. When a task launches, CoderFlow injects the assigned skills into the task container so the agent can use them immediately.

The Parallel Agent Workflow

CoderFlow's most powerful capability is running multiple AI agents in parallel on the same task. This approach dramatically improves code quality by leveraging the diverse problem-solving approaches of different agents.

1. Submit a Task to Multiple Agents

When you have a complex task—fixing a tricky bug, implementing a new feature, or refactoring existing code—you can submit it to multiple agents simultaneously. Each agent works independently in its own isolated container, approaching the problem from its unique perspective.

For example, you might run the same task with Claude, Codex, Gemini, Bob, and Grok. Each agent reads your instructions, explores the codebase, and produces its own solution. Because they work in parallel, you get multiple candidate solutions in roughly the same time it would take to get one.

2. Automatic Evaluation by Judge Agents

Once the execution agents complete their work, CoderFlow can automatically launch judge agents to evaluate the results. Judges are AI agents configured specifically for code review and quality assessment.

Multiple judges can evaluate each solution independently, examining:

  • Whether the implementation correctly addresses the requirements
  • Code quality, readability, and adherence to project conventions
  • Test coverage and whether existing tests still pass
  • Potential edge cases or issues the solution might miss

The judges score each variant and can reach a consensus on which solution is best. This multi-judge approach reduces bias and provides more reliable evaluations.

3. Feedback Loops

Judges don't just score—they provide detailed feedback. When a judge identifies issues or improvements, that feedback can be automatically passed back to the execution agents. The agents then iterate on their solutions, addressing the judge's concerns.

This creates an automated refinement cycle:

  1. Execution agents produce initial solutions
  2. Judge agents evaluate and provide feedback
  3. Execution agents improve their work based on feedback
  4. Judge agents re-evaluate until quality standards are met

You can choose how many AI feedback rounds to follow, or manually trigger additional feedback as needed.

4. Winner Selection and Review

After evaluation completes, you select a winning variant—either manually by reviewing the options, or automatically based on judge scores. With a winner selected, you can:

  • Review the code changes: Examine exactly what the agent modified, added, or removed. The diff view quickly highlights all changes against the original codebase and allows yout to make adjustments. You can also review changes in an IDE (browser-based VS Code environment or your local IDE for deeper work).

  • Run the application: Launch the modified application directly from the container to test functionality. For web applications, CoderFlow provides URLs to access the running server. You can interact with the application, verify the fix works, and check for regressions.

  • Provide additional feedback: If the solution is close but needs adjustments, send follow-up instructions to the agent. It resumes work in the same container with full context of what it already did, making targeted improvements without starting over.

5. Approve and Deploy

When you're satisfied with the solution:

  1. Approve the changes: This commits the agent's modifications to your repository. CoderFlow generates a commit message summarizing the work, which you can customize before finalizing.

  2. Choose your branch strategy: Commit directly to the working branch, create a new feature branch, or open a pull request for team review.

  3. Deploy (optional): If your environment is configured with deployment pipelines, you can trigger deployment after the approval workflow, pushing the changes to staging or production.

Every approval is logged with full traceability—who approved, when, what changed, and which agent produced the work.

Objectives

Objectives are where you plan and draft your requirements before executing work. They provide a permanent home for your ideas as they evolve from initial concepts into well-defined specifications ready for AI agents to implement.

For simple, well-understood tasks, you can create and launch a task directly. But for complex work—where requirements need refinement, or you're still forming your approach—starting with an objective gives you space to think and iterate.

Hierarchical Organization

Objectives support multiple levels of nesting, letting you break down large initiatives into smaller, manageable pieces. The home page displays your objectives as a tree view, making it easy to see how work is organized and navigate between related items.

For example, a high-level objective like "Modernize the reporting module" can be broken down into sub-objectives: "Update data models," "Redesign report templates," "Add export functionality." Each sub-objective can be further decomposed as needed.

The Iterative Workflow

Objectives support an iterative approach to getting work done:

  1. Draft your objective: Write your initial requirements, attach relevant screenshots or documentation, and specify which environment and agents to use.

  2. Launch a task: When ready, launch a task from the objective. The task runs in its own container while the objective remains unchanged.

  3. Evaluate and refine: Review the agent's work. If the requirements weren't complete or specific enough, revise the objective based on what you learned.

  4. Relaunch: Submit another task with the refined requirements. Repeat until the work meets your expectations.

This workflow means objectives serve as living documents that improve over time. Each task launched from an objective is independent, so you maintain a history of attempts and can compare approaches.

Tasks

A task is a single unit of work executed by an AI agent. When you create a task—either directly or by launching from an objective—CoderFlow spins up an isolated container and runs the agent with your instructions.

Tasks progress through a simple lifecycle:

  • Pending: Task is created and waiting to be queued
  • Queued: Waiting for an available container slot
  • Running: Agent is actively working on the task
  • Completed: Agent finished successfully
  • Failed: Agent encountered an error

You can monitor running tasks in real-time, watching the agent's activity feed as it explores code, makes changes, and runs tests.

Staged Tasks

For situations where you want to prepare a task but control exactly when the agent starts working, staged tasks launch the container and set up the environment but pause before the agent begins. This is useful for:

  • Manually adjusting the environment before execution
  • Coordinating task timing with external dependencies
  • Reviewing the setup before committing compute resources to agent work

When ready, you start the staged task and the agent begins immediately in the pre-warmed container.

Pinned Items

As you work with objectives and tasks, pinning helps you focus on what's currently active. Pin objectives you're actively planning and tasks you're evaluating. When work is complete, unpin items to move them out of your primary view.

Your default view shows only pinned items—the objectives and tasks that need your attention right now. Switch to the "All" view when you need to reference historical work or revisit completed items.

This approach keeps your workspace uncluttered while preserving full history. Nothing is deleted; completed work simply moves out of your active view until you need it again.