Configuration¶
In a hurry? You can skip this page
Most people configure Arbor exactly once: run arbor setup, pick a model, done.
Everything below is for when you want to change the model, set a time/cost
budget, add human oversight, or target a specific domain. Come back when you
need it.
This page is written for someone who has never used Arbor. It answers three questions, in order:
- What can I configure, and which settings actually matter?
- How do I set each one from the command line?
- When two settings disagree, which wins?
Pick your path¶
Run arbor setup to choose a model, then arbor to start. You do not need a
config file or any flags. Read The settings that matter most
if you're curious, and ignore the rest.
Do arbor setup, then skim What you can configure and
Set a budget. A --max-cycles flag and the right model
are usually all you need.
Put durable settings in a project config file so every run is identical, or capture a whole domain in a Plugin. See Per-project settings.
What you can configure¶
Settings fall into four tiers, from "almost everyone touches this" to "advanced".
| Tier | Setting | What it controls | Why you'd change it |
|---|---|---|---|
| Essential | provider, model, api_key, base_url |
Which LLM Arbor uses and how to reach it | You must pick a model and supply a key once. |
| Important | max_cycles |
How many experiments before Arbor stops and writes the report | The main time/cost knob. Higher = longer, deeper search. |
| Important | reasoning_effort |
How hard the model thinks per step (low/medium/high) |
Trade speed/cost for depth. |
| Important | max_turns, timeout: |
Hard safety caps on a single experiment | Stop runaway cost on long jobs. |
| Optional | interaction_mode |
How much you steer the run (auto vs. approve ideas) | You want a human in the loop. See Interaction Modes. |
| Optional | webui_port / --no-webui |
The read-only browser monitor | Watch progress live, or turn it off. |
| Advanced | plugin, plugin_profile |
Retarget Arbor to a domain (eval rules, protected files, budget bundle) | You run the same kind of benchmark often. See Plugins. |
| Advanced | skills | Sharpen how the agent reasons at a step | You want better ideation/analysis. See Skills. |
The settings that matter most¶
If you only ever touch three things, make them these:
model— quality and cost come mostly from here.max_cycles— how long and deep the study runs.interaction_mode— whether you watch (auto) or approve each idea (review).
Everything else has a sensible default.
How to set it — from the command line¶
There are five places a setting can come from. Listed the way you'll actually reach for them:
arbor setup— a one-time wizard that saves your model globally. Most people only ever use this.arbor config— view or edit that global file later.- A project config file — durable settings that travel with one project.
- CLI flags — one-off overrides for a single run.
- In-chat slash commands — pick a plugin or skill for this run, no files needed.
1. Your model: arbor setup¶
The fastest way to get configured. It asks four questions and writes
~/.arbor/config.yaml:
$ arbor setup
arbor setup — let's configure your model (one time).
API type (anthropic/openai/litellm): anthropic
Base URL (local proxy / vLLM, blank for the official API):
Model: claude-sonnet-4-5
API key (blank to read from the environment): ********
✓ credentials look resolvable
Done. Saved to ~/.arbor/config.yaml
- API type is the provider — see Providers below.
- Base URL stays blank for the official Anthropic/OpenAI APIs; set it only for a local proxy or gateway.
- API key can be left blank to read from an environment variable (recommended) — e.g.
ANTHROPIC_API_KEYorOPENAI_API_KEY.
After this, just run arbor.
2. Inspect or edit the global config: arbor config¶
arbor config show # print the effective config (secrets masked)
arbor config path # where the file lives
arbor config init --provider openai --model gpt-5 --api-key dummy # write it non-interactively
arbor config init is the scriptable sibling of the wizard — handy for setting up a local
gateway in one line:
arbor config init --provider litellm --model qwen-72b \
--base-url http://localhost:4141 --api-key dummy
Providers¶
Pick one API type. The value you give to arbor setup / --provider is one of exactly
three:
provider |
Use for | Notes |
|---|---|---|
anthropic |
Claude models | Native Anthropic API. |
openai |
OpenAI models | Uses the Responses API for reasoning models. |
litellm |
DeepSeek, Gemini, Qwen, vLLM, Ollama, local gateways | Anything OpenAI-compatible. Set base_url. |
Keep keys out of files
Prefer an environment variable (${ANTHROPIC_API_KEY}) over pasting a secret into a
file. arbor setup stores your global key under ~/.arbor/ with the rest of the config.
3. Per-project: a config file¶
When a project needs its own durable settings, drop a YAML file in it. Arbor auto-detects
research_config.yaml, arbor.yaml, or autoresearch.yaml in the target directory (or
pass --config PATH). Settings here override your global setup but lose to CLI flags.
# ── Model ──────────────────────────────────────────────
llm:
provider: anthropic # anthropic | openai | litellm
model: claude-sonnet-4-5
api_key: ${ANTHROPIC_API_KEY} # env vars are expanded
base_url: null # set for litellm / OpenAI-compatible gateways
reasoning_effort: medium # low | medium | high (where supported)
meta_model: null # optional cheaper model for meta/report steps
# ── Orchestration ─────────────────────────────────────
max_cycles: 12 # experiments before Arbor finalizes and reports
executor_max_turns: 60 # hard cap on one experiment's reasoning turns
# ── Timeouts (seconds) ────────────────────────────────
timeout:
executor: 172800 # 48 h per experiment
run_training_max: 604800 # 7 d ceiling for one training command
# ── Human-in-the-loop & monitoring ────────────────────
ui:
interaction_mode: auto # auto | direction | review | collaborative
webui_port: 8765 # read-only browser monitor
Flat keys also work
The nested groups (llm:, timeout:, ui:) are the recommended style, but equivalent
flat keys are accepted. See examples/research_config.example.yaml in the repository
for an annotated reference.
4. One-off: CLI flags¶
Flags override everything else, for a single run only:
Common ones: --max-cycles N, --max-turns N, --mode MODE, --webui-port N,
--no-webui. See the CLI reference for the full list.
5. In the chat: pick a plugin or skill for this run¶
You don't have to edit files to change domain behavior. While the intake chat is open, type
/:
/plugin load mle_kaggle mle_bench_lite # use a domain plugin (+ profile) for this run
/plugin unload # ignore any configured plugin this run
/skill load idea_drafting # load an extra reasoning playbook
/skill unload first_principles_probe # drop a default skill this run
These choices apply to the single run you're about to launch and don't touch your config. See Plugins and Skills.
What each setting means¶
Orchestration¶
| Key | Meaning |
|---|---|
max_cycles |
Maximum number of completed / skipped / failed idea experiments before Arbor finalizes and writes the report. Override per-run with --max-cycles. |
executor_max_turns |
Hard cap on a single experiment's reasoning turns — a runaway/cost safety valve. Override with --max-turns. |
reasoning_effort |
How hard the model thinks per step (low/medium/high, where the provider supports it). |
meta_model |
Optional cheaper/faster model for meta-level steps (distilling insight, drafting the report) while model drives the main loop. |
Budgets and timeouts¶
The timeout: group bounds how long individual operations may run (in seconds):
| Key | Default | Meaning |
|---|---|---|
executor |
172800 (48 h) |
Wall-clock limit for one experiment. |
run_training_max |
604800 (7 d) |
Ceiling for one long-running training command. |
For benchmarks, the tidiest way to set a coherent budget is a plugin profile, which
bundles max_cycles, tree depth, executor timeout, and total time budget under one name
(e.g. mle_bench_lite). See Plugins.
Human-in-the-loop & monitoring¶
The ui: group controls oversight and the live monitor:
| Key | Meaning |
|---|---|
interaction_mode |
auto, direction, review, or collaborative. See Interaction Modes. Override with --mode. |
webui_port |
Port for the browser monitor (default 8765). See Web UI & Monitoring. Override with --webui-port; disable with --no-webui. |
Domain targeting¶
Two top-level keys retarget Arbor to a domain without touching code:
plugin: mle_kaggle # load a bundled domain plugin
plugin_profile: mle_bench_lite # pick a named budget/behaviour profile within it
See Plugins for the full plugin format and the built-in mle_kaggle plugin.
When settings disagree: precedence¶
Configuration comes from several places. When two set the same value, the higher one wins:
built-in defaults < plugin overrides < plugin profile < global setup (~/.arbor) < project config < CLI flags
Rule of thumb
A CLI flag beats everything. Your project config beats your global setup. Set durable choices in a file; use flags for one-off changes.
Verify it works¶
arbor config show # confirm provider/model/key are what you expect (secrets masked)
arbor doctor # check PATH, Python, git, and that your API key resolves
arbor doctor is the fastest way to catch a missing key or unreachable gateway before a
run starts.