Train a Claude Code Agent on Client Historical Data

Your paid media person runs LinkedIn ads. Your outbound team sends sequences through Instantly. Your content writer publishes founder posts. None of them share a single source file, a single ICP definition, or a single metric benchmark. Every Monday, you spend the first two hours of your week re-aligning people who should already be aligned.

Claude Code agents, configured on shared client data, can fix that alignment problem. One persistent context file. One ICP definition. One set of brand voice rules. Every function reads from the same source before producing output.

Claude Code is an agent, not a chatbot

Standard Claude chat waits for your next prompt. Claude Code reads project files, edits documents, runs scripts, and works through multi-step tasks while you're doing something else. Anthropic describes the difference this way: instead of writing code yourself and asking Claude to review it, you describe what you want and Claude figures out how to build it.

For SaaS growth teams, a few capabilities matter most.

It reads your files directly: Campaign data exports, strategy docs, CRM summaries, brand guidelines. You don't paste content into a prompt. Claude Code opens files in your project directory using built-in Read, Glob, and Grep tools.

It remembers between sessions. A Markdown file called CLAUDE.md sits in your project root. Claude Code reads it at the start of every session. Your ICP definitions, brand voice rules, metric formulas, and business logic load automatically, every time.

It runs on a schedule. Weekly performance reports, campaign audits, and content reviews can run on Anthropic-managed infrastructure without requiring your computer to be on.

It coordinates parallel work. You can spawn multiple agents working on different parts of a task simultaneously, with a lead agent coordinating and merging results. Paid analysis, content drafts, and email sequence reviews can run at the same time.

CLAUDE.md: the file that makes your agent "know" a client

CLAUDE.md is a rules-and-pointers file, not a data dump. It holds the context Claude needs every session and points to heavier files only when they matter.

Reference deeper materials in your main configuration file as appropriate: context/icp-profiles.md, context/historical-campaigns.md, data/crm-performance-summary.md.

The file structure works in layers. When Claude works inside marketing/paid/, it loads the paid-channel rules first, then the marketing-level rules, then the root-level client rules, cascading from specific to general. A typical file tree looks like this:

CLAUDE.md ← Client overview, shared glossary, global rules

marketing/CLAUDE.md ← Brand voice, ICP, messaging rules

marketing/paid/CLAUDE.md ← Paid channel rules, budget approval thresholds

marketing/content/CLAUDE.md ← SEO rules, content calendar, editorial approval

marketing/email/CLAUDE.md ← Email rules, suppression logic, compliance

Two formatting habits to lock in early:

Include rationale with every rule

"Never change the MQL definition mid-quarter" is okay. "Never change the MQL definition mid-quarter, because retroactively breaking pipeline attribution invalidates board-level forecasts already shared" is better. The rationale helps Claude handle edge cases you didn't anticipate.

Version your metric definitions

SaaS growth teams change how they calculate MQLs, CAC, and activation thresholds over time. If your agent compares Q1 2024 to Q1 2025 using the wrong definition for one period, the output is useless. Tag each definition with a date range so the agent picks the right formula for each period.

What client data to feed the agent, and how to prepare it

Keep the always-loaded context lean. The core decision: what should load every session, and what should load only when needed?

Always-loaded in CLAUDE.md (under 1,000 tokens)

ICP summary, brand voice rules, team contacts and approval authority, metric definitions, tool preferences, and pointers to reference files.

CRM and pipeline data

Export from HubSpot or Salesforce, then convert to Markdown tables or structured plain text. CSV files aren't natively supported as document blocks. Use aggregated summaries rather than raw rows: "Q1 2024: 847 MQLs, 23% MQL-to-SQL, avg deal velocity 47 days." Store at @data/crm-performance-summary.md.

Campaign performance history

Group by quarter, not by individual campaign, to reduce token load. Add context alongside metrics: "Q3 2024 LinkedIn spend increased 40% due to product launch, not representative of steady-state performance."

Customer intelligence

Use synthesized insights rather than raw interview transcripts. Structure by segment: pain points, buying triggers, success metrics. Keep raw transcripts in reference files, never auto-loaded.

Competitive intelligence

Date-stamp every file. Competitive data changes fast, so it belongs in on-demand reference files, never always-loaded. competitor-a-battlecard-2025-q2.md is clearer than a generic file with no time boundary.

Content and messaging assets

Brand voice rules go in CLAUDE.md. The full messaging framework, copy samples, and content calendars go in referenced @docs/ files.

That structure keeps useful history available without wasting context on every task.

Package repeatable workflows as Claude Skills in .claude/skills/:

Skills preserve the process so any team member can repeat it at any time. A weekly-report skill might specify which data files to pull, which benchmarks to compare against, the output format the team expects, and the Slack channel where the summary gets posted.

Where this helps SaaS growth teams

The practical value shows up in a few repeatable use cases. Fully autonomous end-to-end campaign orchestration is still aspirational for most teams.

Monday morning performance analysis

A scheduled agent pulls the previous week's campaign data, compares performance against the benchmarks encoded in your reference files, flags anomalies, and outputs a structured brief for your review meeting. In practice, the agent might flag that LinkedIn CPL spiked 35% week-over-week, cross-reference the campaign history file, find that a new audience segment launched last Tuesday, and include that context directly in the brief.

Segment-specific content generation

For B2B SaaS with $20K+ ACVs, segment-level personalization justifies the investment because deal sizes support micro-targeted messaging. An agent loaded with your customer intelligence files and messaging framework can generate segment-specific variants for email sequences, LinkedIn ads, and outbound copy, maintaining brand consistency while varying the pain-point emphasis per segment.

Cross-channel coordination

A lead agent loaded with the full client CLAUDE.md spawns specialized subagents for each channel. The lead agent enforces messaging consistency using the shared brand voice and ICP rules. Each subagent operates with channel-specific rules from its subdirectory. Paid, email, and LinkedIn content all read from the same source of truth.

Voice-of-customer synthesis

A quarterly skill pulls from defined feedback sources, synthesizes insights by segment matching your CLAUDE.md ICP categories, and outputs a structured brief for campaign ideation. The agent can cross-reference insights against historical win/loss patterns to prioritize themes with the strongest commercial signal.

Used this way, Claude Code is less about autonomy and more about consistent execution against shared context.

One non-negotiable: commercial accounts only

If you use Claude Code on client data, account type matters.

Anthropic's consumer plans (Free, Pro, and Max) can train new models using your session data when the training setting is enabled. Consumer terms confirm this includes Claude Code sessions run from those accounts.

Commercial plans do not train on your data by default. Your organization controls the data.

If any team member accesses Claude Code through a personal consumer account rather than a company-provisioned commercial account, client data is potentially exposed to model training and multi-year retention. For agencies and growth teams handling client data, this is a contractual confidentiality requirement, not a preference.

Basic guardrails from day one:

Per-client file isolation with explicit deny rules for credential files (.env, .pem, .key)
Least-privilege permissions per session
PII detection and removal before data enters the context window
Centralized audit logging for all client data sessions
Naming Anthropic as a sub-processor in your client data processing agreements

Coordinate your SaaS growth with Understory

The coordination problem Claude Code solves inside a growth team is the same problem Understory solves at the agency level: fragmented specialists who don't share data, don't align messaging, and cost you strategic hours every week in vendor management overhead.

Understory runs LinkedIn and Meta ads, builds outbound infrastructure through Clay and Instantly, and coordinates these services into one system with shared data across every channel. If you're spending more time aligning your growth vendors than optimizing your pipeline, schedule a consultation with Understory to see what coordinated execution looks like.

Frequently asked questions

What should go in CLAUDE.md versus separate reference files?

Put always-needed context in CLAUDE.md: ICP summary, brand voice rules, approval authority, metric definitions, tool preferences, and pointers to deeper files. Put heavier material, such as campaign history, CRM summaries, messaging libraries, and competitive battlecards, in referenced documents that load on demand. The goal is a lean, always-loaded context that doesn't waste tokens on every task.

How much client data should be loaded all the time?

As little as possible. Keep the always-loaded context under 1,000 tokens and load historical data through referenced files when the task requires it. Loading everything by default inflates the context window and increases the chance the agent surfaces irrelevant history in its outputs.

What's the biggest compliance mistake growth teams make?

Using personal consumer Claude accounts for client work instead of company-controlled commercial accounts. Consumer plans can train on session data when the training setting is enabled, which means client data is potentially exposed. Commercial accounts don't train on your data by default. For agencies handling client data under contract, this isn't optional.

Can this replace human strategy work?

No. The strongest use cases are coordination, reporting, synthesis, and first-draft production. Strategy still depends on human judgment, and the agent's output quality is only as good as the context files you maintain. Think of it as consistent execution against shared context, not autonomous decision-making.