A company in the financial services space came to us six months into a Claude deployment that was not going anywhere. They had a working pilot. They had executive enthusiasm. They had a team that believed in the technology. What they did not have was a clear owner for the API cost, any documented understanding of which data classifications were involved in their use cases, or a deployment surface that matched their actual compliance requirements.
The pilot had been built on the direct Anthropic API with individual developer keys. The compliance team, once they were brought into the conversation, pointed out that certain data involved in the use cases could not be processed outside the company's own cloud account under their regulatory framework. The entire pilot had to be rebuilt on Bedrock. Six months of work, half of it thrown away.
That is the cost of skipping Phase 1. Not the cost of two weeks of discovery. The cost of not doing it.
This article is the first deep-dive in this series on the enterprise Claude deployment roadmap. The pillar piece laid out all six phases and the gate between each one. This one goes inside Phase 1: the nine tasks, the decisions that need to happen, the places teams cut corners and the specific ways those cuts surface later.
Phase 1 runs two weeks, eight to ten working days. It is not research for its own sake. Every task produces a specific output that one of the downstream phases depends on. Compress the phase, cut a task, or go through the motions without producing the actual deliverable, and the dependency shows up broken in Phase 2, Phase 3, or worse, in production.
The Nine Tasks
The nine tasks are not bureaucratic overhead. Each produces a named output. Each output feeds a downstream phase. Skip a task, or produce a placeholder instead of the real artifact, and the phase that depends on it shows up broken.
Phase 1 -- Nine Tasks at a Glance
Each task feeds a downstream dependency
Named sponsor + 8-12 stakeholders scheduled
Scored inventory: value, feasibility, sensitivity, volume
Classification tier + regulated data type per use case
Applicable frameworks + control requirements
Current state: gateway, IdP, logging, secrets, containers
Scored recommendation: direct API, Bedrock, or Vertex
Top 3: high value, high feasibility, low sensitivity
Monthly spend range with documented assumptions
Signed decision: platform, use cases, budget, security baseline
Gate
Executive sponsor approves four items in writing: platform choice, prioritized pilot use cases, budget allocation, and security requirements baseline. Not a verbal nod.
1.1: Stakeholder Interview Schedule
Before any workshop, identify and schedule the right people. Discovery requires eight to twelve stakeholders across the target business units: process owners who understand the actual workflows, IT and security representatives who know the infrastructure and regulatory constraints, and a named executive sponsor who has the authority to approve the gate.
The interview structure matters. Forty-five minutes per stakeholder. The questions are consistent: what AI tooling are you using today, where do you feel constrained by manual work, what data does that work touch, and what does success look like in one year. The goal is not to gather opinions. It is to map the landscape before the use case workshop so you are not starting from a blank whiteboard.
One thing this task forces: you identify the executive sponsor in writing before Phase 1 is over. A deployment without a named executive sponsor does not have governance. It has enthusiasm, which runs out.
1.2: Use Case Inventory Workshop
Two hours per business unit, facilitated. The output is a scored inventory of every candidate Claude use case within that unit.
Each use case gets scored on four dimensions: business value (one to five), technical feasibility (one to five), data sensitivity (low, medium, or high), and estimated volume in requests per day. The scoring is not a vote. It is a structured conversation that forces specificity. High value means nothing until the team commits to what value means, how it would be measured, and what the baseline is today.
The reason to do this before touching the platform is simple: an LLM is not the right solution for every problem. Discovery catches the use cases that sound compelling in a meeting but turn out to be automation problems or search problems or process problems. Building a pilot on the wrong use case is not just wasted effort. It produces a failed demo that makes the technology look bad and the business case harder to reassemble.
1.3: Data Classification Audit
This is the most consequential task in the phase. It is also the one most likely to be deferred.
Every use case from the inventory gets mapped to a data classification tier: public, internal, confidential, or restricted. Then the classification gets cross-referenced against specific data types: PII, PHI, financial data, trade secrets, anything with a regulatory flag. For any use case that touches restricted or regulated data, the audit documents what controls are required and what deployment options are available.
The output of this task is what makes the platform decision in Task 1.6 possible. If any use case involves data that cannot leave the company's cloud account, the direct Anthropic API is not an option for that use case. That is not a technical opinion. It is a constraint that the classification audit surfaces. The teams that skip this task make an undocumented assumption that the data is fine to send to api.anthropic.com, and that assumption either gets caught by compliance late in the process or never gets caught at all.
1.4: Regulatory Requirements Mapping
This task belongs to the client, not the delivery partner. The client's legal and compliance team maps the applicable regulatory frameworks: SOC 2, ISO 27001, GDPR, HIPAA, CCPA, and anything industry-specific. They document what those frameworks require of AI system deployments: audit trail obligations, data retention requirements, third-party vendor assessment obligations for Anthropic as a vendor.
The reason this task has to be client-owned is that the authority is not delegable. Riptide can build a compliant architecture. Riptide cannot tell a financial services firm what their regulators require. That knowledge lives inside the company.
What the delivery partner does is ensure this task happens and that the output feeds into the infrastructure and platform work. The failure mode is letting this task slide because it is uncomfortable or slow-moving, then discovering in Phase 4 that the compliance documentation cannot be completed because the architecture does not have what the framework requires.
1.5: Infrastructure Assessment
Before designing the platform layer, you need to know what exists. The infrastructure assessment audits the current state across five categories: API gateway (Kong, Apigee, AWS API Gateway, or nothing), identity provider (Okta, Azure AD, Google Workspace), logging and observability stack (Splunk, Datadog, ELK), secrets management (HashiCorp Vault, AWS Secrets Manager), and container runtime (ECS, EKS, GKE, or on-premises).
This is not a comprehensive infrastructure audit. It is scoped to the components that the Claude deployment platform layer will need to integrate with or replace. A company that already has a well-configured Apigee gateway does not need to deploy a new API proxy in Phase 2. A company with no secrets management infrastructure does need to address that before API keys start circulating.
The assessment runs in parallel with the use case and classification work. It does not depend on those outputs, so there is no reason to sequence it after them.
1.6: Platform Decision Matrix
Three deployment options for the Anthropic API: direct API (api.anthropic.com), AWS Bedrock, and Google Vertex AI. The choice is not primarily a technical preference. It is driven by the output of the data classification audit and the infrastructure assessment.
The decision matrix scores each option across six factors:
- Data residency requirements. If any use case involves data that must stay within a specific cloud account or geography, that may eliminate the direct API or constrain region selection on Bedrock and Vertex. The direct API supports US-only inference via the inference_geo parameter at a 1.1x pricing premium, but that is a single-datacenter guarantee, not a private-cloud guarantee.
- Existing cloud footprint. A company that runs its critical systems on AWS and authenticates everything through IAM will find Bedrock integration dramatically simpler than building a parallel auth model for the direct API. The integration cost difference is real.
- Feature availability. The direct API gets every Anthropic feature first: prompt caching, the Batch API, Managed Agents when available. Bedrock and Vertex lag on new feature availability, sometimes by weeks, sometimes longer. For a deployment that expects to use new capabilities quickly, that matters.
- Network topology. Bedrock supports VPC endpoints and PrivateLink, meaning traffic to Claude never crosses the public internet. For organizations with strict network egress policies, this may be decisive.
- Cost structure. Bedrock adds a small markup to Anthropic list pricing. For low-volume deployments, the integration benefits often outweigh the cost difference. At high volume, the arithmetic changes.
- IAM integration. Bedrock uses AWS IAM for authentication. Vertex uses Google IAM. The direct API uses API keys. Each has different operational complexity for key rotation, permission scoping, and audit logging.
Platform Decision Matrix -- Task 1.6
Direct API
Bedrock
Vertex
Data Residency
Conditional
US-only inference via inference_geo at 1.1x -- not a private-cloud guarantee
Native VPC
Traffic stays within your AWS account
Native VPC
Traffic stays within your GCP account
Cloud Footprint
Neutral
Separate auth model required regardless of existing stack
AWS native
IAM, VPC, and tooling integrate directly
GCP native
IAM, VPC, and tooling integrate directly
Feature Lead
Ships first
Prompt caching, Batch API, and new models arrive here first
Lag
Weeks to months behind on new features
Lag
Weeks to months behind on new features
Network Topology
Public internet
Requests cross the public internet to api.anthropic.com
PrivateLink
VPC endpoints keep traffic off the public internet
VPC
Private connectivity available
Cost Structure
List price
Direct Anthropic pricing, no intermediary markup
+ Markup
Small premium over Anthropic list pricing
+ Markup
Small premium over Anthropic list pricing
IAM Integration
API keys
Manual rotation and permission scoping required
AWS IAM
Policy-based access control, CloudTrail audit logging
GCP IAM
Policy-based access control, Cloud Audit Logs
The choice follows from the data classification audit and infrastructure assessment, not from a preference. Score it, document the reasoning, and get the executive sponsor to approve it before Phase 2 starts.
Riptide delivers this task as a scored matrix with a recommendation and the reasoning documented. The executive sponsor sees this matrix in Task 1.9. The decision gets made once, with names attached. Re-litigating the platform choice in Phase 3 because someone was not in the room for this conversation is a recognized failure mode.
1.7: Use Case Prioritization and Pilot Selection
Stack-rank the scored inventory from Task 1.2. Select the top three for the pilot.
The selection criteria are high business value, high technical feasibility, and low data sensitivity. The third criterion is often the one that gets negotiated away. A team with a genuinely high-value use case that also touches sensitive data will push to include it in the pilot. The correct answer: pilot on the clean use cases, prove the platform, then bring the sensitive use cases into a hardened environment in Phase 4. A pilot that involves sensitive data before the governance layer is in place is a liability, not a proof of concept.
The selection should also ensure at least one use case per pilot business unit. A pilot that only proves value for engineering, or only proves value for operations, leaves the other unit waiting with no evidence base until after Phase 5.
1.8: Cost Model Projection
Before the executive briefing, build the numbers. The cost model projects monthly API spend for the three pilot use cases.
The inputs are the model tier appropriate for each use case (Opus for complex reasoning, Sonnet for document analysis and generation, Haiku for classification and routing), the estimated average token count per request (input plus output), and the projected daily request volume from the use case scoring. Apply two savings levers where applicable:
- Prompt caching: where system prompts are stable and repeated, cache hit rates of 70 to 90 percent on cached inputs reduce input costs by 90 percent on those tokens.
- Batch API: where any use case can tolerate non-real-time processing, a 50 percent cost reduction applies on batch workloads.
Cost Model Projection -- Task 1.8
Range, not a point estimateInputs
Model Tier
Haiku for classification and routing, Sonnet for analysis and generation, Opus for complex reasoning
Token Count
Average input plus output tokens per request, estimated from representative examples
Daily Volume
Requests per day per use case, taken from the volume scoring in Task 1.2
Savings Levers
Prompt Caching
Apply where system prompts are stable and repeated. Cache hit rates of 70 to 90 percent are common on well-structured prompts.
Batch API
Apply where any use case can tolerate non-real-time processing. Batch workloads do not compete with real-time requests for capacity.
Output Format
$X
LOW
$Y
HIGH
/ month
+ documented assumptions
Executives who see a range with documented assumptions can approve a real budget. Point estimates cannot be interrogated.
Projections built on guesses fail the Phase 3 gate, which requires actual cost to land within 20 percent of projection. Build the model on real token counts and real volume estimates.
The resulting number should be presented as a range with documented assumptions. Executives who see a point estimate without assumptions cannot have a real conversation about it. Executives who see a range with documented assumptions can approve a budget with appropriate contingency.
Cost projection errors in Phase 1 surface as budget overruns in Phase 3. The Phase 3 gate requires cost to land within 20 percent of projection. A projection built on guesses fails that gate.
1.9: Executive Briefing and Sign-Off
The phase ends with a documented decision. The briefing presents the use case ranking, the platform recommendation with the decision matrix, the cost projection, the risk assessment, and the proposed timeline. The executive sponsor reviews and approves in writing: which platform, which three pilot use cases, what budget, and what the security requirements are.
Not a verbal nod. A documented decision with a name attached. It bears repeating because the failure mode is accepting a soft yes and treating it as a hard one. Six weeks later, when the Foundation phase needs a budget code to run against, the soft yes becomes a two-week delay while the paperwork catches up.
The sign-off document is also what earns the right to start Phase 2. Phase 2 provisions the enterprise workspace, configures SSO, and deploys infrastructure. All of that costs money and requires IT resources. Those resources are available when there is an approved, documented decision. They are not reliably available when there is only momentum.
The Gate
The Phase 1 gate is executive sign-off on four specific items: the platform choice, the prioritized pilot use cases, the budget allocation, and the security requirements baseline from the regulatory mapping.
A gate is not a formality. It is a checkpoint with consequences. If any of those four items is missing, Phase 2 does not start. The infrastructure cannot be designed to the right spec if the platform choice is undecided. The pilot work cannot be scoped if the use cases are not confirmed. The cost tracking cannot be configured if the budget is not approved.
The teams that treat the gate as a formality to clear as quickly as possible are the teams that reach Phase 3 having built on an unstable foundation. The teams that treat it as a genuine checkpoint arrive at Phase 2 with a shared, documented understanding of what they are building and why.
What Gets Skipped and Why
Three tasks see the most pressure to compress.
Data classification (1.3) gets skipped because it requires a conversation with the compliance team, and the compliance team is slow, and everyone wants to start building. The financial services example at the opening of this piece illustrates what that decision costs.
Regulatory requirements mapping (1.4) gets deferred for the same reason, with the additional problem that it is a client-owned task, which means the delivery partner cannot accelerate it unilaterally. The right response to a slow compliance team is not to skip the task. It is to run it in parallel with the infrastructure assessment and use case work, and to name the dependency explicitly in the Phase 1 schedule.
The platform decision matrix (1.6) gets replaced by a preference. Someone on the client's team already has a strong opinion about Bedrock or a strong aversion to giving Anthropic direct API access, and the recommendation goes in that direction without doing the actual scoring. A platform preference is not a platform decision. The matrix forces the examination. Skip it and the assumptions baked into the preference never surface.
Phase 2, Foundation and Access Layer, is next. It is the phase that most teams find unglamorous and most platforms find expensive to retrofit when it was skipped. The infrastructure that Phase 2 builds is the layer that Phase 1's platform decision was made in reference to. Next in the series: what that layer actually requires, the sequencing across its eleven tasks, and where foundation work deferred to Phase 3 tends to show up as a crisis.
Work with Riptide
Ready to put a governance framework behind your Claude deployment?
Our Claude Enterprise Readiness Assessment maps your file structure, permissions model, and MCP surface in three weeks.
Book a discovery callAndrew Poole
Founder of Riptide Consulting, an Anthropic-first AI engineering firm based in Carlsbad, CA. Building the intelligence layer for enterprise and growth-stage companies on the Anthropic platform.