Phase 1: Discovery and Assessment - The Two Weeks That Determine Everything

Consider the most expensive way to learn this lesson. A team is six months into a Claude deployment that is not going anywhere. There is a working pilot, there is executive enthusiasm, there is a team that believes in the technology. What there is not: a clear owner for the API cost, any documented understanding of which data classifications the use cases involve, or a deployment surface that matches the actual compliance requirements.

The pilot was built on the direct Anthropic API with individual developer keys. Once compliance is finally brought into the conversation, they point out that certain data in the use cases cannot be processed outside the company's own cloud account under the applicable regulatory framework. The entire pilot has to be rebuilt on Bedrock. Six months of work, half of it thrown away.

That is the cost of skipping Phase 1. Not the cost of two weeks of discovery. The cost of not doing it.

This article is the first deep-dive in this series on the enterprise Claude deployment roadmap. The pillar piece laid out all six phases and the gate between each one. This one goes inside Phase 1: the nine tasks, the decisions that need to happen, the places teams cut corners and the specific ways those cuts surface later.

Phase 1 runs two weeks, eight to ten working days. It is not research for its own sake. Every task produces a specific output that one of the downstream phases depends on. Compress the phase, cut a task, or go through the motions without producing the actual deliverable, and the dependency shows up broken in Phase 2, Phase 3, or worse, in production.

The Nine Tasks

The nine tasks are not bureaucratic overhead. Each produces a named output. Each output feeds a downstream phase. Skip a task, or produce a placeholder instead of the real artifact, and the phase that depends on it shows up broken.

Phase 1 -- Nine Tasks at a Glance

Each task feeds a downstream dependency

1.1Stakeholder Interviews

Named sponsor + 8-12 stakeholders scheduled

1.2Use Case Workshop

Scored inventory: value, feasibility, sensitivity, volume

1.3Data Classification Audit

Classification tier + regulated data type per use case

1.4Regulatory Mapping

Applicable frameworks + control requirements

1.5Infrastructure Assessment

Current state: gateway, IdP, logging, secrets, containers

1.6Platform Decision Matrix

Scored recommendation: direct API, Bedrock, or Vertex

1.7Pilot Use Case Selection

Top 3: high value, high feasibility, low sensitivity

1.8Cost Model Projection

Monthly spend range with documented assumptions

1.9Executive Briefing and Sign-Off

Signed decision: platform, use cases, budget, security baseline

Gate

Executive sponsor approves four items in writing: platform choice, prioritized pilot use cases, budget allocation, and security requirements baseline. Not a verbal nod.

1.1: Stakeholder Interview Schedule

Before any workshop, identify and schedule the right people. Discovery requires eight to twelve stakeholders across the target business units: process owners who understand the actual workflows, IT and security representatives who know the infrastructure and regulatory constraints, and a named executive sponsor who has the authority to approve the gate.

The interview structure matters. Forty-five minutes per stakeholder. The questions are consistent: what AI tooling are you using today, where do you feel constrained by manual work, what data does that work touch, and what does success look like in one year. The goal is not to gather opinions. It is to map the landscape before the use case workshop so you are not starting from a blank whiteboard.

One thing this task forces: you identify the executive sponsor in writing before Phase 1 is over. A deployment without a named executive sponsor does not have governance. It has enthusiasm, which runs out.

1.2: Use Case Inventory Workshop

Two hours per business unit, facilitated. The output is a scored inventory of every candidate Claude use case within that unit.

Each use case gets scored on four dimensions: business value (one to five), technical feasibility (one to five), data sensitivity (low, medium, or high), and estimated volume in requests per day. The scoring is not a vote. It is a structured conversation that forces specificity. High value means nothing until the team commits to what value means, how it would be measured, and what the baseline is today.

The reason to do this before touching the platform is simple: an LLM is not the right solution for every problem. Discovery catches the use cases that sound compelling in a meeting but turn out to be automation problems or search problems or process problems. Building a pilot on the wrong use case is not just wasted effort. It produces a failed demo that makes the technology look bad and the business case harder to reassemble.

1.3: Data Classification Audit

This is the most consequential task in the phase. It is also the one most likely to be deferred.

Every use case from the inventory gets mapped to a data classification tier: public, internal, confidential, or restricted. Then the classification gets cross-referenced against specific data types: PII, PHI, financial data, trade secrets, anything with a regulatory flag. For any use case that touches restricted or regulated data, the audit documents what controls are required and what deployment options are available.

The output of this task is what makes the platform decision in Task 1.6 possible. If any use case involves data that cannot leave the company's cloud account, the direct Anthropic API is not an option for that use case. That is not a technical opinion. It is a constraint that the classification audit surfaces. The teams that skip this task make an undocumented assumption that the data is fine to send to api.anthropic.com, and that assumption either gets caught by compliance late in the process or never gets caught at all.

1.4: Regulatory Requirements Mapping

This task belongs to the client, not the delivery partner. The client's legal and compliance team maps the applicable regulatory frameworks: SOC 2, ISO 27001, GDPR, HIPAA, CCPA, and anything industry-specific. They document what those frameworks require of AI system deployments: audit trail obligations, data retention requirements, third-party vendor assessment obligations for Anthropic as a vendor.

The reason this task has to be client-owned is that the authority is not delegable. Riptide can build a compliant architecture. Riptide cannot tell a financial services firm what their regulators require. That knowledge lives inside the company.

What the delivery partner does is ensure this task happens and that the output feeds into the infrastructure and platform work. The failure mode is letting this task slide because it is uncomfortable or slow-moving, then discovering in Phase 4 that the compliance documentation cannot be completed because the architecture does not have what the framework requires.

1.5: Infrastructure Assessment

Before designing the platform layer, you need to know what exists. The infrastructure assessment audits the current state across five categories: API gateway (Kong, Apigee, AWS API Gateway, or nothing), identity provider (Okta, Azure AD, Google Workspace), logging and observability stack (Splunk, Datadog, ELK), secrets management (HashiCorp Vault, AWS Secrets Manager), and container runtime (ECS, EKS, GKE, or on-premises).

This is not a comprehensive infrastructure audit. It is scoped to the components that the Claude deployment platform layer will need to integrate with or replace. A company that already has a well-configured Apigee gateway does not need to deploy a new API proxy in Phase 2. A company with no secrets management infrastructure does need to address that before API keys start circulating.

The assessment runs in parallel with the use case and classification work. It does not depend on those outputs, so there is no reason to sequence it after them.

1.6: Platform Decision Matrix

Three deployment options for the Anthropic API: direct API (api.anthropic.com), AWS Bedrock, and Google Vertex AI. The choice is not primarily a technical preference. It is driven by the output of the data classification audit and the infrastructure assessment.

The decision matrix scores each option across six factors:

Data residency requirements. If any use case involves data that must stay within a specific cloud account or geography, that may eliminate the direct API or constrain region selection on Bedrock and Vertex. The direct API supports US-only inference via the inference_geo parameter at a 1.1x pricing premium, but that is a single-datacenter guarantee, not a private-cloud guarantee.
Existing cloud footprint. A company that runs its critical systems on AWS and authenticates everything through IAM will find Bedrock integration dramatically simpler than building a parallel auth model for the direct API. The integration cost difference is real.
Feature availability. The direct API gets every Anthropic feature first: prompt caching, the Batch API, Managed Agents when available. Bedrock and Vertex lag on new feature availability, sometimes by weeks, sometimes longer. For a deployment that expects to use new capabilities quickly, that matters.
Network topology. Bedrock supports VPC endpoints and PrivateLink, meaning traffic to Claude never crosses the public internet. For organizations with strict network egress policies, this may be decisive.
Cost structure. Bedrock adds a small markup to Anthropic list pricing. For low-volume deployments, the integration benefits often outweigh the cost difference. At high volume, the arithmetic changes.
IAM integration. Bedrock uses AWS IAM for authentication. Vertex uses Google IAM. The direct API uses API keys. Each has different operational complexity for key rotation, permission scoping, and audit logging.

Platform Decision Matrix -- Task 1.6

Direct API

Bedrock

Vertex

Data Residency

Conditional

US-only inference via inference_geo at 1.1x -- not a private-cloud guarantee

Native VPC

Traffic stays within your AWS account

Native VPC

Traffic stays within your GCP account

Cloud Footprint

Neutral

Separate auth model required regardless of existing stack

AWS native

IAM, VPC, and tooling integrate directly

GCP native

IAM, VPC, and tooling integrate directly

Feature Lead

Ships first

Prompt caching, Batch API, and new models arrive here first

Lag

Weeks to months behind on new features

Lag

Weeks to months behind on new features

Network Topology

Public internet

Requests cross the public internet to api.anthropic.com

PrivateLink

VPC endpoints keep traffic off the public internet

VPC

Private connectivity available

Cost Structure

List price

Direct Anthropic pricing, no intermediary markup

+ Markup

Small premium over Anthropic list pricing

+ Markup

Small premium over Anthropic list pricing

IAM Integration

API keys

Manual rotation and permission scoping required

AWS IAM

Policy-based access control, CloudTrail audit logging

GCP IAM

Policy-based access control, Cloud Audit Logs

The choice follows from the data classification audit and infrastructure assessment, not from a preference. Score it, document the reasoning, and get the executive sponsor to approve it before Phase 2 starts.

Riptide delivers this task as a scored matrix with a recommendation and the reasoning documented. The executive sponsor sees this matrix in Task 1.9. The decision gets made once, with names attached. Re-litigating the platform choice in Phase 3 because someone was not in the room for this conversation is a recognized failure mode.

1.7: Use Case Prioritization and Pilot Selection

Stack-rank the scored inventory from Task 1.2. Select the top three for the pilot.

The selection criteria are high business value, high technical feasibility, and low data sensitivity. The third criterion is often the one that gets negotiated away. A team with a genuinely high-value use case that also touches sensitive data will push to include it in the pilot. The correct answer: pilot on the clean use cases, prove the platform, then bring the sensitive use cases into a hardened environment in Phase 4. A pilot that involves sensitive data before the governance layer is in place is a liability, not a proof of concept.

The selection should also ensure at least one use case per pilot business unit. A pilot that only proves value for engineering, or only proves value for operations, leaves the other unit waiting with no evidence base until after Phase 5.

1.8: Cost Model Projection

Before the executive briefing, build the numbers. The cost model projects monthly API spend for the three pilot use cases.

The inputs are the model tier appropriate for each use case (Opus for complex reasoning, Sonnet for document analysis and generation, Haiku for classification and routing), the estimated average token count per request (input plus output), and the projected daily request volume from the use case scoring. Apply two savings levers where applicable:

Prompt caching: where system prompts are stable and repeated, cache hit rates of 70 to 90 percent on cached inputs reduce input costs by 90 percent on those tokens.
Batch API: where any use case can tolerate non-real-time processing, a 50 percent cost reduction applies on batch workloads.

Cost Model Projection -- Task 1.8

Range, not a point estimate

Inputs

Model Tier

Haiku for classification and routing, Sonnet for analysis and generation, Opus for complex reasoning

Token Count

Average input plus output tokens per request, estimated from representative examples

Daily Volume

Requests per day per use case, taken from the volume scoring in Task 1.2

Savings Levers

Prompt Caching

-90%on cached input tokens

Apply where system prompts are stable and repeated. Cache hit rates of 70 to 90 percent are common on well-structured prompts.

Batch API

-50%on total batch cost

Apply where any use case can tolerate non-real-time processing. Batch workloads do not compete with real-time requests for capacity.

Output Format

LOW

HIGH

/ month

+ documented assumptions

Executives who see a range with documented assumptions can approve a real budget. Point estimates cannot be interrogated.

Pitfall

Projections built on guesses fail the Phase 3 gate, which requires actual cost to land within 20 percent of projection. Build the model on real token counts and real volume estimates.

The resulting number should be presented as a range with documented assumptions. Executives who see a point estimate without assumptions cannot have a real conversation about it. Executives who see a range with documented assumptions can approve a budget with appropriate contingency.

Cost projection errors in Phase 1 surface as budget overruns in Phase 3. The Phase 3 gate requires cost to land within 20 percent of projection. A projection built on guesses fails that gate.

1.9: Executive Briefing and Sign-Off

The phase ends with a documented decision. The briefing presents the use case ranking, the platform recommendation with the decision matrix, the cost projection, the risk assessment, and the proposed timeline. The executive sponsor reviews and approves in writing: which platform, which three pilot use cases, what budget, and what the security requirements are.

Not a verbal nod. A documented decision with a name attached. It bears repeating because the failure mode is accepting a soft yes and treating it as a hard one. Six weeks later, when the Foundation phase needs a budget code to run against, the soft yes becomes a two-week delay while the paperwork catches up.

The sign-off document is also what earns the right to start Phase 2. Phase 2 provisions the enterprise workspace, configures SSO, and deploys infrastructure. All of that costs money and requires IT resources. Those resources are available when there is an approved, documented decision. They are not reliably available when there is only momentum.

The Gate

The Phase 1 gate is executive sign-off on four specific items: the platform choice, the prioritized pilot use cases, the budget allocation, and the security requirements baseline from the regulatory mapping.

A gate is not a formality. It is a checkpoint with consequences. If any of those four items is missing, Phase 2 does not start. The infrastructure cannot be designed to the right spec if the platform choice is undecided. The pilot work cannot be scoped if the use cases are not confirmed. The cost tracking cannot be configured if the budget is not approved.

The teams that treat the gate as a formality to clear as quickly as possible are the teams that reach Phase 3 having built on an unstable foundation. The teams that treat it as a genuine checkpoint arrive at Phase 2 with a shared, documented understanding of what they are building and why.

What Gets Skipped and Why

Three tasks see the most pressure to compress.

Data classification (1.3) gets skipped because it requires a conversation with the compliance team, and the compliance team is slow, and everyone wants to start building. The financial services example at the opening of this piece illustrates what that decision costs.

Regulatory requirements mapping (1.4) gets deferred for the same reason, with the additional problem that it is a client-owned task, which means the delivery partner cannot accelerate it unilaterally. The right response to a slow compliance team is not to skip the task. It is to run it in parallel with the infrastructure assessment and use case work, and to name the dependency explicitly in the Phase 1 schedule.

The platform decision matrix (1.6) gets replaced by a preference. Someone on the client's team already has a strong opinion about Bedrock or a strong aversion to giving Anthropic direct API access, and the recommendation goes in that direction without doing the actual scoring. A platform preference is not a platform decision. The matrix forces the examination. Skip it and the assumptions baked into the preference never surface.

Phase 2, Foundation and Access Layer, is next. It is the phase that most teams find unglamorous and most platforms find expensive to retrofit when it was skipped. The infrastructure that Phase 2 builds is the layer that Phase 1's platform decision was made in reference to. Next in the series: what that layer actually requires, the sequencing across its eleven tasks, and where foundation work deferred to Phase 3 tends to show up as a crisis.

Work with Riptide

Ready to put a governance framework behind your Claude deployment?

Our Claude Enterprise Readiness Assessment maps your file structure, permissions model, and MCP surface in three weeks.

Book a discovery call

Andrew Poole

Founder of Riptide Consulting, an Anthropic-first AI engineering firm based in Carlsbad, CA. Building the intelligence layer for enterprise and growth-stage companies on the Anthropic platform.