The Data Readiness Ascent: Basecamp to the Summit

Written by Tom Cross
On June 25, 2025

Note: SDG hosted an event on June 18th, 2025 to help guide participants on the journey of getting their data ready for AI and advanced analytics.  The following is a summary of our recommendations and guidance.

Everyone in IT is talking about how to leverage Generative AI (GenAI), but for many leaders, the steps remain a mystery. This very challenge set the stage for “Data Readiness for AI and Advanced Analytics,” an immersive session we hosted with our partners at AWS last week.

If you’re a leader in technology or business, you’ve felt it. GenAI has exploded into the conversation, promising to reinvent everything from customer service to software development. The hype is enormous, and so is the pressure to act. But where do you even begin?

The path to a meaningful GenAI solution can feel like an attempt to scale the most intimidating of mountains. The underlying models are brain-meltingly complex. Connecting them to your proprietary data seems fraught with security risks. The entire field presents itself as a highly technical, exclusive club. If you don’t have a platoon of PhDs on staff, it’s easy to feel like you’re already hopelessly behind.

But we believe there is a tractable course for companies of all sizes. The data maturity journey that culminates in powerful GenAI tools isn’t a single, terrifying leap. It’s a virtuous cycle, where each step along the way provides real, tangible business value. **You don’t leap to the summit; you just have to begin the climb.**

The Four Phases of Data Readiness

Think of your data journey as establishing a series of basecamps on your way to the summit. Each one is a stable plateau that offers new capabilities and a fantastic view of how far you’ve come.

1

The Experimenter

This is about getting on the mountain. Your first goal is to establish a data lake, a flexible, low-cost home for your information using a service like Amazon S3. To avoid creating a messy “data swamp,” we introduce the _Medallion Architecture_, a simple but powerful pattern for organization:

Bronze: Raw, untouched data.

Silver: Data that’s been cleaned and validated.

Gold: Business-ready data, perfect for analysis.

This architecture allows a small, focused team to start exploring the terrain with tools like Amazon Athena and Amazon SageMaker Unified Studio, running initial analyses and proving what’s possible.

2

The Adopter

With a successful camp established, other teams want to join the expedition. The Adopter phase is about extending infrastructure and capabilities. Here, you evolve your data lake into a lakehouse, which combines the flexibility of a data lake with the power of a traditional data warehouse.

This isn’t just about static notebooks anymore. You start building formal data pipelines with tools like AWS Glue and delivering standardized reports and dashboards with Amazon Redshift and Amazon QuickSight. You’re no longer just exploring; you’re creating repeatable, valuable insights for the whole business.

3

The Scaler

To get more of your company up the mountain, you need approachable capabilities, safety, and governance. AWS Lake Formation acts as the centralized air traffic control for your data, ensuring that as more teams plug in, they can only access the data they’re authorized to see.

This level of maturity is also the perfect launchpad for GenAI. You can now safely experiment. The key is the foundational model: the massive, pre-trained AI brain like those driving Anthropic Claude. Amazon Bedrock gives you easy access to these models, and Bedrock Knowledge Bases let you securely ground them in your business contexts using clean, trusted data from your lakehouse. Now you’re asking the AI questions it could never answer — until it had your data.

 

4

The Data-Driven Innovator

You’ve reached the summit. From here, you can see the entire landscape and take advantage of the full breadth of the cloud. This is where you build truly differentiated GenAI solutions. The focus shifts from just accessing models to creatively enhancing them:

  • Provide richer context using specialized data like vector stores (Amazon OpenSearch) and graph databases (Amazon Neptune).
  • Trigger intelligent actions by having the main AI consult with smaller, specialized machine learning models.
  • Build sophisticated, modern applications that use this intelligence, powered by best-in-class tools like Fargate, Lambda, and Step Functions.

Access to foundational models is no longer the differentiator — how you wrap them in specialized software is. That’s where your real, lasting advantage will emerge.

The Future of Work

It’s natural to worry about a future where AI replaces the work we do today. But a much more probable future is one where we direct our engineering talent toward building, governing, and enhancing these amazing new capabilities.

This isn’t just a theoretical future; the conversation is happening right now.

In our session this week, one theme emerged: the journey up the mountain is a challenge everyone is facing, and no one wants to go it alone. The focus wasn’t on replacing people, but on empowering them by building the solid data foundation that unlocks the true potential of AI.

This session reinforced our core belief: the time to invest in data readiness is now. It’s the backbone of essential work in the years ahead. We will be holding similar sessions in the future, and if you want to be part of the conversation or are ready to partner with SDG on your own data journey, we’d love to talk.

Is your data ready for the future?