Backblaze's Jeronimo De Leon on Building Storage That Speeds Up AI

AI speed comes from storage, not just models: centralize, index, and cut latency. With Backblaze B2, teams ship faster and improve continuously.

Categorized in: AI News Product Development
Published on: Oct 04, 2025
Backblaze's Jeronimo De Leon on Building Storage That Speeds Up AI

Scaling AI Without Storage Friction: An Interview with Jeronimo De Leon, Senior Product Manager, AI at Backblaze

Jeronimo De Leon has led AI-driven products across IBM Watson, Intelas, Welcome.AI, Bloomberg, and now Backblaze. His core message to product leaders is simple: storage architecture is the lever that determines how fast your AI products ship and improve.

Models and compute are abundant. Data accessibility isn't. If you want predictable velocity in training, inference, and iteration, solve storage early.

Why Storage Decides Your Speed to Value

Innovation cycles are faster, but the fundamentals haven't changed: where is the data, how is it stored, and how fast can you access it? Most teams still lose time consolidating fragmented sources before they can train anything useful.

De Leon's pattern across companies: teams that centralize, catalog, and secure their data move quicker and reduce risk. Storage decisions dictate time-to-train, experiment cadence, and the quality of inference feedback loops.

Where Storage Matters Across the AI Lifecycle

  • Ingest and processing: Centralize, normalize, and tag everything. Clean, well-processed data beats "more data."
  • Training: Scalable, high-throughput storage feeds massive datasets to multiple GPU clusters without bottlenecks.
  • Inference: Capture outputs and user feedback to fuel continuous improvement.
  • Monitoring: Keep traceable records to debug drift, bias, and performance regressions.

As De Leon puts it: "It's not hoarding if it's data." Collect broadly and make it searchable. The winners treat storage as a design choice, not an afterthought.

Startups vs. Enterprises: Different Constraints, Same Goal

Startups: First hurdle is acquiring enough data; the next is cost and architecture. Poor choices here stall growth when usage spikes.

Enterprises: Data exists but lives across silos, legacy systems, and compliance gates. Consolidation and governed access are the main blockers.

In both cases, teams that plan for cost, performance, and accessibility together build compounding advantage.

The Most Pressing Barrier: Latency

Among cost, latency, security, and compliance-latency hits users directly. Slow training delays learning cycles. Slow inference kills adoption.

Prioritize storage that minimizes latency across training and serving. Then handle cost efficiency and compliance in parallel as scale grows.

What "Flexible Access" Really Means

De Leon stresses "smart archiving": centralizing information into a structured, searchable system. Unify formats, normalize, tag, and index for future queries.

This turns storage into a product capability. Teams move faster, experiment more, and ship improvements with less friction.

Real-World Proof: Decart AI and Wynd Labs

Decart AI: Focused on training at scale. With Backblaze B2 they scaled to 16 PB in 90 days, trained across multiple GPU clusters with zero egress cost, and achieved 10x efficiency over competitors. Less wrangling, more iteration.

Wynd Labs: Focused on data access for customers. They ingest petabytes daily and serve tens of petabytes monthly. High performance and free egress let them meet enterprise demand and reinvest in product.

In both cases, storage moved from constraint to enabler.

Balancing Performance and Cost as Models Grow

Plan for your product's long-term data usage-collecting, processing, moving, training, and inference are core product functions now. If you don't price and architect for those loops early, costs and delays compound.

Design storage with the product roadmap in mind, not just today's projects. That's how you preserve speed as scale increases.

Security, Compliance, and Trust

  • Encryption by default
  • Fine-grained permissions
  • Audit trails and data residency options
  • End-to-end data lineage to track sources, processing, and model usage

Governance shouldn't slow teams down. Strong defaults paired with good usability let product, data, and security work in sync.

Implementing Backblaze B2 for AI Workloads

B2 is S3 compatible, so it plugs into existing MLOps and compute stacks without re-architecture. De Leon recommends starting with a proof of concept to validate migration, performance, and integration.

From there, tune throughput, data movement, and orchestration so you can train across clusters, serve inference reliably, and iterate without waiting on infrastructure.

Explore Backblaze B2

What's Next: LLMs, Exabyte Data, and Multi-Cloud

Storage is shifting from passive archive to an active part of data orchestration. With LLMs and exabyte-scale datasets, fast access and high throughput are baseline requirements.

De Leon also points to AI agents that depend on data movement and context to automate workflows. The storage layer will need to support these patterns across hybrid and multi-cloud setups.

Practical Playbook for Product Teams

  • Inventory all data sources; retire or merge duplicates.
  • Define a common schema, tags, and retention policy for "smart archiving."
  • Select S3-compatible storage with predictable costs and favorable egress.
  • Co-locate storage and compute to reduce latency for training and inference.
  • Instrument end-to-end latency; set SLOs for training throughput and P95/P99 inference.
  • Enforce encryption, least-privilege access, audit logs, and data residency from day one.
  • Implement data lineage so model outputs can be traced and audited.
  • Run a proof of concept before large migrations; benchmark throughput and cost.
  • Plan tiering (hot/warm/cold) based on access patterns and product needs.
  • Close the loop: store inference feedback and user signals for continuous improvement.

Further Learning

If your team is upskilling for AI product work and MLOps, explore role-based learning paths here: Complete AI Training - Courses by Job

To see how storage choices can accelerate your roadmap, visit Backblaze B2.