Skip to content
Confidential
Chapter 06 · Main Report

Cost-Benefit Methodology


Why We Take This Approach

Most consulting engagements estimate value from the top down: "steel companies typically save X% with AI." We do the opposite. Every number in this document traces to a specific operational metric at a specific Cleveland-Cliffs site, grounded in conversations with the people who live with that metric every day.

We spent four weeks across four facilities — Cleveland Works, Middletown Works, Tilden Mine, and Burns Harbor — conducting over 120 practitioner conversations. The value estimates that follow are built from what your operators, maintenance leaders, and process engineers told us about what costs money, how much, and how often. Where we use industry benchmarks at all, it is only to sanity-check what your own people reported.

This chapter describes the methodology. Chapter 7 applies it to the full portfolio.


How We Estimate Value

Six Value Categories

Each recommended workstream generates value through one or more of these mechanisms:

Category Definition How We Measure Example
Throughput gain More production from existing assets Tons/year x margin/ton Reducing unplanned downtime frees additional heats per day at the constrained asset
Cost avoidance Preventing failures and reducing waste Event cost x frequency reduction Cobble cost per event x cobble frequency x prevention rate
Direct cost reduction Spending less on inputs Current spend x % reduction $50M annual reagent spend x 5% optimization = $2.5M
Efficiency gain Same output with less effort Hours saved x loaded labor cost 2 hours/day of manual logistics planning x 365 days x labor rate
Risk mitigation Reducing probability or impact of costly events Expected loss x probability reduction Environmental compliance exposure, safety incident cost, knowledge-loss impact
Inventory optimization Reducing carrying costs Inventory value x carrying cost % x reduction % $104M parts inventory x 25% carrying cost x 10% reduction

Most workstreams touch two or three categories. We account for each separately and sum them — we do not double-count.

The Five-Step Estimation Process

For each workstream at each site, we follow the same disciplined process:

Step 1 — Identify the operational metric. What did your people tell us costs money? We trace every estimate back to specific conversations documented in the site initiative registries, with direct quotes and context.

Step 2 — Quantify the current state. What is the metric today? For example: 70% reactive maintenance mix, $50M in annual reagent spend, or 40 coil-movement trips per day. Each data point is tagged by source — whether it was confirmed by a stakeholder, estimated from available data, or benchmarked from comparable operations.

Step 3 — Define a realistic target state. Not a theoretical optimum — an achievable improvement given your systems, your people, and your operating constraints. We benchmark targets against CLF's own best-performing sites where possible, and against industry comparables where necessary. We always present a conservative target alongside a stretch target.

Step 4 — Calculate the delta. The gap between current state and target state, multiplied by unit economics, gives us addressable value. We then apply a capture rate — because no initiative captures 100% of addressable value. Our capture rates are:

  • 30-50% for early-phase quick wins (Horizon 1)
  • 50-70% for established initiatives with proven data (Horizon 2)
  • 70-90% for mature, fully integrated programs (Horizon 3)

Step 5 — Sensitivity check. What if the base metric is off by 20%? What if the improvement is half of what we estimate? We run these scenarios and present every estimate as a range: conservative, expected, and optimistic.

What We Do Not Do

  • No industry hand-waving. "Industry reports say steel companies save $X with AI" is not evidence. Our numbers come from CLF's own operations.
  • No T-shirt sizing. "Small, medium, large" is undefendable in a business case. Every estimate has math behind it.
  • No single-point estimates. Always ranges. Always stated assumptions. Always decomposed into components you can verify independently.

How We Estimate Cost

Phase-Type Template Methodology

Cost estimates follow the same transparency principle as value estimates. Rather than building each estimate from scratch (which introduces subjective drift across 30 project-phases), we classify each phase into a structural archetype and derive team composition from that classification. The archetypes are anchored to two charter actuals — projects where the scope, team, and cost have been worked through in detail.

The two anchors: - PRJ-10 Tilden Concentrator Phase 1: $312,000 (3.25 FTE, 8 weeks, firm charter) - PRJ-03 Cleveland PdM Phase 1: $236,000 (2.4 FTE, 8 weeks, firm charter) - PRJ-10 Tilden Concentrator Phase 2: $1.3-1.56M (9.5-11 FTE, 12 weeks, indicative charter)

Five Phase Types

Every project-phase is classified into one of five structural archetypes based on observable properties — number of sites, data sources, AI complexity, deployment depth, and organizational change required.

Type What It Is Team Shape Duration Cost Range
A: Proof of Value Bounded experiment, static data, prove the thesis 2.5-4 FTE, lean 7-10 weeks $200-350K
B: Operational Pilot Working systems, may span 2 sites, more complex ML 4-7 FTE 10-14 weeks $400-750K
C: Full System Build Live integration, plant-floor, operator-facing 7-11 FTE + IE domain 12-16 weeks $1.0-1.6M
D: Site Extension Proven system adapted to new site 3-5 FTE 8-12 weeks $250-500K
E: Cross-Site Platform Enterprise analytics, corporate dashboards 4-7 FTE 10-14 weeks $400-800K

Classification is based on structural properties, not subjective judgment. For example, a phase with physics-informed ML, plant-floor deployment, and operator training is a Type C regardless of which project it belongs to. Complexity modifiers (dual-site parallel, CV components, legacy system integration) adjust within the type range.

Cost Components Within Each Phase

Component What It Includes How It Maps
AI/ML development Model development, training, validation, prompt engineering "ai" topic features in the feature audit trail
Data engineering Pipelines, ETL, data quality, integration "dev python" and "bi" topic features
Application & UX Dashboards, operator interfaces, workflow integration "dev" topic features
Infrastructure Cloud compute, pipeline orchestration, model serving "devops" topic features
IE domain roles Metallurgists, process engineers, change management Non-feature roles added in team sizing
Project management Coordination, stakeholder alignment, gate management 0.25-0.50 FTE per phase
Contingency 20% buffer on hours Applied uniformly

Blended rate: $250/hr across all resources (Vooban + IE). Charter actuals used differentiated rates ($175-$300); the blended rate is calibrated to reproduce charter totals within acceptable variance (±15%).

Phased Cost Structure

Every project is costed per phase, not as a lump sum. This matches the milestone-gate structure described in Chapter 4:

  1. Phase 1 — Prove it at a single site. Smallest investment, highest learning. Typically Type A or B.
  2. Phase 2 — Build and scale. Full system build at entry site, often with parallel scope at a second site. Typically Type C.
  3. Phase 3 — Platform and expand. Cross-site deployment, corporate dashboards. Typically Type D or E.
  4. Annual run rate — 15-20% of build cost per year for monitoring, model maintenance, and support.

This structure means Cleveland-Cliffs is never committing to a large program upfront. Each phase has a defined budget, a defined deliverable, and a go/no-go gate before the next phase begins. The total program ($22.6M over 24 months) flows through gates — the maximum at-risk amount at any point is the current phase investment.

Auditability

Every cost figure traces through a four-step chain:

  1. Project description (ch5)
  2. Feature decomposition (features.tsv)
  3. Phase classification
  4. Team + cost (team.md)

Features are the audit trail — they document what gets delivered in each phase and justify the classification. The classification drives the team template. The team template produces the cost. If any number is challenged, the chain shows exactly where each assumption enters.


The Self-Funding Logic

A critical design principle of this roadmap is that early phases generate measurable returns that offset investment in later phases. The program is structured so that corporate funds the initial prove-it phase, and demonstrated value builds the case for continued investment.

Period Activity Milestone Gate
Months 0-6 Phase 1 investment: corporate funds the prove-it phase across selected workstreams. --
Months 3-6 Quick-win value materializes: inventory reduction frees working capital; procurement automation reduces cycle cost; knowledge capture reduces training time and risk. --
Months 6-12 Phase 1 returns offset Phase 2 investment. Demonstrated ROI justifies continued funding. Site leadership sees tangible value and buys in for expansion. Corporate decides to continue, pause, or redirect.
Months 12-24 Self-sustaining program. Cumulative returns exceed cumulative investment. Data foundation built by early projects enables advanced analytics at lower marginal cost. Corporate decides on full-scale investment.

This is not theoretical. The workstream sequencing in Chapter 5 is deliberately ordered so that high-confidence, shorter-payback initiatives come first. They prove the approach, generate savings, and create the data infrastructure that makes subsequent phases faster and cheaper to deliver.


Confidence and Sensitivity

How We Express Confidence

Every value estimate in Chapter 7 carries a confidence indicator:

Confidence Level What It Means Typical Basis
High Base metric confirmed by multiple stakeholders; improvement rate benchmarked against CLF's own performance Stakeholder-confirmed numbers, internal CLF data
Medium Base metric confirmed by at least one stakeholder; improvement rate based on comparable operations Single-source confirmation, analogous site data
Low Base metric estimated from partial data; improvement rate based on industry benchmarks Limited site data, external references

We default to the conservative end of every range. Where confidence is low, we flag it explicitly and explain what additional data would sharpen the estimate.

Sensitivity Analysis

For the portfolio as a whole and for each workstream individually, we test three scenarios:

  1. What if our value estimates are 30% too high? Does the program still pay back? At what timeline?
  2. What if costs run 25% over estimate? Where does that push the break-even point?
  3. What if we capture only half the improvement we project? Which workstreams still stand on their own?

These scenarios are presented in Chapter 7. The purpose is straightforward: you should be able to see that even under pessimistic assumptions, the recommended program represents a sound investment — and if it does not, you should see exactly where the economics become marginal.

Assumptions We Are Making

For full transparency, here are the standing assumptions across all estimates:

  • Labor rates use loaded costs (salary + benefits + overhead), not base salary
  • Throughput gains assume the market can absorb additional production (i.e., demand is not the constraint)
  • Capture rates assume adequate change management and champion engagement at each site
  • Infrastructure costs assume CLF proceeds with Databricks as the enterprise data platform; estimates would increase if each workstream must build its own data infrastructure independently
  • Timeline estimates assume dedicated CLF resources are assigned to each active workstream as outlined in the resource plan

Any of these assumptions can be adjusted. The decomposed structure of our estimates means you can change a single input and see the effect ripple through to the bottom line.