Knowledge Pack · Pre-Solution

Intermodal Operations Hub — Predictive Analytics

The complete picture of what we have and what we are trying to solve — before we talk about how to solve it.

845,277
equipment-activity events
119
fields per record
32
terminals
~3.75 GB
single JSON source export

1 · Problem Statement

An Intermodal Operations Hub (IO Hub) moves containers and trailers across the country using rail for the long haul and trucks for first/last mile. Every day, thousands of containers, chassis, and railcars flow through dozens of terminals — gating in, sitting in the yard, mounting onto trains, traveling, and gating out again.

The operation is reactive. Teams find out a yard is full after it is full, learn a gate is congested while trucks are already queued, and discover a chassis shortage when a driver is already waiting. There is rich event data, but it is not yet turned into foresight.

In plain words

We have a detailed minute-by-minute diary of every container's movements, but no one has used that diary to see a few hours into the future — so the yard, the gate, and the equipment are all managed by reacting instead of planning.

The core problem: we cannot reliably answer simple forward-looking questions such as —

  • “How full will Terminal X's yard be in 6 hours? Will it breach capacity?”
  • “How many trucks will hit the gate in the next 1–3 hours?”
  • “Which containers are sitting too long (dwell), and which equipment is idle or bad-ordered?”
  • “Will we have enough empties / chassis / railcars where and when they're needed?”

2 · What We're Trying to Achieve

Turn the existing equipmentActivityReported event stream into predictive, actionable insight — moving the operation from reactive to proactive.

See ahead
Forecast yard fill, gate flow & dwell hours in advance
Prevent
Flag congestion & shortages before they bite
Optimize
Plan space, staffing, lanes & equipment reuse
Answer
Let users ask plain-English questions of the data

Scope of THIS document

This is the problem & source brief. It establishes the business, the operations, the use cases, and exactly what the source data contains. The solution (data pipeline, models, forecasts) is discussed separately.

3 · The Business — Intermodal 101

Intermodal transportation moves freight using more than one mode — primarily rail (long-haul) and truck (first/last mile) — without handling the freight itself when switching modes, because it stays inside the same container.

Core assets

Containers (CSXU / UMAX …) Chassis (wheeled platforms) Railcars & Trains

The primary objective (verbatim from the business doc)

“Ensure the right equipment (containers + chassis) is available at the right terminal, at the right time, to meet customer demand while optimizing cost and utilization.”

End-to-end flow of a container

1
Container request
Reservation / booking is placed.
2
Empty positioning
Empty container staged at the terminal.
3
Gate-In
Container enters the terminal by truck → added to yard inventory.
4
Loading to train
Container moved from yard, mounted/lifted onto a railcar.
5
Train movement
Loaded train travels enroute between terminals.
6
Train arrival & unload
Containers removed from railcars into the yard/chassis.
7
Gate-Out
Customer/driver picks up → inventory reduced.
8
Reuse / reposition
Container is reused or repositioned for the next cycle.

Container states

StateMeaning
EmptyAvailable for booking
LoadedCarrying a customer shipment
Mounted EmptyEmpty container on a chassis (ready)
Mounted LoadLoaded container on a chassis
GroundedNot on a chassis (stacked/stored)

Business tension: too many mounted empties = wasted cost; too few = service delay. Grounded = storage; Mounted = readiness.

4 · Terminal Operations

A terminal's day revolves around three things: the gate (entry/exit), the yard (storage), and the train (load/unload). These are exactly the flows our data captures.

Gate operations — the inventory valve

Gate-In
Truck brings a unit in → validated → sent to yard or direct-load. Adds to inventory.
Gate-Out
Driver picks up → mounted or grounded lift → leaves. Reduces inventory.

The single most important equation

Net Inventory = Gate-In − Gate-Out  (plus train arrivals − train departures). This is what drives yard capacity, demand planning, and repositioning decisions.

Yard, chassis & train

  • Yard: arrival → storage → assignment to a booking → mount/lift → load or gate-out.
  • Chassis: bare → mounted → used → returned → reused. Availability directly limits pickups.
  • Load to train: reservations set demand; units staged & lifted; track Units Loaded vs Left-to-Load.
  • Train unload: arrivals removed into yard/chassis → feeds the next demand cycle.
  • Enroute: loads & empties already moving between terminals.

Operational challenges called out in the source: over-/under-usage of chassis, bad-order equipment reducing usable capacity, and yard overflow.

5 · The 8 Predictive Use-Case Groups

The IO Hub vision defines 8 groups of predictive & analytics use cases. They are the menu of value we want to unlock from the data.

1

Inventory Prediction (Yard Capacity Forecast) POC focus

Objective: Predict yard congestion and optimize asset utilization.

  • Forecast containers/trailers in the yard over time (next 6–24 h)
  • Identify yard blocks/zones likely to reach capacity
  • Predict when congestion will occur
  • Empty-container availability & dwell-time prediction
  • Inbound surge detection; chassis availability / shortage / turnaround / idle / bad-order
2

Equipment Usage & Availability

Objective: Understand how equipment is utilized and when it will be short.

  • Predict equipment utilization trends
  • Forecast downtime from maintenance & repairs
  • Identify potential equipment shortages
3

Intermodal Crew future scope

Objective: Crew planning and assignment.

  • Future scope — not in this dataset's reach yet
4

Railcar Demand & Availability Forecast

Objective: Balance railcar demand and supply.

  • Demand: railcars required (reservations / projected volumes)
  • Supply: railcars in yard, en route, being unloaded
  • Gap analysis: shortage / surplus + recommendations
5

Gate Operations Analytics POC focus

Objective: Optimize gate flow and reduce truck wait times.

  • Gate congestion: truck arrivals in next 1–3 h
  • Inbound vs outbound flow (drop vs pickup)
  • Throughput forecast (transactions / hour / day)
  • Peak-hour patterns, lane-level optimization, gate dwell, driver scoring
6

Intermodal Train Analytics

Objective: Improve train performance & visibility.

  • Avg train footage by destination
  • Container distribution by destination/shipper
  • On-time performance, load/unload times
7

Safety Prediction future scope

Objective: Proactively identify and mitigate safety risks.

  • Predict high-risk zones, incident patterns, unsafe-condition alerts (future scope)
8

Conversational Analytics (Chat Module) future scope

Objective: Let business users ask questions in plain English.

  • “Which yard block will be full in the next 4 hours?”
  • “What is the expected gate congestion at 5 PM?”
  • “How many railcars are currently idle?”

Reading the badges

POC focus = directly supported by the data we already have · future scope = needs more sources or later phases.

6 · The Source Data — equipmentActivityReported

Everything above is powered by one real dataset: an export of equipment-activity events from the container-history system (IPRO_CONTAINER_HISTORY).

845,277
event records
119
fields per event
32
distinct terminals
Mar 16 → May 29 2026
time span (~2.5 months)

What one record looks like

It is MongoDB extended JSON — one big array of event objects. Each event is a snapshot of a single piece of equipment at a moment in time. A trimmed real example:

{
  "equipmentId": "UNKN324740",
  "equipmentTypeCode": "H",                // H = chassis
  "loadEmptyStatus": "E",                  // Empty
  "currentStatus": "On Ramp",
  "trainMoveCategory": "Storage",
  "eventDescription": "Created",
  "eventDateTime": "2026-05-29T20:54:21Z", // the heartbeat
  "terminal": { "code": "WOR", "name": "WORCESTER", "region": "North" },
  "shipperName": "JB HUNT TRANSPORT SERVICES INC",
  "flatcarId": "BNSF211633",
  "holdFlag": 0, "damageFlag": -1,
  "header": { "uuid": "36395582-…", "messageTypeId": "imod_equipment_activity_reported" }
}

The full proto definition lists every field; the Container Inventory — Important Fields doc highlights the ones that matter for reporting and prediction, catalogued next.

7 · Field Catalog — what each column tells us

The 119 raw fields group into seven meaningful families. Below are the key fields for tracking, reporting, and (later) prediction.

Identity & Ownership — Uniquely identify the equipment and who owns it.

FieldTypeWhat it isWhy it matters
equipmentIdstringUnique container/trailer ID (prefix + number), e.g. UNKN324740.Primary key — tracks one asset across every event.
unitPrefixstringEquipment owner/type prefix (CSXU, UMXU, UNKN…).Ownership & type classification.
chassisIdstringID of the chassis paired with the container.Container ↔ chassis pairing.
chassisPrefixstringChassis owner prefix (e.g. TSFZ).Chassis ownership/type tracking.
railcarPrefix / railcarClass / railcarTypestringRailcar owner & class (e.g. DTTX / S635 / 3W).Railcar-level reporting.
flatcarIdstringFlatcar the unit rides on (e.g. BNSF211633).Links the unit to a physical railcar.
equipmentTypeCodecharC = Container, T = Trailer, H = Chassis.Segments the dataset by asset type.
isoSizeType / isoTypeCode / chassisSizestring/intISO size-type (U_53), container/chassis size (53 ft).Capacity & equipment-mix analysis.

Shipment & Business Context — Who the shipment is for and where it goes.

FieldTypeWhat it isWhy it matters
shipperNamestringShipper from the waybill (e.g. JB HUNT TRANSPORT SERVICES INC).Customer-level reporting.
consigneeNamestringParty receiving the shipment.Delivery / customer visibility.
originCity / destinationCitystringOrigin and waybill destination city.Lane & flow analysis.
finalDestination / finalDestinationCitystringEnd destination (incl. outside the network).End-to-end shipment visibility.
unitAgreementNamestringIntermodal contract/agreement name.Billing & contract analytics.
billingPatronName / billingPatronCodestringBilling party.Revenue / customer attribution.
bookingNumber / billOfLadingNumberstringBooking & bill-of-lading references.Ties an event to a reservation.

Movement & Train Classification — How the unit is moving through the rail network.

FieldTypeWhat it isWhy it matters
trainMoveCategorystringInbound, Outbound, Transrail, Through, Storage.The core operational category — drives gate & inventory KPIs.
equipmentMoveTypestringMove between locations (Y-R yard→rail, R-Y rail→yard…).Tracks yard ↔ rail transitions.
intermodalActivityTypestringActivity classification (e.g. UNSPECIFIED).Event typing.
arrivalTrain6 / departureTrain6stringInbound & outbound train IDs.Links the unit to train operations.
fromRailcar / toRailcarstringRailcar a unit came off / goes onto.Load/unload planning & execution.
railcarSequence / trainProcessingTrackint/stringPosition in train, processing track.Load-plan & yard-track detail.

Status & Condition — The current state and health of the unit.

FieldTypeWhat it isWhy it matters
loadEmptyStatuscharL = Loaded, E = Empty.Core capacity & utilization metric.
currentStatusstringLocation/state — Yard, On Ramp, Offsite…Real-time inventory tracking.
holdFlag / holdReasonint/stringOn hold? + reason (CUSTOMS, STOPORDER…).Exception & delay analysis.
damageFlag / chassisDamageFlagint1 = damaged ( -1 = unknown ).Bad-order & maintenance tracking.
priorityShipmentFlagstringHigh-priority shipment indicator.SLA / expedite handling.
coneStatus / equipmentLocationOnRailcarstringSecurement & stack position (BOTTOM…).Loading detail & safety.

Timing & Lifecycle — When the key events happened.

FieldTypeWhat it isWhy it matters
eventDateTimedatetime (UTC)Timestamp of the event — the heartbeat of all analytics.Every time-series & forecast is built on this.
yardArrivalTimedatetimePhysical arrival at the terminal.Start of dwell-time calculation.
yardDepartureTimedatetimeDeparture from the terminal.End of dwell & throughput.
creationTime / lastUpdateTimedatetimeRecord created / last updated.Data freshness & lineage.
groundedDate / notifyDate / holdReleaseDatedatetimeGrounded, customer-notify, hold-release times.Lifecycle & gate-out readiness.

Location & Source — Where, and from which system, the event occurred.

FieldTypeWhat it isWhy it matters
terminal.{code,name,region}objectTerminal code (WOR), name (WORCESTER), region (North).Location-based KPIs — the unit of analysis.
terminal.{fsac,milepost,city,state}objectTerminal geography & rail milepost.Geospatial & network context.
eventCity / eventStatestringCity/state where the event fired.Location cross-check.
parkingLocation / trainProcessingTrackstringPhysical spot in the yard / track.Yard-zone & congestion detail.
eventSource / eventDescriptionstringSource system (IPRO_CONTAINER_HISTORY) & event label (Created, Arrived, Notified…).Event classification & filtering.
header.{uuid,time,messageTypeId}objectMessage envelope: unique id, emit time, type.Idempotency & streaming lineage.

Physical Attributes — Size and weight of the unit and its cargo.

FieldTypeWhat it isWhy it matters
containerHeight / containerWidth / equipmentLengthint (mm)Physical dimensions.Capacity & clearance.
grossWeight / tareWeight / cargoWeightintTotal, empty, and cargo weight.Weight planning & load limits.
equipmentCastingTypestringCasting type (ISO…).Equipment classification.

8 · Data Reality & Caveats

Being honest about the source is part of the complete picture.

Known data realities

  • Null sentinels: date -2208988800000 (1900-01-01) and N/A / -1 values mean unset, not real data — must be treated as missing.
  • Events ≠ true inventory: these are activity events; yard inventory must be derived from arrival/departure deltas (assumptions to be documented in the solution).
  • Single export, finite window: ~2.5 months of history — any forecast horizon must be scaled to what the history can support.
  • Sparse fields: many of the 119 columns are mostly N/A for a given event type (e.g. railcar fields on a pure yard event).

In plain words

The data is rich and real, but it's a diary of events, not a tidy ledger of "how many are in the yard right now." We'll have to reconstruct the counts we care about — and we must ignore the placeholder values that only look like data.

9 · POC Scope & the Next Step

Of the 8 use-case groups, three are directly supported by the data we already have — no extra source systems required. These are the natural POC focus:

PriorityUse caseWhat it answersSource group
1Yard Inventory / CapacityHourly yard count + forecast + congestion flag§1 Inventory
2Gate / ThroughputInbound vs outbound flow + near-term forecast§5 Gate Ops
3Dwell & Equipment QualityDwell-time, idle equipment, hold/damage rates§1 / §2

Where this goes next

With the problem and source now fully framed, the solution discussion covers: how we ingest & curate the data, how we derive the metrics, the forecasting approach, accuracy, and the path to production on Microsoft Fabric. That is a separate conversation.

Document synthesized from: Container Inventory — Important Fields, Intermodal Business & Terminal Operations, IO Hub Predictive & Analytics Use Cases, and the live equipmentActivityReported source. Generated May 31, 2026.


PART TWO · LAYMAN → SEMI-TECHNICAL · READ TOP TO BOTTOM

What do all these numbers mean — and how did we get them?

You saw cards full of numbers and a “leaderboard” of models. This part explains every piece, in plain English, then a little deeper.

Start here — the one big idea

We have a giant diary of what every container did over the last ~2.5 months. We use that history to teach a computer to guess the near future — how full a yard will get, how many trucks will arrive, how long things will sit.

The whole thing in one sentence

Learn from the past → guess the future → check how close the guess was → keep the model that guessed best.

That's it. Everything below is just the detail behind those four steps.

10 · Every number on the card, explained

Here's a real Yard card and what each line is telling you:

Yard inventory (+24h)
  winning model:   Ridge  · 1739 train pts
  Last actual      52,348
  Forecast (next)  52,422.5
  MAE (error)      64.56
  Accuracy (MAPE)  0.1%      # = 99.9% accurate
  Congestion threshold 45,566
  Forecast vs threshold OVER # congestion predicted
What you seeWhat it meansIn plain words
Last actualThe most recent real, measured value.“Right now there are 52,348 containers in the yard.”
Forecast (next)What the model predicts for the next period.“In 24 hours I expect ~52,422.”
MAE — Mean Absolute ErrorAverage miss, in the same units (containers, trucks). Lower is better.“On a fair test, I was off by about 65 containers on average.”
MAPE — Mean Absolute % ErrorAverage miss as a percentage. Accuracy = 100 − MAPE.“0.1% error → 99.9% accurate.”
train ptsHow many historical points the model learned from. More = better.Yard had 1,739 hourly points; Gate had only 67 daily points.
Congestion thresholdA chosen ‘full’ capacity line for the yard.45,566 = the line we don't want to cross.
Forecast vs thresholdThe action signal: is the forecast above the line?OVER → congestion predicted → prepare space/staff.

The one trick to remember

MAE = how far off in real units (containers). MAPE = how far off in percent. Accuracy = 100 − MAPE. When MAE is small but MAPE is big, it just means the numbers themselves are small (more on that in §15).

11 · The “model race” (the leaderboard)

For each thing we predict, we don't pick one method and hope. We enter six different algorithms into a race, give them all the same data, and let the most accurate one win.

How the race is judged

We hide a recent slice of real history. Each algorithm predicts it. We measure who came closest (lowest error). Lowest number wins. The winner is saved and used for the live forecast.

Your Yard leaderboard, read as a race result (lower = better):

Ridge 🏆 233.63   — smooth line, perfect for yard
ARIMA 117.79 / Prophet 119.12   — close, time-series methods
RandomForest 5,362 · XGBoost 5,804 · HistGB 6,162   — tree models lost badly here

Why did the powerful tree models lose on yard? Because yard inventory is a smooth, slowly-drifting line — a straight-line method (Ridge) fits it better than complex trees, which overcomplicate it. Different problems, different winners.

One honest footnote

The leaderboard number (Ridge 233.63) comes from cross-validation during training; the card's MAE (64.56) comes from the final hold-out test. Same idea, measured at two stages — that's why they differ slightly.

12 · What each algorithm actually does

The six racers, in plain words — and what kind of problem each is good at.

R

Ridge · Linear regression 🏆 Yard, Inbound

What it does: Draws the best straight-line / smooth relationship through the data.

In plain words

Like drawing a steady trend line through dots that mostly drift in one direction.

Best for: Smooth, slowly-changing numbers — e.g. yard inventory.

R

RandomForest · Tree ensemble 🏆 Outbound

What it does: Asks hundreds of yes/no question-trees and averages their votes.

In plain words

Like polling 300 experts who each look at the data differently, then taking the average.

Best for: Bumpy, non-linear patterns and sudden jumps.

H

HistGradientBoosting · Boosted trees strong all-rounder

What it does: Builds trees one after another, each fixing the last one's mistakes.

In plain words

Like a student who reviews every wrong answer and studies exactly that next.

Best for: Complex patterns; fast on large data.

X

XGBoost · Boosted trees strong all-rounder

What it does: Same idea as the above, a very popular, highly-tuned version.

In plain words

The competition-winning cousin of gradient boosting.

Best for: Complex tabular patterns.

P

Prophet · Time-series (by Meta) 🏆 Gate total

What it does: Splits a series into trend + weekly/yearly seasonality + holidays.

In plain words

Like saying ‘busier on Mondays, slower in summer’ and projecting that forward.

Best for: Seasonal, calendar-driven series — e.g. gate traffic.

A

ARIMA · Classic statistics baseline

What it does: Predicts the next value from recent values and recent errors.

In plain words

Like guessing tomorrow mostly from the last few days.

Best for: Short-term, self-correlated series.

Why have six at all?

No single method wins everywhere. Smooth series love Ridge; seasonal series love Prophet; spiky series love trees. Racing them means each terminal automatically gets the model that fits it best — no manual guessing.

13 · What data (tables) we have

It all starts from one source: 845,277 equipment-activity events (119 fields each, 32 terminals, ~2.5 months). From that raw stream we build four clean tables the models read:

TableWhat it holdsUsed forBuilt from
yard_hourlyHourly count of equipment sitting in each terminal's yard.Yard inventory & congestion forecastderived from gate-in/out + train arrivals/departures
gate_flow_hourlyHourly trucks gating in and out per terminal.Gate throughput, inbound vs outbound flowfrom gate-in / gate-out events
dwellHow long each unit stayed in the yard (arrival → departure).Dwell-time, slow-mover & idle detectionyardArrivalTime → yardDepartureTime
equipment_qualityHold and damage flags per terminal.Bad-order / exception analyticsholdFlag, damageFlag, holdReason

Under the hood

Raw JSON → streamed into shards → summarized with DuckDB into these parquet tables (the “gold” layer). The models never touch the 3.75 GB raw file — they read these tidy hourly summaries.

14 · How we train and pick a winner

  1. Shape the data: turn events into an hourly/daily time-series per terminal.
  2. Add clues (features): hour-of-day, day-of-week, and recent past values (“lags”).
  3. Split fairly: train on older data, test on a recent slice the model never saw.
  4. Race the six: each predicts the hidden slice; we score the error.
  5. Crown the winner: lowest error wins and is saved as a .joblib file.
  6. Forecast: the winner predicts the next 24h (yard) or next period (gate).

In plain words

It's like studying past exams (training), then sitting a mock exam on questions you've never seen (the test). Whoever scores best on the mock is trusted with the real one.

We did this for every terminal × every metric — 127 models in ~15 minutes on a laptop CPU. The “Load & run the real .joblib” button proves each one is a genuine saved model, not a hardcoded number.

15 · Why Gate accuracy looks “low” (and what's really going on)

1,739
Yard training points (hourly, ~72 days)
67
Gate training points (daily, ~2.5 months)
+24h
Yard horizon (easy)
+14 days
Gate horizon (hard)

Two honest reasons Gate is weaker:

  • Far less data: 67 points vs 1,739. Hard to learn a pattern from so few.
  • Much longer horizon: guessing 14 days ahead is far harder than 24 hours ahead.

The “outbound 41.8%” trap

Outbound numbers are small and spiky (159, 51…). MAPE divides the error by the actual value — so a modest miss on a small number looks like a huge percentage. Low-volume series always look bad in MAPE even when the model is fine. Judge them by MAE (real units) instead.

In plain words

Yard is like predicting the level of a big lake — slow and smooth, easy. Gate is like predicting how many people walk through a door on a random day two weeks from now — noisy and hard, especially with only a couple months of history.

16 · What MORE data would help (and how much)

More data is the single biggest lever — especially for gate & flows.

Data we could addTodayWhy it helpsImpact
More history (1+ year)Only ~2.5 months today.Lets models learn weekly & seasonal cycles. Biggest single win, especially for gate.high
Train schedules / ETAsNot in current feed.Arrivals drive yard surges — huge for congestion timing.high
Booking / reservation volumesPartial today.Forward demand signal for railcars & empties.high
Calendar & holidaysDerivable.Explains predictable busy/quiet days for gate.medium
WeatherExternal.Storms shift truck arrivals & delays.medium
Chassis pool / availability feedPartial.Enables chassis shortage & turnaround prediction.medium
Labor / shift rostersExternal.Connects forecasts to staffing decisions.low

If you only do one thing

Get 1+ year of history and add train schedules / ETAs. Those two unlock seasonal patterns and explain the surges that drive both yard and gate.

17 · Other algorithms we could add to the race

The current six cover the main families. With more data, these are worth entering:

AlgorithmWhat it addsEffort
LightGBM / CatBoostFaster, often more accurate boosted trees (CatBoost handles categories natively).easy add
SARIMA / SARIMAXARIMA with seasonality + external drivers (e.g. train ETAs).easy add
Exponential Smoothing (ETS)Lightweight, strong baseline for trend + seasonality.easy add
LSTM / GRU (neural nets)Deep nets for long sequences — needs lots of data & GPU.needs 1yr+
Temporal Fusion TransformerState-of-the-art deep forecaster with feature attention.needs 1yr+ / GPU
Ensembling / stackingBlend several winners — often beats any single model.next step

Under the hood

Easy adds (LightGBM, CatBoost, SARIMA, ETS) plug straight into the existing race today. Deep-learning (LSTM, Transformers) only pays off with a year+ of data and ideally a GPU — otherwise it overfits and underperforms the simple models you already have. Stacking (blending winners) is usually the smartest next gain.

18 · If I add 1 year of data — is it production-ready?

MetricTodayWith 1 year
Yard inventoryProduction-grade (99.9%)Rock solid
Gate totalDemo (≈80%)Much better — learns weekly/seasonal cycles
Inbound / OutboundNoisy (low volume)Better; report in MAE

1 year is necessary, but not the whole story

To be truly production-ready you also need: (1) the right horizon — forecast gate 1–3 hours ahead (the real business question), not 14 days; (2) a retraining schedule as new data arrives; (3) honest metrics (MAE for low-volume flows); (4) monitoring that alerts when error grows.

Bottom line

Yard is ready now. Gate becomes production-grade with 1 year of data + a realistic short horizon + a retrain loop — and the biggest quick win is simply asking gate the right question (next few hours, not next two weeks).

19 · How to read the screen, quickly

  • Find the 🏆 on the leaderboard — that's the chosen model. Lower number = better.
  • Compare MAE and MAPE. Both small → trust it. Small MAE but big MAPE → it's a low-volume series, not a broken model.
  • Read “Forecast vs threshold.” OVER = act now (prepare yard space / staffing).
  • Click “Load & run the real .joblib.” Confirms it's a genuine saved model, live.

Merged from the explained.html companion. Same warm reading theme, no dark mode. Generated May 31, 2026.