Advanced 18 min read Module 22 Data & Methodology

Correlation vs. Causation in Shipping-Price Analysis

Every time shipping costs spike, headlines announce that consumer prices will follow. Sometimes they do. Sometimes they don't. Sometimes something else is driving both. This module explains how to tell the difference — and why certainty is usually the wrong goal.

Key Takeaways

  • 01 Pearson correlation measures the strength of a linear relationship between two variables. A high r-value tells you they move together — it says nothing about why.
  • 02 A p-value below 0.05 means the correlation is unlikely to be random noise. It does not mean the relationship is economically meaningful, practically important, or causal.
  • 03 Time-lagged correlations test whether changes in variable A at time T predict changes in variable B at time T+N. For shipping-to-price analysis, lags of 3–9 months matter most.
  • 04 Granger causality is a statistical test for predictive priority — not true causation. If freight rates Granger-cause import prices, it means freight rates contain information about future import prices beyond what's already in import prices themselves.
  • 05 Confounding variables — monetary policy, weather, exchange rates, labor markets — regularly drive both shipping costs and consumer prices simultaneously, producing correlations that have no causal content.
  • 06 Risk and Route expresses findings as probability ranges and confidence levels, not predictions. Where the evidence is ambiguous, we say so explicitly.

The Seductive Simplicity of Correlation

In 2021, the Freightos Baltic Index — a benchmark for global container shipping rates — rose from roughly $1,500 per forty-foot container in January to over $10,000 by September. Over the same period, U.S. consumer prices for goods accelerated sharply, with core goods inflation reaching levels not seen since the early 1990s. The two lines on the chart moved together. The narrative wrote itself: shipping costs exploded, and prices followed.

The story is broadly true. There is genuine research supporting a transmission mechanism from ocean freight rates to consumer prices, operating through a lag of several months. But the simple chart tells you only that two things happened at the same time. It does not tell you how much of the goods inflation was caused by freight rates, how much would have happened anyway due to pandemic-era demand surges, how much reflected domestic trucking constraints, and how much was driven by the Federal Reserve's own policy decisions feeding through into import costs via exchange rates.

Answering those questions requires moving past correlation. This module explains the tools economists use to get there — and why intellectual honesty about what those tools can and cannot establish is the foundation of any serious analysis.

Pearson Correlation: What It Measures and What It Doesn't

The Pearson correlation coefficient (r) measures the linear relationship between two variables on a scale from -1 to +1. An r of +1 means the variables move in perfect lockstep. An r of 0 means they share no linear relationship. An r of -1 means they move in perfect opposition.

In shipping-price research, a correlation of r = 0.70 between container rates and import prices would be considered a moderately strong positive relationship. It means that when container rates are above their long-run average, import prices tend to be above their long-run average too — and that roughly 49% of the variation in import prices (r² = 0.49) is statistically associated with variation in container rates.

What it does not mean: that container rates cause import prices to move. It does not mean the relationship will persist in the future. It does not mean the other 51% of variation is explained by factors you've controlled for. And it says nothing about economic magnitude — whether a 10% increase in container rates translates to a 0.1% or a 1% increase in consumer prices.

The further problem is that economic time series are rarely independent. Shipping rates and consumer prices both move with the business cycle, respond to the same global demand shocks, and are affected by many of the same underlying forces. This creates the conditions for spurious correlation: two variables that appear related not because of any direct connection between them, but because both are responding to a common third cause.

r = 0.73
Shanghai–LA Container Rates vs. Core Import Prices
12-month lagged correlation, 2015–2024. Statistically significant (p < 0.01). Does not establish causation.
r = 0.61
Baltic Dry Index vs. Global Industrial Output
Concurrent correlation. BDI moves with demand, not ahead of it — the 'leading indicator' claim requires careful qualification.
4–6 months
Estimated lag: container rates to CPI goods
IMF (2023) and BIS (2022) working papers. Range varies by product category and supply chain configuration.
~0.15%
CPI impact per 10% increase in container rates
Borio et al. (2022) estimate. Effect is non-linear and diminishes as freight costs become a smaller share of total product cost.

What P-Values Actually Tell You

When researchers report a correlation alongside a p-value of 0.01 or 0.05, they are communicating something specific: if there were truly no relationship between these two variables in the underlying population, what is the probability of observing a correlation at least this large by chance in a sample of this size?

A p-value of 0.05 means there's a 5% probability of seeing data this extreme if the null hypothesis (no relationship) is true. A p-value of 0.01 means 1%. Smaller p-values suggest the relationship is less likely to be random noise.

Three things that p-values do not mean, despite widespread confusion about this:

  • A p-value is not the probability that the null hypothesis is true. That is a common misreading, but it is wrong.
  • A p-value does not measure the size or importance of an effect. A tiny, economically trivial correlation can achieve p < 0.001 with a large enough dataset.
  • A p-value does not protect against confounding. A perfectly confounded regression — one where the real driver is an omitted variable — can produce highly statistically significant results that nonetheless tell you nothing about the true causal relationship.

In economic datasets with hundreds of monthly observations, statistical significance is almost automatic. The harder question is always: is this relationship real, stable, and economically meaningful? Those questions require judgment that goes well beyond a p-value.

Time-Lagged Correlations: Why They Matter for Shipping Analysis

Shipping costs do not affect consumer prices the moment a freight rate changes. A container of athletic shoes booked at $8,000 per FEU in October needs to cross the Pacific (two to three weeks), clear customs, move to a distribution center, ship to individual retailers, and then sell through whatever inventory was already on shelves before higher-cost inventory arrives. That process takes months.

Time-lagged correlation adjusts for this by shifting one time series relative to the other before computing the correlation. Instead of asking "do shipping rates at time T correlate with prices at time T?", a lagged correlation asks "do shipping rates at time T correlate with prices at time T + 4 months?" or whatever lag makes the most structural sense for the product category.

This matters enormously in practice. The concurrent (zero-lag) correlation between container rates and retail clothing prices is often weak or even negative — retailers with long-term freight contracts are insulated from spot rate spikes in the short term. But the 5 to 9 month lagged correlation is much stronger, reflecting the time required for higher spot rates to work through new purchase orders, production runs, and inventory turnover.

Different product categories have dramatically different optimal lags. Gasoline prices respond to tanker freight rates and crude oil prices within days. Electronics take 3 to 6 months for new inventory to reach shelves. Clothing and footwear with long production cycles can take 6 to 12 months. A regression that treats all goods as having the same lag will understate the relationship for some categories while misattributing price changes in others.

Approximate Transmission Lags by Category

Gasoline / Fuel
Spot crude price and tanker rates feed nearly immediately into wholesale fuel pricing.
0–14 days
Fresh food
Highly dependent on reefer (refrigerated) container rates and domestic logistics.
2–6 weeks
Processed food & beverages
Longer supply chain with domestic manufacturing stage absorbs some of the shock.
2–4 months
Electronics
Complex multi-origin supply chains; retailer hedging moderates pass-through.
3–6 months
Clothing & footwear
Seasonal production cycles mean new bookings take one or two seasons to reach shelves.
6–12 months
Furniture & home goods
High volume-to-value ratio makes freight costs a larger share of landed cost.
4–8 months

Granger Causality: What It Tests and What It Doesn't

Granger causality is a statistical concept developed by economist Clive Granger, who received the Nobel Memorial Prize in Economic Sciences in 2003 partly for this work. Despite the name, it does not test whether one variable truly causes another in a philosophical or structural sense. It tests something more modest and more precise: whether past values of variable X contain information that helps predict future values of variable Y beyond what Y's own past values already predict.

If container freight rates Granger-cause import prices, it means that a regression model predicting import prices improves meaningfully when you include lagged freight rate data — not just lagged import price data. If freight rates do not Granger-cause import prices, it means shipping rates carry no additional predictive information about where prices are going that the price series itself doesn't already contain.

In the context of shipping-price analysis, Granger causality tests ask the question most analysts actually care about: does monitoring freight rates give you advance warning of where consumer prices are heading? The academic evidence here is reasonably favorable. Multiple studies, including work published through the Bank for International Settlements and the International Monetary Fund, find that container shipping indices Granger-cause measures of goods price inflation, particularly at lags of three to nine months.

The important caveat is that Granger causality is still a statistical relationship in historical data. It can break down. During periods of unusual monetary policy, major structural shocks to domestic supply chains, or significant shifts in retailer behavior around hedging and inventory management, the historical relationship between freight rates and consumer prices may weaken substantially.

It also says nothing about how much of the price change is attributable to freight versus everything else. For that, you need a structural model with explicit controls for competing explanations — and those models always involve assumptions that can be challenged.

Confounding Variables: The Real Analytical Challenge

A confounding variable is one that influences both the presumed cause and the presumed effect, creating an apparent relationship that doesn't reflect direct causation. In the shipping-price context, confounders are everywhere. Six of the most consequential are below.

📉 Monetary Policy
High risk

Federal Reserve rate decisions affect import demand, dollar strength, and commodity prices simultaneously. A rate hike that slows the economy will reduce both shipping volumes and consumer prices — creating an apparent negative correlation between freight rates and inflation that has nothing to do with the shipping-price transmission mechanism.

🌊 Weather & Climate Events
High risk

Hurricanes disrupt Gulf ports, droughts reduce grain exports and cut bulk shipping demand, and La Niña patterns shift agricultural commodity prices. A severe Atlantic hurricane season can move both freight rates and food prices in the same direction without one causing the other.

Labor Markets
Moderate risk

Dock strikes, trucker shortages, and warehouse labor constraints affect goods prices through domestic logistics costs entirely separate from ocean freight. A correlation between container rates and retail prices during a dock strike may actually be capturing the strike's direct effect, not the ocean leg at all.

💱 Exchange Rates
High risk

Dollar appreciation makes imports cheaper in USD terms even when shipping costs rise. The two effects can partially cancel out, mask each other, or create spurious correlations depending on the time window selected. Import price indices are particularly vulnerable to this confound.

🛢️ Commodity Price Cycles
Moderate risk

Iron ore, coal, and grain prices drive bulk shipping demand. When Chinese steel demand surges, both the Baltic Dry Index and commodity input prices rise together — but neither caused the other. The common driver is industrial demand in China.

📦 Inventory Cycles
Moderate risk

Retail restocking cycles create clustered demand for shipping independent of current shipping costs. The post-COVID restocking boom of 2021 sent container rates to $20,000 while also creating retail price inflation — but the shared driver was pent-up demand, not a freight-to-price transmission.

Controlling for confounders requires either randomization (which is impossible in observational economic data) or explicit modeling — building regression specifications that include the confounding variables so their effects can be estimated separately. Neither approach is perfect. Randomization is unavailable. Regression controls are only as good as your ability to measure the confounders and correctly specify their relationship to the outcome.

This is why economists working on shipping-price transmission generally report results with confidence intervals, robustness checks across different model specifications, and explicit acknowledgment of what the analysis cannot rule out. The headline correlation is only the beginning of the analytical work.

Why "Rates Went Up, Then Prices Followed" Proves Nothing

The most common analytical error in shipping commentary is the post hoc fallacy: event A preceded event B, therefore A caused B. The 2021 container rate surge preceded the goods inflation surge. The 2024 Red Sea rerouting preceded price increases in European imports. The temporal sequence is real. The causal inference is not automatic.

Consider the structural objection. In 2021, the same forces that sent container rates to $20,000 — a massive surge in consumer goods demand following pandemic-era stimulus — also created direct price pressure on retail goods through demand-pull inflation. The chart showing freight rates and consumer prices rising together might mostly be showing two effects of the same demand shock, not a transmission from one to the other.

The magnitude objection is equally important. Container freight costs typically represent 1% to 3% of the final retail price of most manufactured goods. Even a tripling of freight rates — a historically extreme event — raises the freight-component of final price by 2% to 6%. If consumer prices rose by 15% during the same period, freight cost increases can explain only a fraction of the total. The rest requires other explanations.

None of this means shipping costs are irrelevant to consumer prices. The academic consensus, across the BIS, IMF, Federal Reserve, and academic economics literature, supports a real and statistically identifiable transmission effect. The point is that the size of the effect, its stability across different economic conditions, and its contribution relative to other drivers requires careful analysis — not a visual inspection of two lines on a chart.

Evidence vs. Certainty: The Right Epistemic Standard

Economic analysis operates under uncertainty by definition. The economy is a complex adaptive system with billions of actors, incomplete information, and feedback loops that regularly produce surprises. Any claim to certainty about economic outcomes should be treated as a warning sign rather than a selling point.

The appropriate standard is not certainty but calibrated confidence: how likely is this outcome given the available evidence, and what would cause us to update our assessment? That requires being explicit about the quality and limitations of the evidence, the assumptions embedded in the analysis, and the conditions under which the historical relationship might break down.

For shipping-price analysis specifically, calibrated confidence means distinguishing between what the evidence actually supports and what would require stronger claims. A reasonable, evidence-based position might be: "Historical data and structural economic modeling support a transmission effect from container freight rates to core goods inflation at a 4 to 6 month lag, with an estimated pass-through coefficient of 0.10 to 0.20 (meaning a 10% increase in freight rates is associated with a 1.0% to 2.0% increase in goods prices). This relationship is statistically significant and supported by Granger causality tests, but accounts for a minority of goods price variation and may be weaker during periods of high domestic demand pressure or unusual monetary policy."

That is a more complicated statement than "rates went up and prices followed." It is also more useful, because it tells you the expected magnitude, the timing, the uncertainty range, and the conditions that might invalidate the forecast.

The Levels of Evidence

Weakest Concurrent correlation in raw time series data. Two variables moved together. Nothing more.
Moderate Statistically significant time-lagged correlation, with lag chosen on structural grounds before looking at the data.
Moderate Granger causality — predictive priority of X over Y in historical data, controlling for Y's own lags.
Stronger Regression analysis with explicit controls for major confounders (monetary policy, exchange rates, demand cycles).
Stronger Structural economic model with identified causal mechanisms, verified against multiple independent datasets.
Strongest Quasi-experimental designs — event studies of discrete shocks (canal closures, port strikes) that are plausibly exogenous to the variables of interest.

How Risk and Route Handles Uncertainty

The analytical approach at Risk and Route starts from a simple premise: the people reading this analysis are making real decisions — about supply chains, about sourcing, about pricing, about when to lock in contracts — and they are better served by honest uncertainty than by false precision.

In practice, this means several things. First, we always report the methodology behind a correlation or forecast, so you can evaluate the assumptions yourself. A correlation coefficient without context for how it was calculated and what sample period it covers is marketing, not analysis.

Second, we express forecasts as probability distributions or confidence ranges rather than point estimates. "Container rates are likely to remain elevated through Q3" is a weaker but more honest claim than "rates will be $6,000 by September." The first is actionable. The second is pretending to know things we cannot know.

Third, we flag the major sources of uncertainty for each analysis — the confounders that could invalidate the relationship, the model assumptions that might not hold in current conditions, and the indicators we are watching that would cause us to update our view.

Fourth, we distinguish between the core empirical finding and the interpretation layered on top of it. If the data shows that container rates and import prices have historically been correlated with a six-month lag, we say that. We do not automatically extend that finding into a deterministic forecast without acknowledging that historical patterns break down under novel conditions.

This approach produces analysis that is sometimes less satisfying than confident predictions — humans are wired to prefer certainty. But it is analysis you can actually use without being misled by the inevitable cases where the simple story was wrong.

What to Do With This

When you encounter a claim that shipping rates caused some consumer price outcome, ask these five questions before accepting or rejecting it.

One: What is the lag structure? Were the price effects observed at the same time as the shipping cost changes, or at a structurally appropriate delay? Concurrent correlations in economic data are almost always confounded.

Two: What confounders were controlled for? If the analysis does not account for monetary policy, exchange rates, or demand conditions, the causal attribution is much weaker than it appears.

Three: What is the magnitude? Even a real and statistically significant transmission effect from freight to consumer prices is typically small compared to other drivers. If the claimed effect is large, it requires correspondingly strong evidence.

Four: Was the relationship tested before the fact or identified after? Post-hoc explanations using historical data are always prone to over-fitting. A pattern that was predicted before the data came in is much stronger evidence than one identified afterward.

Five: Does the claim hold under alternative specifications? Robust findings survive changes in the time window, sample, control variables, and estimation method. Fragile findings do not.

These are not pedantic methodological objections. They are the difference between actionable intelligence and a compelling chart that leads you astray. Shipping markets are genuinely important for consumer prices. The connection is real. But quantifying it honestly, acknowledging what we don't know, and updating when the evidence changes — that is what makes the analysis worth having.