ESG Scores Vary Wildly Between Providers and Here Is Why
If you pull ESG ratings for Tesla from three different providers, you will get three meaningfully different answers. MSCI gives Tesla a rating that reflects strong environmental product positioning. Sustainalytics flags significant governance concerns. FTSE Russell lands somewhere else entirely. This is not a bug in the system. It is the inevitable result of measuring something genuinely complex using different frameworks, and understanding why it happens is more useful than complaining about it.
The Measurement Problem
ESG covers an enormous range of company characteristics. Environmental factors include carbon emissions, water usage, waste management, and product lifecycle impacts. Social factors span labor practices, supply chain conditions, community relations, product safety, and data privacy. Governance encompasses board composition, executive compensation, shareholder rights, audit quality, and anti-corruption practices.
No two rating agencies agree on how to weight these categories against each other. Should a company with excellent environmental practices but poor labor conditions score higher or lower than a company with average performance across all categories? The answer depends entirely on the weighting scheme, and different providers make fundamentally different choices here.
MSCI tends to weight financial materiality heavily, focusing on ESG factors most likely to affect a company's bottom line within its specific industry. Sustainalytics emphasizes exposure to ESG risks and how well companies manage those risks. CDP focuses almost exclusively on environmental disclosure and climate strategy. Each approach is defensible. Each produces different rankings.
Data Sources and Collection Methods
The underlying data problem is substantial. ESG rating agencies rely on a mix of company self-disclosure (sustainability reports, proxy filings, CDP questionnaires), third-party data (government databases, NGO reports, news sources), and proprietary estimation models for companies that do not disclose.
Company self-disclosure is inconsistent. Reporting frameworks like GRI, SASB, and TCFD have improved standardization, but participation is voluntary for most companies, and the depth of reporting varies enormously. A large European company subject to the EU's Corporate Sustainability Reporting Directive will disclose far more than a mid-cap Asian manufacturer with no regulatory disclosure requirements.
When companies do not disclose, rating agencies estimate. And different agencies estimate differently. One provider might assume that a non-disclosing company in a high-risk industry has poor practices (negative assumption). Another might use industry averages. A third might simply leave the data point blank, which affects the overall score calculation.
Industry Classification and Peer Comparison
Most ESG methodologies score companies relative to industry peers rather than on an absolute scale. This means a fossil fuel company can receive a high ESG score if it manages environmental risks better than other fossil fuel companies, even though its absolute environmental impact is substantial. Conversely, a technology company might receive a mediocre score despite minimal direct environmental impact if its data privacy practices lag behind tech sector peers.
The peer group definition itself introduces variation. Where you draw industry boundaries affects who gets compared to whom. Is Amazon a technology company, a retailer, or a logistics company? The answer changes its peer set and therefore its relative score on issues like warehouse labor conditions, carbon emissions from delivery operations, and data governance.
Subjectivity in Scoring
Beyond methodological choices, there is genuine subjectivity in how analysts interpret qualitative information. Two analysts reading the same sustainability report might reach different conclusions about the quality of a company's climate transition plan or the adequacy of its human rights due diligence process.
Some rating agencies rely more heavily on analyst judgment. Others try to minimize it through rules-based scoring frameworks. Neither approach eliminates subjectivity entirely. The rules-based approach just embeds the subjective judgments into the methodology design rather than the individual scoring decisions.
Controversy assessment is particularly subjective. When a company faces an environmental fine or a labor dispute, different agencies assess the severity differently. Is a $50 million EPA settlement a minor operational issue for a company with $200 billion in revenue, or is it a signal of systemic environmental management failure? The answer often depends on the analyst's interpretation of context.
What This Means for Analysts
The low correlation between ESG rating providers (academic studies consistently find correlations in the 0.4 to 0.6 range, compared to 0.9+ for credit ratings) tells you something important. ESG scores are not measuring a single, well-defined quantity. They are measuring a complex, multidimensional construct using different definitions and different data, and arriving at different answers because the underlying question is genuinely ambiguous.
For analysts, the practical takeaway is that no single ESG score should be treated as authoritative. If you are using ESG data in investment analysis, you should understand which provider's methodology best aligns with your analytical objectives, look at multiple providers to identify where they agree and disagree, and investigate the underlying data rather than relying on headline scores.
A company that scores well across all major providers probably does manage ESG risks competently. A company with wildly divergent scores is worth investigating more closely, because the disagreement itself is informative. It usually means the company has strong performance on some dimensions and weak performance on others, and the agencies disagree about which dimensions matter more.
The Path Forward
Regulatory efforts like the EU's CSRD and the ISSB standards are improving data consistency, which should reduce some of the variation driven by disclosure gaps. But methodological differences will persist because they reflect genuinely different views about what ESG analysis should prioritize.
This is not necessarily a problem to solve. In equity research, different analysts reach different conclusions about the same company all the time. The existence of multiple ESG perspectives, when understood properly, gives investors richer information than a single consensus score ever could. The key is treating ESG ratings as one input among many rather than as a definitive verdict on corporate sustainability.