Goodhart’s Law states: “When a measure becomes a target, it ceases to be a good measure.”
An organization measures customer satisfaction using sentiment analysis. The sentiment metric becomes a KPI. Leadership tracks it. Teams are evaluated on it. Compensation is tied to it.
Everyone optimizes for the sentiment metric. They do not optimize for actual customer satisfaction. The metric diverges from reality. The organization makes decisions based on a distorted signal.
This is the core problem with using sentiment analysis as a metric. Sentiment is a weak proxy for what organizations actually care about (satisfaction, engagement, retention). When the proxy becomes the target, the real thing gets ignored.
How Sentiment Becomes a Metric
An organization wants to measure something important: customer satisfaction, employee engagement, market sentiment.
Satisfaction is hard to measure directly. It requires ongoing surveying, complex analysis, subjective judgment. It is slow and expensive.
Sentiment analysis offers an alternative. Measure sentiment of available text (reviews, feedback, communication). Sentiment is the proxy for satisfaction.
The metric is created: “average customer sentiment is 0.68.” The metric is tracked: trending from 0.65 to 0.68. The metric is reported: “sentiment improved 5%.”
Leadership is happy. There is now a quantified measure of satisfaction. It is tracked. It is trended.
The metric becomes the goal. Teams are told: “Increase customer sentiment by 10%.” They are measured on this goal. Their compensation depends on meeting it.
Now everyone is optimizing for sentiment, not for satisfaction.
Perverse Incentives
Once sentiment becomes a metric with consequences, people optimize for it.
For customer-facing teams. They learn that negative sentiment is a problem. They learn to handle negative feedback strategically.
Instead of fixing the problem (which might take time), they focus on managing the sentiment of the customer’s expression of the problem.
A customer complains about a feature. Instead of fixing the feature, support staff focus on calming the customer down and getting them to express less negative sentiment.
“I understand your frustration. We are working on a fix. In the meantime, here is something else that might help.” The customer is soothed. Their sentiment improves.
The underlying problem (the broken feature) is unchanged. But the metric improved.
Or they selectively respond. High-sentiment feedback gets recognition. Low-sentiment feedback gets deprioritized. The organization ends up fixing problems for happy customers and ignoring problems from unhappy ones.
The incentive structure is backwards. Problems from unhappy customers are exactly the ones that need attention. But sentiment metrics create incentive to ignore them.
For employee teams. They learn that negative sentiment (expressed to management or in monitored communication) creates risk.
They learn to perform positivity. They express only positive thoughts in communication. They hide concerns. They stay silent about problems.
The organization measures sentiment and sees improvement. But people are actually less willing to surface problems. The metric has made the organization less adaptive.
The incentive for honesty is removed. The incentive for performance installed.
For product teams. They learn that sentiment is what is measured, not outcomes.
They focus on features that generate positive sentiment rather than features that solve problems.
A feature that prevents a customer problem is valuable but might not generate positive sentiment (the customer might not notice or express it).
A feature that delights customers might generate high sentiment but might not be valuable (it might be a feature 1% of customers care about).
Product prioritization shifts away from impact and toward sentiment. The organization builds less valuable products.
The Metric Diverges From Reality
As people optimize for sentiment, the metric diverges from the underlying construct.
An organization measures customer sentiment: 0.72 and improving.
But actual customer churn is rising. Actual revenue is flat. Actual retention is declining.
The metric improved. The underlying reality worsened.
How? The organization has been optimizing for sentiment. They:
- Focused on features that customers express positive sentiment about, not features customers actually need
- Focused on handling complaints to reduce sentiment, not on fixing underlying problems
- Attracted customers who express sentiment easily, lost customers who suffer silently
- Optimized for the metric instead of the underlying construct
Customer sentiment improved. Customer satisfaction declined. Customer retention declined.
The metric diverged from reality.
Segment Divergence
Different customer segments respond to sentiment optimization differently.
Vocal customers. Some customers complain and express sentiment easily. When sentiment optimization happens, they benefit. They get fast service. Their complaints get attention (to manage sentiment).
Silent customers. Some customers experience problems but do not complain. They are suffering silently. When sentiment optimization happens, they do not benefit. They are not expressing sentiment.
If sentiment optimization is aggressive, the organization over-serves vocal customers and under-serves silent ones. The silent customers are often the most valuable (they are not noise, they have real problems).
The metric improves (vocal customers are happier and expressing positive sentiment). The organization deteriorates (silent customers are increasingly dissatisfied and leaving).
Attrition of Losers
When an organization optimizes for sentiment, the people most likely to leave are those who do not express sentiment easily.
This is particularly true with employee sentiment.
An organization measures employee sentiment as a KPI. They create culture where positive sentiment is valued. They respond to negative sentiment.
Employees who express concerns directly are handled with positive reinforcement. “Thank you for speaking up. We are working on this.”
Employees who do not express concerns are invisible. Employees who are quietly dissatisfied are not flagged by sentiment monitoring.
Over time, what happens?
Employees who are comfortable expressing themselves and performing are retained. Employees who are quiet, serious, contemplative, or actually independent are likely to leave.
The organization has optimized for sentiment-expressers and lost the people who just work.
The metric looks good (sentiment is high, lots of people expressing positive things). The organization has become worse (lost the people who did the real work).
Gaming at Scale
As sentiment metrics become important, people learn to game them.
Employees learn which language is flagged as positive and which as negative by sentiment systems. They adjust their language strategically.
A team has a serious problem. Instead of saying “This approach is not working because X,” they learn to say “This approach is interesting. I wonder if we could also explore alternative approaches.”
Same information. Different language. Sentiment analysis is happy. The problem is still hidden.
Social media communities learn that certain language triggers more engagement and more positive sentiment. They learn to use that language strategically.
A company asks “How was your experience?” on a survey. Customers learn that expressing extremely positive sentiment gets more perks. They express extreme sentiment whether they feel it or not.
Sentiment metrics become measurements of how well people game the metric, not measurements of the underlying construct.
Temporal Gaming
People express sentiment at specific times strategically.
A customer is unhappy. If they express it immediately, the company will try to fix it. If they wait and express it at a high-visibility moment (annual survey, public review), it will have more impact.
So they stay silent, then express strongly. The sentiment metric dips when they express.
Or a customer will express positive sentiment right after getting good service. But if problems emerge later, they do not re-express sentiment. The temporal pattern of sentiment does not match the temporal pattern of satisfaction.
An employee will express positive sentiment during a performance review. They will express negative sentiment once they have found another job. The sentiment metric lags the actual decision to leave.
Sentiment metrics are temporally gamed.
The Distortion Effect
When sentiment becomes a metric, decision-making distorts.
Feature prioritization. Teams focus on features that generate positive sentiment, not features that are important.
Customer focus. Teams focus on vocal customers who express sentiment, neglecting silent customers who might be suffering more.
Problem addressing. Teams focus on managing sentiment about problems instead of fixing problems.
Hiring and retention. Teams optimize for people who express sentiment easily, losing people who work quietly.
Strategy. Company strategy shifts from what the company should do toward what will generate positive sentiment.
All of these distortions flow from optimizing for the metric instead of the underlying construct.
The Cost
The cost of sentiment metric distortion is usually not immediately visible.
In the short term, the metric improves. The organization looks good.
Over time, the underlying constructs decay:
- Customer churn rises despite positive sentiment
- Employee turnover rises despite positive sentiment scores
- Product quality declines despite positive feedback
- Revenue stagnates despite positive metrics
The organization then discovers (usually through outcome failure) that the metric and reality had diverged.
By then, the distortion has been ongoing for months or years.
Examples
Customer Example
A software company measures customer satisfaction using sentiment analysis on support interactions.
The company trains support teams: “Our goal is to improve customer sentiment.”
Support teams learn that negative sentiment is bad. They focus on calming upset customers quickly.
Instead of saying “You are right, that feature is broken. Here is the timeline for fixing it,” they say “I understand your frustration. You are not alone. We are committed to making this better. In the meantime, here are workarounds.”
The customer’s sentiment improves. They feel heard. The metric improves.
The broken feature is not fixed. The customer is still frustrated, just less vocal about it.
The company tracks customer sentiment. It trends upward. Leadership is happy. They increase the bonus for improving sentiment.
Support teams optimize further. They become more skilled at calming upset customers. Sentiment improves more.
Meanwhile, actual customer satisfaction declines. Products are still broken. Problems are not fixed. Customers are still dissatisfied, just quieter.
A year later, customer churn spikes. The company is shocked. “Sentiment was positive.”
The metric diverged from reality because the metric became the goal.
Employee Example
A company measures employee engagement using sentiment analysis on internal communication.
They create a culture: “High sentiment is engagement.”
Employees learn that expressing concerns is risky (it triggers flags in sentiment monitoring). They learn to perform positivity.
They also learn that the loudest voices get heard. People who complain loudly (in productive ways) get attention. People who are quiet do not.
So the engineers who are actually working hard and getting things done stay quiet. The people who are talkative about problems get energy and attention from management.
Over time, the culture shifts. Talking is valued. Doing is less valued. Complaints are listened to. Accomplishments are not celebrated (they do not generate sentiment).
The company measures sentiment. It is positive. Employees are expressing positive things.
Actually, the company is slowly becoming dysfunctional. The serious people who do the work are leaving. The talkative people are rising. The culture has shifted away from work and toward performance.
A year later, critical deadlines are missed. Projects are incomplete. The company realizes people are not getting things done.
The metric suggested engagement was high. Reality suggested it was low.
Market Example
A financial company measures market sentiment using sentiment analysis on news, social media, and analyst reports.
They use this to inform portfolio decisions. High sentiment in tech means increase tech exposure. Low sentiment means reduce.
The sentiment metric becomes the signal. Traders optimize for it.
They learn that certain kinds of statements generate positive sentiment. They learn that certain kinds of bad news trigger very negative sentiment.
The metric becomes increasingly disconnected from actual valuation. Sentiment is about psychology of traders. Valuation is about fundamentals.
The company makes portfolio bets based on sentiment trends. They increase exposure when sentiment is high. They reduce when it is low.
This is momentum trading. It can work sometimes. But when sentiment and fundamentals diverge sharply, it fails catastrophically.
A company might have negative sentiment but positive fundamentals (market has mispriced it). A company might have positive sentiment but negative fundamentals (bubble).
The company loses money because they optimized for sentiment instead of understanding fundamentals.
Breaking the Optimization
Once a metric becomes the goal, it is hard to break the cycle.
Teams are incentivized to optimize for the metric. If you remove the optimization pressure, teams feel like they are being asked to perform worse.
If you remove the metric, people feel like their hard work (optimizing for it) was wasted.
The organization becomes locked into optimizing for a metric that no longer reflects what matters.
Prevention
The way to avoid this is to not make sentiment a metric in the first place.
Or, if you do measure sentiment, measure it separately from what it is a proxy for.
“Sentiment is 0.72. Customer satisfaction, measured independently through surveys and outcomes (retention, repeat purchase), is 0.61. These diverge. Why?”
The divergence reveals that sentiment is not measuring satisfaction.
Or measure sentiment but never tie compensation or goals to it. Use it only as a signal for further investigation.
“Sentiment is declining. This is a signal. Let’s investigate why. It might indicate satisfaction problems. Or it might indicate people are becoming more honest.”
Treating sentiment as a signal for investigation is different from making it a goal.
When Sentiment Metrics Are Most Dangerous
Sentiment metrics are most dangerous when:
They are tied to compensation. When people are paid for sentiment improvement, they optimize for it. The metric becomes the goal.
They are tracked as KPIs. When leadership watches the metric, teams optimize for it.
They are public. When the metric is published and compared (team A has 0.72, team B has 0.68), teams compete to optimize for it.
There is no outcome measurement. When sentiment is measured but actual outcomes (retention, productivity, quality) are not, the divergence is invisible.
They are the primary measure. When sentiment is the main way the organization measures success, all optimization flows toward it.
The Alternative
Do not use sentiment analysis as a metric. Use it as a signal.
Measure what actually matters: retention, revenue, quality, productivity, actual satisfaction (through direct surveys and outcomes).
Use sentiment as a supplementary indicator. When sentiment diverges from outcomes, investigate why.
Do not optimize for sentiment. Optimize for what actually matters. If you do this right, sentiment will be positive as a byproduct (because you are actually creating satisfaction).
The organizations that perform well are not the ones with the highest sentiment metrics. They are the ones focused on actually delivering value.
Sentiment metrics distort this focus. They make organizations optimize for the wrong thing. The cost is visible only after divergence becomes severe.