New: Track how US tariffs are hitting corporate earnings across industries, in real time.
Go to Tracker

Sentiment Analysis

The Marvin Labs sentiment feature converts company communication into a single score. Each company receives one sentiment score between 0 and 100. A score of 0 is the most negative, while 100 is the most positive. Scores are updated daily overnight.

Methodology

Creating a reliable sentiment analysis model requires more than just measuring positive or negative language. The methodology behind Marvin Labs’ sentiment feature is designed to reflect how companies communicate with investors—grounded in primary, regulated disclosures rather than third-party noise.

Data Sources: Focus on Primary, High-Quality Sources

The model includes only sources where company management is under regulatory obligation to provide fair and balanced communication. These include:

  • Annual and quarterly filings (10-K, 10-Q)
  • Earnings calls (prepared remarks and Q&A)
  • Regulatory press releases and filings
  • Select investor conferences (e.g., JPM Healthcare, Morgan Stanley Tech, Goldman Sachs Internet)
  • Investor days and major public events (e.g., WWDC, Build, GTC)

Excluded sources:

  • Marketing blogs and promotional press releases unless filed with regulators
  • Financial media (Bloomberg, CNBC, Financial Times, etc.)
  • Social media (Twitter, Reddit, StockTwits)

We are aware that other sentiment models from time to time include these sources. One of the most prominent example, the VADER sentiment model, is even deliberately designed to work with social media content.

However, we believe that this dilutes the quality of the sentiment signal from two separate directions.

First, our sentiment model is meant to capture the sentiment of management and the company itself, rather than the sentiment of third parties about a company.

Second, the distribution of third-party coverage of various companies is very top-heavy. Outside a particular subset of companies - typically consumer facing brands and meme stock - most companies simply do not generate enough content on social media or in traditional media to draw any conclusions. Have you ever looked at the Twitter / X account of ADP ( currently ~ the 100th largest company by market capitalization globally)? How often are they mentioned in the financial press? This lack of coverage is even more pronounced for smaller companies, which may not have a significant social media presence or media coverage.

Focus: Company-Level, Not Document-Level

We calculate a single score at the company level. Each document is weighed by importance and relevance. Annual reports carry more weight than minor filings. This approach reflects the broader narrative that companies build across multiple communications.

Document-level sentiment misses context. Company-level sentiment creates a more complete view of management outlook and recognizes that disclosures form part of an ongoing narrative.

Normalization

Sentiment is normalized to account for industry differences. A bank and a toy company naturally use different language. Scores are calibrated against peers and normalized across industries and market phases to a scale of 0 to 100. This allows fair comparison.

Practical Use

  • Screen companies for further research
  • Spot risks where sentiment diverges from numbers
  • Use as a complementary signal in a broader investment workflow

Performance Analysis: Sentiment as an Investment Signal Methodology

Setup

To evaluate sentiment analysis as an investment signal, we constructed a weekly rebalancing portfolio using sentiment scores derived from our LLM-based system. The methodology included:

  1. Analyze a broad universe of stocks using the sentiment model
  2. Go long the ten companies with the highest scores
  3. Short the ten companies with the lowest scores
  4. Equal-weight all positions
  5. Rebalance weekly on updated scores

The test covered a five-year period from 2018 to present, including the market turbulence of 2020 and subsequent recovery.

Results

As shown in the chart below, a portfolio with a long-notional value of $100 would have yielded approximately $60 over the five-year period without experiencing significant drawdowns.

Return on a Sentiment Long-Short Portfolio. Source: Marvin Labs, BloombergReturn on a Sentiment Long-Short Portfolio. Source: Marvin Labs, Bloomberg

These figures compare favorably to both broad market indices and other alternative data strategies during the same period1.

MetricValue
Sharpe Ratio1.3
Maximum Drawdown8.2%
Annualized Volatility9.6%

The strategy showed consistency, including during the March 2020 market crash. Research from Goldman Sachs Global Investment Research shows sentiment signals have low correlation with traditional factors such as value, momentum, and quality. This can add diversification benefits to multi-factor strategies.

While promising, sentiment works best as a complementary tool. Combining sentiment with fundamental factors improves risk-adjusted returns compared to either approach in isolation. Sentiment is most effective as a screening or risk-flagging tool.

Historical Perspective and Evolution

Financial analysis has historically centered on quantitative metrics. Analysts would parse through income statements, balance sheets, and cash flow statements, building intricate models to project future performance. According to a 2021 CFA Institute survey, over 90% of investment professionals consider financial statement analysis their primary analytical tool.

However, numbers tell only part of the story. Research by Loughran and McDonald demonstrates that the language in corporate disclosures provides valuable information beyond the financial figures. This recognition led to the first generation of sentiment analysis in finance.

By the 2010s, more sophisticated methods emerged. Natural Language Processing (NLP) techniques began analyzing syntactic patterns, and machine learning algorithms started identifying subtle relationships between linguistic features and market reactions. A Stanford University study found that these improved models could predict market reactions to earnings announcements with greater accuracy than traditional methods.

The LLM Revolution

The introduction of transformer-based language models in 2017 and their widespread adoption and quality improvements since 2022 marked a turning point. These models understand language in context, recognizing subtle cues that earlier systems missed.

Today's LLMs don't just count words or apply rigid rules. They understand semantic relationships, detect subtle shifts in tone, and recognize the implications of statements within broader economic and business contexts.

How LLM-Powered Sentiment Analysis Works

Beyond Word Counting

Traditional sentiment analysis essentially counted words from predefined lists. If a CEO used five positive words and three negative words, the statement received a positive score. This approach fails spectacularly with statements like: "We are not seeing the positive results we anticipated, and our outlook lacks the optimism we projected last quarter."

Modern LLMs understand these contextual relationships. They process text through multiple attention layers that weigh the relationships between words and phrases. According to research from MIT Technology Review, this enables them to accurately interpret statements like: "While facing headwinds in our core markets, we've managed to stabilize cash flow and expect gradual improvement throughout next quarter."

Capturing Semantic Meaning

LLMs excel at understanding implicit meaning in financial communications. When a CFO states, "We remain committed to our previous guidance," an LLM can assess whether surrounding context suggests confidence or hedging. Research published in the Journal of Financial Economics demonstrates that these models can identify linguistic patterns associated with future earnings surprises, corporate fraud, or management turnover.

The models also recognize industry-specific language and terminology. For example, "challenging environment" carries different implications in banking versus retail, distinctions that LLMs learn through their training on vast financial text corpora.