Undervalued Stock Scanner: From Raw Data to Smart Insights

Undervalued Stock Scanner: From Raw Data to Smart Insights

A collaborative project that combines financial logic with automated Excel and Power BI tools to transform raw data into user-friendly and insight-driven dashboards.

Like many aspiring investors, my husband and I were drawn to the challenge of identifying undervalued stocks because they are like hidden gems with solid fundamentals that the market hasn’t caught up with yet. What started as a weekend project in Excel became a fully automated pipeline combining Power Query, VBA, and Power BI dashboards.

GitHub Repo

Disclaimer: This project is for educational purposes only. The analysis, models, and methods used here are not financial advice. Investing carries risk, and any decisions based on this analysis are your responsibility. Always do your own research or consult with a qualified financial advisor before making investment decisions.

Project at a Glance

The idea started with my husband’s interest in value investing and my curiosity for building smarter, faster workflows. He defined the investment logic and I built the engine. Our goal? Identifying potentially undervalued stocks by applying statistical methods to assess key financial metrics across various sectors and three market cap categories (Large Cap, Mid Cap, and Small Cap).

Datasets

We used 11 raw CSV files exported from Thinkorswim, each representing a different sector. These contain financial fundamentals segmented by market cap:

  • Large-Cap: $10B+
  • Mid-Cap: $2B–$10B
  • Small-Cap: $250M–$2B

Each sector includes multiple industries. When comparing stocks, it’s important to consider market capitalization and industry because a “good” metric in one industry or market cap can be a red flag in another. We structured the pipeline to retain this nuance throughout the analysis.

What Makes a Stock Undervalued?

Key Fundamentals

Here’s a quick overview of the financial metrics we used:

Price-to-Free Cash Flow ratio (P/FCF)

Price-to-Free Cash Flow (P/FCF): A high P/FCF ratio indicates that the specific firm is trading at a high price but is not generating enough free cash flows to justify the price. Smaller price ratios are generally preferred, as they may reveal a firm generating ample cash flows that may not yet be reflected in the price.

Price-to-Book ratio (P/B)

Book Value Per Share (BVPS) shows a company’s net assets per share. If BVPS is higher than the stock price, the stock may be undervalued. The P/B ratio compares market price to book value. Thus, P/B of 1 means it is traded at exactly where it should be. Higher P/B means it might be overvalued.

Return on Equity (ROE)

This ratio is a gauge of a corporation’s profitability and how efficiently it generates those profits. The higher the ROE, the better a company is at converting its equity financing into profits.

Return on Assets (ROA)

Return on assets measures how effective a company’s management is in generating profit from the total assets on its balance sheet. A ROA that rises over time indicates that the company is doing well at increasing its profits with each investment dollar it spends.

Asset-to-equity ratio (A/E)

The asset-to-equity ratio measures a company’s financial leverage by comparing its total assets to its shareholders’ equity. A higher ratio means more debt and higher financial risk. A lower ratio signals a more conservative, equity-heavy structure.

Price-to-Earnings ratio (P/E)

This metric shows how much investors are paying for each dollar of earnings. A high P/E can signal high growth expectations or overvaluation. A low P/E might suggest undervaluation or strong recent performance.

How We Evaluate Stocks Using These Metrics

Our evaluation is based on a series of carefully selected financial criteria but here’s the key: we never compare a stock to the wrong peer group. Each stock is only evaluated relative to the industry average within its own market cap category. To make industry comparisons fair and statistically sound, we calculate the mean and standard deviation for each metric within the same industry and market cap group, excluding outliers. Outliers are identified using the interquartile range (IQR) method, and removed before calculating averages and z-scores.

Preliminary criteria:

To be considered “undervalued,” a stock must meet all of these:

  • P/FCF > 0 and below the industry average
  • P/B > 0 and below the industry average
  • A/E > 1 and below the industry average
  • ROE > 10%
  • ROA > 5%

Additional criteria:

We further rank stocks based on these bonus signals:

  • P/B < 70% of industry average
  • ROE > industry average
  • ROA > industry average
  • P/E < industry average
  • P/E between 1 and 25

For each criterion met, a stock gets 1 point. The highest score a stock can get is 5 points.

Automating the Mess: Turning Chaos into Clean Data

Initially, my husband manually cleaned and analyzed the raw stock data in Excel. But with 11 sector-specific datasets and dozens of formulas, the process quickly became time-consuming, repetitive, and prone to error. Especially with frequent data updates, each refresh could take hours.

I completely re-engineered the process to be fast, accurate, and user-friendly by implementing a set of tools:

  • Power Query (20+ custom functions): For automatic cleaning and transformation.
  • Advanced Excel formulas: For metric calculation across all datasets.
  • RTD function for real-time stock prices: the RTD (Real-Time Data) function integrates with the Thinkorswim platform to pull real-time stock prices directly into the workbooks.
  • Pivot tables: To calculate industry averages while filtering outliers.
  • VBA Macros: To refresh all data and recalculate results with a single click.
  • Conditional formatting: To surface strong candidates visually, making stock analysis easier for users.
  • Dynamic file paths: So the project works across any machine without broken links.

After processing the data in Excel, I brought the results into Power BI, where I created interactive dashboard that enables non-technical users to explore insights without needing to touch the raw data.

The resulting Excel workbooks form the backbone of this project, acting as the central platform that integrates Power Query, advanced formulas, and VBA macros. These files can function independently as stock scanners for individual sectors, even without the dashboard. However, the dashboard plays a crucial role in bringing everything together, presenting the data in a way that’s more accessible and engaging for non-technical users.

The dashboard showcases top-scoring undervalued stocks by sector and market cap, alongside industry averages for each fundamental metric, to help users evaluate individual stocks and benchmark industries.

👉 Open the dashboard in new tab.

While the dashboard highlights stocks that meet certain criteria and includes summary cards showing the top 3 stocks for each individual metric, it avoids naming a single “best overall” stock. What’s considered “best” can vary widely depending on which metrics an investor prioritizes. This design choice reflects an intentional effort to keep the analysis unbiased and flexible. The goal is to present clean, objective data, not advice.

Final Thoughts

This project started with a shared curiosity and turned into a complete, repeatable workflow that saves hours and delivers clear insights. It was as much about collaboration as it was about technology. Working closely with my husband, a non-technical stakeholder, I used an informal agile approach by delivering in small chunks, gathering feedback, and adapting quickly. Clear communication and step-by-step guides I created helped my husband transition smoothly to the new process. I’m proud to have taken initiative, self-taught new skills ahead of coursework, and built a fully automated system that saves time and drives smarter investment decisions.

The real win? Building something that works for both analysts and non-technical users alike. It reminded me that the best data projects don’t just analyze, they simplify. They turn messy data into clear insights and help people make better, faster decisions. That’s what I aim to build, every time.

Undervalued Stock Trade Tracking

I also built an Undervalued Stock Trade Tracking system that automates the entire process of tracking and analyzing trade history of the Undervalued Stock strategy. Its Python-based ETL pipeline automatically accesses the file system to locate and retrieve the latest Thinkorswim statements, processes new trade history, and updates a clean dataset used for real-time Excel reporting.

Feel free to check it out for more details on how I handle trade history, ETL, and real-time Excel reporting.

👉 See Undervalued Stock Trade Tracking

Undervalued Stock Scanner 2.0

A new version of Undervalued Stock Scanner has been released. Unlike the original, this version moves all data processing and automation from Excel, Power Query, and VBA into modular Python scripts that enable fully automated ETL pipelines, batch processing of sector files, and clean outputs ready for Power BI dashboards.

👉 See Undervalued Stock Scanner 2.0