top of page

The Functional Art: Exploratory Stock Screening

This disclaimer informs readers that the views, thoughts, and opinions expressed in the text belong solely to the author, and not necessarily to the author's employer, organization, committee or other group or individual. You should not treat any opinion expressed on this article as a specific inducement to make a particular investment or follow a particular strategy, but only as an expression of an opinion.

The aim of the blog post is to demonstrate the use of data visualization for screening stocks. The analysis is limited to the U.S stocks with market cap greater than $3 billion.

A. Introduction

Exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task.

A stock screener is a tool that investors and traders can use to filter stocks based on user-defined metrics.

A stock screener would need some variables as well their values for filtering/screening stocks.

The variables could be broken down into three types:

  • Fundamental variables such as Return on Equity, Return on Invested Capital, Earnings per Share growth, Sales growth, etc.

  • Market variables such as beta, price, credit spreads etc.

  • Both Fundamental and Market variables such as trailing Price to Earnings per Share (PE), forward PE, forward Price to Book Value per Share, etc.

The blog post uses data visualization to explore selection of 'values/criterias' for screening stocks.

The selection of variables and it's values is very important, as it would differentiate many investors/traders in the market place. Every investor has its own philosophy. The classification could be simplified as follows momentum investors, growth investors, value investors, thematic investors, etc.

For my analysis, I use the following categories and metrics:

1. Categories by Quality of the Firm

To determine quality of the firm, I used 3 year average return on invested capital. The categories are broken down into four quartiles. The first quartile has firms with lowest 3-Yr Average ROIC (low quality) and the fourth quartile has firms with highest 3-Yr Average ROIC (high quality).

3 Yr Average ROIC - Categories and Sectors


Source: Datastream

2. Categories by Value-Growth score

The Value-Growth score is determined by Morningstar. Value-Growth score is re-scaled to fit [-100 to 400] scale, where -100 represents 'Value' characteristics/factors and 400 represents 'Growth' characteristics/factors.

There are five value factors and five growth factors.

Value factors:

  • Forward Looking Price/Projected Earnings 50.0%

  • Historical-Based Measures Price/Book 12.5%

  • Price/Sales 12.5%

  • Price/Cash Flow 12.5%

  • Dividend Yield 12.5%

Growth factors:

  • Forward Looking Long-term Projected Earnings Growth 50.0%

  • Historical-Based Measures

  • Book Value Growth 12.5%

  • Sales Growth 12.5%

  • Cash Flow Growth 12.5%

  • Historical Earnings Growth 12.5%

Value-Growth Score - Categories and Sectors


Source: Morningstar

3. Categories by Foreign Sales % of Total Sales

To measure global nature of the firms, I have used Foreign Sales % of Total Sales as a proxy. 'Below Median' consists of firms that are less/not reliant on foreign sales while 'Quartile 3' and 'Quartile 3' consists of firms that are dependent on foreign sales.

Foreign Sales % of Total Sales - Categories and Sectors


Source: Datastream

4. PE Next Twelve Month (NTM) and PE NTM 10 Year Average is used as a proxy for current and historical valuation respectively.

PE_NTM - Categories and Sectors

B. Impact of Value-Growth Score in Performance of Stocks & Funds.

For screening, we need variables as well as values/criterias. For example, running a command that select stocks if '3 Yr Average ROIC' >10 and if 'Value-Growth Score' > 200.

But the selecting number '15' and '200' is important and we need to perform exploratory data analysis for gauging which values to use. It could make a huge difference. Determining relevant criteria will help us screen stocks better.

Let us take the universe of 1200 stocks and visualize the 'Total Return YTD' vs Sector, Value-Growth Score, Quality and Foreign Sales Category.

Total Return YTD - Sector, Value-Growth Score, Quality and Foreign Sales Category


The chart on itself doesn't say much except an visualized understanding that value-growth score had positive impact on performance.

It could be observed that a lot of growth stocks represented in Q3 and Q4 outperformed the Q1 and Q2 'Value-Growth Score' category quartiles. This is not a statistical analysis, further studies would require.

We could perform regression analysis with Y= Total Return YTD and X = ['3 Yr Average ROIC', 'Value-Growth Score'). Here are the results

We could see that 'Total Return YTD' was positively correlated with 'Value-Growth Score' and is also statistically significant.

We could further perform analysis by collecting all the US Large Cap & Mid Cap separate accounts (funds). We could calculate 'Value-Growth Score' of each funds based on their historical holdings. We could categorize data based on quartiles where quartile 4th has most 'growth-centric' funds and quadrant 1st has the most 'value-centric' funds.

Risk Adjusted Category's quadrant 4 consists of top performing funds while quadrant 1 consists of worst performing funds.

We could visualize the role of Value-Growth Score in determining 'Risk Adjusted Return' of those funds.

Value-Growth Score and Risk Adjusted Return - US Large/Mid Cap Funds

Source: Morningstar,

Quadrant 4 of Risk-Adjusted Return Category has:

  1. 77% funds from Quadrant 4 of Value-Growth Score Category.

  2. 17% funds from Quadrant 3 of Value-Growth Score Category.

  3. 5% funds from Quadrant 2 of Value-Growth Score Category.

  4. 1% funds from Quadrant 1 of Value-Growth Score Category.

Without performing regression analysis or statistical analysis, we could say that there is some sort of positive relationship between performance of the funds and Value-Growth Score of the funds.

We could also demonstrate this relationship using a scatter plot.

Value-Growth Score vs Risk Adjusted Return - US Large/Mid Cap Funds

Source: Morningstar,

It's fair to wonder what's the utility of these visuals. I could think of the following actions:

1. Invest in growth funds/stocks over value stocks. This is historically relevant but in the future, interest rates could rise (depends on your macro outlook) and disturb the correlation between value-growth score and performance.

2. Invest in value funds/stocks that outperformed their peers. This could work if the historical relation remains consistent and interest rates climb up.

3. We could also use the modern screens such as:

  • a. Growth and/or Quality at a reasonable price.

  • b. Great price for modest growth and/or quality.

I will use the third method & part 'a': Growth and/or Quality at a reasonable price to screen stocks from the universe of ~1200 stocks. For example, we could use the following criteria:

  • Quality Category == 'Quadrant 4' ~ high quality &

  • Value-Growth Score Category == 'Quadrant 4' ~ high growth &

  • PE NTM < PE_NTM_10_Year_Average &

  • PE NTM < PE_NTM_Industry_Median.

Here is the output of the command:

Source: Datastream,

The next step could be to perform fundamental analysis on the selected firms and perform valuation analysis using different methods. You could learn it by taking professor Damodaran's Valuation class on the internet.

C. Sectorial Exploratory Data Visualization

Total Return YTD - Sector, Value-Growth Score, Quality and Foreign Sales Category of Individual Sectors

Information Technology




Consumer Discretionary

Consumer Staples




You could use many different variables based on your needs and philosophy and determine many different actions. But I would like to conclude by providing my belief that you could use Data Analysis and Data Visualization to get a visual understanding of the dataset you are dealing with. It would save you a lot of time and energy.

bottom of page