Validating Screener Criteria with Multiple Regression

Stock Rover Communications Manager Alex Reisman devised a sequence of stock screeners pairing mid cap stocks one-on-one in a simulated tournament devised to mimic the NCAA basketball tournament, playfully called March Midness. Her “tournament” consisted of 6 rounds with one competitor knocked out on each confrontation. Each round evaluated stocks on a particular scale or parameter, which intuitively should be strongly indicative of a stock’s potential to appreciate in the future based on the stock’s past history.

For several of the rounds, Alex used the ranked screener feature (formerly called quants) of Stock Rover to separately weight criteria and arrive at a score between 0 and 100 to determine the match-up winners. The first round rated the company’s growth record and based on consistent growth in sales, income and EPS. The second round evaluated the financial soundness of the enterprise based on cash flow and various debt ratios. The third round checked corporate efficiency by looking into the operating margins and return rates on assets, capital, equity as well as a low level of receivables. Fourth, the company’s competitive advantagewas gauged by a combination of subjective criteria, which, although important, will not be part of this blog post, as well as objective ratings based on a variety of performance metrics compared to each company’s own industry. The fifth round looked for favorable valuation derived from the ratios of price-to-book, earning & sales, and EV/ EBITDA. The sixth and final screen centered on volatility considered as a proxy for risk and was comprised of low beta and price standard deviation for a 1- and 3-year period.

Developing a Global Predictor

All of these metric scales have what has sometimes been called “construct validity,” which, put simply, is the degree to which a test measures what it purports to measure. Growth, financial soundness, corporate efficiency, competitive advantage, valuation and risk all sound as if they should be important to the overall success of a company. However, as investors, what concerns most of us is whether this analysis can help us predict the stock price in six months, one year, or two, five or ten years. Moreover, if each of these scales is equally important then we should be able to add them up and obtain an even better global predictor for each company.

What I did was just that. I took a sample of 300 stocks covering large and small cap stocks and stocks in all of the sectors of the S&P 500 for which full data were available from Stock Rover. I then took the numerical quant score for each category (growth, etc.) and summed to get a total score which I correlated (using Pearson product moment correlation) with the percent gain or loss in the stock price over 6 months, 1 year, and 2, 5 and 10 years. The 300 stocks I used and their quant scores are listed in the ‘Dataset’ worksheet of the .xlsx file available for download at the bottom of the page. Details about all the metrics used to determine the quant scores can be found in the ‘Screener Criteria’ worksheet of the same document.

The overall results were disappointing. First the correlations were of a very low order and the best correlation was r=0.28 between total score and the 5-year share growth. The correlations between each of the parameters was not much better and the best correlations was between growth and total score (r=0. 0.37). These correlations can be found in the table at the bottom of the ‘Dataset’ worksheet in the .xlsx document that is available for download below.

Changing the Approach

This disappointment led to a rethinking about how to best combine the six parameters to optimize the predictive powers of each. Although Alex and I agreed that ultimately the weighting of the scoring system depends entirely on personal preference, there is a statistical procedure called multiple regression that can find the optimal weighting for us.

The purpose of multiple regression is to learn more about the relationship between several independent or predictor variables and a dependent or criterion variable. For example, a real estate agent might record for each listing the size of the house (in square feet), the number of bedrooms, the average income in the respective neighborhood according to census data, and a subjective rating of appeal of the house. Once this information has been compiled for various houses it would be interesting to see whether and how these measures relate to the price for which a house is sold. For example, you might learn that the number of bedrooms is a better predictor of the price for which a house sells in a particular neighborhood than the house’s square footage. You may also detect “outliers,” that is, houses that should really sell for more, given their location and characteristics.

Similarly, I decided to combine the parameters of growth, financial soundness, corporate efficiency, competitive advantage, valuation and risk to obtain a best correlation with stock price. If you have an interest in statistics and want to see the details of how I did this, refer to the time-period worksheet tabs of the attached .xlsx document (6 months, 1 yr, etc.).


Here is a summary of the total score, with all parameters (growth, etc.) combined, for how well they correlate to stock price growth over the different periods:
multiple regression table

As you can see above, for the 300 stocks in our dataset, 6-months and 10-years produced very low order correlations (r=0.28 and r=0.30 respectively). The multiple correlations for 1 year 2 years and 5 years were at 0.41, 0.44 and 0.49 respectively—somewhat better. Although the correlations are still of fairly low order, the low p-values mean that the statistical reliability was very high.

The procedure furthermore revealed that the contribution of each of the six parameters was far from equal. Financial health and corporate efficiency were never significant predictors of stock performance over 1-year, 2-year and 5-year intervals. Moreover, growth was the most consistent predictor at all the time intervals.

Prior to calculation, one might have presumed a positive correlation between stock price and both valuation and volatility. Paradoxically, valuation and volatility were notable for their negative weights in the multiple regression. In other words, the higher the valuation and volatility scores, the lower the predicted stock price after 1 year, 2 years, and 5 years. Tables showing the information described in this section can be found in the attached PDF document at the bottom of this page. Only statistically significant correlations were included in the tables.

Concluding Thoughts

The dataset in this exercise only included 300 (about 4%) of the nearly 8,000 stocks available to Stock Rover. Consequently, the generality of the results is subject to sampling error. The choice of stocks included in the dataset was relatively arbitrary, although an effort was made to include representative stocks from all sectors of the S&P 500 and both large and small capitalization stock. Companies with less than 10 years growth history were also excluded.

Despite the risk of sampling bias, our tentative conclusion is that growth was clearly the best predictor of future stock price, but valuation and volatility can improve the accuracy of the prediction, whereas financial health and corporate efficiency were strangely inefficient predictors. In the future, we would like to test these conclusions but running this exercise again on different stock populations.


Download a summary of the Multiple Regression Tables (PDF)

Allan Smith is a retired neuroscientist and now an enthusiastic, but unofficial, member of Stock Rover’s beagle pack of stock-hunters. Please send comments to