Is there a best day of the week to invest?

## Best Day of the Week to Invest in the Stock Market

Today I veer firmly into the "random stuff" territory in this blog, and not just figuratively. Indeed we are going to look into randomness: the stock market randomness. Say you have some money you want to invest in the stock market; maybe you received a bonus or some other lump-sum payout. Statistically speaking, is there a "best" day of the week  - Monday through Friday- for when to invest that money in the stock market? Turns-out this is not a new question - a simple Google search reveals many answers, often contradicting each other! However,  the conventional wisdom in this matter seems to be that Mondays are the best time to invest.  Apparently, the market tends to end down more often on Mondays than in other days of the week. Call it the "Monday blues" effect:)
Your's truly is a skeptical mind though, and to paraphrase Winston Churchill, I do not trust any statistic I did not fake myself!  So I decided to dust-off my data-science and statistics skills and try to reach my own conclusions. The results may surprise you...

## The data

I'll be using SP500 index return data as a proxy for the whole stock market in this analysis. These days, this data is readily available from many sources, but I used yahoo (link here https://finance.yahoo.com/quote/%5EGSPC/history/). This data-set contains data from 1950 till the present day (August 2019 as of this writing). I won't bore you with all the programming details, but suffice to say I used the R statistical language to process the data and generate the analysis. Here's a summary of the complete 1950-2019 data-set by day:

### SUMMARY PER DAY:
day      mret   mdret sdret
1 Mon   -0.128  -0.0743 1.03
2 Tue    0.0282  0.0107 0.811
3 Wed    0.104   0.103  0.815
4 Thu    0.0529  0.0530 0.761
5 Fri    0.108   0.116  0.752

mret = mean return [%]
mdret = median return [%]
sdret = standard deviation of the return [%]

I'll be damned!  The Monday mean (and median) returns are indeed  lower than the other days. The Googles were right.  But, wait a second... this doesn't prove it, right? One thing that jumps at you is that the standard deviation is pretty large for all days -  even more so on Mondays. This suggests there is quite a bit of volatility in this data. To be more rigorous here, we need to conduct a statistical test.
One way to determine whether there is a statistically significant difference between the mean returns for each day is to perform a ANOVA test (aka ANalysis Of Variance). Here are the results for the complete 1950-2019 data-set:

### SUMMARY ANOVA - Pr(>F) < 0.05 MEANS SIGNIFCANT GROUP DIFFERENCES:
Df Sum Sq Mean Sq F value Pr(>F)
day             4     73  18.256   25.98 <2e-16 ***
Residuals   10049   7060   0.703
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Holly Molly! The p-factor is minuscule (2e-16) so this proves the mean difference between groups is significant right? Now I'm getting excited.
However, after thinking a bit more, old Winston Churchill voice returns whispering in the back of my mind... are these real or fake statistics?

## Not so fast

Let's have a closer look. I'm a visual person and get a kick out of visualizing data-sets. So, after playing with ggplot2's package impressive plotting capabilities, I came-up with the following plot:

Figure 1 - Returns [%] vs Day - 1950-2019

And just for kicks, here are the corresponding histograms:

Figure 1 - Return Histograms[%] vs Day - 1950-2019

For the fellow data-geeks out there, the first plot is a 'boxplot' overlay with a 'jitter-plot'. In English: it shows the days of the week on the x-axis and each point corresponds to the return observed on a given day. The box-plot center is the median of each day, and the box margins contain most of the data for each day. What is interesting here is how similar the median (and the boxes) look, even though Monday's mean is slightly lower. But what is even more interesting is that Monday has quite a number of outliers, including one below 20% in a  single day.. Ouch! 1987 market crash anyone?

In fact, looking in the complete 70-year data-set for the very worst days by return reveals quite a few Mondays:

# WORST PERFORMING DAYS 1950-2019
date        close    diff    ret day
1 1987-10-19  225.   -57.9  -20.5  Mon
2 2008-10-15  908.   -90.2   -9.03 Wed
3 2008-12-01  816.   -80.0   -8.93 Mon
4 2008-09-29 1106.  -107.    -8.81 Mon
5 1987-10-26  228.   -20.6   -8.28 Mon
6 2008-10-09  910.   -75.0   -7.62 Thu
7 1997-10-27  877.   -64.7   -6.87 Mon
8 1998-08-31  957.   -69.9   -6.80 Mon
9 1988-01-08  243.   -17.7   -6.77 Fri
10 2008-11-20  752.   -54.1   -6.71 Thu
11 1962-05-28   55.5   -3.97  -6.68 Mon
12 2011-08-08 1119.   -79.9   -6.66 Mon
13 1955-09-26   42.6   -3.02  -6.62 Mon
14 1989-10-13  334.   -21.7   -6.12 Fri
15 2008-11-19  807.   -52.5   -6.12 Wed
16 2008-10-22  897.   -58.3   -6.10 Wed
17 2000-04-14 1357.   -83.9   -5.83 Fri
18 2008-10-07  996.   -60.7   -5.74 Tue
19 1950-06-26   18.1   -1.03  -5.38 Mon
20 2009-01-20  805.   -44.9   -5.28 Tue

So this got me wondering if one would get different results running the analysis using a subset of the data that excluded 1987. I decided to make the cutoff at 1990. This is admittedly a somewhat arbitrary decision, but that never stopped me before. I could however make a case that 1990 seems like the start of the modern information era; the beginnings of the internet and wide availability of computers and cell-phones. It also conveniently excludes the 1987 crash:) So how does the analysis look when we include only 1990-2019? Have a look:

### SUMMARY PER DAY:
day     mret  mdret sdret
<ord>  <dbl>  <dbl> <dbl>
1 Mon   0.0325 0.0628  1.21
2 Tue   0.0634 0.0381  1.13
3 Wed   0.0408 0.0568  1.05
4 Thu   0.0198 0.0387  1.10
5 Fri   0.0127 0.0725  1.02

### SUMMARY ANOVA - Pr(>F) < 0.05 MEANS SIGNIFCANT GROUP DIFFERENCES:
Df Sum Sq Mean Sq F value Pr(>F)
day            4      2  0.5944   0.489  0.744
Residuals   7459   9059  1.2146

A p-factor of 0.744 (vs 2e-16 for the complete data-set). Any statistician that claims a statistical difference between groups with such a large p-factor will likely be ostracized by the community, covered in tar and chicken feathers. There is no statistically significant difference between the days of the week for the last 30 years. Bummer!

Figure 2 - Returns [%] vs Day - 1990-2019

## Conclusion

My conclusion after all this analysis is that I don't expect any meaningful difference between investing on Mondays versus any other day of the week. Seems to me that betting on Mondays being down is betting on catching an outlier event like the 1987 crash, or like a few bad Mondays in the 2008 crash.

That is, of course, unless you believe the next three decades will look more like the 1950-1990 decades than they do 1990-2019... Hey, If that means we get Elvis and Sinatra back, sign-me up!

Comments, questions, suggestions? You can reach me at: contact (at sign) paulorenato (dot) com