Who wants Warren Buffet’s $1Billion?

Last week Warren Buffet offered up a $1 Billion prize for anyone picking a perfect NCAA tournament bracket this March.  Just exactly how difficult could it be to pick a perfect bracket? I set to find out that answer mathematically. After a couple days of analysis and Monte-Carlo simulations I have come up with an answer:

1 in 500,000,000,000  (1 in 500 billion)

For comparison, the odds of winning powerball are about 1 in 175,000,000.  So for every person out there with a perfect NCAA bracket there would be 2,860 lottery winners.  Daunting, but do-able.

How could I come up with such a number? Read on.

To answer this problem you have to first know how the NCAA brackets work. Not counting the first 4 teams in (which usually don’t factor into the bracket anyways) 64 teams are slotted into a 6 round, single elimination tournament.   In order to get from 64 teams down to 1, precisely 63 teams must lose 1 game. So, if you assume that eachgame is a 50/50 crapshoot, then the math is simple- the odds are 1 in 2^63, or about 1 in 9.2 quintillion (9.2 * 10^18).  Now that indeed would be an impossible task. To put that number in perspective, for each correct bracket with those odds there would be 52 billion powerball winners. Another way to wrap your mind around that number is to visualize that there are 8*10^17 square inches on the surface of earth (including all oceans, etc), so if you placed 11 brackets on every square inch of earth and they were all unique (no repeated brackets), you would have precisely one winner. Odds are that one square inch of real estate is not in your yard.

But each game is not a 50/50 crapshoot, and there lies the tantalizing probability of bringing those odds way, way down in your favor. For instance a #16 seed has never beaten a #1 seed. They have been playing this bracket format since I was a kid and it has not happened yet. Similarity a #2 seed rarely loses to a #15 seed.  And if a #15 seed wins a game, it rarely wins a 2nd game. So we can use the fact that some teams are better than others to weight down the probabilities around the whole bracket.

To model this I used a Monte-Carlo simulation. Label each team A1-A16 (first region) through D1-D16 (last region).   Assign “ping-pong balls” to each team based on rank. A #1 seed gets 99 ping pong balls (or their virtual equivalent). #2 gets 95, #3 gets 90 all they way down to #16 getting 1 ball.   When 2 teams play, assign a random number between 0 to 1 for each team and then multiply that number by their ping-pong ball rating. The team with the higher final number wins.

Using this method, a #1 seed will beat a #16 seed about 99 times out of 100. Similarly, a #1 seed will beat a #3 seed about 99 times out of 99+90=189 (just over 50%). We can actually adjust these weights based on historical brackets.  Here is the actual perl script I used to do this simulation.

Now, getting a perfect bracket is hard enough. If this year a bunch of #16-#13 seeds make runs deep into the tournament that scenario would be impossible to predict with the weights I have assigned to the teams. My weights make it much more likely that the top teams will make the deepest tournament runs, which models reality.  To test the model I used the 2008 bracket as the model bracket I want the computer to come up with. Why 2008? That was a less-madness year, where all 4 #1 seeds made the final four, and there were few upsets throughout the brackets.

My script simulates a whole tournament, multiple times. I wanted to see how many times the computer would have to simulate the tournament to get all 63 games right (compared to the 2008 perfect bracket).

The first time through, the computer got 34 games right (out of a possible 63). The second and third tries yielded less than 34 correct games, but the 4th try improved on the result and correctly predicted  45 games.  From there it takes a lot of iterations to get more games accurately predicted. In fact, here are the numbers I got.

Number of correct picks (compared to 2008 perfect bracket)
Iterations the computer needed to achieve this result
34    1
45    4
47    236
48    6,144
49    9,688
50    19,770
51    101,360
52    212,544
53    351,630
54    2,162,574
55    33,794,131
56    43,477,602
63    500,000,000,000 (est)
As you can see, the numbers go up very slowly past 50 correctly predicted games.  My computer can crunch about 10,000 brackets a second, but that still takes a long long time to get to 500,000,000,000.  So then, how do I arrive at 500,000,000,000?  Graph the known data points on a logarithmic scale, and extrapolate where 63 lands. A decently straight line to me shows about 500,000,000,000.

So, tell me.  What are you going to do with your billion?

Thoughts from Pasadena

The Tuesday morning after the BCS title game this showed up on my twitter account:

 

I have not clicked the link. I have not read the article. I do not intend to do so either.

In the aftermath of the Rose Bowl BCS title game everything was simply too painful for me. There was no way I could listen to the post-game radio call. There was no way I could look up the game stats on ESPN Scorecenter. There was no way I could read the post-game write up on USA today. Mind you, this was not a technology limitation- I did posses a smartphone and my hotel room gave me a perfectly good copy of USA today Tuesday morning. However, I still could not, can not, and likely never will be able to go back there.

What follows are my thoughts from the game

On the game itself

I don’t need a stat sheet to tell me that Auburn likely out gained FSU by 50-100 yards. I also remember Jameis Winston getting sacked and hit multiple times (I’d guess 4 sacks and 10 total hits). Auburn was clearly the more dominant team, but they had those 21 points in ~20 minutes of football, and then suddenly no more for a long time.  That’s where the game was lost. As well as the defense was playing, you can’t go scoreless for ~30 minutes and not expect your defense to continue to play at a high level.

I will admit, the final Auburn touchdown had me pretty excited, but the long pass play by FSU on their final drive had me sitting down in my chair.  I could not bring myself to watch. Fortunately there are others who know the pain, this lady in front of me did the same thing.

On asking the other team to take your picture

Check out this pair. The FSU fan on the left asked the Auburn fan to his right to take his pic at the stadium and gave the Auburn fan his phone.  The Auburn fan obliged, snapped a pic and then, um, dropped the phone while returning it to him. The result was a ridiculously cracked screen you can see in the post.

The wall of champions outside the Rose Bowl

At the front entrance of the Rose Bowl are a series of plaques depicting the Rose Bowl champions back to 1910 or so. All of them have gold lettering except one- the 2002 Rose Bowl which OU won. That game was also the first non-sellout for the Rose Bowl ever and the first non title game here that did not involve a Big10-PAC10 matchup. Coincidence? I doubt it.

The Goodyear blimp

During the whole game the Goodyear blimp circled overhead. Strangely, the only message that it could say said “check your tire pressure monthly”.  Really? The national championship game? Do people really want to care about tire pressure at that time?

Scoreboard cover photo

I so very desperately wanted to take a scoreboard pic and use it as my FB cover photo for the next 365 days. Alas, this is as close as I came :(

Sunrise in Pasadena

It was a beautiful day weather wise here. While Norman was 7° this morning with the wind chill of 0, we were warm and toasty in the mid-70s. The next morning we had our feet in the sand at the beach. Very hard to be in Norman Oklahoma when you know Pasadena is out there.

Time for a new Blog

Happy 2014!

It is time for agafamily.com to have a new blog home.

Why?

I first started blogging on a laptop in my house in 2006, a bit before Facebook became the go-to app of everyone’s spouse. I got to learn some basics on HTML and how to control the message (my direct HTML editing) and the location of the message (a laptop in my house).

At that time I used blogger by Google to do some of the text editing. Well, somewhere about 2009 Blogger changed something and I lost the ability to quickly and easily edit my blog. So there were very few blog posts over the past 4 years.

I figure it is time to now move to WordPress and move to the cloud.

So, hello world! Here I am again!

Backtesting the “January effect” theory

If you trade stocks you have no doubt heard of the January effect. It goes something like this:

“The stock market for the year does the same that it does in January. If the month of January is up, then stocks will be up for the full year. If the month of January is down, then stocks will be down for the full year. Furthermore, the 1st trading day in January is a predictor for the month of January and thereby a predictor for the full year”

Well, I thought it would be a good exercise to start the year by testing this theory,  As good as last year’s 30% gains were, I sure would not want to go through 30% losses this year :(

Specifically I wanted to test the 1st day theory and test it with the NASDAQ index.  Most of my mutual fund holdings today are in QQQ or TQQQ – which is the NASDAQ 100 ETF.

I wanted to go back through at least my lifetime (from 1980 on), since I remember the NASDAQ even back then was talked about as the “tech index”.  Here are the numbers. Two rows for each year, which show the NASDAQ close on 31 Dec and 2 Jan for each year:

year    NASDAQ closing price         Adj Close    Percentage Gain for year    Percentage Gain for 1st trading day of year    Verdict on Swami…
31-Dec-80        202.34
2-Jan-81        203.55
1981    31-Dec-81        195.84    -4%    0.60%    Incorrect Positive Prediction
4-Jan-82        195.53
1982    31-Dec-82        232.41    19%    -0.16%    Incorrect Negative Prediction
3-Jan-83        230.59
1983    30-Dec-83        278.6    21%    -0.78%    Incorrect Negative Prediction
3-Jan-84        277.63
1984    31-Dec-84        247.1    -11%    -0.35%    Correct Negative Prediction
2-Jan-85        245.9
1985    31-Dec-85        324.9    32%    -0.49%    Incorrect Negative Prediction
2-Jan-86        325
1986    31-Dec-86        348.8    7%    0.03%    Correct Positive Prediction
2-Jan-87        353.2
1987    31-Dec-87        330.5    -6%    1.26%    Incorrect Positive Prediction
4-Jan-88        338.5
1988    30-Dec-88        381.4    13%    2.42%    Correct Positive Prediction
3-Jan-89        378.6
1989    29-Dec-89        454.8    20%    -0.73%    Incorrect Negative Prediction
2-Jan-90        459.3
1990    31-Dec-90        373.8    -19%    0.99%    Incorrect Positive Prediction
2-Jan-91        372.2
1991    31-Dec-91        586.34    58%    -0.43%    Incorrect Negative Prediction
2-Jan-92        586.45
1992    31-Dec-92        676.95    15%    0.02%    Correct Positive Prediction
4-Jan-93        671.8
1993    31-Dec-93        776.8    16%    -0.76%    Incorrect Negative Prediction
3-Jan-94        770.76
1994    30-Dec-94        751.96    -2%    -0.78%    Correct Negative Prediction
3-Jan-95        743.58
1995    29-Dec-95        1,052.13    41%    -1.11%    Incorrect Negative Prediction
2-Jan-96        1,058.65
1996    31-Dec-96        1,291.03    22%    0.62%    Correct Positive Prediction
2-Jan-97        1,280.70
1997    31-Dec-97        1,570.35    23%    -0.80%    Incorrect Negative Prediction
2-Jan-98        1,581.53
1998    31-Dec-98        2,192.69    39%    0.71%    Correct Positive Prediction
4-Jan-99        2,208.05
1999    31-Dec-99        4,069.31    84%    0.70%    Correct Positive Prediction
3-Jan-00        4,131.15
2000    29-Dec-00        2,470.52    -40%    1.52%    Incorrect Positive Prediction
2-Jan-01        2,291.86
2001    31-Dec-01        1,950.40    -15%    -7.23%    Correct Negative Prediction
2-Jan-02        1,979.25
2002    31-Dec-02        1,335.51    -33%    1.48%    Incorrect Positive Prediction
2-Jan-03        1,384.85
2003    31-Dec-03        2,003.37    45%    3.69%    Correct Positive Prediction
2-Jan-04        2,006.68
2004    31-Dec-04        2,175.44    8%    0.17%    Correct Positive Prediction
3-Jan-05        2,152.15
2005    30-Dec-05        2,205.32    2%    -1.07%    Incorrect Negative Prediction
3-Jan-06        2,243.74
2006    29-Dec-06        2,415.29    8%    1.74%    Correct Positive Prediction
3-Jan-07        2,423.16
2007    31-Dec-07        2,652.28    9%    0.33%    Correct Positive Prediction
2-Jan-08        2,609.63
2008    31-Dec-08        1,577.03    -40%    -1.61%    Correct Negative Prediction
2-Jan-09        1,632.21
2009    31-Dec-09        2,269.15    39%    3.50%    Correct Positive Prediction
4-Jan-10        2,308.42
2010    31-Dec-10        2,652.87    15%    1.73%    Correct Positive Prediction
3-Jan-11        2,691.52
2011    30-Dec-11        2,605.15    -3%    1.46%    Incorrect Positive Prediction
3-Jan-12        2,648.72
2012    31-Dec-12        3,019.51    14%    1.67%    Correct Positive Prediction
2-Jan-13        3,112.26
2013    31-Dec-13        4,176.59    34%    3.07%    Correct Positive Prediction
2-Jan-14        4,143.07
2014                ???    -0.80%

All total that’s 33 years.

23 up years and 10 down years (up 70% of the years)

14 correct positive predictions
4 correct negative predictions
6 Incorrect Positive Predictions
9 Incorrect Negative Predictions
In total, that 18 correct predictions and 15 incorrect predictions  (55%)

Some interesting things fall out of this table.  For instance, do you remember that Jan 2 2001 was down -7.23% for the 1st trading day of January! Whew!

At first glance you may say that something that predicts with 50+% accuracy is a good predictor. That is not correct. You have to remember the stock market is not a random coin flip (where 55% prediction accuracy would be great!) but something that tends to rise over time. So a predictor that says “Any year beginning with the digit 1 or 2 will be an up year” would be right 70% of the time. In that light, I’d need a January effect predictor that would need to be something much more accurate than 70% for me to be interested.

More interesting still, if trying to use the data to your trading advantage.

If you bought the NASDAQ on Jan 2, 1980 and held it to Dec 31, 2013- your gain would be 1952% (4176-203)/203.   If instead you watched the 1st trading day of Jan and went 100% in the market in years where that 1st trading day was up and stayed out of the market for the whole year when the 1st day was down, you would have a 33 year total return of 1945%, virtually identical to buy and hold.

Therefore, in proper Mythbuster’s fashion, the 1st trading day of January theory is *BUSTED*