All posts by Neville Aga

AI hallucinations and MNIST digits

Easy question: What digit (0-9) is this?


It was a “2”. Got that? Well, this should also be another easy question: What digit is this?

If you said, “it’s not a digit, it is the letter Q” you are wrong. Listen to the rules again: What digit (0-9) is this? The answer is of course 8, as seen here by the output of a 3 layer MLP neural network:

Not convinced? Well you should be. With enough training samples and epochs the neural net has great accuracy of reading handwritten digits.

OK, what? Back to the beginning – what are we even talking about?

A foundational piece of machine learning, really one the early tasks that can’t simply be programmed away, was recognizing handwritten digits. This work goes back to the 1990s and the first (artificial) convolutional neural networks. It is basically impossible to program in if/then statements to identify a number 2. You could write code that says if pixel places 620-630 are all greater intensity than 0.8, then you probably have a line on the bottom, hence a feature of a number 2. But obviously that does not scale or work with all the variability of how people write.

Take this handwritten “3”. How do we know it is a 3? Well, come to think of it, actually I do not. This particular piece of data was not labeled by the author. Another problem for another time.

MNIST focused on taking handwritten digits and converting them to 28×28 grayscale so machines could process them. So first, convert this to a 28×28 grayscale:

Now visualize that as a 28×28 matrix, each element between 0-255 (8-bit):

That is perfect for working with machines. Notice this can just be a 784 element column-vector with values 0-255 in each place. We build our neural network as follows:

784 neurons on the left, one for each pixel of the input. 10 neurons on the right, one for each output possibility (0-9). Sticking with our “3” that means input X151=63, X152=99,(those are the first nonzero pixel inputs you see in the matrix above) … straight to the end where pixel 784 or X784=0. The outputs should be [0 0 0 1 0 0 0 0 0 0] meaning we have 100% confidence the input is a “3” and not something else. Don’t worry about the AI black box magic right now, we’ll address that in a minute. Here is the actual output we get:

===============================================
SINGLE TEST SAMPLE SOFTMAX OUTPUT PROBABILITIES
===============================================
True label: n/a → Model predicted: 3
Confidence: 0.892576 (89.26%)
----------------------------------------------
Class Name Softmax Prob Bar
-----------------------------------------------
0 0 0.000000 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.00%
1 1 0.000001 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.00%
2 2 0.000001 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.00%
3 3 0.892576 ████████████████████████████████████████ 89.26%
4 4 0.000000 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.00%
5 5 0.091811 █████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 9.18%
6 6 0.000000 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.00%
7 7 0.000000 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.00%
8 8 0.014620 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 1.46%
9 9 0.000992 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.10%
-----------------------------------------------
PREDICTED CLASS: 3 (3) with 89.26% confidence
===============================================

A well-communicated written “3” There is still a 10% chance it is a 5 or 8, but that’s just 2 times through (also known as 2 epochs) the training set of 60,000 MNIST digits. As we go to 10 epochs and beyond the output neurons do go to 100% (or [0 0 0 1 0 0 0 0 0 0] more specifically). Here is how the model (python script you can pull from GitHub, link at the bottom) is invoked

./mnist_digits_10.py --train_classes d --test_classes d --test_samples 1 --epochs 10 --show_softmax_output_probabilities --test_seed 24 --train_samples 60000


Training on classes: ['d']
Testing on classes: ['d']
Using device: mps
Proportional dataset created:
  digit: 60000 samples (100.0%)
Loaded MNIST test dataset: 10000 samples
Using train_seed=42 for training data selection
Using test_seed=24 for test data selection
Using 60000 training samples and 1 test samples
Model created with 10 output classes
Starting training for 10 epochs
Epoch 1/10, Samples: 3200/60000, Loss: 1.2241
Epoch 1/10, Samples: 6400/60000, Loss: 0.473
  ...(skipping to the end of training output)...
Epoch 10/10, Samples: 51200/60000, Loss: .01
Epoch 10/10, Samples: 54400/60000, Loss: .01
Epoch 10/10, Samples: 57600/60000, Loss: .01
Training completed in 20.21 seconds
Average time per sample: 0.03 ms

==================================================
SINGLE TEST SAMPLE SOFTMAX OUTPUT PROBABILITIES
==================================================
True label: n/a → Model predicted: 3
Confidence: 1.000000 (100.00%)
--------------------------------------------------
Class    Name     Softmax Prob    Bar
--------------------------------------------------
0        0        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
1        1        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
2        2        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
3        3        1.000000    ████████████████████████████████████████ 100.00%
4        4        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
5        5        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
6        6        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
7        7        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
8        8        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
9        9        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
--------------------------------------------------
PREDICTED CLASS: 3 (3) with 100.00% confidence
==================================================


 

What if we train less? like instead of 60,000 training images and 10 epochs, how about 100 training images and 2 epochs (so it will only be exposed to 20 “3”s, 10 “3”s 2 times each)

./mnist_digits_10.py --train_classes d --test_classes d --test_samples 1 --epochs 2 --show_softmax_output_probabilities --test_seed 24 --train_samples 100
Training on classes: ['d']
Testing on classes: ['d']
Using train_seed=42 for training data selection
Using test_seed=24 for test data selection
Using 100 training samples and 1 test samples
Model created with 10 output classes
Starting training for 2 epochs
Training completed in 0.19 seconds
Average time per epoch: 0.09 seconds
Average time per sample: 0.94 ms

=================================================
SINGLE TEST SAMPLE SOFTMAX OUTPUT PROBABILITIES
==================================================
True label: 3 → Model predicted: 0
Confidence: 0.137974 (13.80%)
--------------------------------------------------
Class    Name     Softmax Prob    Bar
--------------------------------------------------
0        0        0.137974    ███████████████████████████████████████  13.80%
1        1        0.074135    ███████████████░░░░░░░░░░░░░░░░░░░░░░░░   7.41%
2        2        0.088325    █████████████████████░░░░░░░░░░░░░░░░░░   8.83%
3        3        0.113393    ██████████████████████████████░░░░░░░░░  11.34%
4        4        0.102612    ██████████████████████████░░░░░░░░░░░░░  10.26%
5        5        0.104179    ██████████████████████████░░░░░░░░░░░░░  10.42%
6        6        0.103599    ██████████████████████████░░░░░░░░░░░░░  10.36%
7        7        0.080653    ██████████████████░░░░░░░░░░░░░░░░░░░░░   8.07%
8        8        0.106269    ███████████████████████████░░░░░░░░░░░░  10.63%
9        9        0.088862    █████████████████████░░░░░░░░░░░░░░░░░░   8.89%
-------------------------------------------------
PREDICTED CLASS: 0 (0) with 13.80% confidence
=================================================

A lot worse. The model has not learned. And we have our first hallucination — we fed it a “3” and the AI said we have a “0”. That’s bad.

But what about our Q that the model said was an 8? From a hallucination standpoint, our fundamental limitation is that network only had 10 output neurons. No matter what it was fed, it had to output something between 0-9. Therefore, the “Q” became an “8”. Look at this: what digit is this?

You can see from the label on the top — truly this is a whitespace (ws). There is nothing there. Yet the MLP neural net predicted this was in fact a “5”. How close was it?

./mnist_digits_12.py --train_classes d --test_classes w --test_samples 1 --epochs 10 --show_softmax_output_probabilities --test_seed 327 --train_samples 10000 --visualize 5

True label: 11 → Model predicted: 5
Confidence: 0.228395 (22.84%)
--------------------------------------------------
Class    Name     Softmax Prob    Bar
--------------------------------------------------
0        0        0.062223    ██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   6.22%
1        1        0.137581    █████████████████████████░░░░░░░░░░░░░░░  13.76%
2        2        0.086253    ██████████████░░░░░░░░░░░░░░░░░░░░░░░░░░   8.63%
3        3        0.066895    ████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   6.69%
4        4        0.085063    █████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░   8.51%
5        5        0.228395    ████████████████████████████████████████  22.84%
6        6        0.110055    ████████████████████░░░░░░░░░░░░░░░░░░░░  11.01%
7        7        0.104153    █████████████████░░░░░░░░░░░░░░░░░░░░░░░  10.42%
8        8        0.062765    ███████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   6.28%
9        9        0.056616    ███████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   5.66%
-------------------------------------------------
PREDICTED CLASS: 5 (5) with 22.84% confidence
==================================================

As you would expect, not any conviction on this prediction, even going through the data with 10 epochs. Of course this was not a fair fight — I trained the MLP only on digits and asked it to find me a digit in a perfectly blank space.

How do we avoid this? We add in new classes for training: whitespace and Not a Number (NaN). When we train that way our results avoid hallucinations from being fed testing data that was outside the scope of the training data. We invoke the script now with classes d,ws for both training and testing:

./mnist_digits_10.py --train_classes d,w --train_samples 50000 --test_classes d,w --test_samples 1000 --epochs 10

Training on classes: ['d', 'w']
Testing on classes: ['d', 'w']
Proportional dataset created:
  digit: 45454 samples (90.9%)
  whitespace: 4545 samples (9.1%)
Loaded MNIST test dataset: 10000 samples
Loaded whitespace test dataset: 24000 samples
Proportional dataset created:
  digit: 909 samples (91.0%)
  whitespace: 90 samples (9.0%)
Using train_seed=42 for training data selection
Using test_seed=42 for test data selection
Using 49999 training samples and 999 test samples
Model created with 12 output classes
Starting training for 10 epochs
Epoch 1/10, Samples: 3200/49999, Loss: 1.4390
Epoch 1/10, Samples: 6400/49999, Loss: 0.5672
Epoch 1/10, Samples: 9600/49999, Loss: 0.3649
  ...
Epoch 10/10, Samples: 44800/49999, Loss: 0.0100
Epoch 10/10, Samples: 48000/49999, Loss: 0.0185
Training completed in 17.04 seconds
Average time per sample: 0.03 ms
Overall accuracy on 999 test images: 97.90%
Total Type I errors: 0 / 999 (0.00%)
Total Type II errors: 21 / 999 (2.10%)

Detailed breakdown:
  Class 0: 97.9% (94/96) correct, incorrect: 2
  Class 1: 100.0% (102/102) correct, incorrect: none
  Class 2: 98.9% (89/90) correct, incorrect: 1
  Class 3: 97.1% (101/104) correct, incorrect: 3
  Class 4: 98.1% (104/106) correct, incorrect: 2
  Class 5: 98.6% (71/72) correct, incorrect: 1
  Class 6: 98.6% (71/72) correct, incorrect: 1
  Class 7: 98.7% (75/76) correct, incorrect: 1
  Class 8: 93.3% (84/90) correct, incorrect: 6
  Class 9: 96.0% (97/101) correct, incorrect: 4
  Class ws: 100.0% (90/90) correct, incorrect: none

And beautifully we were fed 90 blank images and every time saw it as a blank. Perfect.

But look at the “8”s and “9”s – only 93.3% and 96% accurate there.

But this whole exercise got me thinking:

Is it possible to avoid hallucinations by training to avoid hallucinations?

Stated another way, “Is it possible to get better accuracy identifying 0-9 digits (using the same amount of computational power) if you train on digits *and* whitespace *and* non-numbers?”

Our end goal is to avoid a digit to digit hallucination. We don’t want to be presented with a 4 and say it is a 9. That is a Type II error, and what we want to avoid at all costs. Let’s look at standard training with just digits (using backslashes for readability):

./mnist_digits_10.py \
  --train_classes d \
  --train_samples 50000 \
  --test_classes d \
  --test_samples 1000 \
  --epochs 10 \
  --visualize 5 \
  --test_seed 19

Total Type II errors: 19 / 1000 (1.90%)

Let’s look at one of the 19 failed OCR attempts, a 4 that was misread as a 9.

  Class 4: 98.0% (96/98) correct, incorrect: 1 (9), 1 (7)

Note this particular digit error is a very bad 4 (half of the MNIST digits were written by high schoolers). However, it is labeled data, so we know without a doubt it is truly a 4.

Note our total Type II errors are at 1.9% — now, if we give the exact same testing data (including this bad “4”) but train now on 45000 digits plus 5000 not-a-number, do we get better results for digit-to-digit hallucinations? What do we predict for this “4”?

./mnist_digits_10.py \
  --train_classes d,nan \
  --train_samples 50000 \
  --test_classes d \
  --test_samples 1000 \
  --epochs 10 \
  --visualize 5 \
  --test_seed 19

Total Type I errors: 1 / 1000 (0.10%)
Total Type II errors: 19 / 1000 (1.90%)
  Class 4: 100.0% (98/98) correct, incorrect: none

So no better, no worse. 1.9% to 1.9%. Although the percentage remains the same, the individual errors are different. For example, our lousy “4” is now predicted properly (2nd of these 5 below):

but other errors come up including a single Type I error where we were given a digit and predicted it was not a number.

Let’s try this with 40,000 labeled digits, 5,000 not-a-number, and 5,000 blanks. Here is fuller script output :

./mnist_digits_10.py \
  --train_classes d,nan,w \
  --train_samples 50000 \
  --test_classes d \
  --test_samples 1000 \
  --epochs 10 \
  --visualize 5 \
  --test_seed 19

Training on classes: ['d', 'nan', 'w']
Testing on classes: ['d']
Reported testing statistics will exclude impossible inferences
Visualization options: ['5']
Using device: mps
EMNIST letters: 80000 total, 52301 after excluding C,L,N,O,Q,R,S,W,Z,c,l,n,o,q,r,s,w,z
Loaded EMNIST letters dataset: 52301 samples
Proportional dataset created:
  digit: 41666 samples (83.3%)
  letter: 4166 samples (8.3%)
  whitespace: 4166 samples (8.3%)
Loaded MNIST test dataset: 10000 samples
Using train_seed=42 for training data selection
Using test_seed=19 for test data selection
Using 49998 training samples and 1000 test samples
Model created with 12 output classes
Starting training for 10 epochs
Epoch 1/10, Samples: 3200/49998, Loss: 1.4763
Epoch 1/10, Samples: 6400/49998, Loss: 0.6118
Epoch 1/10, Samples: 9600/49998, Loss: 0.4315
   ... 
Overall accuracy on 1000 test images: 98.00%
Total Type I errors: 2 / 1000 (0.20%)
Total Type II errors: 20 / 1000 (2.00%)

So, no. Went from 1.9% error rate to 2.0%.

After much testing, in general, unfortunately no, you can not get better results against hallucinations by training for hallucinations. It seems non-intuitive, but all that training compute is essentially wasted. The testing digits are always 0-9 and although you have learned blank and NaN, you are never presented those during testing. This holds to large epochs and smaller batch sizes. On Average, you do 0.3% worse on accuracy when you train with whitespaces, 0.15% worse when you train with not-a-number, and 0.2% worse when you train with both whitespaces and not-a-number. Boo. 😞

If you do want to avoid type II errors at all costs, the better way is simply rejecting all inferences where the confidence is less than some high percentage, say 99.95%. That gets to 100% across the entire MNIST test set of 10,000 accuracy even with five epochs. It is surprising to me that it is not confidence of 90% at just 2 epochs, but there are some really badly written digits in there, for example:

Rejecting low-confidence inferences is much more effective at avoiding hallucinations vs training for the unexpected.

Thanks for reading! And please, play with the code yourself — I published the code on GitHub.

==================================================

Appendix: Oh — we never dove back into the hidden layers. Here is the full 3 layer neural network.

The point of learning is to change the weights in the 3 layers (fc1, fc2, fc3) to minimize the error. Pytorch randomly selects initial weights and biases (example: in the range of (-0.09,0.09) for fc1) and then adjusts weights with each batch of trained images. Here is what our feedforward multi-layer perceptron (MLP) neural network looks like:

The input layer 784 long. Hidden layer 1 is 256 neurons. Hidden layer 2 is 128 neurons. Layer 3 is the output layer with 10 output neurons. Total of about 400 neurons. Fully connected, about 235,000 synapses.

If you were to count each input pixel input as neurons then their are 1,200 neurons. However, the input layer is just that and not neurons. Analogous to your eyes being rods and cones connected to neurons in the brain– the rods and cones themselves are not neurons. Also the input layer is not fed in as 0-255 (based on the greyscale intensity), it is fed in between 0-1 (that is raw intensity of that pixel value (input logit?) divided by 255).

Really, neurons are just matrix math. That first hidden neuron in blue (h11) is simply the sum of each weight times each input summed, plus it’s bias

The ReLU just means that if the sum is positive, it stays. If the sum is negative, then h11=0. It stands for Rectified Linear Unit. Just doing this matrix math over and over again is what GPUs (and now we think maybe the human brain) does. Once it is learned, then the learned features (digits) are the AI black box magic inside the weights and biases.

Results of an 18 year experiment on college savings

My parents (God rest their souls) always were helpful with our 4 kids and money for college. When Addison and Emerson were born, they started a 529 savings account for each child and contributed $100 every month. When the girls were two years old I matched that by contributing an additional $101 every month. We continued that until my parents moved in 2022. So, we averaged about $201 in every month, or about $2,500 in each year for ~15 years. I took out ~$20,000 from each account from 2018-2024 for my graduate school and Evan’s schooling. So how much has that ~$13,000 in principle grown to now they are both 18 and heading out for college next Fall?

Well, when Addison and Emerson were still preschoolers I decided I would run an 18 year experiment. I would invest all Addison’s account in an SP500 fund, and Emerson’s I would invest 50% in SP500 fund and 50% in the age based fund, the kind that are more aggressive when the kid is younger, and more conservative as the kid nears college.

Of course this was not to favor or handicap one child — as the account owner I can always shift money, so the expectation is mom and dad will pay for college – I was never intending to give Addison the ability to go to only to OCCC and Emerson to Vanderbilt if the market did bad, or the reverse if the market did good.

So, looking behind the covers, the fund the Oklahoma college savings plan (the only one I get a state tax deduction for) — now calls the “U.S. Equity Index Option”, which when you go to their page https://www.oklahoma529.com/investment/risk-based/us-equity-index-option/ says it is really 100% invested into symbol TIEIX – Nuveen Equity Index Fund. The benchmark for that fund is the Russell 3000 Index. I know the Russell 5000, but Russell 3000? Never heard of it. So how has that fund done compared to SPY? Well, pretty similar. Both market cap weighted indexes, just the SP500 is limited to the top 500, Russell3000 is 3,000. The top stock weighting of the SP500 is NVDA at 7.96%, where the top of the R3000 is also NVDA at 6.78%.

To get performance, however, you must cite your source of truth. Even something straightforward like SPY, well look at these different reported results for 5 year SPY performance. Let’s look at State Street (the funds owner), Google finance, yahoo finance, and SeekingAlpha for $10k invested 5 years ago:

State Street$19,685
Google Finance$19,513
Yahoo Finance (chart)$20,336
Seeking Alpha$20,341

so why the difference? Dividends? let’s use yahoo finance daily historicals:

adjusted close of 307.83 to today’s price of 682.80 is 121.96% or $10k invested is now $22,196. So no, dividends were not even included. The non adjusted price of 330.20 to todays 682.80 is 106.9% or $10k is 20,692. Here is the updated chart:

Source of Truthcurrent value of $10k invested in SPY 5 years ago
State Street$19,685
Google Finance$19,513
Yahoo Finance (chart)$20,336
Seeking Alpha$20,341
Yahoo Finance (historical, adjusted close)$22,196
Yahoo Finance (historical, actual close)$20,692
Current value of $10k invested in SPY 5 years ago, per each source

This is a kinda big deal. The difference from $22,196 to $19,513 is 14%. Heck, without dividends, the difference from $20,692 to $19,513 is still 6%. 6% is two years on a cash account.

OK, set aside for the moment that management fees suck and the industry feeds off you like mosquitos feed off water buffalo. I guess we just accept it.

Back to the initial question — how did the 2 portfolios do? Well, here are the values today (Nov 3, 2025).

So $20k more, just by choosing full SP500 over 50/50 SP500/Age fund. Heck if I had just chose age fund for both, it would only be $40k each, not even enough to fund 1.5 years of school.

So now let’s look closer — I chose the Oklahoma fund and the investement choices they provide — but what if I owned my own? Let’s look at several scenarios:

  1. US Equity Fund, from OK4Savings.org
  2. 2024/2025 Enrollment Option, also from OK4Savings.org
  3. TIEIX
  4. SPY
  5. QQQ
  6. TQQQ
  • 1) As a baseline, here are the contributions to Addison’s account and total value now:

$13k net principle becomes $80k now. Good, not great.

2) OK, some will say I was “too aggressive” with 100% US Equity Index, and I should have done the 2024/2025 Enrollment option. Look here:

Would take $13k to $21k. Awful. Note, there were a lot of band rolls as she got older and automatic switching of funds prior to 2020. That’s why the year end prices look so rigid, but this is directionally correct.

Lesson 1 — Screw the people who tell you to invest concurrent with the kids age. It is just plain bad advice.

I know that the market is at all time highs now so it is easy to draw up lesson #1, but I strongly feel it is the correct lesson. If the market tumbles anyways then schools tuition and fees would go down commensurate. Don’t spend your life (and investments) worrying. It will kill you in the end, just like the unworthy servant who hid his talent in the ground. After all, that servant did safeguard and return the masters talent in perfect condition. What did it get him? He got called “wicked and slothful” by Jesus himself. Do you really want God to look at you and call you wicked or slothful? If not, heed lesson 1.

3) OK, now what if instead of TIAA/CREF, I just opened a Schwab account and invested in TIEIX directly?

Wow, that is a hell of a difference. $110k instead of $80k. Sure, you don’t get the state tax deduction, but that was not worth $30k. Again, financial management fees suck and the industry feeds off you like mosquitos feed off water buffalo.

4) OK, what if I invested in SPY instead of TIEIX in that Schwab account:

Basically the same as TIEIX, actually $5k worse. But basically the same.

5) Let’s get riskier: How about QQQ? The girls were born and this fund what set up the same time the iPhone 1 came out. Betting on tech (of course hindsight is 20/20) would have been a good thing:

Wow – $13k becomes $200k. Now you are talking. College is fully funded and mom and dad don’t have to dip into savings.

Lesson 2 — Invest risky, and keep the investment on no matter what (COVID, etc)

If you have belief in the future of the US, Lesson 2 is the only lesson you need to remember. As long as we are 4% of the world but 33% of the world’s economy, the reserve currency of the world, the strongest military in the world, etc — bet on the US for the very long term. We have a US exceptionalism and optimism that does not exist in other countries. Hell, in western Europe they are damn apologists. No one want to buy into that thinking.

6) OK, Final risk up trade, right up there with putting it in bitcoin. How about if I did TQQQ (leveraged ETF). The fund did not exist until 2010, so just take the principle inflows from 2007-2009 and dump it in the trash can. What do we get?

That’s right $1.2 million! And not going from $13k to $1.2 million, more like $8k to $1.2 million, because this starts in 2010. Now we are talking!

In truth, I don’t think there is any way I would have been able to keep this level of risk tolerance on. There is no way I would not sell during COIVD or any other number of days (like even today, just a 1% down day). But if I had – wow — I wish I had is all I can say.

Lesson 3 — all the savings, all the austerity – delay that purchase, get the cheaper model, it may all be worthless noise. Just click a few correct buttons a few times in your life >> all else.

Lesson 3 is kind of depressing, actually, but real and true. Sure, I wish I had twenty bitcoin lying around — I set up my first bitcoin full node when each coin was $100. Coulda / Woulda / Shoulda. My dad used to tell me he should have bought Boeing in the 70s and he would be passing $10 million to me. He didn’t and I didn’t. Oh well, I loved my dad very much, and that was worth a lot more than $10 million, and I am being very honest about it.

Still though, my hope is people reading this will have both — ample funds for all your dreams, and long and healthy lives with family. Thanks for reading!

Could you have made money in the stock market today?

I am curious for myself. The market opened about 1.75% down, quickly went to down 2.5% before 9am, and rallied all the way to closing just down 0.15% on the day, so a rally of ~1.5% from open to close. Could I have made significant money with $NDXP options today? My gut tells me no, but I want to see.

Specifically the most likely trade I would have done is buy $NDXP call options at the open for $NDX to close flat by the end of the day. Those obviously would have expired worthless as NDX did not end the day positive. But how about some others? The VIX is currently at 21, and my gut feel is it is too high to make money buying out of the money calls, but let’s see.

First, what if I had bought at the open with a strike 0.5% down from yesterday. NDX closed Friday at 19,280. 0.5% down is 19,180. NDX opened today at 18,990. The first trade for .NDXP-25-03-31-C-19,180 was at $19. So, you would have made $100 / $19 = about 5x your money. Damn. That is a lot.

OK, so how about if you bought .NDXP-25-03-31-C-19,200 at 8:48am when the market was down the full 2.5%. You could have picked it up for $9. It would have ended at $80. Almost 10X. Damn.

What about an ATM call? You could have bought .NDXP-25-03-31-C-19,000 at 8:30am for $100, and at 8:45am for $40. That would have closed at $280. Meaning 3X or 8X your money.

Damn, Well, I guess you could have made money today.

Just how good are the 2025 OKC Thunder?

OK, I’ll start this one off by admitting this post is total procrastination. It is Friday morning and I should be doing something productive, but instead I want to look at the metrics for the 2025 OKC Thunder. The stock market is down another 2% today so don’t look there. My curiosity was peaked when on local sports radio I heard the announcer say that the game last week between OKC and the LA Clippers was the first one possession game OKC has played all year. It is late March. So, let’s dig through the numbers:

As I write this the Thunder are 61-12 with 9 regular season games left to go. They have already wrapped up the #1 seed in the Western conference, no other conference team has even locked up a playoff spot yet (!!) In 1-score games this year (defined as -3 >= final margin >= +3 and OT games) the Thunder are 1-4. So the radio talking head was wrong, it was not the Thunder’s first 1-possession game, it was just the 1st 1-possession game the Thunder have won. The Thunder are 6-2 in 2-possession games, and 54-6 in 3+ possession games. Here is the full record:

2025 Thunder record to date

The record is sorted by point differential. OT games are boxed in, 1-possession games are in puke-yellow, and 2-possession games are in sea-foam-green. 3-possession+ games are in white.

The Thunder play a one-possession game about once a month. That is nuts. Compare the current champs, the Boston Celtics:

The Celtics have played twice as many 1-possession games (10) and won 7 of those.

So if you convert the Thunder’s 1-4 record in 1 possession games to 4-1 (or even better 5-0) then they would be 65-8 currently, and theoretically could win out to go 74-8. Only slightly crazy talk, because look at the 2016 Golden State Warriors:

Notice their good fortune in close games — they went 10-0 in games that were decided by 2-possessions. They even went 7-2 in 1 possession games, for a total of 17-2. Their 3+ possession record of 56-7 is going to end up being worse than the thunders, who could go as high as 63-6

It has been widely reported this year that OKCs average margin of victory (currently +13.1) is the largest in NBA history (all time). It handily beats the 2nd and 3rd best teams this year (Cleveland +10.4 and Boston +9.1). To put in perspective that those two are great numbers on their own, the 4th and 5th place teams this year are in the +4 range. It even handily beats Jordan’s 1995 Bulls (+12.3) and the 2015 GSW team (+10.8) that went 73-9. The current best all-time is the 71-21 Lakers at +13.9. But that need to be adjusted for pace — back in the 1970s there were many possessions per 48 minutes. No 3-point line, no offensive sets, just fast breaks and dunks (Showtime!). The metric that does this is Net Rating. Net rating adjusts for pace, making it possible to compare teams that play at different speeds. For example, a fast-paced team might have a good point differential simply because they play more possessions per game, while net rating would reveal if they’re actually more efficient on a per-possession basis. Here are the net-rating comparisons:

So, on a Net Rating basis, OKC is almost up +3 on 2016 GSW and even better than Jordan and the Bulls.

Neville’s Take:

So just how good are these Thunder? Let me make a prediction. The Thunder will be the first team to go 16-0 in the playoffs. It is very likely they sweep anyone in the West, and then the ECF will have Cleveland and Boston slug it out, the winner there being tired and no match for a rested, healthy Thunder. If you are in Vegas put some money on that prop bet and send a check my way for Father’s day. We all need it after the stock market today 🙂

Book review – The Frackers, by Gregory Zuckerman

Just finished a great read recommended to me by Michael Palmer — The Frackers largely centers around the events in the American shale oil boom from ~2000 till about ~2014 when the book was written and published. The cast of characters is nothing less than American Heroes- George Mitchell, Harold Hamm, Aubrey McClendon/Tom Ward, Charif Souki. Several of these heroes were bankrupted or even dead and the impact they have had on our way of life is not appreciated by as many as should.

To put it in perspective, the USA (including Alaska) has around 3-4% of the worlds proven oil reserves (around 50,000 million barrels, with a world supply of 1,500,000 million barrels). However, we are the #1 producer in the world, pumping 15% of the worlds supply (13 million barrels each day, where the world pumps 83 million barrels). In 2005 we pumped only about 5 million barrels per day, with many assuming domestic oil would run out, but the work of these people has pushed us from 5 million to 13 million. As a consequence our gas and electricity bills are less than half of western Europe — natural gas in Europe costs $10 per million BTUs, gas in Asia is about $12 per million, in the USA it is just $2-3 per million. Natural gas (produced alongside oil frequently) is also much cleaner burning than coal. If not for these men we we would be burning 2-3x as much coal, polluting the environment, and paying 3x for the ability to do so.

George Mitchell, who developed the Woodlands area north of Houston, started commercial development of horizontal drilling and fracking in the 1980s / 1990s. Oil drilling previous to Mitchell was basically drill a vertical hole in the earth like a big straw and pump it out. Most fields (like Saudi Arabia’s easy oil) are just sitting there in a giant pool. This domestic revolution was shale oil — liquid oil that is there, but trapped inside rock. It takes guts to drill down 2 miles into rock, turn that horizontal, drill another 3 miles, send dynamite explosive charges down with water and blow those rocks up to recover oil. You can see how it is much easier to just drill it in the middle east and pay the importers.

Aubrey McClendon and Tom Ward (via Chesapeake Energy) really supersized the process and embraced debt to expand operations. Aubrey in particular is someone who should be taught in OKC metro public schools as he brought forth Classen Curve, transformed the city with the olympic rowing river south of downtown, helped bring the Thunder to OKC– really changed the fate of OKC for the better. Sadly though, Obama could not have given a flip about any of this and Obama’s DOJ witch hunted him because he lived on the edge with debt and largess, so they indicted him with jail time in mind, and it was too much for Aubrey as his distracted mind was killed in a car crash 24 hours later. This is after Aubrey made many, many land owners very rich by paying billions of dollars for mineral rights. He employed more landmen then others had employees. Shame on our government at that time.

Charis Souki is super fascinating. He actually managed the restaurant in LA where OJ/Nichole Brown/Ron Goldman happened. He decided to leave LA after that and move to Louisiana and get involved in oil. Specifically he saw all the media reports that the USA was running out of oil and decided to build multibillion dollar import terminals for liquified natural gas drilled in Europe and then imported into America. At that time both the USA and rest of the world were about $2-3 per million BTU, and he foresaw a time where the rest of the world would stay at $3 and the USA would go to $10. Well, as it turned out because of this domestic shale oil boom and Russia/Ukraine the USA is at $2-3 and Europe is at $10. He reconfigured his company midstream to go from importing natural gas to exporting it, and now LNG trades at $230 per share, up 20x since 2010).

Anyways, a great read and the author is on X at @GZuckerman I love a good nonfiction story, and all Oklahomans should know this story.

AI has a photographic memory

I had a lightbulb moment today. I am talking a class on neural networks taught by the excellent Dr. G. at Stanford continuing education. Last lecture we talked about a simple neural network identifying an image, say a boat/plane/car/train. The neural net starts blank, and you feed it labeled images of boats/planes/etc. That input changes the weights of the perceptrons (neuron mimicking structures in a machine). These weights are simple numbers, think 4, 7, 12.5, whatever. The point is simple numbers (weights) only. These perceptrons connect to each other and have an activation function, so a 12.5 from one perceptron is fed to perceptron #2 and the 2nd perceptron may (or may not) fire a number downstream after being fed a 12.5. That’s it. After trained on numerous boats/planes/cars/trains, if you feed the network a new boat it has not seen before it is likely to spit out “boat” because this new image fed a 12.6 to the downstream perceptrons, not exactly 12.5, but much closer than plane or car.

The key point to understand in the paragraph above is the AI (specifically large language models) do not “store” source materials. There is no hard drive with images of boats that can be pulled up. The network has seen many boats and that has caused these weights to be as they are. The only memory are these numbers and weights, not source material — words or images. That bears repeating- if I have a model like gemma-2-27b that is 50GB large, those 50GB are all model weights — absolutely no supplemental material.

Think about your physics test back in college– your teacher allowed you to write anything you wanted, formulas, pictures on a 3×5 note card, and as long as you could fit it on that note card you could bring it in during test time. So your brain had the ideas and methods, but you had a note card to remember the exact derivation of final speed based on acceleration and time. What I am trying to say is that the AI language model has no note card. It does not have 50GB of weights and also the text of the Declaration of Independence, it just has 50GB of weights. Sure it has read (been trained on) the Declaration of Independence, but when I ask Grok/Claude/ChatGPT what is the 3rd line of the Declaration of Independence it *does not* pull up the internet, read the text, then tell me the answer — it simply pulls the answer out of those 50Gb of weights. (now this is not exactly true anymore, Grok and the other LLMs can search the internet and take in results, but a traditional old-school LLM like gemma-2-27b does not need, and can not use, any internet access whatsoever)

So in these 50Gb of weights (not really that big, about the size of 10 movies) it can think (or predict) words out of the Declaration of Independence. Or the emancipation proclamation.

So I asked Ara (the xAI voice assistant) to read me word for word the emancipation proclamation. It said that from my 50Gb of weights I can give you that it is 270 words long, 5 paragraphs and it could give me the gist of each section, but it probably could not recite it word for word. I pulled up Lincoln’s handwritten version from the National Archives and read along as I asked Grok to give it to me word for word, or try its best. It nailed EVERY SINGLE WORD! All from the 50Gb of weights. I even asked it to tell me about which exceptions Lincoln wrote in inside the margins, where the line spacing is off. This is a very obscure reference. If you do a google search for ’emancipation proclamation “Norfolk” and “Portsmouth” “line spacing”‘ you will not get any results. This is just something you have to read and look at. But Grok, after successfully reading me the whole thing (again from “memory” aka the 50Gb of model weights) correctly told me the exceptions for Norfolk and Portsmouth were written in between the normal line spacing.

So the lightbulb for me? An LLM is not just smart — it has a photographic memory. It does not have to recall source material on demand, it can pull EXACT COPIES of things just from its weights. Maybe today only 270 words like the Emancipation Proclamation, but tomorrow, everything.

AI is freaking amazing at coding

So I know anyone who codes is underwhelmed by that post title. Of course it is and we all have known that for some time. But how do I convey that to people who are non-programmers? I found myself a couple weeks ago talking to a person at Cisco and saying that AI tools like ChatGPT are incredible at understanding my intent in code and helping me out, but I felt lacking in making a concrete example that connects to people who don’t live in arrays, lists, and hashes.

Well, today I have an easy low-hanging fruit example to share. I was updating some code on my playoffpredictor.com site where conferences were hard coded in:

$conference = array(

"Air Force" => "G5",
"Akron" => "G5",
"Alabama" => "SEC",
"Appalachian State" => "G5",
"Arizona" => "B12",

I asked Claude.ai to help me pull conference data from a file I have (fbsTeams.json) which has the data in more traditional .json format:

[
 {
"team_name": "Boston College",
"conference": "ACC",
"cbs_sports_icon": "BC",
"cfdb_name": "Boston College",
"pp_id": "1"
 },
 {
"team_name": "Clemson",
"conference": "ACC",
"cbs_sports_icon": "CLEM",
"cfdb_name": "Clemson",
"pp_id": "2"
 },
 {
"team_name": "Duke",
"conference": "ACC",
"cbs_sports_icon": "DUKE",
"cfdb_name": "Duke",
"pp_id": "3"
 },

AI of course wrote the necessary code for me. Here was the prompt I used:

I want to replace the current hard-coded way of defining $conference and instead move it to reading the .json file and populating it from the file. please write the necessary code for me. The file path is /home/neville/cfbPlayoffPredictor/data/2024/fbsTeams.json.

For teams that pull as conference CUSA, Sun Belt, Mountain West, AAC, FBS Independent, I want those all to map to "G5" in the $conference array. I want the logic to be that if the team does not match SEC,B10,B12, or ACC then replace whatever the .json file has with "G5"

And here is the AI code output:

Here is what I mean by intent — no where in the prompt did I say anything about power 5 conferences. But it understands that and wrote the code that way. Stupid tools would just call it ‘filtered_conferences’ or some basic thing like that, but AI is smart enough to understand I want to separate out power 5 conferences and G5 conferences, even though I never used the words power 5.

Of course in 2024 with the destruction of the Pac12 there are now only 4 power conferences, so I edit the code to call it $power4 in my script. But the hard work is done by AI and I just do some fine tuning editing. This is a perfect example of how AI should be helping us in day-to-day mundane coding tasks.

2024 Week 2 playoff predictor ATS results

Week 2 is in the books and for Auburn it is better luck next year. Ouch. Oh well, let’s see what we can learn from the computer model on playoffpredictor.com

The model went .615 (29-18-1) which was tied for 2nd out of 37 computer models for the week. Outstanding! And remember, this is with no preseason data of any kind. Just the results of the games played on week 1. Some of the more predictable results were Oklahoma beating Houston by 4 and Penn State beating Bowling Green by 7, when the spreads were -27.5 and -34 respectively. The computer model said -18 and -10.5, which were significant improvements on Mean Squared Error. Speaking of Mean Squared Error, the model went +142 and Absolute error of 3.3, which were dead last and next to last respectively out of the 37 computers. This is to be expected as other computers use player, team and preseason data. The model predicts no blowouts this early in the season, although we know there will be blowouts in week 1-4.

I don’t like this 12 team playoff. After spending last week updating the logic for 12 teams instead of 4 the computer sees these probabilities for teams making the playoff after 2 weeks of data:

Note the top likelihood is Syracuse, due to an ease of schedule. It wont last. I’d be surprised to see them still on top of the ACC by week 4.

Right now it says SEC gets 3.5 teams, Big10 gets 3 teams, Big 12 gets 1.5 teams, ACC gets 2 teams, and G5 gets 2 teams. I’d expect by season end it will be SEC 4 teams, Big10 3 teams, Big 12 2 teams, ACC 2 teams, and G5 1 team. I think the talk and season end is who is the 12th team, a 8-4 Missouri or a 10-2 Utah. Ugh. Who cares. What a horrible debate to have.

The best trade of my life (so far)

This year I was fortunate enough to make one really good trade and hold a portion of it through fruition. Here I will document the trade and the way it has evolved over the course of this year.

Towards the end of 2023 I shaped up what my 2024 portfolio would include. I like liquidating everything at the end of a calendar year and re-buying what I like and believe in. For 2024 I decided I would put 6% of my speculation portfolio into 3 options, 2% each which worked out to about $6,000 each option. The 3 equities I decided to buy options on were NVDA, RIVN, and QQQ. Here is the the setup in NVDA coming out of Jan-Dec 2023:

It had a great 2023, tripling from 150 to 500. Sometimes last year’s winners are also this year’s winners — look at DELL and EMC in the late 90s.

And above is the entry trade into NVDA.

My reasoning for this trade was that I like this setup. I have been blow away by AI and ChatGPT and GitHub CoPilot. During the fall I spend a decent amount of time coding for my website and app for playoffPredictor.com. To say that AI made my coding so much better is a gross understatement. You simply can’t do this development on Intel processors – it has to be GPUs and NVIDIA. So there is my conviction. Also, look at the 2023 performance of NVIDIA. There is the huge gap up from 300 to 400 in May when they announced earnings, and then really ‘UNCH’ for the rest of the year. Well, I guess going from 400 to 480 is 20% which is far from 0%, but for a speculation stock this volatile that is growing earnings by 10X over last year- yeah 20% is pretty much unchanged. I like stories like that which are due for a total repricing, not just a change in price based on what you could have bought it for yesterday.

Here notice the date and the amount. Ideally I would have made this trade on 1/2, but instead it is on 1/9. Sometimes I get nervous on these trades and just want the price to fall a bit further. If I would have made this trade on 1/2 and done $6,000 – the price was 50 cents, so I would have bought 120 contracts instead of 30. Hmm…. @#%#$^#%&. Oh well can’t live in the past. And there is no way I would have thrown $6k at such a nonsense idea. I mean 50% in 10 weeks on a $1 trillion company is nuts.

To speak to that nuts, planning my trades in December I never wanted NVDA at 750 strike and 3/15 expiration. I actually was planning on 600 strike and 4/19 expiration. But the problem was it was just too expensive. On Jan 2 the 600 call for 4/19 was trading for $10. I waited the first week of January to pull the trigger and the price got down to $8, but that’s still too expensive for me. I like to buy options that are about 0.20 cents to 0.50 cents as a rule when I’m swinging for the fences. I was watching the price the first week of January and hoping to get a better price. But the price kept going up, and by Monday 1/9 the 600 calls were trading at $20. So I said what the hell and threw down $4,500 on 30 contracts at a much higher strike of 750 for about $1.50 each. I’d rather have 30 contracts instead of 5 contracts any day. My thinking is that you can get rich at 100+ contracts and just have a good return at <10 contracts. And it is a good thing I did not end up buying the 600 strikes. They would be worth ~ $300 each contract now which is 300/10 = 3000% A 30 bagger. Very good — but as it turns out not as good as the 750s

So here is what NVDA and the call has done 2024 year to date:

In a perfect world yes this was a .38 cents to 223 opportunity: 223/.38 = 58,600% or a 586 bagger! Imagine turning $2,000 into $1,000,000…

Well all January NVDA went up and the call went up, straight from $1.50 to $5 by the end of January, and then a quick run up from $5 to $25 in the first 3 trading days of February. Very nice I turned $5k into $15k and then suddenly from $15k to $75k was quite happy (ecstatic). I decided to celebrate by purchasing an Apple Vision Pro that came out on Feb 2. So I sold my first 2 calls on Feb 5th for $25 a contract ($5,000 credit, covered my initial investment. Everything else was now house money).

After February 5th it got much more interesting. Earnings were to be released Feb 20, and wall street was all abuzz with this earnings report. I felt it could go up into earnings, but I was playing with real money now. I’d never spend that kind of money on an out of the money option into earnings, but I didn’t have to spend that money — it just grew into that valuation.

My next move was on 2/12. By that time NVDA was trading at 720, getting really close to the 750 strike. Generally I sell an option when it goes from out of the money to at the money. This time I did not really want to sell, I wanted to see where this ride would go, so I decided to sell an upside call against my call position. Normally people sell upside calls against share ownership positions; it is called a synthetic dividend. But this was a little different. Essentially I turned by 750C into a 750/930 call spread. I sold 28 contracts at 930 strike for the same expiration of 3/15. My reasoning was that I was now guaranteed $20k of profit, even if NVDA tanked. Nice!

Of course I capped my upside at $180 per contract, but 180*28 is $500,000 – so I settled for a max profit of $500,000 with $20,000 guaranteed.

On 2/18 and 2/19 NVDA fell from 730 to 670. Earnings was on 2/20, and people wanted to lock in gains before earnings. That was somewhat painful. On paper each day I lost $30k or so. Basically all those gains to $75k gone, back to $20k total return. On earnings itself I did not watch at all. I texted my son an hour after close asking “Are we eating at the Ranch or at Taco Casa tomorrow”. His response “somewhere in-between”. My pulse quickened. After earnings it was back to $750 and I took the whole family to the Ranch for dinner the following evening. It was the feast of unrealized profits! Over the past 24 hours I had made $100k on paper. Other may be used to that, but I was not.

Now here is where I normally would sell the whole position. Earnings are done. The strike price has been met. It is too risky to risk everything and theta decay will kill you now. But I decided this time to just sell 8 contracts at effectively $31 each. And let’s see what happens with the other 20 contracts.

Now here is a trade I regret. From 2/23 to 2/28 NVDA drifted down from 820 to 775. Getting close back to the 750 strike I had a fit of paper hands and sold 5 more contracts on 2/29. I guess I just wanted to guarantee $60k in profits, so I sold.

The next trade was to buy back 2 of the 930 calls. from 3/1 – 3/4 (2 trading days; Friday and Monday) the price jumped from 800 to 880 and I suddenly realized taking out the 930 calls would be possible and then I’d be forced to sell. I just did not want to sell until expiration. I could have bought back all remaining 15 930 calls on 3/1 for $2k, but I didn’t and the price shot right back to the premium I collected in the first place.

Next I decided I wanted protection. I bought put options (expiring that Friday 3/8, 1 week before my call option expiration) so that I locked in $50 per contract no matter what. The only thing that could hurt me was if it closed Friday above 800 and then opened Monday below 800. I felt comfortable with that risk (it was trading 880 at that time). Sure enough these expired worthless, but that $760 spent helped me sleep at night.

Over 3/6 and 3/7 the price kept shooting up and was bumping right against 930. Well if I was still short the 930 I would have to sell, I mean at that point it is only risk with no reward as the 930 and 750 will both have a delta of 1. Only possibility is price goes below 930 and I lose more money on the 750s (still deep in the money) than the 930s (back to at the money). So I panicked and started buying back the 930 calls. Remember that $20k premium collected? Well, here was $12k just to buy them back. I was now long 15 contracts at 750 and short 6 contracts at 930.

Well on 3/8 open it broke through 950 and I sold 5 more contracts. I like to sell at market open, but I was on the Guardians of the Galaxy ride at EPCOT, so I had to finish that first. I sold the 5 contracts just outside the gift shop at 9:35am. I also bought back the last 930C contract. I kept thinking to myself the Guardians ride was tame compared to the NVDA ride. Why there is not a theme park with the whole theme of stock and options prices I have no idea — that is a wild ride! Total spread credit was $160 per contract.

On 3/12 I bought protection for 5 of the remaining 10 contracts for $4, financing it by selling upside calls at $1,000. I mean if I get $250 per contract I’m thrilled, and this guarantees a full $100 per contract. Of course both these expired worthless, except the insurance was nice this week as it drifted down on Thursday.

Thursday morning I went ahead and sold another 5 contracts at the open. Now I am locked into a minimum return for the whole trade. This trade is pretty similar to the 2/29 trade, done out of fear of loss or paper hands, same concept.

Friday morning is expiration day. Time to harvest what is left. But first I had to buy back the $1000 strike calls, else my account could be subject to infinite risk. Divorced from the big picture it was a good sale on its own – sold on Tuesday for $1,621 and bought back on Friday for $53. I guess if you think that way its a 30x return on your money in 2 days. Of course the catch is you have to tie up tens of thousands of dollars in shares or long options in order to make a trade where you sell premium in the first place. In a lot of ways it is not fair. $1,600 is a lot of money to most of the country and it is only accessible to people who already have means. I am sure someone reading this will be young and understand the concept, but be unable to execute it for themselves because they don’t have the money already, and that is a shame. I think Robinhood at least offers people money for loaning out their stock, even just a few dollars. That seems like a good start to me.

Finally, this morning I sold 3 contracts at open and in the afternoon at 2:45pm I sold the final 2 contracts. That open price ended up being the low of the day. Oh well. And I was going to buy that ivory backscratcher.

All totaled, it comes to $291,000 on a $4,500 investment, or a 6,500% gain, a 65 bagger. You only need a few of these in life to catch the options bug permanently. Do I wish I had held on the whole time, I mean it could be more like $600,000. While the answer is of course, I don’t think I would have had the mental energy to do this all the way to $600k. For that matter you could have bought the 950C expiring 3/15 for 0.05 cents on Jan 5th, and it would have been worth $50 on March 8 – a full 100,000% return – 1000x on your money. Turn $5k into $5M. You could have also done that with MSTR in February. I wish I would because houses on the beach are not cheap and I could use an extra 1 or 2 million if they are giving it away, but oh well. That kind of thinking is a trap. It is impossible to buy the low and to sell the high.

Here is one of the profit and loss bubble charts while it was trading at it’s high. Normally that percentage you are happy at 20% and 30%. The thing says %15,000. Just unreal.

Could I have done better? Yes, but. Common strategies that people use is to roll up (higher strikes) and potentially out (further calendar date) options as they go from out-of-the-money to at-the-money. If I would have done that and say harvested 50% each time while going up in 50$ chucks, I’ll bet I could have made a lot more money (I’ll have to compute that when I get back from Spring Break to see exactly how that would have worked out). But there is no way I could have done that, mentally. That would have meant selling $80,000 of options on 2/22 and then buying $40,000 that same day, like 3/15 800C. That would have been hard. I mean even though the money fungible and holding is the same as selling today and buying it tomorrow our human caveman brains can play with house money of $80k MUCH easier than playing with fresh $40k. I mean, I just would not throw that kind of money at an option expiring in 3 weeks. That’s irresponsible. People who can master that part of their psyche and not fall into that trap I am sure are fantastic options traders, but I have not conquered that hurdle yet.

Did I make mistakes? Yes. Here you have to separate true mistakes from just price anticipation, which is unknowable. Looking back at this I made 2 mistakes:

  1. On 2/29 when I sold the 5 contracts I should have also bought net new options at ~800 expiring 3/22. I should have used 10k of the 20k for that. That guarantees $50k on the trade, but you have to get longer on weakness if you believe in what you are doing, not just flatter.
  2. On 3/8 I should have sold everything. In my speculation account I have a rule – everything must be harvested 1 week before expiry. In my view options with less than a week is gambling and belongs in my timing account, not my speculation account. Now, of course that was the high that Friday morning, but that is irrelevant. I have to have the discipline that everything in the speculation portfolio must have an expiration >1 week out. This would mean that the protection I bought should also have been through 3/8, not 3/15. Unfortunately this was a $100,000 mistake (ouch)

In the end I think you need some conviction to stick with a trade like this. There is no way I would stick with it if I thought NVDA would not become larger than AAPL and MSFT, which I think it will this year. Now after that, it certainly can be cut in half. But if the thesis is that every single CPU in every computer and phone is dead in the AI world, yeah that is a big market. As for now, well it is spring break and I am headed to Mexico. Cashing out this week. No crazy options positions on vacation!