Chip Koziara – Exploring how to build, scale, and operate businesses from the perspective of individuals and organizations

Exploring how to build, scale, and operate businesses from the perspective of individuals and organizations

Context, understanding, and memory

Sunday, May 20, 2018
A couple years ago, I read On the Shortness of Life, a letter by the Stoic philosopher Seneca the Younger¹.

While the gist of Seneca’s writing was clear, many references he made were lost on me and I found the whole thing to be quite dry. What I didn’t realize at the time is that if I had the context of a Roman in 49 A.D., Seneca’s writing would be more dynamic and his arguments would be more compelling.

I recently read through On the Shortness of Life again. Coincidentally, I happened to be listening to the excellent History of Rome podcast in the weeks leading up to my re-read. Seneca seemed more captivating somehow, and I realized why after reading a few sentences toward the end of his letter. Seneca wrote:

“So, when you see a man repeatedly wearing the robe of office, or one whose name is often spoken in the Forum, do not envy him: these things are won at the cost of life. In order that one year may be dated from their names, they will waste all their own years.”

A lightbulb went off in my head! From the History of Rome podcast, I knew that the Roman Republic used to reference years by the names of the two consuls² who served in office each year. In that last sentence, Seneca cleverly connected his arguments to a custom familiar to all Romans. This was lost on me at first, but having more context about Roman life gave me the information I needed to use these references to anchor Seneca’s points.

If I were to rewrite this today for a modern audience, it could be:

“In order that buildings may feature their names, they will waste all their own years.”

The argument is the same, but the context shapes how we relate the argument to the world around us.

Context helps us relate information to things we already know, strengthening our understanding and memory. While I stumbled into the History of Rome podcast and Seneca, I plan to intentionally read multiple contemporary works, listen to podcasts, and watch documentaries that provide a broader context on whatever I’m learning.
1. Seneca was born in 5 B.C. and died in 65 A.D. He’s a fascinating individual who amassed a huge fortune and tutored the Roman Emperor Nero, who, in a strange twist, ultimately forced Seneca to take his own life. ↩︎
2. A consul was the highest Roman public office during the Roman Republic, and two consuls served jointly for one-year terms. ↩︎
Throwing darts with a Monte Carlo simulation

Sunday, November 5, 2017
Nassim Nicholas Taleb recently tweeted out an interesting problem. I didn’t know how to solve it mathematically, so I wanted to build a simulation in R.

Here’s the problem:

PROBABILITY QUIZ: How randomness should almost never look random.
(spurious cancer cluster in Fooled by Randomness)@CutTheKnotMath pic.twitter.com/1Svrn0IB69
— Nassim Nicholas Taleb (@nntaleb) November 2, 2017

I like to visualize things, so I decided to simulate throwing 8 darts on a board first, then simulate the dart throwing model many times using the Monte Carlo method.

Simulating dart throws

This uses the raster and tidyverse packages¹. The raster package simulate throwing darts, while the tidyverse gives us flexible ways to filter the data we’re generating.

We need to do four main things:
1. Create a 4×4 grid that will function as our dart board
2. Generate 8 random points on the grid that function as our dart throws
3. Count the number of darts in each section of our dart board
4. Visualize the results
Constructing a dart board

To build the dart board, we can use the raster package:
```
# create a raster with 4x4 grids
board <- raster(xmn=0, ymn=0, xmx=1, ymx=1, res=0.25)
```
Generating dart throws

We need to generate an extent from our dart board raster. This forms a bounding box around our dart board raster, so we can calculate coordinates within the extent that map to the raster. To generate a spatial sample, we pass in our dart board extent², the number of times we want to throw darts, and our method for throwing the darts, which is random.
```
# throw our darts on the dart board 
throws <- spsample(as(extent(board), 'SpatialPolygons'), 8, 'random')
```
Counting darts

The raster package has a function called rasterize that will tabulate how many times each grid was hit by a dart. ³
```
# calculate which grid each of our darts hit 
board <- rasterize(throws, board, fun='count', background = 0)
```
Visualizing the dart board

Let’s take a look at our dart board now:
```
# plot the board 
plot(board)
# plot the points 
points(throws, pch=20)
```
A visualization of our dart board, with the points our darts and counts for how many darts hit each grid.

In this example, none of our grids were hit by 3 or more darts.

Simulate our dart board model many times

Now that we can visualize hitting one dart board with 8 darts, we want to determine the likelihood of one of our grids getting hit by 3 or more darts, per Taleb’s challenge. One simulation, shown above, gave us a little information about this probability. Using the Monte Carlo simulation method, we can run many simulations rapidly, then compare the results to get a much better idea of this probability. While we’ll never prove the correct answer with this method, it will give us a practical answer very quickly.⁴

Turn our earlier work into a function

To create a Monte Carlo simulation, we’re going to turn our earlier R snippets into a function that returns TRUE or FALSE, depending on if the dart board contains any grids with 3 or more darts:
```
throwDarts <- function(raster, n_throws) { 
# throw our darts on the dart board 
throws <- spsample(as(extent(board), 'SpatialPolygons'), n_throws, 'random') 
# calculate which grid each of our darts hit 
board <- rasterize(throws, board, fun='count', background = 0) 
# save board to a table
board_table <- tbl_df(data.frame(coordinates(board), count=board[])) 
# return TRUE if there are 3 or more darts in any grid, else return FALSE 
ifelse(nrow(board_table %>% filter(count >= 3)) > 0, return(TRUE), return(FALSE))
}
```
The primary changes in this function are:
1. We’re saving the results of the board after counting the darts in each grid to a table, instead of plotting them in R.
2. We’re evaluating whether or not there are 3 or more darts in any grid within the function and returning TRUE or FALSE depending on the results.
3. The parameter raster needs to be passed in (we’ll use our board raster from earlier), as well as the number of throws, n_throws.
We can run this function with throwDarts(board, 8).

Building the Monte Carlo simulator

Now that we have the throwDarts function, all we need to do is run it many times and calculate the frequency of cases where it returns TRUE divided by the number of simulations. This will return an estimated average for how frequently 3 or more darts will hit the same grid during a simulation.

Since we already did the heavy lifting, the Monte Carlo simulator is easy to build:
```
dartSims <- function(raster, n_throws, n_sims) { 
# save an empty vector to store sim results 
boardEval <- c()
# for loop to run throwDarts sims 
for (i in seq(n_sims)) { boardEval <- c(boardEval, throwDarts(raster, n_throws)) } 
# calculates the frequency of boards with 3 or more darts in a single grid 
hitRate <- sum(boardEval)/length(boardEval) 
# returns this value when the function is evaluated 
return(hitRate)
}
```
The bulk of the work is done in the for loop, which runs through our throwDarts function as many times as we tell it to, storing the results in the boardEval vector. Since TRUE values equal 1 and FALSE values equal 0 in R, we can write a quick one-liner to add up all the boards that have TRUE values, then divide that by the number of boards we simulated. This gives us a frequency for how often boards have a grid with 3 or more darts!

Final calculation

Running dartSims(board, 8, 100) will simulate throwing 8 darts at our dart board 100 times, returning the frequency of grids with 3 or more darts. However… this value isn’t likely the best answer! Running more simulations increases the likelihood of attaining a closer approximation of the true probability.

This is where judgement needs to come into play, as increasing the number of simulations increases the time and computing power required. There are diminishing returns past a certain point, too, so starting smaller and increasing until things start to stabilize is a good approach.

dartSims(board, 8, 100000) gives a good approximation given these tradeoffs with a value of 0.16912⁵. Taleb tweeted out that the solution is 0.169057, so this is a fantastic estimation for our purposes.

Recap

We covered a lot of ground, first building a visualization of throwing darts at a virtual dart board before abstracting the dart throwing into a function that we could use in a Monte Carlo simulation. While we didn’t answer Taleb’s question definitively, we did calculate a practical answer quickly.

The Monte Carlo method is portable to similar problems, too. We sneakily implemented a technique that will work for geospatial problems, so keep an eye out for future write-ups along those lines (no pun intended). You can get the complete R script here.
1. One of the best things about R is its ecosystem. There’s a package for almost any statistical or visualization task you can think of. ↩︎
2. You may notice the SpatialPolygon parameter, too. This is telling the sampling function that the dart board contains distinct grids. ↩︎
3. We’re using fun='count' because we want the counts of darts hitting each grid. background=0 tells rasterize to set grid values to 0 for the lonely grids that don’t get hit by a dart. ↩︎
4. As a quick philosophical aside, a practical, mostly correct answer now is good enough for me in most situations. ↩︎
5. Running set.seed(2001) just before executing dartSims(board, 8, 10000) allows you to reproduce this result. R‘s random functions aren’t really random, rather it uses a pseudorandom number generator behind the scenes. set.seed allows us to tap into this and reproduce results by resetting to a starting point for a random number generation function. ↩︎
Equality in uncertainty

Wednesday, March 29, 2017
Nassim Nicholas Taleb is writing a new book, Skin In the Game. Taleb discusses the concept of equality in uncertainty in one of the book’s excerpts, and I have some thoughts on how the internet is making equality in uncertainty possible in a way that could not have previously existed.

Information asymmetry in a sale isn’t fair. Taleb shares a few punchy anecdotes that drive this point home. But to what extent must we level the informational playing field?

Taleb shares a debate between two ancient Stoic philosophers, Diogenes and Antipater, about what degree of information an ethical salesperson must share:

Assume a man brought a large shipment of corn from Alexandria to Rhodes, at a time when corn was expensive in Rhodes because of shortage and famine. Suppose that he also knew that many boats had set sail from Alexandria on their way to Rhodes with similar merchandise. Does he have to inform the Rhodians? How can one act honorably or dishonorably in these circumstances?

Each philosopher promoted a valid viewpoint:

Diogenes held that the seller ought to disclose as much as civil law would allow. Antipater believed that everything ought to be disclosed –beyond the law –so that there was nothing that the seller knew that the buyer didn’t know.

In theory, Antipater’s view resonates with me. In practice, I can see how this view could quickly become prohibitive to the seller.

I believe buyers need access to information, coupled with competency to filter and interpret that information, to satisfy Antipater’s threshold of ethical selling.

Very rarely does a situation present itself that is as clear cut as the man with the shipment of corn. The corn seller would be taking advantage of the Rhodians because they did not have access to critical information that would affect their decision to purchase his corn.

I think this last part – access to information – is key, because the internet has fundamentally removed barriers to acquiring most information. While the internet has not removed information asymmetry entirely, it has begun to level the playing field.

Imagine that the Rhodians in Taleb’s story are living in 2017. If the other corn sellers are en route to Rhodes from Alexandria and are sending tweets and posting to Instagram that they’re on their way, the first corn seller could sell without disclosing that more corn is on the way¹ because there is no information asymmetry.

When access to information is not readily accessible, or when the buyer is unlikely to be able to interpret this information², the seller must step in and provide the buyer with this filtered information so each individual shares equality in their uncertainty.

I could see how providing each buyer with this filtered information could be prohibitive for a seller in the past, but now software can now automate most of this process. This automation makes it more possible to be an ethical salesperson in 2017 than in the days of Diogenes and Antipater, because there’s no excuse to withhold relevant information from buyers.

I’m looking forward to reading Skin In the Game. In the meantime, I’ll keep chewing on the concept of equality in uncertainty.
1. This assumes the Rhodians had access to the internet and were acting rationally. ↩︎
2. Filtering and interpreting information is getting harder every day as more and more information is thrown at us from the internet. Information is essentially a commodity and our ability to process it is now the constraint. ↩︎

Hi, I’m Chip

I work at Uber, where I currently lead strategy and planning for sales operations within our Delivery business. Prior to Uber, I helped launch a new product as a Venture for America Fellow and later co-founded a workplace productivity startup.

I enjoy reading, programming, thinking through business models, and being an arm-chair football coach (Go Blue!). I am always chasing the perfect cup of coffee.

Here’s more about what I’m up to now.

Elsewhere on the Web

@chipkoziara on Twitter, LinkedIn, and GitHub

Context, understanding, and memory

Throwing darts with a Monte Carlo simulation

Simulating dart throws

Constructing a dart board

Generating dart throws

Counting darts

Visualizing the dart board

Simulate our dart board model many times

Turn our earlier work into a function

Building the Monte Carlo simulator

Final calculation

Recap

Equality in uncertainty

Hi, I’m Chip

Elsewhere on the Web