The Unexpected Lesson Within A Jelly Bean Jar (2024)

The Unexpected Lesson Within A Jelly Bean Jar (1)

How Jelly Beans helped me understand a key Artificial Intelligence principle

Published in

Towards Data Science

8 min read

Oct 14, 2019

On a livestock fair in late Victorian Plymouth, England, a statistician with the name of Francis Galton asked around 800 attendees to guess the weight of an ox that was on display. He then calculated the median of all estimates, which ended up being 1207 lbs¹. To his surprise, the measured weight of said ox was 1198 lbs, which put the median estimation at ~0.01% off from the real weight. As Galton himself noted¹:

…the middlemost estimate expresses the vox populi, every other estimate being condemned as too low or too high by a majority of the voters

This effectively means that as a group, or as a collection of independent thinkers, we are very, very good estimators.

As I love Data and Science, I wanted to replicate this experiment myself, so not so long ago I did so at my office in my own way. I conducted the Jelly Bean Jar Game, you might have heard of it before.

I bought a jar and filled it with exactly 490 beans (yes, I counted them all). Then, like Sir Francis Galton did, I asked 30 of my co-workers to give an estimate of the amount of Jelly Beans in the jar. To my surprise, the distribution of estimates looked like this:

The Unexpected Lesson Within A Jelly Bean Jar (4)

With a mean estimate of 487, only three Jelly Beans off from the ground truth! With this simple experiment, I was getting more and more convinced that the vox populi or Wisdom of the Crowds¹ ² is a real thing.

As a group, we are very good estimators, individually, not so much.

NOTE: Patient individuals outperformed those that made wild guesses. In my experiment, some individuals measured the volume of the jar and estimated the volume of each jellybean to then extrapolate this to the amount of jellybeans within the jar. Other simply went and said “Hmm I don’t know… 1000” (see the figure). Nonetheless all estimations were centered around one value, being the ground truth. Keep this in mind.

In the rest of this essay I will compare this vox populi principle with one that has kept my interest for a long time. It might sound crazy, but I think Artificial Neural Networks³ share a common ground with it. Especially because in both cases a collection of parts is given one single task and work together to solve it. I hope that you too feel this way by the end of the text.

A good way to start this comparison is probably by providing a definition of what neurons do in Artificial Neural Networks. I found this description to be rather compelling and simple to understand⁴:

Each neuron receives one or more input signals x 1, x 2, …, x m and outputs a value y to neurons of the next layer and so forth. The output y is a nonlinear weighted sum of input signals.

Under this point of view then, neurons in an ANN are the individuals of a collective thinking. In fact, the de facto architecture of ANN’s is a collection of connected individual regressors³. The output of a neuron with n input neurons is defined by⁵ :

The Unexpected Lesson Within A Jelly Bean Jar (5)

Each output h then is a function with parameters W and b of the sum of individual linear regressions from all inputs x, which in turn will be the input (after an activation function, usually non-linear³ ⁶) of the next layer. The neurons collectively and only collectively, solve tasks. Try building an ANN classifier for a complex task with one neuron, you must probably going to fail. This will be like Galton asking one single person to give an estimate of the ox’s weight. The estimation is probably going to be wrong. It is here where ANN’s really work collectively. This concept can be visualized in the next example:

The Unexpected Lesson Within A Jelly Bean Jar (6)

In the image above the trained NN is taking as input 784 features from the image of a “2" and will classify it accordingly. The complexity of the system increases drastically with each added neuron, but in turn increases the amount of possible feature permutations that effectively pushes up the performance of the classifier. Add too many though and you will be a victim of overfitting⁷. I recommend you to visit this Google Playground to understand these and other concepts better where you can see the effect each added (or removed) neuron has on a simple classifier. Try training the model with only the first two features (X¹ and X²) and see the results. Now do it with more. can you find the minimum amount of neurons needed to get good results? Do you need many neurons/layers to do simple tasks? The answer is no. Will get back to this in a moment.

Going back to oxen and jelly beans, this will be like finding the minimum amount of individuals required for a very good estimation. Surely asking 10,000 people about the weight of the ox will reduce the error, but at 800 we are already 99% around the ground truth. Increasing the complexity of an algorithm is useful only when the desired output has not been satisfied. From here, computationally speaking will be best to reduce the amount of estimators to find the minimum required to reach the desired performance. The vox populi reduces the cost of the computation once this balance is found. To understand this, we can look at the next figure I quickly made in Python:

The Unexpected Lesson Within A Jelly Bean Jar (7)

We can create a set of random normal distributions with μ = 1 and σ = 0.1 while increasing the amount of samples from 10 to 1000. Because we know that the mean ground truth is by design equal to 1, we can then compute the average across these distributions and see how close it gets to μ. As you might have guessed, the more data we have the better, meaning that our estimation gets closer and closer to our ground truth. After infinite samples we reach μ, but this is unpractical for obvious reasons. It might even be that that 1000 samples is too costly for whatever reason and we decide to use the set with 500 points for our analysis, which yields an error that satisfy our needs. It is our sweet spot: general enough to maximize performance, but specific enough to minimize error. Artificial Neural Networks follow a similar (albeit not identical, mind you) principle.

Although there are some general rules on how many neurons and layers you should use⁸, choosing these limits is a common problem that I frequently encounter while building Deep Neural Networks. Too many neurons and/or layers for a rather simple problem will probably cause severe overfitting (asking 10,000 individuals about the ox’s weight or using 1000 points or more in our previous example). Too little and you would not be able to generalize your model for blind testing. In a way then (and very generally speaking), ANN’s feel comfortable with a balance of simplicity and complexity.

Going back to Google’s TensorFlow Playground, we can see that the time it takes for a simple ANN to reach low loss values is short in a very simple classification task:

The Unexpected Lesson Within A Jelly Bean Jar (8)

Although trivial, this exemplifies perfectly the point I am trying to convey. Test an Training loss reach ~0.05 in about 350 epochs (see values just above the scatter). Now let see what happens with an overly complex ANN classifying the same data and using the same parameters:

The Unexpected Lesson Within A Jelly Bean Jar (9)

Not even 200 epochs and loss values are still not at the same levels as in the previous example. If we wait long enough though, the network does the job. Following our normal distribution example from previous paragraphs, you could use thousands of points to get a “better”estimation of μ, but the error would not compensate for the high cost. In this example, the same is happening. Not even thinking about other overfitting problems⁹, the latter architecture is too costly for such a task. The first architecture does the job perfectly with very low cost, so choosing it over the other is the wiser decision.

I like to think of AI models, and specifically Deep Neural Networks, as complex systems that should be built as simple as possible. Believe it or not, my Jelly Bean Jar experiment helped understand this principle. Both cases require to partition a certain task (just enough) to collectively solve it. This seems to be the best solution. As Albert Einstein himself noted on a lecture in 1933¹⁰:

It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience .

I can’t argue with that. Can you?

Thank you for reading!

References:

[1] Galton, F. Vox populi (1907), Nature, 75(7), 450–451.

[2] Text on the story of Wisdom of the Crowds: https://towardsdatascience.com/on-the-wisdom-of-crowds-collective-predictive-analytics-302b7ca1c513

[3] ANN resources: https://towardsdatascience.com/nns-aynk-c34efe37f15a

[4] Koutsoukas, A., Monaghan, K. J., Li, X., & Huan, J. Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data (2017), Journal of cheminformatics, 9(1), 42.

[5] Great text on the foundations of Multilayer Neural Networks: http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/.

[6] Activation functions: https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6.

[7] A word on overfitting: https://www.jeremyjordan.me/deep-neural-networks-preventing-overfitting/

[8] https://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw

[9] https://towardsdatascience.com/preventing-deep-neural-network-from-overfitting-953458db800a

[10] Robinson, A. Did Einstein really say that? (2018) Nature, 557(7703), 30–31.

The Unexpected Lesson Within A Jelly Bean Jar (2024)

FAQs

How to estimate jellybeans in a jar? ›

STEP 1: Use the top and bottom layers to figure out the average number of jelly beans per layer. STEP 2: Estimate the number of layers in the jar. (Use the dotted lines). STEP 3: Multiply the average number of jelly beans per layer by the estimated number of layers.

Get More Info ›

How many jelly beans are in the jar Life is Strange? ›

Read her thoughts with Alex's powers in Life is Strange: True Colors to find out that she is nervous about the guess being so close. Talk to the man again, and choose "700" as Alex's new guess. Reading the lady's thoughts one more time will result in her revealing the exact number of 731 beans.

Learn More Now ›

How did Monk guess the jelly beans? ›

There were six boxes, and the labels said each box had 1,400 jellybeans, which means there were 8,400 beans altogether. Reasoning that the guy running the booth ate a handful or two results in 8,385 beans as the final answer.

Read The Full Story ›

How to win the guess how many in a jar game? ›

"Then look to see if all the candies are the same size. If they are, take 64 percent of that volume and divide it by the size of the candy to get the total number that would randomly fit inside. If they aren't equally sized, divide a slightly larger area, around 70 percent, by the average size of the candies."

See Details ›

How to guess how much stuff is in a jar? ›

An approximate method to calculate the number of sweets in a jar, is to multiply the number along the width and length of the base by the number of sweets in the height of the jar. Granular Matter theory then tells us that on average a jar of mixed shapes will have about a 30% air gap in between the sweets.

Keep Reading ›

How many jelly beans are in a jar psychology? ›

It's well known that if you ask a group of people how many jelly beans there are in a jar, that the average of all the answers will often be more accurate than any individual answer. What is the scientific reason for this? One of the reasons for this can be a simple averaging out of random mistakes in large sample.

What to put in a jar for a guessing game? ›

Ideas of what you can put in the jar

Individual mixed sweets.
Fun size bars.
Lindor mixed chocolate balls.
Lollipops.
Pens/pencils/crayons.
Small packets of sweets (for example, Haribo) and other favourites.
Small packets of biscuits.

Apr 24, 2023

Tell Me More ›

How do you figure out how much something is in a jar? ›

So, multiply the radius number by itself, and, to get a rough estimate, multiply this number by 3. EXAMPLE: With a radius of 5, your formula would be: Pi x 52 = 3 x 25 = 75 So, a single layer of items in the jar should be about 75 items.

Tell Me More ›

How do you guess jelly beans in a jar in Life is Strange? ›

Speak to the Jelly Bean Counter and select the "700" dialogue option. Alex remarks that the guess is close and that the Jelly Bean Lady should be read once more, so do exactly that. The Jelly Bean Lady says the answer is 731, so return to the Jelly Bean Counter and tell him that using the "Answer" prompt.

What was jellybeans real name? ›

John Benitez (born November 7, 1957), also known as Jellybean, is an American musician, songwriter, DJ, remixer, and music producer. He has produced and remixed artists such as Madonna, Whitney Houston, Michael Jackson, and the Pointer Sisters. He was later the executive producer of Studio 54 Radio.

Discover More ›

What happens if you leave Haven with Steph? ›

Stay in Haven with Steph - If you stay in Haven, Steph will stay with you and you'll work in the record shop together. Here you'll see them share moments together. Seek adventure with Steph - If you leave, Steph will go with you as you pursue your music career.

Read On ›

What was Monk poisoned with? ›

That night, Kazarinski's employer instructs him to kill Monk. When attending a dinner the following day at Natalie's house, Monk is poisoned with a powerful synthetic toxin based on ricin. Stottlemeyer forms a task force to find out who hired the hitman to kill Dr. Nash and what poison was used on Monk.

Know More ›

What was the first jelly bean flavor? ›

David Klein sold the first Jelly Belly jelly beans in 1976 at an ice cream parlor called Fosselman's in Alhambra, California. The first flavors were Very Cherry, Tangerine, Lemon, Green Apple, Grape, Licorice, Root Beer, and Cream Soda.

Discover More Details ›

Why does Monk touch? ›

Character development

This was inspired by his own bout with self-diagnosed obsessive-compulsive disorder; in a Pittsburgh Post-Gazette interview, he stated that, "Like Monk, I couldn't walk on cracks and had to touch poles.

Get More Info ›

How do you play the candy jar game? ›

Step 1: Fill container Fill a clear container with a known amount of candy. Add some nickels and tiny toys to the container and close the lid. TIP: Instead of nickels and toys you can use any small object or trinkets. Step 2: Guess Have everyone guess how many candies are in the container.

View Details ›

How to help the jelly bean guy? ›

In Chapter 4, the Jelly Bean Counter can be found at the Spring Festival. Talk to him and guess "800", then read the woman's emotions. Next, talk to the guy again and guess "700". Read the woman's thoughts one more time to learn the answer and speak to the guy again and tell him "731".

Learn More ›

How do you get a bigger jelly bean jar in Toontown? ›

In order to increase the Jellybean Jar amount, the player must purchase larger Jellybean Banks from the Cattlelog. At the Estates, all Toons will start with a Jellybean Bank that can hold up to 20,000 Jellybeans. This amount can be increased up to 100,000 Jellybeans.

Learn More Now ›