← Back to weeks

Week 3 | A look at data

This end of week snuck up on me. It felt like everything was moving very quickly but very slowly this week. But that's likely because there was some stuff in my personal life that was weighing on me. Now that the event has passed, and everything seems to be ok, I'm back to a fairly light and clear head—minus the 3rd dose side-effects from getting vaccinated yesterday. Fever, body aches and fatigue, oh my!

The last 2 weeks have been filled with a joyous discovery! I've discovered I'm incredibly interested in data analysis and engineering. Considering the basis of which I originally founded Firstbloom on, this really shouldn't come as a surprise but it is to me! I've also never felt smart enough to feel like I could participate in the field. That is, until this last week!

At work we've been modeling out a spaced-repetition system and are trying to fit it within the "game" of our product. To get a better understanding of what the "game" should feel like given our constraints, I decided to build out a probability based scenario to run a few simulations so we can understand what the regression to the mean would look like after a few runs. Doing this lit me up, and gave me a huge amount of confidence to be able to dig into data analysis and use it to become a better product thinker, programmer and logical communicator.

At the same time, my brother-in-law is looking to get into the field of data analysis. With this serendipitous timing, and wanting to come up with some small projects just for practice, I thought: "wouldn't it be cool to analyze Rotten Tomatoes data to see if there are any trends in the top 100 comedies of all time?" With that thought, I brought him in to get some practice on the analysis side, I wrote a little web scraper in Python and started doing some incredibly basic analysis to get my feet wet. I'm sure we'll find nothing new here. It's mostly a hobby project for us to get some practice.

To start, I thought it would be fun to see if there were any commonalities when it comes to cast & crew. I'd love to add on to this and run some sentiment analysis on the critic reviews, and see if there could be a seasonal correlation with when movies were released, and how well they were received. To start, we're keeping it super light. Here are some super early findings. Mostly so I have something to post this week. 😬

Out of some really early data mining, I threw together 2 charts.

The first chart shows the top 10 most frequent actors in the top 100 comedies by number of acting credits: A bar graph showing the most frequent actors in the top 100 comedies

Here's the data in table form for those with screen readers!

ActorNumber of movies
John Ratzenberger8
John Lasseter7
Andrew Stanton6
Wallace Shawn5
Bill Murray4
Bonnie Hunt4
Tom Hanks4
Henry Bergman4
Joan Cusack4
Annie Potts4

The second chart was a look at the top 10 most frequent directors who directed the top 100 comedies by number of directing credits: A bar graph showing the most frequent directors in the top 100 comedies

Once again in table form for screen readers.

DirectorNumber of movies
Charlie Chaplin4
Lee Unkrich3
Wes Anderson3
Howard Hawks3
Taika Waititi2
Brad Bird2
James Gunn2
Paul King2
John Lasseter2
George Cukor2

I was honestly really surprised to see John Ratzenberger at the top of the acting list. I was expecting someone like Bill Murray to be in the top 3. Though, I'm sure if this data set was broader and included multiple sources, the results would be drastically different! Something to consider if we decide to expand on this in the future...

All of the data taken for this mini project was scraped from this Rotten Tomatoes list.