Dream Data: Text Mining a Historic Speech

Most frequent words bar plot Word cloud of MLK speech

This project was my attempt to turn one of the most powerful speeches in American history into… data! Using text mining techniques, I analyzed Dr. Martin Luther King Jr.’s iconic “I Have a Dream” speech to uncover which words stood out the most and how they were used together to deliver such a timeless message.

The process involved all the classic NLP moves: cleaning the text (sorry, punctuation), removing stopwords (you’re great, “the,” but not helpful here), and stemming words down to their roots. Once prepped, I built a Term Document Matrix and created visualizations like ,word cloudsand bar plots to spotlight words like “freedom,” “dream,” and “will” — which unsurprisingly, were front and center.

I also explored how certain words connected with each other. For instance, “freedom” often appeared near “ring” and “dream,” showing how Dr. King’s vision was tightly interwoven with hope and action. The word cloud gave a powerful snapshot, while the bar chart backed it up with good ol’ numbers.

This project reminded me how impactful simple words can be when used with purpose — and how even legendary speeches can be broken down and better appreciated with the help of a little data science magic.

Tools used: R, NLTK, Matplotlib, Seaborn