It has been a while since I wrote my last blog post. This is because I rediscovered one of my great passions in life: coding. I have thus been spending all my free time lately programming and learning about data science: tons of math, statistics, machine learning algorithms, et cetera.
You can see what I’ve coded so far on github.com/domreichl and kaggle.com/domreichl. Mostly, however, I took a lot of online courses in the past few months (edX and coursera are outstanding platforms to learn about computer science and artificial intelligence).
Today, I finished a new little machine learning project. I started it because I was interested in how a natural language processing (NLP) model would cluster all the blog posts I have written to date, and also to learn about unsupervised learning in general. Here’s my Python code including explanations, plots, and results:
To see the interactive plot created in step #6, please visit this link.
I’m still in the process of learning to understand how all these models work in minute detail, so if you notice anything stupid in my notebook, I encourage you to tell me about it in the comments below.