Animé Text Analytics

Final project for the UIUC online course CS 410: Text Information Systems, taught by ChengXiang Zhai.
I worked with Karan Bokil (a.k.a. Q-ran) on this project.
We scraped the myanimelist website for short descriptions of animé shows. Then we pre-processed the text to remove stop words and other irrelevant words in the descriptions. Using the processed text, we then ran LDA (Latent Dirichlet Allocation) to cluster the shows into topics, or genres, among the animé shows, based on the keywords in their descriptions. The website includes a nice visualization of the genres based on their keywords.