SemanticScuttle - klotz.me

Challenges in measuring similarity between unstructured text data like movie descriptions.
Simple NLP methods may not yield meaningful results; thus, a controlled vocabulary is proposed.
Using an LLM, a genre list is generated for movie titles, which helps improve the similarity model. A function is created to find the most similar movies to a given title based on cosine similarity scores. Network visualization highlights clusters of genres linked via movies, showcasing potential improvements in recommender systems.

Tags: gephi*