Notebook Three | Repository

Tags vectorisation

Andrea Leone
University of Trento
January 2022


Analyse the tag distribution in the database:
extract all tags of each talk and store them in a set


Get the frequency of each tag in the set


Plot the tag frequency distribution


Tags are manifold: select a bunch and check the frequency


Create a dictioary with three main categories, each one collecting the tags that describe or concern it.


Assign each record to one of the three categories


Explore the data distribution according to the three categories.


Data seems quite balanced. Now, load and compress the vectors.


Arrange the data for a 3D-scatterplot and see the result