Thursday, January 11, 2024

LIS4317 - Visual Analytics - Module 1 Post

After checking out the internet for the most interesting data visualizations, I have decided to focus on The 50 Most Visited Websites in the World by Dorothy Neufeld and Joyce Ma of Visual Capitalist.

Here is the link to the visualization and their findings:

https://www.visualcapitalist.com/the-50-most-visited-websites-in-the-world/

To answer the question on why I chose this particular visualization, I chose it because it was easy to make sense of their findings and the idea of bubbles representing the number of monthly website visits made it very simple to interpret.

Relating the Keim et al. (2008) definition of visual analytics, it is said that data at first has no direct value until an individual makes a point to get information from it (p. 154). Therefore, the data in the visualization is monthly website visits and applying Keim et al.’s definition, one is aware the monthly website visits do not mean much unless there is some meaning giving to it. In this case, the greater the number of website visits, the more popular the website is. We can see this idea represented by the size of the bubbles like how the YouTube bubble is smaller than the Google bubble.

Moving on to the definition of visualization, Keim et al. (2008) suggests that the act of visualization takes both data and information and their associated processes and makes it communicable so that individuals can obtain a greater understanding and clarity about a particular dataset (pp.155-157). When an individual quickly glances at the visualization, they can immediately determine that Google is the website with the greatest number of monthly visits just by looking at the largest bubble. By using bubbles, the visualization captures various websites and how they are related by company and makes the data more expressive as a result.

As for knowledge, Keim et al. (2008) continuously stresses the importance of transforming “data into reliable and provable knowledge” through the use of methods and models (p. 155). Knowledge can be quite difficult to express and without careful consideration of the design of the visualization, the intended audience of the visualization will end up confused and no extraction of knowledge will occur. Thus, one can infer that visualization is very important for allowing individuals to make sense and derive information and knowledge from the data. So, visualization such as the choice of how one will represent their data goes hand in hand with knowledge extraction. That is, how will certain design choices impact the readability and/or clarity of the data presented? Just looking at the visualization, the creators clearly prioritized the design of it to make it as easy to read as possible.

According to Keim et al. (2008), models play a significant role in data analysis (p. 161). From preparing a training dataset to teach models to locate data samples based on classification and/or prediction to using association rules to identify the co-occurrence of data items, there is a variety of different models out there performing many things. Sadly, not much is known regarding the techniques used to gather data for this particular visualization. While the data was retrieved by SimilarWeb, they do not document how they went about getting and recording the data. Nor if any models were applied.


~ Katie

LIS 4370 R Programming - sentimentTextAnalyzer2 Final Project

For this class's major final project, I set out to make the process of analyzing textual files and URL links for sentiment insights much...