In this week of Visual Analytics, we are asked to create a multi-variate visualization graph with a dataset of our choice.
In this case, I decided to work with a dataset called nyc_squirrels.csv and basically it contains very detailed observations from squirrel watching in New York City's Central Park. From what they were doing, what sound they made, to even the exact geo coordinates of where the squirrel watching event occurred, it is all noted down in the data.
Link to where I found the data: NYC Squirrels Data
From this data, I decided that I wanted to better understand the spatial distribution of squirrel sightings and see if there is any difference in sightings that occurred in the AM or PM.
To begin my analysis, I did make a point to clean my data containing entries with NA values and deleted variables that were not conducive to the analysis.
With the data ready, I used the ggplot2 package in R to graph the points:
Here is the visual:
As you can see, when the geo coordinate points are plotted, it actually makes a rough outline of Central Park. The big empty gap you are seeing represents Jacqueline Kennedy Onassis Reservoir so it makes sense that there were not any squirrels spotted there. For the most part, I do not see any particular difference in squirrel sightings in the AM versus PM but there does appear to be more squirrel sightings at night than during the day.
For fun, let's see what this plot looks like in Tableau with a map underneath the points:
See the map up close here: NYC Squirrel Sightings
Wrapping up, visualizing multi-variables can be very helpful when it comes to understanding the subtle relationships between them. It is definitely interesting to be able to compare AM sightings to PM sightings and where they occurred in Central Park and allows for one to better understand the dataset.
As for applying the 5 principles of design, alignment is used for the axis labels, legend, and title for better readability. With repetition, shape style, color, font size, and type are kept consistently. To highlight the difference between day and night, I opted to use cool colors most often associated with the night for PM and warm colors for AM which checks off the contrast requirement. Moving on to proximity, visual elements like the legend are clearly placed together to promote connection. Lastly, with balance, I must admit that the Tableau visual is not as balanced as the previous ggplot visual. It has very small legend which makes it have uneven weight. To prevent this, I should think about adding more data elements to make the visual more balanced.
~ Katie
No comments:
Post a Comment