IS445: Homework #10

This was completed by Oliver Quarm and Dilan Patel

Our notebook is here. Alternatively, you can access it using the The Analysis button in the bottom right.

Below is an embedded version of the interactive chart we have created. This was done by saving the chart in the Jupter Notebook as a JSON file in the assets folder.

Because the second visualization is quite wide, you may need to scroll to the right to fully see it depending on the size of your screen/monitor.

Write-up

In our first plot, we are visualizing a bar graph of the amount of licenses in each state. We did this to see if there was any clear bias towards a certain state in this dataset, and we found that there is a massive bias towards Illinois. We made each state a different color on the chart to help the viewer differentiate the states easier and to enhance the visual appeal of the graph. One example of data transformation that we used for this visualization was filtering out all the licenses in this dataset that do not have their state listed – some rows had this State field empty, or as NaN, and were thus filtered out.

The second plot is linked to our first plot and will show the number of licenses in each county of the states that the viewer chooses (this selection is done via the interactivity tool, discussed below). With this plot, the viewer can go deeper into each state and see what counties within each state have the licenses in this dataset. For this visualization we also made each county have different color for the purpose of allowing the user to easily differentiate each county (and again enhance the visual appeal of the graph). We also used data transformations for this graph by filtering out the rows that did not have a county listed. Some rows had their County field as NaN and were filtered out accordingly.

To implement the interactivity feature, we used the brush tool, implementing it via the add_selection() function. It’s linked to the second visualization using the transform_filter() function. As discussed above, this allows the user to select a state and see the license distributions across all the counties (that have license entries in the dataset) in that state. By default (at the start when no selection is made), all counties are displayed with their corresponding license count.