Visualizing Hierarchy - Fire Department Calls for Service

Chien-Yu Sung

# Introduction

The dataset I used in this project was Fire Department Calls for Service sourced from data.sfgov.org . This project focused on visualizing Hierarchy of Regions of incidents using hierarchical data visualization techniques in order to get a better understanding regarding the story underneath. I used Python3 for parsing the original data into json format that can be used by d3.stratify(), and I used D3.js version 5 for implementing the visualization. The font styles used in this website are powered by Google Fonts . This is an open source project, all the source code can be found here .


# Dataset - Fire Department Calls for Service

Source: data.sfgov.org
Licence: ODC Public Domain Dedication and Licence
Type: CSV
Date: 04/18/2019

Attribution Value Sreenshot
Size (GB) 1.7
Rows 49,366,176
Columns 34
Data Types String, Number, Datetime, Boolean, Location (Latitude and Longitude)

Data Processing:
Used python to filter unused columns and process each row to build a hierarchy with children and parents so it can be used by d3.stratify(). I tried to make this script reusable and as comprehensive as possible. You can indicate the input file, output filename, columns to use, order of hierarchy, whether to add a root, and lookup option to combine something like San Francisco and SF. See source code here .

Used Columns:

Column Interpretation
City City of incident.
Neighborhooods - Analysis Boundaries Neighborhood District associated with this address.
Zipcode of Incident Zipcode of incident.

# D3 Visualizations

Root:




Interpretation

These are the hierarchy of same dataset with different encoding. The root of Dendrogram is on the top, Treemap is the most outer square, Circular Tree and Sunburst is in the middle. Leaves are the purple ones. The data was encoded with State -> City -> Neighborhood -> Zipcode. The colors are based on the depth of the node, root is 0 and leaf is 3. Tooltip shows the name of the node and how many records are there in all children. Size of the leaves indicate the number of records for each Zipcode in the neighborhood.

Discussion

The first thing I learned from this dataset is that there are some situation that one Zipcode might be in several neighborhoods. Moreover, there are some situation that one neighborhood is in different city such as Outer Mission is in both San Francisco and Daly City. This makes the data wrangling very hard because we might put the wrong child under a incorrect parent. However, I used a top down solution to parse and generate the dataset so there will be no such problem. I believe that there was a reconstruction regarding the hierarchy of neighborhood and city during these years. You can see there is a empty city on second level, and some cities on second level became the children of that empty city. Overall, most of the services happened in the city of San Francisco.

Interactivity

0. Use dropdown feature to zoom into different children and the legend will change as well.
1. Hover over nodes highlights itself and all the ancestor nodes as well as the path to the root.
2. Hover over nodes shows the name of the nodes and how many records are there.

Credits

0. Java 11 API Hierarchy - Data Wrangling by Sophie Engle
1. Java 11 API Hierarchy - Visualization by Sophie Engle
2. Sunburst Tutorial by David Richards


Expected Grade

Done? Letter Justification
A+ Completed four types of hierarchy visualizations. Two of them were using polar coordinates. Added highlighting from node to root and zooming interactivity. Included write-ups about interpretation and what I learned. Included the context like titles and legends.

# About Me

Chien-Yu Sung
An enthusiastic and responsible man with ambition and creativity. Capable of working as a dedicated team member as well as an independent initiative taker. Proud Taiwanese! Currently, a 2nd year Master student in Computer Science at University of San Francisco who is graduating on May 18, 2019.

[ Website ]

[ Interests ]
Distributed System
Site Reliability
Cybersecurity
Data Visualization
Board Games