The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Getting Started with nycOpenData: data set titled NYPD Shootings Data

Joyce Escatel Flores

knitr::opts_chunk$set(warning = FALSE, message = FALSE)
library(nycOpenData)
library(ggplot2)

Introduction

If you have lived or live in NYC before, you know how populated it is. A beautiful city with many things to do. But we unfortunately know that a very populated city can lead to us hearing about crimes that occur or we might be curious as to what crime or what type of crime might occur in our beautiful city. NYC now has data to show you about every shooting incident that has occurred in NYC. Information that is included is the date of the incident, the time it occurred, what borough it occurred, and so much more. If you want to know more information, you can find the dataset here If you want to explore this data set more, in R, the nycOpenData package can be used to pull this data directly.

By using the nyc_shooting_incidents() function, we can gather the most recent shooting incidents in NYC!

Pulling a Small Sample

To start, let’s pull a small sample to see what the data looks like. By default, the function pulls in the 10,000 most recent requests, however, let’s change that to only see the latest 3 requests. To do this, we can set limit = 3.

small_sample <- nyc_shooting_incidents(limit = 3)
small_sample
#> # A tibble: 3 × 13
#>   incident_key occur_date            occur_time boro  loc_of_occur_desc precinct
#>   <chr>        <chr>                 <chr>      <chr> <chr>             <chr>   
#> 1 318207675    2025-12-31T00:00:00.… 23:44:00   MANH… OUTSIDE           23      
#> 2 318203589    2025-12-31T00:00:00.… 18:40:00   MANH… INSIDE            32      
#> 3 318139227    2025-12-30T00:00:00.… 03:30:00   BROO… INSIDE            77      
#> # ℹ 7 more variables: jurisdiction_code <chr>, loc_classfctn_desc <chr>,
#> #   location_desc <chr>, x_coord_cd <chr>, y_coord_cd <chr>, latitude <chr>,
#> #   longitude <chr>

# Seeing what columns are in the data set
colnames(small_sample)
#>  [1] "incident_key"       "occur_date"         "occur_time"        
#>  [4] "boro"               "loc_of_occur_desc"  "precinct"          
#>  [7] "jurisdiction_code"  "loc_classfctn_desc" "location_desc"     
#> [10] "x_coord_cd"         "y_coord_cd"         "latitude"          
#> [13] "longitude"

We have successfully pulled NYPD Shooting Incident Data from the NYC Open Data Portal.

Mini analysis

Since we have successfully pulled the data, lets do a quick analysis to see the location (name of column: LOC_OF_OCCUR_DESC, Either:Outside or inside) of shooting incidents in each borough (name of column: BORO).

To do this, we will create a cluster bar graph.

shooting_data<-nyc_shooting_incidents(limit=1000)

ggplot(shooting_data, aes(boro, fill = loc_of_occur_desc)) +
  geom_bar(position = "dodge") +
  geom_text(
    stat = "count",
    aes(label=after_stat(count)),
    position = position_dodge(width = 0.8),
    vjust=-0.2,
    size = 3) +
  labs(
    title = "Counts For Shooting Incidents",
    x="Borough",
    y="counts of shooting incidents"
  )+
  theme_minimal()
Cluster bar graph showing the number of shooting incidents per borough with the amount of shootings that took place either outside or inside

Cluster bar graph showing shooting incidents per borough based on the location of shooting.

This graphs shows us the counts of shooting incidents that took place in each borough based on the location of the incident (inside or outside)

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.