In today’s tutorial I give you a quick introduction to data.table and I show you how you can filter rows, select columns and get some basic statistics by group.
You can find the data set here: https://www.kaggle.com/russellyates88/suicide-rates-overview-1985-to-2016
Since I used a data set about suicide it is important to point out that you should talk about suicidal thoughts if you have them. You can find a list of suicide hotlines for a lot of countries here: http://ibpf.org/resource/list-international-suicide-hotlines
If your country isn’t listed here please just google „suicide hotline“ and your country
The video:
The code:
setwd("PATHTOYOURWORKINDIRECTORY")
library(data.table)
data <- data.table::fread("suicide-rates-overview-1985-to-2016/master.csv", check.names = TRUE)
str(data)
#filter rows
data[country == "Austria"]
#select columns
data[country == "Austria", .(Number_of_Suicides = suicides_no, Pop = population)]
#transform columns
data[country == "Austria", .(sum_of_suicides = sum(suicides_no), Pop = sum(population))]
#sort by
data[country == "Austria", .(sum_of_suicides = sum(suicides_no)), by = c("year", "sex")]