Introduction to Probability and Data(2) - R Coding

2019. 7. 26. 00:04

> sfo_feb_flights <- nycflights %>%
+ filter(dest == 'SFO', month == 2)
> dim(sfo_feb_flights)
[1] 68 16

> ggplot(data = sfo_feb_flights, aes(x = arr_delay))+
+ geom_histogram(bins = 50)

> nycflights %>%
+   group_by(month) %>%
+   summarise(mean_dd = mean(dep_delay)) %>%
+   arrange(desc(mean_dd))
# A tibble: 12 x 2
   month mean_dd

1     7   20.8
2     6   20.4
3    12   17.4
4     4   14.6
5     3   13.5
6     5   13.3
7     8   12.6
8     2   10.7
9     1   10.2
10     9    6.87
11    11    6.10
12    10    5.88

> nycflights %>%
+ group_by(month)%>%
+ summarise(median = median(dep_delay))%>%
+ arrange(desc(median))
# A tibble: 12 x 2
   month median

1    12      1
2     6      0
3     7      0
4     3     -1
5     5     -1
6     8     -1
7     1     -2
8     2     -2
9     4     -2
10    11     -2
11     9     -3
12    10     -3

nycflights <- nycflights %>%

mutate(avg_spd = 60 * distance / air_time)

nycflights %>%

select(tailnum, avg_spd) %>%

arrange(desc(avg_spd))

tailnum avg_spd
1 N666DN 703.3846

'데이터 사이언스 > Introduction to Probability and Data' 카테고리의 다른 글

Introduction to Probability and Data_Week3 - R Coding (0)	2019.08.04
Introduction to Probabillity and Data - Week 3 (0)	2019.07.28
Introductions to Probability and Data (2) (0)	2019.07.10
Introduction to Probability and Data - 교재 학습(1) (0)	2019.07.03
Introductions to Probability and Data (1) (0)	2019.07.02

매일 성장하는 블로그

Introduction to Probability and Data(2) - R Coding

'데이터 사이언스 > Introduction to Probability and Data' 카테고리의 다른 글

+ Recent posts

티스토리툴바