readr 패키지로 데이터를 R로 가져오는 방법을 숙지합니다.
다음과 같은 데이터 파일을 다운로드 한다. 브라우저의 다른 이름으로 링크 저장 기능을 이용해서 파일을 다운로드 합니다.
“data” 라는 디렉토리를 생성하고, 다운로드한 데이터 파일을 복사해 넣습는다.
read.csv()
함수는 CSV 파일을 읽을 수 있는 함수입니다.
apt <- read.csv(file = "data/아파트매매_실거래가_200001.csv",
header = TRUE,
fileEncoding = "cp949",
stringsAsFactors = FALSE)
str(apt)
'data.frame': 58805 obs. of 13 variables:
$ 시군구 : chr "강원도 강릉시 견소동" "강원도 강릉시 견소동" "강원도 강릉시 견소동" "강원도 강릉시 견소동" ...
$ 번지 : chr "202" "202" "202" "202" ...
$ 본번 : chr "0202" "0202" "0202" "0202" ...
$ 부번 : chr "0000" "0000" "0000" "0000" ...
$ 단지명 : chr "송정한신" "송정한신" "송정한신" "송정한신" ...
$ 전용면적... : num 43.4 59.8 59.8 84.9 116.2 ...
$ 계약년월 : int 202001 202001 202001 202001 202001 202001 202001 202001 202001 202001 ...
$ 계약일 : int 3 15 18 18 21 23 6 20 27 4 ...
$ 거래금액.만원.: chr "12,000" "10,000" "10,500" "13,500" ...
$ 층 : int 12 3 12 11 5 8 10 15 8 14 ...
$ 건축년도 : int 1997 1997 1997 1997 1997 1997 2005 2009 2009 1999 ...
$ 도로명 : chr "경강로2539번길 8" "경강로2539번길 8" "경강로2539번길 8" "경강로2539번길 8" ...
$ 해제사유발생일: logi NA NA NA NA NA NA ...
dim(apt)
[1] 58805 13
read_csv()
함수는 readr 패키지에서 CSV 파일을 읽을 수 있는 함수입니다.
그런데 read_csv()
함수로 영문 파일이 아닌 멀티 바이트 문자 파일을 읽을 때 에러가 발생 할 수 있습니다. 이 경우는 guess_encoding()
함수로 파일의 인코딩을 유추할 수 있습니다. 그러나 read.csv()에서의 사용하는 인코딩 이름과 체계가 다를 수 있습니다. “cp949”가 “EUC-KR”로 유추됩니다. read_csv()
함수에 “EUC-KR” 인코딩 이름을 적용해도 결과는 동일합니다.
library(readr)
library(dplyr)
apt2 <- read_csv(file = "data/아파트매매_실거래가_200001.csv",
locale = locale(encoding = "cp949"))
str(apt2)
spec_tbl_df [58,805 × 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ 시군구 : chr [1:58805] "강원도 강릉시 견소동" "강원도 강릉시 견소동" "강원도 강릉시 견소동" "강원도 강릉시 견소동" ...
$ 번지 : chr [1:58805] "202" "202" "202" "202" ...
$ 본번 : chr [1:58805] "0202" "0202" "0202" "0202" ...
$ 부번 : chr [1:58805] "0000" "0000" "0000" "0000" ...
$ 단지명 : chr [1:58805] "송정한신" "송정한신" "송정한신" "송정한신" ...
$ 전용면적(㎡) : num [1:58805] 43.4 59.8 59.8 84.9 116.2 ...
$ 계약년월 : num [1:58805] 202001 202001 202001 202001 202001 ...
$ 계약일 : num [1:58805] 3 15 18 18 21 23 6 20 27 4 ...
$ 거래금액(만원): num [1:58805] 12000 10000 10500 13500 19000 ...
$ 층 : num [1:58805] 12 3 12 11 5 8 10 15 8 14 ...
$ 건축년도 : num [1:58805] 1997 1997 1997 1997 1997 ...
$ 도로명 : chr [1:58805] "경강로2539번길 8" "경강로2539번길 8" "경강로2539번길 8" "경강로2539번길 8" ...
$ 해제사유발생일: logi [1:58805] NA NA NA NA NA NA ...
- attr(*, "spec")=
.. cols(
.. 시군구 = col_character(),
.. 번지 = col_character(),
.. 본번 = col_character(),
.. 부번 = col_character(),
.. 단지명 = col_character(),
.. `전용면적(㎡)` = col_double(),
.. 계약년월 = col_double(),
.. 계약일 = col_double(),
.. `거래금액(만원)` = col_number(),
.. 층 = col_double(),
.. 건축년도 = col_double(),
.. 도로명 = col_character(),
.. 해제사유발생일 = col_logical()
.. )
- attr(*, "problems")=<externalptr>
dim(apt2)
[1] 58805 13
# 파일의 10줄을 읽어서 인코딩을 유추합니다.
read_lines("data/아파트매매_실거래가_200001.csv", n_max = 10) %>%
guess_encoding()
# A tibble: 3 x 2
encoding confidence
<chr> <dbl>
1 EUC-KR 1
2 GB18030 0.58
3 Big5 0.34
read.csv() 함수는 data.frame 객체를 반환하지만, read 패키지의 함수들은 tibble 객체를 반환합니다.
FoodFacts.tsv 파일은 약 92MB의 용량으로 대용량이라 보기 어렵지만, 실습 차원에서 대용량으로 간주하여 진행합니다.
file.size("data/FoodFacts.tsv") / 1024^2
[1] 91.65345
read.csv() 함수는 TSV 파일을 읽을 수 있는 함수입니다.
elapse_old <- system.time(
foodfact <- read.csv(file = "data/FoodFacts.tsv",
sep = "\t",
header = TRUE,
stringsAsFactors = FALSE)
)
elapse_old
user system elapsed
2.568 0.086 2.663
dim(foodfact)
[1] 23179 163
str(foodfact)
'data.frame': 23179 obs. of 163 variables:
$ code : num 3087 4530 4559 16087 16094 ...
$ url : chr "http://world-en.openfoodfacts.org/product/0000000003087/farine-de-ble-noir-ferme-t-y-r-nao" "http://world-en.openfoodfacts.org/product/0000000004530/banana-chips-sweetened-whole" "http://world-en.openfoodfacts.org/product/0000000004559/peanuts-torn-glasser" "http://world-en.openfoodfacts.org/product/0000000016087/organic-salted-nut-mix-grizzlies" ...
$ creator : chr "openfoodfacts-contributors" "usda-ndb-import" "usda-ndb-import" "usda-ndb-import" ...
$ created_t : int 1474103866 1489069957 1489069957 1489055731 1489055653 1489055651 1489055730 1489055711 1489055651 1489055654 ...
$ created_datetime : chr "2016-09-17T09:17:46Z" "2017-03-09T14:32:37Z" "2017-03-09T14:32:37Z" "2017-03-09T10:35:31Z" ...
$ last_modified_t : int 1474103893 1489069957 1489069957 1489055731 1489055653 1489055651 1489055730 1489055712 1489055651 1489055654 ...
$ last_modified_datetime : chr "2016-09-17T09:18:13Z" "2017-03-09T14:32:37Z" "2017-03-09T14:32:37Z" "2017-03-09T10:35:31Z" ...
$ product_name : chr "Farine de blé noir" "Banana Chips Sweetened (Whole)" "Peanuts" "Organic Salted Nut Mix" ...
$ generic_name : chr "" "" "" "" ...
$ quantity : chr "1kg" "" "" "" ...
$ packaging : chr "" "" "" "" ...
$ packaging_tags : chr "" "" "" "" ...
$ brands : chr "Ferme t'y R'nao" "" "Torn & Glasser" "Grizzlies" ...
$ brands_tags : chr "ferme-t-y-r-nao" "" "torn-glasser" "grizzlies" ...
$ categories : chr "" "" "" "" ...
$ categories_tags : chr "" "" "" "" ...
$ categories_en : chr "" "" "" "" ...
$ origins : chr "" "" "" "" ...
$ origins_tags : chr "" "" "" "" ...
$ manufacturing_places : chr "" "" "" "" ...
$ manufacturing_places_tags : chr "" "" "" "" ...
$ labels : chr "" "" "" "" ...
$ labels_tags : chr "" "" "" "" ...
$ labels_en : chr "" "" "" "" ...
$ emb_codes : chr "" "" "" "" ...
$ emb_codes_tags : chr "" "" "" "" ...
$ first_packaging_code_geo : chr "" "" "" "" ...
$ cities : logi NA NA NA NA NA NA ...
$ cities_tags : chr "" "" "" "" ...
$ purchase_places : chr "" "" "" "" ...
$ stores : chr "" "" "" "" ...
$ countries : chr "en:FR" "US" "US" "US" ...
$ countries_tags : chr "en:france" "en:united-states" "en:united-states" "en:united-states" ...
$ countries_en : chr "France" "United States" "United States" "United States" ...
$ ingredients_text : chr "" "Bananas, vegetable oil (coconut oil, corn oil and/or palm oil) sugar, natural banana flavor." "Peanuts, wheat flour, sugar, rice flour, tapioca starch, salt, leavening (ammonium bicarbonate, baking soda), s"| __truncated__ "Organic hazelnuts, organic cashews, organic walnuts almonds, organic sunflower oil, sea salt." ...
$ allergens : chr "" "" "" "" ...
$ allergens_en : logi NA NA NA NA NA NA ...
$ traces : chr "" "" "" "" ...
$ traces_tags : chr "" "" "" "" ...
$ traces_en : chr "" "" "" "" ...
$ serving_size : chr "" "28 g (1 ONZ)" "28 g (0.25 cup)" "28 g (0.25 cup)" ...
$ no_nutriments : logi NA NA NA NA NA NA ...
$ additives_n : int NA 0 0 0 0 0 0 1 0 0 ...
$ additives : chr "" " [ bananas -> en:bananas ] [ vegetable-oil -> en:vegetable-oil ] [ oil -> en:oil ] [ coconut-oil -> en:co"| __truncated__ " [ peanuts -> en:peanuts ] [ wheat-flour -> en:wheat-flour ] [ flour -> en:flour ] [ sugar -> en:sugar ]"| __truncated__ " [ organic-hazelnuts -> en:organic-hazelnuts ] [ hazelnuts -> en:hazelnuts ] [ organic-cashews -> en:organi"| __truncated__ ...
$ additives_tags : chr "" "" "" "" ...
$ additives_en : chr "" "" "" "" ...
$ ingredients_from_palm_oil_n : chr "" "0" "0" "0" ...
$ ingredients_from_palm_oil : logi NA NA NA NA NA NA ...
$ ingredients_from_palm_oil_tags : chr "" "" "" "" ...
$ ingredients_that_may_be_from_palm_oil_n : chr "" "0" "0" "0" ...
$ ingredients_that_may_be_from_palm_oil : chr "" "" "" "" ...
$ ingredients_that_may_be_from_palm_oil_tags: chr "" "" "" "" ...
$ nutrition_grade_uk : int NA NA NA NA NA NA NA NA NA NA ...
$ nutrition_grade_fr : chr "" "d" "b" "d" ...
$ pnns_groups_1 : chr "" "" "" "" ...
$ pnns_groups_2 : chr "" "" "" "" ...
$ states : chr "en:to-be-completed, en:nutrition-facts-to-be-completed, en:ingredients-to-be-completed, en:expiration-date-to-b"| __truncated__ "en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed,"| __truncated__ "en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed,"| __truncated__ "en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed,"| __truncated__ ...
$ states_tags : chr "en:to-be-completed,en:nutrition-facts-to-be-completed,en:ingredients-to-be-completed,en:expiration-date-to-be-c"| __truncated__ "en:to-be-completed,en:nutrition-facts-completed,en:ingredients-completed,en:expiration-date-to-be-completed,en:"| __truncated__ "en:to-be-completed,en:nutrition-facts-completed,en:ingredients-completed,en:expiration-date-to-be-completed,en:"| __truncated__ "en:to-be-completed,en:nutrition-facts-completed,en:ingredients-completed,en:expiration-date-to-be-completed,en:"| __truncated__ ...
$ states_en : chr "To be completed,Nutrition facts to be completed,Ingredients to be completed,Expiration date to be completed,Cha"| __truncated__ "To be completed,Nutrition facts completed,Ingredients completed,Expiration date to be completed,Packaging-code-"| __truncated__ "To be completed,Nutrition facts completed,Ingredients completed,Expiration date to be completed,Packaging-code-"| __truncated__ "To be completed,Nutrition facts completed,Ingredients completed,Expiration date to be completed,Packaging-code-"| __truncated__ ...
$ main_category : chr "" "" "" "" ...
$ main_category_en : chr "" "" "" "" ...
$ image_url : chr "" "" "" "" ...
$ image_small_url : chr "" "" "" "" ...
$ energy_100g : chr "" "2243" "1941" "2540" ...
$ energy.from.fat_100g : chr "" "" "" "" ...
$ fat_100g : chr "" "28.57" "17.86" "57.14" ...
$ saturated.fat_100g : chr "" "28.57" "0" "5.36" ...
$ X.butyric.acid_100g : chr "" "" "" "" ...
$ X.caproic.acid_100g : num NA NA NA NA NA NA NA NA NA NA ...
$ X.caprylic.acid_100g : num NA NA NA NA NA NA NA NA NA NA ...
$ X.capric.acid_100g : logi NA NA NA NA NA NA ...
$ X.lauric.acid_100g : num NA NA NA NA NA NA NA NA NA NA ...
$ X.myristic.acid_100g : num NA NA NA NA NA NA NA NA NA NA ...
$ X.palmitic.acid_100g : chr "" "" "" "" ...
$ X.stearic.acid_100g : logi NA NA NA NA NA NA ...
$ X.arachidic.acid_100g : int NA NA NA NA NA NA NA NA NA NA ...
$ X.behenic.acid_100g : chr "" "" "" "" ...
$ X.lignoceric.acid_100g : chr "" "" "" "" ...
$ X.cerotic.acid_100g : chr "" "" "" "" ...
$ X.montanic.acid_100g : num NA NA NA NA NA NA NA NA NA NA ...
$ X.melissic.acid_100g : logi NA NA NA NA NA NA ...
$ monounsaturated.fat_100g : num NA NA NA NA NA NA NA NA NA NA ...
$ polyunsaturated.fat_100g : num NA NA NA NA NA NA NA NA NA NA ...
$ omega.3.fat_100g : num NA NA NA NA NA NA NA NA NA NA ...
$ X.alpha.linolenic.acid_100g : num NA NA NA NA NA NA NA NA NA NA ...
$ X.eicosapentaenoic.acid_100g : logi NA NA NA NA NA NA ...
$ X.docosahexaenoic.acid_100g : chr "" "" "" "" ...
$ omega.6.fat_100g : logi NA NA NA NA NA NA ...
$ X.linoleic.acid_100g : logi NA NA NA NA NA NA ...
$ X.arachidonic.acid_100g : chr "" "" "" "" ...
$ X.gamma.linolenic.acid_100g : chr "" "" "" "" ...
$ X.dihomo.gamma.linolenic.acid_100g : chr "" "" "" "" ...
$ omega.9.fat_100g : logi NA NA NA NA NA NA ...
$ X.oleic.acid_100g : logi NA NA NA NA NA NA ...
$ X.elaidic.acid_100g : logi NA NA NA NA NA NA ...
$ X.gondoic.acid_100g : logi NA NA NA NA NA NA ...
$ X.mead.acid_100g : int NA NA NA NA NA NA NA NA NA NA ...
$ X.erucic.acid_100g : logi NA NA NA NA NA NA ...
$ X.nervonic.acid_100g : int NA NA NA NA NA NA NA NA NA NA ...
[list output truncated]
read_tsv()
함수는 readr 패키지에서 TSV 파일을 읽을 수 있는 함수입니다.
elapse_readr <- system.time(
foodfact2 <- read_tsv(file = "data/FoodFacts.tsv")
)
elapse_readr
user system elapsed
4.846 0.267 4.186
dim(foodfact2)
[1] 35000 163
str(foodfact2)
spec_tbl_df [35,000 × 163] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ code : chr [1:35000] "0000000003087" "0000000004530" "0000000004559" "0000000016087" ...
$ url : chr [1:35000] "http://world-en.openfoodfacts.org/product/0000000003087/farine-de-ble-noir-ferme-t-y-r-nao" "http://world-en.openfoodfacts.org/product/0000000004530/banana-chips-sweetened-whole" "http://world-en.openfoodfacts.org/product/0000000004559/peanuts-torn-glasser" "http://world-en.openfoodfacts.org/product/0000000016087/organic-salted-nut-mix-grizzlies" ...
$ creator : chr [1:35000] "openfoodfacts-contributors" "usda-ndb-import" "usda-ndb-import" "usda-ndb-import" ...
$ created_t : num [1:35000] 1.47e+09 1.49e+09 1.49e+09 1.49e+09 1.49e+09 ...
$ created_datetime : POSIXct[1:35000], format: "2016-09-17 09:17:46" ...
$ last_modified_t : num [1:35000] 1.47e+09 1.49e+09 1.49e+09 1.49e+09 1.49e+09 ...
$ last_modified_datetime : POSIXct[1:35000], format: "2016-09-17 09:18:13" ...
$ product_name : chr [1:35000] "Farine de blé noir" "Banana Chips Sweetened (Whole)" "Peanuts" "Organic Salted Nut Mix" ...
$ generic_name : chr [1:35000] NA NA NA NA ...
$ quantity : chr [1:35000] "1kg" NA NA NA ...
$ packaging : chr [1:35000] NA NA NA NA ...
$ packaging_tags : chr [1:35000] NA NA NA NA ...
$ brands : chr [1:35000] "Ferme t'y R'nao" NA "Torn & Glasser" "Grizzlies" ...
$ brands_tags : chr [1:35000] "ferme-t-y-r-nao" NA "torn-glasser" "grizzlies" ...
$ categories : chr [1:35000] NA NA NA NA ...
$ categories_tags : chr [1:35000] NA NA NA NA ...
$ categories_en : chr [1:35000] NA NA NA NA ...
$ origins : chr [1:35000] NA NA NA NA ...
$ origins_tags : chr [1:35000] NA NA NA NA ...
$ manufacturing_places : chr [1:35000] NA NA NA NA ...
$ manufacturing_places_tags : chr [1:35000] NA NA NA NA ...
$ labels : chr [1:35000] NA NA NA NA ...
$ labels_tags : chr [1:35000] NA NA NA NA ...
$ labels_en : chr [1:35000] NA NA NA NA ...
$ emb_codes : chr [1:35000] NA NA NA NA ...
$ emb_codes_tags : chr [1:35000] NA NA NA NA ...
$ first_packaging_code_geo : logi [1:35000] NA NA NA NA NA NA ...
$ cities : logi [1:35000] NA NA NA NA NA NA ...
$ cities_tags : logi [1:35000] NA NA NA NA NA NA ...
$ purchase_places : chr [1:35000] NA NA NA NA ...
$ stores : chr [1:35000] NA NA NA NA ...
$ countries : chr [1:35000] "en:FR" "US" "US" "US" ...
$ countries_tags : chr [1:35000] "en:france" "en:united-states" "en:united-states" "en:united-states" ...
$ countries_en : chr [1:35000] "France" "United States" "United States" "United States" ...
$ ingredients_text : chr [1:35000] NA "Bananas, vegetable oil (coconut oil, corn oil and/or palm oil) sugar, natural banana flavor." "Peanuts, wheat flour, sugar, rice flour, tapioca starch, salt, leavening (ammonium bicarbonate, baking soda), s"| __truncated__ "Organic hazelnuts, organic cashews, organic walnuts almonds, organic sunflower oil, sea salt." ...
$ allergens : chr [1:35000] NA NA NA NA ...
$ allergens_en : logi [1:35000] NA NA NA NA NA NA ...
$ traces : chr [1:35000] NA NA NA NA ...
$ traces_tags : chr [1:35000] NA NA NA NA ...
$ traces_en : chr [1:35000] NA NA NA NA ...
$ serving_size : chr [1:35000] NA "28 g (1 ONZ)" "28 g (0.25 cup)" "28 g (0.25 cup)" ...
$ no_nutriments : logi [1:35000] NA NA NA NA NA NA ...
$ additives_n : num [1:35000] NA 0 0 0 0 0 0 1 0 0 ...
$ additives : chr [1:35000] NA "[ bananas -> en:bananas ] [ vegetable-oil -> en:vegetable-oil ] [ oil -> en:oil ] [ coconut-oil -> en:coc"| __truncated__ "[ peanuts -> en:peanuts ] [ wheat-flour -> en:wheat-flour ] [ flour -> en:flour ] [ sugar -> en:sugar ] "| __truncated__ "[ organic-hazelnuts -> en:organic-hazelnuts ] [ hazelnuts -> en:hazelnuts ] [ organic-cashews -> en:organic"| __truncated__ ...
$ additives_tags : chr [1:35000] NA NA NA NA ...
$ additives_en : chr [1:35000] NA NA NA NA ...
$ ingredients_from_palm_oil_n : num [1:35000] NA 0 0 0 0 0 0 0 0 0 ...
$ ingredients_from_palm_oil : logi [1:35000] NA NA NA NA NA NA ...
$ ingredients_from_palm_oil_tags : chr [1:35000] NA NA NA NA ...
$ ingredients_that_may_be_from_palm_oil_n : num [1:35000] NA 0 0 0 0 0 0 0 0 0 ...
$ ingredients_that_may_be_from_palm_oil : logi [1:35000] NA NA NA NA NA NA ...
$ ingredients_that_may_be_from_palm_oil_tags: chr [1:35000] NA NA NA NA ...
$ nutrition_grade_uk : logi [1:35000] NA NA NA NA NA NA ...
$ nutrition_grade_fr : chr [1:35000] NA "d" "b" "d" ...
$ pnns_groups_1 : chr [1:35000] NA NA NA NA ...
$ pnns_groups_2 : chr [1:35000] NA NA NA NA ...
$ states : chr [1:35000] "en:to-be-completed, en:nutrition-facts-to-be-completed, en:ingredients-to-be-completed, en:expiration-date-to-b"| __truncated__ "en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed,"| __truncated__ "en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed,"| __truncated__ "en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed,"| __truncated__ ...
$ states_tags : chr [1:35000] "en:to-be-completed,en:nutrition-facts-to-be-completed,en:ingredients-to-be-completed,en:expiration-date-to-be-c"| __truncated__ "en:to-be-completed,en:nutrition-facts-completed,en:ingredients-completed,en:expiration-date-to-be-completed,en:"| __truncated__ "en:to-be-completed,en:nutrition-facts-completed,en:ingredients-completed,en:expiration-date-to-be-completed,en:"| __truncated__ "en:to-be-completed,en:nutrition-facts-completed,en:ingredients-completed,en:expiration-date-to-be-completed,en:"| __truncated__ ...
$ states_en : chr [1:35000] "To be completed,Nutrition facts to be completed,Ingredients to be completed,Expiration date to be completed,Cha"| __truncated__ "To be completed,Nutrition facts completed,Ingredients completed,Expiration date to be completed,Packaging-code-"| __truncated__ "To be completed,Nutrition facts completed,Ingredients completed,Expiration date to be completed,Packaging-code-"| __truncated__ "To be completed,Nutrition facts completed,Ingredients completed,Expiration date to be completed,Packaging-code-"| __truncated__ ...
$ main_category : chr [1:35000] NA NA NA NA ...
$ main_category_en : chr [1:35000] NA NA NA NA ...
$ image_url : chr [1:35000] NA NA NA NA ...
$ image_small_url : chr [1:35000] NA NA NA NA ...
$ energy_100g : num [1:35000] NA 2243 1941 2540 1552 ...
$ energy-from-fat_100g : num [1:35000] NA NA NA NA NA NA NA NA NA NA ...
$ fat_100g : num [1:35000] NA 28.57 17.86 57.14 1.43 ...
$ saturated-fat_100g : num [1:35000] NA 28.57 0 5.36 NA ...
$ -butyric-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -caproic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -caprylic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -capric-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -lauric-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -myristic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -palmitic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -stearic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -arachidic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -behenic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -lignoceric-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -cerotic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -montanic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -melissic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ monounsaturated-fat_100g : num [1:35000] NA NA NA NA NA NA NA NA NA NA ...
$ polyunsaturated-fat_100g : num [1:35000] NA NA NA NA NA NA NA NA NA NA ...
$ omega-3-fat_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -alpha-linolenic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -eicosapentaenoic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -docosahexaenoic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ omega-6-fat_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -linoleic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -arachidonic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -gamma-linolenic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -dihomo-gamma-linolenic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ omega-9-fat_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -oleic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -elaidic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -gondoic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -mead-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -erucic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
$ -nervonic-acid_100g : logi [1:35000] NA NA NA NA NA NA ...
[list output truncated]
- attr(*, "spec")=
.. cols(
.. code = col_character(),
.. url = col_character(),
.. creator = col_character(),
.. created_t = col_double(),
.. created_datetime = col_datetime(format = ""),
.. last_modified_t = col_double(),
.. last_modified_datetime = col_datetime(format = ""),
.. product_name = col_character(),
.. generic_name = col_character(),
.. quantity = col_character(),
.. packaging = col_character(),
.. packaging_tags = col_character(),
.. brands = col_character(),
.. brands_tags = col_character(),
.. categories = col_character(),
.. categories_tags = col_character(),
.. categories_en = col_character(),
.. origins = col_character(),
.. origins_tags = col_character(),
.. manufacturing_places = col_character(),
.. manufacturing_places_tags = col_character(),
.. labels = col_character(),
.. labels_tags = col_character(),
.. labels_en = col_character(),
.. emb_codes = col_character(),
.. emb_codes_tags = col_character(),
.. first_packaging_code_geo = col_logical(),
.. cities = col_logical(),
.. cities_tags = col_logical(),
.. purchase_places = col_character(),
.. stores = col_character(),
.. countries = col_character(),
.. countries_tags = col_character(),
.. countries_en = col_character(),
.. ingredients_text = col_character(),
.. allergens = col_character(),
.. allergens_en = col_logical(),
.. traces = col_character(),
.. traces_tags = col_character(),
.. traces_en = col_character(),
.. serving_size = col_character(),
.. no_nutriments = col_logical(),
.. additives_n = col_double(),
.. additives = col_character(),
.. additives_tags = col_character(),
.. additives_en = col_character(),
.. ingredients_from_palm_oil_n = col_double(),
.. ingredients_from_palm_oil = col_logical(),
.. ingredients_from_palm_oil_tags = col_character(),
.. ingredients_that_may_be_from_palm_oil_n = col_double(),
.. ingredients_that_may_be_from_palm_oil = col_logical(),
.. ingredients_that_may_be_from_palm_oil_tags = col_character(),
.. nutrition_grade_uk = col_logical(),
.. nutrition_grade_fr = col_character(),
.. pnns_groups_1 = col_character(),
.. pnns_groups_2 = col_character(),
.. states = col_character(),
.. states_tags = col_character(),
.. states_en = col_character(),
.. main_category = col_character(),
.. main_category_en = col_character(),
.. image_url = col_character(),
.. image_small_url = col_character(),
.. energy_100g = col_double(),
.. `energy-from-fat_100g` = col_double(),
.. fat_100g = col_double(),
.. `saturated-fat_100g` = col_double(),
.. `-butyric-acid_100g` = col_logical(),
.. `-caproic-acid_100g` = col_logical(),
.. `-caprylic-acid_100g` = col_logical(),
.. `-capric-acid_100g` = col_logical(),
.. `-lauric-acid_100g` = col_logical(),
.. `-myristic-acid_100g` = col_logical(),
.. `-palmitic-acid_100g` = col_logical(),
.. `-stearic-acid_100g` = col_logical(),
.. `-arachidic-acid_100g` = col_logical(),
.. `-behenic-acid_100g` = col_logical(),
.. `-lignoceric-acid_100g` = col_logical(),
.. `-cerotic-acid_100g` = col_logical(),
.. `-montanic-acid_100g` = col_logical(),
.. `-melissic-acid_100g` = col_logical(),
.. `monounsaturated-fat_100g` = col_double(),
.. `polyunsaturated-fat_100g` = col_double(),
.. `omega-3-fat_100g` = col_logical(),
.. `-alpha-linolenic-acid_100g` = col_logical(),
.. `-eicosapentaenoic-acid_100g` = col_logical(),
.. `-docosahexaenoic-acid_100g` = col_logical(),
.. `omega-6-fat_100g` = col_logical(),
.. `-linoleic-acid_100g` = col_logical(),
.. `-arachidonic-acid_100g` = col_logical(),
.. `-gamma-linolenic-acid_100g` = col_logical(),
.. `-dihomo-gamma-linolenic-acid_100g` = col_logical(),
.. `omega-9-fat_100g` = col_logical(),
.. `-oleic-acid_100g` = col_logical(),
.. `-elaidic-acid_100g` = col_logical(),
.. `-gondoic-acid_100g` = col_logical(),
.. `-mead-acid_100g` = col_logical(),
.. `-erucic-acid_100g` = col_logical(),
.. `-nervonic-acid_100g` = col_logical(),
.. `trans-fat_100g` = col_double(),
.. cholesterol_100g = col_double(),
.. carbohydrates_100g = col_double(),
.. sugars_100g = col_double(),
.. `-sucrose_100g` = col_logical(),
.. `-glucose_100g` = col_logical(),
.. `-fructose_100g` = col_logical(),
.. `-lactose_100g` = col_double(),
.. `-maltose_100g` = col_logical(),
.. `-maltodextrins_100g` = col_logical(),
.. starch_100g = col_double(),
.. polyols_100g = col_logical(),
.. fiber_100g = col_double(),
.. proteins_100g = col_double(),
.. casein_100g = col_logical(),
.. `serum-proteins_100g` = col_logical(),
.. nucleotides_100g = col_logical(),
.. salt_100g = col_double(),
.. sodium_100g = col_double(),
.. alcohol_100g = col_logical(),
.. `vitamin-a_100g` = col_double(),
.. `beta-carotene_100g` = col_logical(),
.. `vitamin-d_100g` = col_double(),
.. `vitamin-e_100g` = col_double(),
.. `vitamin-k_100g` = col_double(),
.. `vitamin-c_100g` = col_double(),
.. `vitamin-b1_100g` = col_double(),
.. `vitamin-b2_100g` = col_double(),
.. `vitamin-pp_100g` = col_double(),
.. `vitamin-b6_100g` = col_double(),
.. `vitamin-b9_100g` = col_double(),
.. folates_100g = col_double(),
.. `vitamin-b12_100g` = col_double(),
.. biotin_100g = col_logical(),
.. `pantothenic-acid_100g` = col_double(),
.. silica_100g = col_logical(),
.. bicarbonate_100g = col_logical(),
.. potassium_100g = col_double(),
.. chloride_100g = col_logical(),
.. calcium_100g = col_double(),
.. phosphorus_100g = col_double(),
.. iron_100g = col_double(),
.. magnesium_100g = col_double(),
.. zinc_100g = col_double(),
.. copper_100g = col_double(),
.. manganese_100g = col_double(),
.. fluoride_100g = col_logical(),
.. selenium_100g = col_double(),
.. chromium_100g = col_logical(),
.. molybdenum_100g = col_logical(),
.. iodine_100g = col_logical(),
.. caffeine_100g = col_logical(),
.. taurine_100g = col_logical(),
.. ph_100g = col_logical(),
.. `fruits-vegetables-nuts_100g` = col_logical(),
.. `fruits-vegetables-nuts-estimate_100g` = col_double(),
.. `collagen-meat-protein-ratio_100g` = col_logical(),
.. cocoa_100g = col_logical(),
.. chlorophyl_100g = col_logical(),
.. `carbon-footprint_100g` = col_logical(),
.. `nutrition-score-fr_100g` = col_double(),
.. `nutrition-score-uk_100g` = col_double(),
.. `glycemic-index_100g` = col_logical(),
.. `water-hardness_100g` = col_logical()
.. )
- attr(*, "problems")=<externalptr>
기존 방법은 2.663초, readr 패키지는 4.186초가 소요되었습니다. 데이터의 용량이 그리 크지 않기 때문에 기존 방법의 수행속도가 다소 빠를 수 있지만, 대용량 데이터의 경우에는 readr 패키지의 속도가 빠릅니다.
그리고 35,000건의 데이터 중에서 기존의 방법은 23,179만 읽었을 뿐입니다.
For attribution, please cite this work as
유충현 (2022, Feb. 23). Dataholic: Import data with readr. Retrieved from https://choonghyunryu.github.io/2022-02-23-readr
BibTeX citation
@misc{유충현2022import, author = {유충현, }, title = {Dataholic: Import data with readr}, url = {https://choonghyunryu.github.io/2022-02-23-readr}, year = {2022} }