Database profiling functions that returns confirmed, unique, missing, conflicting, or majority values in all (non-ID) variables in the datasets in a 'many' package database.
Usage
db_plot(database, key = "manyID", variable = "all", category = "all")
db_comp(database, key = "manyID", variable = "all", category = "all")
Arguments
- database
A many database.
- key
A variable key to join datasets by, "manyID" by default.
- variable
Would you like to focus on one, or more, specific variables? By default "all". For multiple variables, please declare variable names as a vector.
- category
Would you like to focus on one specific code category? By default "all" are returned. Other options include "confirmed", "unique", "missing", "conflicting", or "majority". For multiple variables, please declare categories as a vector.
Value
A plot, or a tibble, with the profile of the variables across all datasets in a "many" database. For multiple categories across multiple variables, the functions return all rows that contain at least one of the selected variables coded as one of the categories.
Details
Confirmed values are the same in all datasets in database. Unique values appear once in datasets in database. Missing values are missing in all datasets in database. Conflicting values are different in the same number of datasets in database. Majority values have the same value in multiple, but not all, datasets in database.
db_plot()
plots the database profile.
db_comp()
creates a tibble comparing the variables in a database.
Examples
# \donttest{
db_plot(database = emperors, key = "ID")
#> There were 116 matched observations by ID variable across datasets in database.
db_plot(database = emperors, key = "ID", variable = c("Beg", "End"))
#> There were 116 matched observations by ID variable across datasets in database.
db_plot(database = emperors, key = "ID", variable = c("Beg", "End"),
category = c("conflict", "unique"))
#> There were 116 matched observations by ID variable across datasets in database.
# }
# \donttest{
db_comp(database = emperors, key = "ID")
#> There were 116 matched observations by ID variable across datasets in database.
#> # A tibble: 139 × 37
#> ID wikip…¹ UNRV$…² brita…³ Beg (…⁴ wikip…⁵ UNRV$…⁶ brita…⁷ End (…⁸ wikip…⁹
#> <chr> <mdate> <mdate> <mdate> <chr> <mdate> <mdate> <mdate> <chr> <chr>
#> 1 Augu… -0026-… -0027 -0031 confli… 0014-0… -0014 0014 confli… IMPERA…
#> 2 Tibe… 0014-0… -0014 0014 confli… 0037-0… 0037 0037 majori… TIBERI…
#> 3 Cali… 0037-0… NA 0037 confli… 0041-0… NA 0041 confli… GAIVS …
#> 4 Clau… 0041-0… 0041 0041 majori… 0054-1… 0054 0054 majori… TIBERI…
#> 5 Nero 0054-1… 0054 0054 majori… 0068-0… 0068 0068 majori… NERO C…
#> 6 Galba 0068-0… 0068 0068 majori… 0069-0… 0069 0069 majori… SERVIV…
#> 7 Otho 0069-0… 0069 0069-01 confli… 0069-0… 0069 0069-04 confli… MARCVS…
#> 8 Vite… 0069-0… 0069 NA confli… 0069-1… 0069 NA confli… AVLVS …
#> 9 Vesp… 0069-1… 0069 0069 majori… 0079-0… 0079 0079 majori… TITVS …
#> 10 Titus 0079-0… 0079 0079 majori… 0081-0… 0081 0081 majori… TITVS …
#> # … with 129 more rows, 27 more variables: `UNRV$FullName` <chr>,
#> # `FullName (2)` <chr>, `wikipedia$Birth` <chr>, `UNRV$Birth` <chr>,
#> # `Birth (2)` <chr>, `wikipedia$Death` <chr>, `UNRV$Death` <chr>,
#> # `Death (2)` <chr>, `wikipedia$CityBirth` <chr>, `CityBirth (1)` <chr>,
#> # `wikipedia$ProvinceBirth` <chr>, `ProvinceBirth (1)` <chr>,
#> # `wikipedia$Rise` <chr>, `Rise (1)` <chr>, `wikipedia$Cause` <chr>,
#> # `Cause (1)` <chr>, `wikipedia$Killer` <chr>, `Killer (1)` <chr>, …
db_comp(database = emperors, key = "ID", variable = "Beg")
#> There were 116 matched observations by ID variable across datasets in database.
#> # A tibble: 139 × 5
#> ID `wikipedia$Beg` `UNRV$Beg` `britannica$Beg` `Beg (3)`
#> <chr> <mdate> <mdate> <mdate> <chr>
#> 1 Augustus -0026-01-16 -0027 -0031 conflict
#> 2 Tiberius 0014-09-18 -0014 0014 conflict
#> 3 Caligula 0037-03-18 NA 0037 conflict
#> 4 Claudius 0041-01-25 0041 0041 majority
#> 5 Nero 0054-10-13 0054 0054 majority
#> 6 Galba 0068-06-08 0068 0068 majority
#> 7 Otho 0069-01-15 0069 0069-01 conflict
#> 8 Vitellius 0069-04-17 0069 NA conflict
#> 9 Vespasian 0069-12-21 0069 0069 majority
#> 10 Titus 0079-06-24 0079 0079 majority
#> # … with 129 more rows
db_comp(database = emperors, key = "ID", variable = c("Beg", "End"),
category = "conflict")
#> There were 116 matched observations by ID variable across datasets in database.
#> # A tibble: 26 × 9
#> ID wikip…¹ UNRV$…² brita…³ Beg (…⁴ wikip…⁵ UNRV$…⁶ brita…⁷ End (…⁸
#> <chr> <mdate> <mdate> <mdate> <chr> <mdate> <mdate> <mdate> <chr>
#> 1 Augustus -0026-… -0027 -0031 confli… 0014-0… -0014 0014 confli…
#> 2 Tiberius 0014-0… -0014 0014 confli… 0037-0… 0037 0037 majori…
#> 3 Caligula 0037-0… NA 0037 confli… 0041-0… NA 0041 confli…
#> 4 Otho 0069-0… 0069 0069-01 confli… 0069-0… 0069 0069-04 confli…
#> 5 Vitellius 0069-0… 0069 NA confli… 0069-1… 0069 NA confli…
#> 6 Commodus 0177 … 0180 0177 confli… 0192-1… 0192 0192 majori…
#> 7 Pertinax 0193-0… 0193 NA confli… 0193-0… 0193 NA confli…
#> 8 Didius Julia… 0193-0… 0193 NA confli… 0193-0… 0193 NA confli…
#> 9 Caracalla 0198 … 0211 0198 confli… 0217-0… 0217 0217 majori…
#> 10 Geta 0209 … 0211 NA confli… 0211-1… 0211 NA confli…
#> # … with 16 more rows, and abbreviated variable names ¹`wikipedia$Beg`,
#> # ²`UNRV$Beg`, ³`britannica$Beg`, ⁴`Beg (3)`, ⁵`wikipedia$End`, ⁶`UNRV$End`,
#> # ⁷`britannica$End`, ⁸`End (3)`
db_comp(database = emperors, key = "ID", variable = c("Beg", "End"),
category = c("conflict", "unique"))
#> There were 116 matched observations by ID variable across datasets in database.
#> # A tibble: 91 × 9
#> ID wikip…¹ UNRV$…² brita…³ Beg (…⁴ wikip…⁵ UNRV$…⁶ brita…⁷ End (…⁸
#> <chr> <mdate> <mdate> <mdate> <chr> <mdate> <mdate> <mdate> <chr>
#> 1 Augustus -0026-… -0027 -0031 confli… 0014-0… -0014 0014 confli…
#> 2 Tiberius 0014-0… -0014 0014 confli… 0037-0… 0037 0037 majori…
#> 3 Caligula 0037-0… NA 0037 confli… 0041-0… NA 0041 confli…
#> 4 Otho 0069-0… 0069 0069-01 confli… 0069-0… 0069 0069-04 confli…
#> 5 Vitellius 0069-0… 0069 NA confli… 0069-1… 0069 NA confli…
#> 6 Antonius Pius 0138-0… NA NA unique 0161-0… NA NA unique
#> 7 Commodus 0177 … 0180 0177 confli… 0192-1… 0192 0192 majori…
#> 8 Pertinax 0193-0… 0193 NA confli… 0193-0… 0193 NA confli…
#> 9 Didius Julia… 0193-0… 0193 NA confli… 0193-0… 0193 NA confli…
#> 10 Septimus Sev… 0193-0… NA NA unique 0211-0… NA NA unique
#> # … with 81 more rows, and abbreviated variable names ¹`wikipedia$Beg`,
#> # ²`UNRV$Beg`, ³`britannica$Beg`, ⁴`Beg (3)`, ⁵`wikipedia$End`, ⁶`UNRV$End`,
#> # ⁷`britannica$End`, ⁸`End (3)`
# }