The report family of functions allows users to quickly get information about and compare several aspects of a package in the many packages universe, and its' databases and datasets.
Usage
data_source(pkg, database = NULL, dataset = NULL)
data_contrast(pkg, database = NULL, dataset = NULL)
data_evolution(pkg, database, dataset, preparation_script = FALSE)
open_codebook(pkg, database, dataset)
Arguments
- pkg
character string of the many package to report data on. Required input.
- database
vector of character strings of the many package to report data on a specific database in a many package If NULL, the function returns a summary of all databases in the many package NULL by default for
data_source()
anddata_contrast()
.- dataset
character string of the many package to report data on a specific dataset in a specific database of a many package If NULL and database is specified, returns database level metadata. NULL by default for
data_source()
anddata_contrast()
.- preparation_script
Would you like to open the preparation script for the dataset? By default false.
Value
A dataframe with the data sources
A list with the desired metadata to compare various datasets in a many package.
Either the data comparison between raw and available data or the preparation script detailing all the steps taken to prepare raw data before making it available in one of the 'many' packages.
Opens a pdf version of the original codebook of the specified dataset, if available.
Details
data_source()
displays names of the database/datasets and
source material of data in a many package.
data_contrast()
displays information about databases
and datasets contained in them.
Namely the number of unique ID's, the percentage of
missing data, the number of observations, the number of variables,
the minimum beginning date and the maximum ending date as well as
the most direct URL to the original dataset.
data_evolution()
enables users to access the
differences between raw data and the data made available to them
in one of the 'many' packages.
open_codebook()
opens the original codebook of the specified
dataset to allow users to look up the original coding rules.
Note that no original codebook might exist for certain datasets.
In the latter case, please refer to the
source URL provided with each dataset by running manydata::data_contrast()
as further information on coding rules available online.
Examples
# \donttest{
data_source(pkg = "manydata")
#> Component 1 :
#> Reference
#> wikipedia "(????). “List_of_Roman_emperors.”<tps://en.wikipedia.org/wiki/List_of_Roman_emperors>. Accessed" [truncated]
#> UNRV "(????). “Roman Emperor list.”<https://www.unrv.com/government/emperor.php>. Accessed: 2021-07-2" [truncated]
#> britannica "(????). “List of Roman emperors.”<https://www.britannica.com/topic/list-of-Roman-emperors-20432" [truncated]
#>
# }
# \donttest{
data_contrast(pkg = "manydata")
#> emperors :
#> Unique ID Missing Data Rows Columns Beg End
#> wikipedia 68 9.9 % 68 15 -0026-01-16 0014-08-19
#> UNRV 98 6.06 % 99 7 -0027-01-01 -0014-12-31
#> britannica 87 0 % 87 3 -0031-01-01 0014-12-31
#> URL
#> wikipedia https://github.com/zonination/emperors
#> UNRV https://www.unrv.com/government/emperor.php
#> britannica https://www.britannica.com/topic/list-of-Roman-emperors-2043294
#>
# }
# \donttest{
data_evolution(pkg = "manydata", database = "emperors",
dataset = "wikipedia")
#> Raw data could not be open or is not available for this dataset,
#> opening preparation script instead.
#> [1] 0
#data_evolution(pkg = "manytrade", database = "agreements",
#dataset = "GPTAD")
# }