Skip to contents

The report family of functions allows users to quickly get information about and compare several aspects of a package in the many packages universe, and its' databases and datasets.


data_source(pkg, database = NULL, dataset = NULL)

data_contrast(pkg, database = NULL, dataset = NULL)

data_evolution(pkg, database, dataset, preparation_script = FALSE)

open_codebook(pkg, database, dataset)



character string of the many package to report data on. Required input.


vector of character strings of the many package to report data on a specific database in a many package If NULL, the function returns a summary of all databases in the many package NULL by default for data_source() and data_contrast().


character string of the many package to report data on a specific dataset in a specific database of a many package If NULL and database is specified, returns database level metadata. NULL by default for data_source() and data_contrast().


Would you like to open the preparation script for the dataset? By default false.


A dataframe with the data sources

A list with the desired metadata to compare various datasets in a many package.

Either the data comparison between raw and available data or the preparation script detailing all the steps taken to prepare raw data before making it available in one of the 'many' packages.

Opens a pdf version of the original codebook of the specified dataset, if available.


data_source() displays names of the database/datasets and source material of data in a many package.

data_contrast() displays information about databases and datasets contained in them. Namely the number of unique ID's, the percentage of missing data, the number of observations, the number of variables, the minimum beginning date and the maximum ending date as well as the most direct URL to the original dataset.

data_evolution() enables users to access the differences between raw data and the data made available to them in one of the 'many' packages.

open_codebook() opens the original codebook of the specified dataset to allow users to look up the original coding rules. Note that no original codebook might exist for certain datasets. In the latter case, please refer to the source URL provided with each dataset by running manydata::data_contrast() as further information on coding rules available online.


# \donttest{
data_source(pkg = "manydata")
#> Component 1 :
#>            Reference                                                                                                    
#> wikipedia  "(????). “List_of_Roman_emperors.”<tps://>. Accessed" [truncated]
#> UNRV       "(????). “Roman Emperor list.”<>. Accessed: 2021-07-2" [truncated]
#> britannica "(????). “List of Roman emperors.”<" [truncated]
# }
# \donttest{
data_contrast(pkg = "manydata")
#> emperors :
#>            Unique ID Missing Data Rows Columns         Beg         End
#> wikipedia         68        9.9 %   68      15 -0026-01-16  0014-08-19
#> UNRV              98       6.06 %   99       7 -0027-01-01 -0014-12-31
#> britannica        87          0 %   87       3 -0031-01-01  0014-12-31
#>                                                                        URL
#> wikipedia                 
#> UNRV                 
#> britannica
# }
# \donttest{
data_evolution(pkg = "manydata", database = "emperors",
dataset = "wikipedia")
#> Raw data could not be open or is not available for this dataset,
#>               opening preparation script instead.
#> [1] 0
#data_evolution(pkg = "manytrade", database = "agreements",
#dataset = "GPTAD")
# }