vignettes/GGO_agreements_codebook.Rmd
GGO_agreements_codebook.Rmd
This document provides a brief overview of the coding rationale for key variables in the list of international agreements provided in manytreaties::agreements$GGO
.
Note that this dataset was constructed as a complement to datasets such as the International Environmental Agreements Database (manytreaties::agreements$IEADB
) and the Design of Trade Agreements Database (manytreaties::agreements$DESTA
). As such, it is neither complete in observations nor variables, yet offers more specificity and some additional entries compared to such other datasets.
Work on this dataset was supported by the Swiss National Science Foundation (SNSF) Grant Number 188976: “Power and Networks and the Rate of Change in Institutional Complexes” (PANARCHIC).
Please direct all comments and suggestions to:
James Hollway
International Relations/Political Science Department
Graduate Institute of International and Development Studies
Geneva, Switzerland
james.hollway@graduateinstitute.chThe variables are as follows: treatyID, Title, Begin, End, Signature, Force, Term, Grounds, TypeDomain, TypeAgree, TypeAmbit, TypeGeo, TypeSubject, Location, Depository, Published, Secretariat, Website, TitleAlt, Language, Languages, Text, TextURL, Coder, Comments, Source.
This is the title of the treaty or agreement. We interpret an agreement broadly to include not only formal treaties, such as treaties, conventions, or declarations, but also other types of international agreements such as memoranda of understanding, resolutions, protocols, and amendments. We generally use the title of the agreement as listed on the treaty text itself, or as listed in the source from which the treaty was identified.
In some situations an agreement might be known by several different titles. These situations include:
In such situations, the shorter or more common title is generally preferred for the Title
variable, so long as it is unambiguous. If only the longer version of the title mentions the parties, subject, or other distinguishing information, then this title will be used as the Title
variable. Any alternative titles are included in the TitleAlt
variable, using commas to separate multiple alternative titles if necessary.
While we strive to include a comprehensive list of international agreements, including agrements that are not formal treaties, for instance, but that can be filtered out by users if they wish (see TypeAgree
), there are some agreements that we do not include. For the main part, it is because the information contained is not substantive, but rather procedural or administrative.
Note that exchanges of letters or similar documents that form part of a treaty are included.
In some cases treaties with exactly the same title are in fact different treaties. This might be because they were signed on different dates, or because they were signed by different parties. In other cases, treaties with different titles are actually the same treaty – signed on the same date by the same parties. We differentiated such entries as follows:
In other cases, treaties with very similar titles are actually the same treaties. This is more challenging to determine. We approach this deduplication task as follows:
With the list of remaining potential duplicates, we then check the texts of the treaties themselves, if available, or the sources from which they were identified. We also use Google to search for the treaty title and date of signature, to see if there are any sources that clarify whether these are the same or different treaties. We use the following heuristics to assist with this process:
Note that agreements that are amendments or protocols to existing treaties are retained, as these are often important in their own right.
A treatyID is a meaningful shorthand ID created from a combination of elements extracted from the agreement title and date. The treatyID allows users to identify date, type and linkage.
A bilateral treaty that is an agreement will have the following treatyID: "FRA-TON[DEZ]_1980A“. This is a combination of the parties to the agreement (FRA-TON), the activity described in the treaty, which is here”Delimitation of Economic Zone (DEZ)", with the year of signature (1980), followed by the type (A = Agreement).
Type | Pasting | Pasting |
---|---|---|
Bilateral | FRA-TON[DEZ]_1980A | parties[subject]_year(type) |
Bilateral + Protocol | RUS-USA[FKC]_1967E:RUS-USA[FKC]_1965A | parties[subject]_year(type):linkage |
A bilateral treaty that is any other type than an agreement (e.g. protocol, amendments) will have the treatyID under this format: "RUS-USA[FKC]_1967E:FKC_1965A“. It is composed of the parties (RUS-USA), the subject abbreviation from treaty title”Fishing for King Crab (FKC)“, the year of signature of the amendment (1967), and the type (E) which refers to Amendment. The linkage portion links the agreement to the treatyID its”mother" treaty (RUS-USA[FKC]_1965A).
A multilateral treaty that is an agreement will have a treatyID format similar to: “HSPDF_2005A”. The treatyID indicates the acronym (HSPDF), the signature year of the agreement (2005), and the type (A).
Type | Pasting | Pasting |
---|---|---|
Multilateral | HSPDF_2005A | acronym_year(type) |
Multilateral + Protocol | SFDP_2007E2:H08F_1992A | acronym_year(type and number):linkage |
A multilateral treaty that is not an agreement will have this treatyID format: “SFDP_2007E2:H08F_1992A”. This represents the acronym (SFDP), the signature year of the amendment (2007), the type (E = Amendment), the number of amendment (2), and the treatyID (H08F_1992A) of the “mother” treaty.
Type | Pasting | Pasting |
---|---|---|
Known treaties | UNCLOS1982A | abbreviation, uID, type |
Amendment of known treaties | J09H_1990E2:MARPOL1973A | acronym_uID, type(number):linkage |
Famous multilateral treaties have a simplified treatyID with a known abbreviation. For example, the United Nations Convention On The Law Of The Sea will have the following treatyID: “UNCLOS1982A”. This is the known abbreviation (UNCLOS), the signature year (1982) and the treaty type (A).
Protocols or amendments of the known treaties will have this treatyID format: “J09H_1990E2:MARPOL1973A”. It indicates the acronym (J09H), the signature year of this specific amendment (1990), the type of treaty (E), its number (2), and the treatyID of the “mother” treaty (MARPOL1973A).
These are the dates when an agreement is deemed to have begun or ended. Where more precise information is available, we code also the date of signature, the date when the agreement entered into force, and the date when the agreement terminated.
Dates are coded using the messydates system. This implements ISO’s extended date/time format. As such, some dates are only entered as a year or are annotated with a question mark if the source is uncertain. For more details see messydates.
Agreements that are known to still be in force at the time that the data was collected or checked have an end date 9999-12-31
. However, an NA
in this field simply means that we could not verify the status of the agreement at the time of coding.
Where the agreement has been terminated, we code the grounds for termination. The grounds for a treaty’s termination is coded as one of the following categories:
As this coding is incomplete, it is currently only available for a subset of the agreements. For more details, please reach out to the maintainer.
These variables are useful for filtering the dataset, and better understanding the lineage of certain agreements.
The domain is a simple character/string variable that indicates the main domain of the agreement. At the moment this is a fairly coarse coding, containing the following categories: . We have many more agreements for environment, trade, and health than the others.
The type of agreement is coded as one of the following categories:
For the most part our observations are either agreements, protocols, or amendments. The others are current rather residual categories.
The ambit of the agreement is coded as one of the following categories:
We code geographical region more specifically than in some other datasets. We code the region descriptively and as a character string, which affords the opportunity to search by regular expression such as “America” to get “Northern America”, “Southern America”, “Central America”, and “Caribbean America”. Note that we use the adjectival form, e.g. “Southern Africa”, to distinguish the region from the country “South Africa”. We use “Central” to describe areas in the middle of the continent, if applicable. The data includes the following regions:
However, we code not only regions but also specific geographical features, such as rivers, seas, and oceans. Codes that appear more than 8 times include:
#> Warning: Unknown or uninitialised column: `Region`.
#> Unknown or uninitialised column: `Region`.
#> NULL
For the geographical region to be “Global”, the agreement must explicitly refer to the whole world or globe.
In addition to the Region
, we also code the subject of the agreement. We code this as a one- or two-keyword description of the subject matter of the agreement. For example, “Water Energy”. This is a character/string variable, so users can search or filter as they wish. Filtering by regular expression is therefore possible, e.g. “Water|Energy” to get all agreements that mention either water or energy. So this could include agreements on water, energy, water and energy, water and pollution, or nuclear and energy, for instance.
Codes that appear more than 8 times include:
#> Warning: Unknown or uninitialised column: `Subject`.
#> Unknown or uninitialised column: `Subject`.
#> NULL
We have also attempted to record some additional information about each agreement, such as sources and such, where available.
This is the name of the city where the agreement was concluded. This has been drawn from the title of the agreement in some cases, or from the treaty text itself, or metadata available from the source from which the treaty was identified. For the most part, this is fairly straightforward, however in some cases there is a second city. Since second locations are relatively rare, this will appear only in the Comments
for now.
Where applicable, we code the name of the secretariat or international organisation that administers the agreement. This is drawn from the treaty text itself, or metadata available from the source from which the treaty was identified. Where available, we also provide a link to the website of the agreement or its secretariat.
We have collected the text of many agreements where available. Texts for the agreements are formatted and stored in .txt files in the ‘data_raw/agreements/GGO/TreatyText’ folder within the package. The Text
variable denotes if the corresponding text has been stored in the folder. Where possible, we provide a link to the source from which the text was obtained.
The Language
variable indicates the language in which the treaty text is available as a two-letter code, e.g. “en”. We retain the English language version wherever available to facilitate analysis. However, some agreements are available in other languages; these are listed in Languages
as a pipe separated vector.
The Coder
variable is a comma separated vector of the surnames of those who have added or verified data for each entry/observation. Where special conditions arise, the Comments
variable offers a free text area for explanations or recording how the coding has changed from version to version. The Source
variable should contain only links or bibliographic information for the sources used to add or verify information.