HXL tagging conventions (version 1.1)
Release 1.1, 2018-04-30 (permalink, previous release) Part of the HXL 1.1 standard.1. Introduction
This document is part of the Humanitarian Exchange Language (HXL) version 1.1, a standard for increasing the efficiency and effectiveness of data exchange during humanitarian crises. This new version is fully backwards-compatible with data produced using HXL 1.0 (released 18 March 2016), and adds several new features, including JSON-based encodings and a standard way to refer to taxonomies/controlled vocabularies. There are also several new hashtags and attributes in the hashtag dictionary. The intended audience for this specification is information-management professionals and software developers who require a formal definition of the HXL syntax. Most users who simply want to add hashtags to their data may prefer the HXL postcards and the tutorial information at hxlstandard.org, as well as interactive HXL tool support under development at the Humanitarian Data Exchange (HDX). The HXL standard consists of two normative parts:- HXL tagging conventions (this document) — instructions for adding HXL hashtags to spreadsheets.
- HXL hashtag dictionary — a list of hashtags for identifying humanitarian data fields.
1.1. Design philosophy
HXL is a lightweight standard by design. Most data standards dictate to users how they should collect and format their data; HXL, on the other hand, encourages organisations to add hashtags to their existing datasets, without requiring new skills or software tools, and interferes as little as possible in their current ways of working. The primary focus of HXL is tabular-style data such as spreadsheets or API output from database tables, which represent the vast majority of the operational data collected in the humanitarian sphere; however, HXL hashtags can potentially have other applications, including labelling attributes for map layers or identifying data types in SMS messages.1.2. Target audience
The standard’s primary audiences are information-management specialists who are familiar with spreadsheets or relational databases, and computer programmers and database specialists looking to consume data produced by those information-management specialists.1.3. Terms of use
HXL is available as an open standard — the working groups have designed it for use with humanitarian data, but people and organisations are welcome to use it for any purpose they choose. Note, however, that users may not claim support or endorsement from any members of the HXL working group or the organisations for which they work. The authors offer no warranty of any kind, so implementors use the standard at their own risk. The text of the standard itself is released into the public domain.2. Adding HXL hashtags to data
2.1 Spreadsheet Data eg. csv, xls, xlsx
Consider the following simple spreadsheet:LOCATION NAME | LOCATION CODE | NUMBER AFFECTED |
---|---|---|
Camp A | 01000001 | 2000 |
Camp B | 01000002 | 750 |
Camp C | 01000003 | 1920 |
- Number affected
- Affected
- People affected
- # de personnes concernées
- Afectadas/os
- عدد الأشخاص المتضررين
LOCATION NAME | LOCATION CODE | NUMBER AFFECTED |
---|---|---|
#loc +name | #loc +code | #affected |
Camp A | 01000001 | 2000 |
Camp B | 01000002 | 750 |
Camp C | 01000003 | 1920 |
CAMP INFORMATION | NEEDS | |
---|---|---|
LOCATION NAME | LOCATION CODE | NUMBER AFFECTED |
#loc +name | #loc +code | #affected |
Camp A | 01000001 | 2000 |
Camp B | 01000002 | 750 |
Camp C | 01000003 | 1920 |
2.2 JSON data
It is becoming increasingly common for organisations to share data through APIs. HXL is well placed to add interoperability to that data through its support for JSON, the format most-commonly used by APIs. HXL is purposely restricted to a simplified subset of the full JSON specification. In this simplified subset, the data must be laid out in a non-hierarchical and tabular form. Two such forms are currently supported.2.2.1. Array of objects JSON style
This is a very common way for data to be presented where each row is a lookup between a hashtag key and a value:[ { "#hashtag": value, "#hashtag": value }, { "#hashtag": value, "#hashtag": value } ]An example of this is shown below:
[ { "#event+id": 1, "#affected+killed": 1, "#region": "Mediterranean", "#meta+source+reliability": "Verified", "#date+reported": "05/11/2015", "#geo+lat": 36.891500, "#geo+lon": 27.287700 }, { "#event+id": 3, "#affected+killed": 1, "#region": "Central America incl. Mexico", "#meta+source+reliability": "Partially Verified", "#date+reported": "03/11/2015", "#geo+lat": 15.956400, "#geo+lon": -93.663100 } ]For repeated same named hashtags eg. to express multiple sectors using repeated #sector columns, the equivalent in this format is a comma separated list of sectors (see 4.1.1. The +list attribute), e.g.
"#sector": "WASH,health"Note that the array of objects form does not allow for human-readable headers. If there is a demand for these — and the array of arrays form outlined below does not suffice — then they will appear in a future version of the standard. Note: HXL allows hashtag attributes to appear in any order, case-insensitive, with or without whitespace separating them, so these are all considered equivalent: “#affected+f+children”, “#affected +children +f”, and “#affected+Children+F”. In JSON objects, it is essential that the property names be consistent, so you should take the following steps when converting a HXL hashtag specification (hashtag and attributes) for use as a JSON object property:
- Convert to lowercase.
- Remove all whitespace.
- Present the attributes in US-ASCII alphabetical order.
2.2.2. Array of arrays JSON style
Although not widely used, this form is ideally suited to visualisations because it is significantly more compact than the Array of Objects format as the hashtags are only defined once in the first element of the Array:[ ["#hashtag", "#hashtag"], [value,value], [value,value] ]Below is an example:
[ ["#event+id","#affected+killed","#region","#meta+source+reliability", "#date+reported","#geo+lat","#geo+lon"], [1, 1, "Mediterranean", "Verified", "2015-11-05", 36.891500,27.287700], [3, 1, "Central America incl. Mexico", "Partially Verified", "2015-11-03", 15.956400, -93.663099] ]If headers are needed, they can be added as an extra array prior to the hashtags.