The challenges and how to overcome them (2/4)

Challenge n°2: Access to raw datasets for the data landscape exercise

Data is becoming more public and transparent. Web-based platforms, such as DHS STATcompiler, UN data, DEVINFO, DHIS2, and the NADA repository, are now more commonly used. Summary statistics are also frequently published in statistical books, survey reports and web platforms. However obtaining access to the raw datasets can be problematic and time-consuming because:

  • Data must be anonymised to be shared due to ethical considerations.
  • Some institutions can understandably still be reluctant to share sensitive data in the absence of a legal framework for data sharing.

Although is not necessary to have access to the raw datasets to complete the data landscape exercise, the exercise will need to assess the practical accessibility of the datasets by:

  • Inquiring about the formal procedural steps to obtain official permission and access the data.
  • Interviewing external users on their experience of accessing the data.
  • Making an actual attempt to access the database by:
    • Downloading and opening databases that are formally available on the web to establish whether: an access code is required, all indicators are available and the data is anonymised.
    • Requesting access from the data providers. Even if it is not possible to access a particular dataset, it is helpful to know where datasets are stored, and the process and information required for access. It is also important to establish whether reports are available that describe the sampling method, data quality controls and results.

Efforts to establish the accessibility of datasets can be time-consuming. In an ideal situation, a legal framework for data sharing exists. In the absence of such a framework, the NIPN country team should advocate for it but will at the same time need to find a pragmatic solution for accessing data. Based on past experience, the following factors may contribute to facilitating access to datasets of multiple sectors:

  • Develop relationships with data providers: Access to datasets can depend on individual relationships based on trust. The data landscape exercise is a good way of identifying key data providers, building relationships and raising their awareness and understanding of the National Information Platform for Nutrition. This includes an explanation of what NIPN intends to do with the data, and addressing the concerns of data providers and understanding the information and authorisation needed to access data.
  • Engage data providers in NIPN: Experience from the Nutrition Evaluation Platforms project showed that having key data providers as members of the technical committee was an efficient means of accessing datasets.
  • Coordinate with data providers: The NIPN data experts should not work in silo but rather involve data providers in the interpretation and communication of the data. The outputs of the data analyses should be beneficial to data providers as well. For example, in Côte d’Ivoire the Prime Minister provided the NIPN team with an official letter addressed to data providers to grant the NIPN systematic access to the datasets.
  • For the updating of a central repository (based on the NADA software solution) the National Statistical Office of Burkina Faso organised a one week workshop with focal points from the different ministries in 2012. Each focal point would bring their datasets to be uploaded. Discussion on harmonisation of indicators formed the basis of a plan of action. The workshop created a dynamic forum that facilitated the sharing of data instead of needing to request datasets from each and every stakeholder.
*****