Case Study

Health Systsems Trust

Streamlining and automating health data for faster, more reliable insights.
Location
South Africa
Industry
Health Research
Automating the DHB workflow has transformed a 3-week task into a 10-minute run, ensuring up-to-date data and freeing staff for analysis.

Client Profile

The Health Systems Trust (HST) is a non-profit organisation dedicated to strengthening health systems across South Africa and the broader southern African region. Since 1992, HST has partnered with government, donors, and communities to drive innovation in primary healthcare, research, quality improvement, and universal health coverage implementation.

The Challenge

The District Health Barometer (DHB) is HST’s flagship publication and dashboard, tracking the performance of all 52 health districts across South Africa. Producing the DHB, however, is a complex and time-consuming process. Data must be manually downloaded from multiple sources such as DHIS, BAS, and PERSAL, covering over 40 indicators. Spreadsheets are used to merge, transform, and code data to the correct geography, facilities, and programmes, with fuzzy matching in Excel alone taking up to three weeks. The challenge was to streamline this manual, fragmented process into an automated, accurate, and scalable system that reduces delays and ensures high-quality reporting.

Our Approach

We developed a cloud-based data pipeline to automate the ingestion, transformation, and integration of health data for the DHB. For DHIS data, a Python script packaged in a Docker container and deployed on Google Cloud Platform runs daily to ingest the latest indicators via API. Finance and HR data that are downloaded once a year from the BAS and PERSAL systems respectively, are automatically merged and transformed using Python, which is also used for fuzzy matching of facility and programme codes. All data are written to BigQuery, where they are combined into a single unified view that feeds into both the annual DHB publication and its accompanying interactive dashboard.

Previous HST design
New HST design with Wimmy

The Impact

The new data pipeline has transformed the production of the DHB. Data is now up to date, eliminating delays caused by once-off downloads and ensuring the latest DHIS updates are reflected. Automated merging and fuzzy matching in Python has reduced a task that previously took three weeks to less than ten minutes, improving accuracy by removing manual copying and pasting. The process now requires less human labour, freeing staff to focus on analysis and insights rather than repetitive data preparation. This solution delivers faster, more reliable reporting of information that is critical for policy makers and implementers of health programmes in South Africa.

“Automating the DHB data pipeline has transformed a laborious process into a fast, reliable, and scalable workflow, allowing our team to focus on analysis and action rather than manual data preparation.”
- Noluthando Ndlovu, Programme Manager, HST