Defining Data for Health Justice
Introduction: Data is a powerful tool for justice and language for accountability. However, data can easily be manipulated and used to entrench inequities. In this lesson, I'll cover some of the fundamentals around how we define data and what this means for health justice.
Why this lesson is important: Organisations seek data to fill knowledge gaps they have when making decisions, but often these data practices do not account for the more human and complicated factors, such as community lived experience and health impacts, that may be at odds with commercial and political agendas that factor ownership, capital, and the benefit of a few over the health of communities. To support how communities can evidence the impacts of structural injustices and inform how community-favoured health justice solutions are developed, we need to develop ways for communities to “speak this language” in their advocacy while still maintaining their priorities and values.
Story: Let’s begin with the definition of data. We can look at data as factual information such as measurements, statistics, documented responses, used as a basis for reasoning, discussion or calculation. One useful definition is from the UN, which calls data “the lifeblood of decision-making and the raw material of accountability.” This definition shows why it's important for communities experiencing injustice to feel empowered to create data that serves their advocacy and justice needs.
We can break data down into two primary uses:
- Creating understanding of a phenomenon, such as your health, your community’s needs, or your relationship with your environment in the case of health justice. 
- Informing decisions that individuals or institutions make based on their knowledge of people, places, or systems. 
Two fundamental ways of looking at types of data are quantitative versus qualitative and primary versus secondary with each of them having their pros, cons, and reasons why we pick them.
Quantitative data can be expressed as numbers or measured, like air quality readings or number of residents. Qualitative data includes responses, observations, or descriptions—like how a space feels, or naming community professions.
Quantitative data is easier to calculate but often lacks context. Qualitative data offers depth and humanity, but is harder to compare or quantify. There's no superiority between them—we often need both.
Primary data is what you collect. Primary data collection allows control over the narrative and framing of the data and can utilise your proximity to the community for relevance and accuracy. But often primary data takes more effort to produce after a certain scale and may be questioned by institutions if not seen as “scientific,” even if the data produced is more representative of the phenomenon.
Secondary data comes from others, often with wider coverage (e.g., city or national data). Secondary data can be easier to access, but you rely on how ethically and accurately others collected and labeled the data. Again, neither is better—they serve different roles and people often use a mix of these types.
You’ve likely heard of lived experience which refers to insights from people’s actual lives—like self-reporting symptoms, or a shop owner noting common customer questions. These everyday observations are data, too.
Lived experience matters. Data doesn’t need to be technical or algorithmic. A note in your phone tracking when you cough is valid data for that experience. This leads to the idea of data culture—your and your community's comfort and confidence in using data to make informed decisions. It’s about the agency to gather, analyze, and communicate data to support your goals. Data is not neutral and serves a purpose—whether that’s understanding, aligning people, or expressing lived realities.
The Ada Lovelace Institute coined the term data divide during the pandemic, referring to the inequality created as more data-driven systems are adopted and many communities are left behind due to complexity or lack of access. This divide often arises when the purpose of data collection becomes disconnected from the people it’s meant to serve. To close this divide, communities need strategies for deciding what data matters to them, what’s ethical to collect, and what language or format best supports their needs. No matter how advanced the data system, data is not the phenomenon. If you’re sick, no data perfectly reflects how sick you feel—it’s just a reference point to help inform understanding or decisions.
Communities often know the phenomena they’re experiencing, but existing mechanisms may not capture it—or their data may be dismissed as unscientific. A useful example is from Clean Air for Southall and Hayes (CASH). In Southall, residents reported worsening respiratory health after nearby construction began. Public Health England reviewed specific pollutant readings, concluding there wasn’t significant harm. But if they had used a holistic approach including pollution levels, NHS visits, and construction timing, they might have reached a different conclusion.The question becomes: does adding pollution help people who are already sick, even if they didn't believe that the pollution was causing their sickness?
This shows no single dataset tells the whole story. That’s why in other case studies, we emphasize using a mix of data, not just air monitors quantifying air quality. An ideal data culture for health justice includes:
- Trusted, transparent resources for understanding health and environment. 
- Clear instructions for self-reporting strategies. 
- Community repositories, like local libraries, to store and reflect shared experiences. 
- Responsive stakeholders who use updated community data to support health. 
- A sense of agency—that anyone can contribute to and interpret data. 
Ultimately, data is not just digits and monitor readings—it’s a language that everyone can generate. Defining your intentions with data starts with questions like: What are we trying to understand? What decisions are being made that need our voices? What kind of data supports those discussions?
Learning Points:
- Data can be seen as a language to create understanding and inform decisions. Different types of data (such as quantitative vs qualitative, primary vs secondary) can contribute towards these two goals in different ways. No type of data is superior to the other. 
- Developing your data culture as a community is important for being able to understand and affect macro decisions that can impact your health and environment. 
- Lived experience data is a crucial and valid source of data that does not need hi-tech methods to create. 
- Data is a proxy for a phenomenon. The data about your health, for instance, is just a metric to create understanding and support decisions but cannot fully represent the experience. 
- An ideal data culture for health justice involves a mix of transparent and accessible sources of current knowledge, intuitive and achievable strategies for activities such as self-reporting, trusted sources to collect and store this information, and a greater involvement in using these processes to embed civic participation in decisions that affect the health and environments of communities. 
Segueing into the next Lesson theme: In the next lesson we're going to build on this understanding of data and understand its relationality to health. We'll discuss how air pollution and health are separate yet interacting phenomena and what that means for data used to frame the health of communities who are being polluted.
 
                        