MODULE 1 | LESSON 4

Data & Community Health

INTRODUCTION

Data refers to individual items of information, facts or statistics that can often be communicated in a numerical language.

Due to its numerical articulation, data is often perceived as neutral, unbiased, or accurate. It receives these descriptors with little resistance to the contrary, often verging into a dogmatic thinking that data cannot be wrong. This creates a supremacy culture around data, where it rules over other methods of observation such as an individual’s accounts of their lived experience, which is often dismissed as anecdotal or not rigorous enough to count. There is another line to this supremacy culture, which is that those who can conduct this type of work are seen as above those who do not. In the context of community health, this has put millions of people at risk of poor health outcomes. 

For example, at Centric Lab we have been following two communities in Southall, London and Newcastle-Under-Lyme, Staffordshire who have been experiencing acute air pollution due to the remediation of land on a former gasworks site and an open landfill site respectively. Although very different communities, there are similarities in how their lived experiences contradict the establishment’s methods and evidence. Both communities have been told by authorities, such as Public Health England and the Environmental Agency, they are not at risk of poor health outcomes. Despite the authorities proclamations, both communities are seeing spikes in hospitalisation, people are experiencing nausea, headaches, problems breathing, chest pains, and various other poor health outcomes which they directly relate to the activity they are being told is “safe”.

It is not good enough to simply say “the data says” and walk away as it contributes to health injustice and inequity. These communities are being told that their lived experience does not matter, that their data does not matter, and that their intellect does not matter.

Learning Points

  • According to Merriam-Webster Dictionary, data is a “factual information (such as measurements or statistics) used as a basis for reasoning, discussion, or calculation” (source). An important distinction to make is that data itself is an often unstructured set of symbols and/or characters. This essentially means data is not inherently good or bad as those layers are added when data starts to get interpreted. By the time someone uses data to make a decision, present a fact, or justify an opinion, the data has gone through some form of framing to make it meaningful to some stakeholder(s). 

    Understanding that health is in relation to our external environment means accepting that data alone cannot provide us with the full phenomenological experience of health. The context, collection, and communication of data should:

    • Consider various systemic factors such as where a person lives, the type of work they do, their social infrastructure, their access to resources etc in order to build a more complete picture of health through data.

    • Consider the lived experience, which means that data alone cannot tell us the full picture of a person’s health or ability to heal. To fully capture lived experience, we need to create an equitable data ecosystem where a community can gather their own data and have it be as important as the data gathered by practitioners (source). 

    • Consider that the assessment of health is not “ input out output”. For example a low level reading of air pollution does not inherently equate to low risk to health as the person could be already experiencing asthma or have other biological burdens that even low levels can present a high risk to their health (source). 

    Data-driven health is a process of creating a health infrastructure that relies on analysing data on individuals and communities in order to create campaigns, policies, and solutions. For example, taking a test to determine if a prescription or referral to a specialist is needed is data-driven health. Creating a new policy to keep pollutants in line with national or global targets is data-driven health. Since data is itself a neutral tool and its uses vary, data-driven health is neither good nor bad but can become either based on how it is executed. 

    Data-driven health is simply a reality of the way we create modern health knowledge that we need to approach with a critical eye if we hope to move towards health justice and healing. We have to be critical of data-driven health because of the role data can play in creating marginalised communities, cultural asymmetry, and destructive narratives.

  • The UK COVID-19 response showed some data-based disparities, what the Ada Lovelace Institute refers to as the ‘the data divide,’ which include factors such as service accessibility, digital infrastructure, awareness of technologies, and familiarity or attitudes towards technology (source, source). Many of these outcomes are a result of using data as evidence in a situation where there is already a power inequity and distrust in structural stakeholders.

    This inequity can be the difference between asking an open-ended question (such as “how do you spend your evenings?”) and a misleadingly or offensively closed question (such as “how often do you participate in anti-social behaviour?”) to create data that frames a community in a certain light. Sometimes, this is purely unintentional due to a lack of self-awareness on research methods. Other times, it is an intentional power play that is steeped in a culture of supremacy. Regardless of intention, data can become a tool for injustice when it is used on people rather than with people to create a narrative.

  • The DIKW model, which was created in the late 1980s, is referenced in Milan Zeleny’s Management Support System (source). Though not originating in the health domain, the DIKW model is a useful framework, because it provides stages in which insight is needed to provide greater purpose to what starts as data.

    The model presents four stages: Data, Information, Knowledge, and Wisdom. We acknowledge that in practice the lines between each layer may not always be clear cut. There are two approaches to navigating from one stage of the DIKW model to the other (source):

    • The first is to increase the metrics. This means that if you gather enough isolated data points, it becomes information. Create enough information by paired sets of linked data and it becomes knowledge. Gather enough knowledge from those knowledge clusters and it becomes wisdom. In order for this flow to work at each step there must be discussion and reassessment, otherwise, there is a risk that we carry misinformation from the data stage to the wisdom stage.

    • The second approach involves dimensions. This movement is more of a qualitative transcendence up the model. That is because it relies more on our intuition and cognitive parsing of concepts. For instance, you might decide that transcending from information to knowledge is understanding the model of a car, the owner of a car, and the times it is parked outside as information contributing to knowledge that your neighbour parks their car of a certain model in that location every weeknight. In a health context, knowledge could be that someone is ill or that you know what the general traits of an ill person are. Wisdom then becomes your heuristic for how you respond to these situations. Wisdom grows as more experiences develop your knowledge. In a health sense, you might develop wisdom on how to maintain your health during the week by various situational knowledge of your environment and your own body.

    This second approach is in alignment with how communities can develop community sovereignty over their own form of health justice. 

    Health Data

    According to the Data Protection Act of 2018, ‘data concerning health’ is “personal data relating to the physical or mental health of an individual, including the provision of health care services, which reveals information about their health status” (source). Therefore, when we talk about data, we are referring to what is directly being collected as a neutral, apolitical term. But when we mention health data, it is data that can inform and affect decisions regarding our health. While health data starts unstructured like any other data, it quickly becomes valuable and ceases to be apolitical in action.

    To build data practices that enable and promote health justice, ethics and due diligence must start with ethical health data collection. In ethical data practices the data is collected with community consent and comfort. Even though the data itself may be unstructured and of little value in isolation, a community must feel safe in the data being generated and collected. External collectors need to be mindful of cultural barriers around certain data collection practices and treat community concerns as an opportunity to create data novel collection practices. A community may, for instance, not allow researchers to talk to children alone but be happy to develop prompts that they can provide to children and return with answers. Keeping an open mind at the beginning increases the likelihood that the data collection practices are in cultural alignment, the first step towards a community being comfortable using that data for health justice purposes.

    Health Information

    The health information has explicit meaning around the data, which makes the data identifiable. Security and community consent become fundamental to the development of health information. Common practice would be to use terms from the scientific literature to describe each piece of information. If our goal is community health justice, there needs to be a translational phase where practitioners either teach the community the terminology or translate the terminology to how the community already identifies that type of information. In action, this would include being knowledgeable about the linguistic and cultural frameworks a community has around the data being collected.

    On a more technical level, you may have to format information to be more digestible. An example would be presenting data as daily average temperatures or caloric intake to respond to a community’s preference as hourly readings are too discrete and a community does not think of most health information in hours. Sometimes, the best option might be to collect at a discrete level then put the information forward to a community and have them put it into their own context. Another technical consideration is that some communities will respond to diagrams, charts, or scales better than discrete numbers. With all of this in mind, a community is also capable of changing its culture around certain types of information.

    After all of these considerations lead to the conclusion that health information needs to be cognitively and physically accessible. This can be the focus of campaigns and interventions in its own right as many communities when given the avenue would happily develop their understanding of key health information as the research develops and lived experience is documented. Health information can be presented at cultural events, as school curricula, or even in popular media online where people are already engaged. This health information strategy sets communities up to be comfortable with the information as they develop the insight to achieve health justice.

    Health Knowledge

    Health knowledge means that people have the framing and relevant information to start making a running assessment of their health. If we have accessible and equitable health information, health knowledge is how this information is put together and viewed in relation to others. Health information is the stage where the data/phenomena dichotomy becomes important. Research tends to base our health knowledge relative to an expected norm. In general, this is useful because if someone is trying to determine if they are healthy, we can use the average physiological readings of researched populations.  

    The data/phenomena dichotomy emphasises why health knowledge needs to be framed by communities. There is less agency in critically assessing information and prioritising phenomena when an external party, such as a health organization or local authority, determines the health knowledge of a community.

    Health Wisdom

    When we are able to co-create health wisdom with communities, we will have taken a significant step towards health justice. A community that is enabled in health wisdom understands how they can make decisions about their health and communicate their current health based on all the resources provided. It changes the way organisations convey information from current data and the role that information plays in updating health knowledge.

  • Large data studies on population health come with several limitations. The collection of large amounts of data on environment and health is useful to capture generic patterns and trends - to make sense of the vast amount of information, scientists describe, correlate, and model the data. These analyses often are associated with error, uncertainty, and bias. We will now point out how considering people’s lived experience can address and mitigate such limitations.

    Day to day changes: Living through a phenomenon is a daily experience with nuances that develop through the passage of time. For example, some days may be tougher due to weather or a change in a person’s life or other changes in the environment. A monitoring device cannot deliver these important changes (source).

    • Phenomenology: Being exposed to an environmental pollutant or any other health threat is not just about units of measurement, it is a full sensorial experience, which all have an effect on the psychosocial experience of the pollutant. There is stress and trauma in the experience, which also has to be considered as part of the pollutant’s impact (source).

    • Varied degrees of impact: Each person has a different biological tolerance to environmental stressors and even to disease depending on what is already happening in their body and life. For example, if a person has obesity, they will experience air pollution differently and could be impacted at much lesser levels than a person with no health conditions. Therefore, generalised thresholds set by health organisations or policy cannot speak for the varied biological impacts of a pollutant. The individual impact of a pollutant is a necessary piece of information in every research project involving environmental pollution (source).

    • Expertise through time: Expertise takes time to build; therefore, a person living in an environment for a sustained period of time develops an expertise through the act of spending time in that environment. The time in the environment gives them the opportunity to observe and chronicle the changes and activities related to the environmental pollutant in a manner that researchers would not be able to do as non-residents.

    In research, scientists often work with averages to describe a given phenomenon. These averages are useful to grasp and communicate large amounts of data. However, depending on the distribution of the data, some observations may be very different from the average. For example, on average, a community may be described as ‘healthy’. However, not all individuals of the community have the same health. Some have better health; some have poorer health. Similarly, average air pollution levels in an area may be below a given official threshold. However, some individuals living in the area are exposed to more air pollution than others. This is problematic at different levels. First, one may have individual data on an outcome (e.g., health) and aggregate data on an exposure (e.g., air pollution). If one investigates the link between outcome and exposure, their results will likely be biased because of so-called exposure misclassification (source), leading to false conclusions about the relationship between outcome and exposure. Second, one may base policy and planning decisions on (biased or unbiased) averages, neglecting that these averages do not apply to all individuals. For example, the WHO has established ‘safe’ levels of air pollution which are often used to justify policy and planning decisions. However, first of all, even if the average level of air pollution is below a threshold, at times, air pollution levels can be much higher. Second of all, even if a given level of air pollution is considered ‘safe’, it may not at all be safe for individuals who are more vulnerable due to existing health conditions, exposure to air pollution in other places, or accumulated exposure to a range of stressors. This is very much related to the so-called Ecological Fallacy which describes that relationships between variables at an aggregate level imply the same relationships at the individual level (source). Considering people’s lived experience would allow for the collection of individual data and, therefore, mitigate the many limitations associated with averages and aggregated data.

    Exposure: the individual moves beyond their ‘neighbourhood’

    In large, quantitative studies, researchers often aim to find quantifiable patterns and correlations in the data. In place and health research, scientists often define an area of interest, for example, an area of exposure. They then collect data on the exposure of interest (e.g., air pollution) for that area and link it to data on the outcome of interest (e.g., health). Above, we have already identified some limitations of this approach. Another major limitation is that the exposure area (e.g., a postcode, a LSOA, or a ward) is not actually the true geographic context that an individual experiences (source): the individual is unlikely to be exposed to all of the defined area; the individual is likely to be exposed to areas outside the defined area; the individual is likely to spend different amounts of time in different areas; and the individual is likely to engage in different activities in different areas and at different times. In brief, capturing static, average information on a given exposure in one area is not sufficient to understand the individual’s true exposure. This, again, is associated with so-called exposure misclassification which biases our understanding of a phenomenon (source). Considering people’s lived experience would allow for the collection of individual data on their true exposure throughout day, year, and life, in turn, allowing for a better understanding of the true effect of an exposure on health.

“Consider that the assessment of health is not ‘input-output’. For example a low level reading of air pollution does not inherently equate to low risk to health as the person could be already experiencing asthma or have other biological burdens that even low levels can present a high risk to their health.”

KEY LEARNINGS

  1. Whilst data and its processes can be seen as apolitical and unbiased, their application often is not. We must be aware of this risk in order to avoid propagating health inequities and injustice. By collaborating with communities and breaking down the supremacy rapport that is the norm.

  2. Data intentionally used for health justice considers the human, societal factors when creating ranges and values. We should also consider the lived experience, which means that data alone cannot tell us the full picture of a person’s health or ability to heal. To fully capture lived experience, we need to create an equitable data ecosystem where a community can gather their own data and have it be as important as the data gathered by practitioners.

  3. If communities have no agency and understanding over the data used to make decisions about their health, they are likely not the ones involved in making those decisions nor do they have an avenue to genuinely critique the process or impact of data-driven health decisions.

QUESTIONS TO ASK YOURSELF

  1. Based on your experience with your neighbourhood, list elements of your neighbourhood where you have not seen data already that either: Impact your health and/or the health of your community (e.g. road/vehicle activity, presence of risky infrastructure, exposure to risky infrastructure) OR would make a good indicator of the health impacts in the neighbourhood (e.g. community sentiment, community activity, presence of agriculture).

  2. What principles would you want to see embedded into data collection and analysis frameworks?

  3. Can you create your own datasets based from your experiences of a pollutant/environmental nuisance in your neighbourhood, such as tracking every time in a spreadsheet a car horn goes in the night or an odour from outside comes into your home?