A hand is extended, wearing a pink rubber glove. The hand is holding an unlabeled spray bottle aimed towards an unseen area off of screen. A subtitle appears in the open space next to the spray bottle: "Scrub gently: On data scrubbing in a community survey."

Scrub gently: On data scrubbing in a community survey.

Recently, my team with the CHAOSS Project had a data concern emerge when I was working on a project to run a community survey. This community had never run a survey before, and it was the first notable event where the project made an explicit, structured ask for feedback from the community. As a result, this first experience was also a calibration event, so we could guide this kind of work in future years.

Survey says: What?

At some point, after we opened the survey, a question emerged about how to handle an unruly response. In the ongoing responses, our data manager noticed one response that was objectively harmful. The person was strongly against the D.E.I. initiative that organized the survey. The response was written in a hostile tone, made insulting and derogatory comments about groups of people, and was entirely opposed to the project spending any time and resources on diversity, equity, and inclusion. The question asked to our group was whether we would include this response in the published data, or whether we would omit it.

There were two perspectives. Some elected to remove this response from the final report and any published data. Others felt it was important to wait and see if this response would become a pattern as we ran the survey. I found myself in the second group that felt it was important to wait and see first. I want to unpack this rationale, both for future me and perhaps someone else reading.

On discarding the survey response

There were good points about removing the harmful response.

Firstly, the response used harmful language and was likely triggering. This particular response included angry rhetoric that was reflective, to a degree, of the social and political “climate” of our world today. Including the response in our final reporting could also be giving it a platform, which would arguably be a harmful act. It would validate that input as acceptable input. Our group was not in disagreement that the response was harmful and not behavior the community should tolerate.

Second, the response did not provide actionable insight or useful asks to the project and community. It was written in an aggressive, angry tone towards the reader and did not offer workable suggestions other than ending and divesting from all D.E.I. work immediately. Given this was not an acceptable option, there wasn’t much there for us to learn or understand about CHAOSS from this individual response. So, why include or save this response?

There is an option to ignore feedback by intentionally discarding it, but what if the individual feedback represents a larger trend?

What is community culture?

It is important to be aware of threats to community culture. What is community culture? My improvised definition is any organizational culture oriented towards the care, well-being, and thriving of others (including the self) within a single, shared community environment. Regardless of other values and goals in a project, the shared culture of the project can either lean towards a collective, communal-oriented approach or an independent, individual-oriented approach. The communal approach that prioritizes the well-being of all instead of a privileged view could also be considered as community culture. Many traditional “Open” projects skew toward a strong community culture.

On monitoring survey responses for a pattern

Coming back to the survey response, what if omitting the data leaves holes in the story of your community? If there is not just one, but several of these kinds of responses, what comments does that make about the community culture? Is there already a strong community culture, or is there resistance and challenges to building a more cooperative, caring environment? There is real work to do at both ends of the spectrum, but what that work might look like depends on which side you are on.

I posit that omitting the “unhappy” or harmful responses can create a dangerous blind spot to toxicity within a community culture. When it comes to direct, interpersonal interactions with others (e.g. meetings, emails, chats, etc.), stewards of the community culture need to take direct action against visible challenges and threats to the community culture. If someone starts swearing out at someone in a meeting, that is a hard-to-miss action. It is visible, and anyone could observe it or even record it.

In anonymous surveys, you might find a more subtle layer of the community culture than what is shown by the actions of a small few. There can be greater trust that someone’s comments will not be tied back to their identity, so some responders may feel emboldened with their words and true opinions.

Don’t discard a blind spot.

The point of this is that especially in larger communities, it is worth noting negative and harmful responses and not totally ignoring them. Communities that organize in more decentralized ways will always have supporters, users, and contributors from both the core and the periphery. The core project membership may not interact or engage often with the periphery often, so there can be a blind spot to parts of the project that identify with the community but are a few degrees removed from the inner ring of the project community.

Noting whether something is indicative of a larger pattern is important. If your community has a ton of jerks, you need to know that your community is full of jerks so that you don’t waste time persuading people otherwise, when the lived experience is very different.

In the original conversation with the CHAOSS Project team, this data scrubbing question emerged in the process of running the survey instead of after the data collection concluded. The survey later closed and our data manager confirmed that the flagged response from earlier was the only one of its kind. As a group, we then felt more confident in discarding that one outlier as an anomaly since the survey was open to the general public.


Feature photo by JESHOOTS.COM on Unsplash. Modified by Justin W. Flory.

Drop a line