Are your survey respondents using Generative AI?

Practical tips and ethical considerations for qualitative researchers in the age of AI

The emergence of generative AI tools like ChatGPT has revolutionised various fields, including qualitative research. While these tools offer immense potential for data analysis and participant engagement, they also introduce new complexities and ethical dilemmas. A fascinating conundrum we’ve encountered is the apparent use of ChatGPT by survey respondents to generate their answers. This raises important questions about authenticity, representation, and the integrity of qualitative data. In this blog, we will explore some considerations for qualitative researchers who might want to identify AI-generated responses and handle them with care and ethical rigour.

Recognising AI-generated responses

One of the first challenges researchers face is identifying AI-generated responses. In our experience, these responses often exhibit certain tell-tale signs:

⚠️ Length and structure: AI-generated responses tend to be longer and more structured than typical human responses, often including introductory and concluding sentences and sometimes even numbered paragraphs. Itemized paragraphs produced by ChatGPT usually begin with a short title followed by a colon (just like this section of the blog!). Whereas human survey responses are often written in phrases and sentence fragments, generative AI is more likely to produce complete sentences.

⚠️ Language and grammar: These responses are usually free of typos and grammatical errors, which is uncommon in spontaneous human replies. Certain words are also overused by generative AI compared to the human population, such as the words “delve”, “tapestry”, and “embark” in the case of ChatGPT.

⚠️ Specific terminology: AI tools may use formal or precise terms, and may expand acronyms that are used in the survey question. For example, if a survey is about gender-based violence (GBV) and the survey question itself uses the acronym GBV, we would normally expect respondents to use the acronym rather than spelling out “Gender-Based Violence (GBV)”. 

While there is no conclusive way to determine whether a response is AI-generated, tools such as AI Content Detectors can provide you with a percentage likelihood that a response is AI-generated based on factors such as the above points.

Ethical considerations

Authenticity vs. accessibility

A primary concern is that AI-generated responses do not reflect the authentic voice of the respondents. However, we must also consider the inclusivity of our research. Some respondents might use AI tools to help articulate their thoughts in English if it is not their first language or if they are concerned about their fluency. Excluding these responses could inadvertently marginalise non-native speakers and those with literacy challenges. By not acknowledging the potential need for some participants to use AI assistance, we might overlook systemic inequities that affect data collection.

Transparency and integrity

Maintaining transparency about the nature of our data is crucial. If AI-generated responses are identified, this should be disclosed in the research findings. Researchers should explain how these responses were identified and the steps taken to address them. When possible, verify with respondents whether they used AI tools to assist with their responses. This can be done through follow-up questions or interviews.

Methodological adjustments

To effectively navigate the complexities introduced by AI-generated responses, consider the following methodological adjustments:

Data segmentation

Segment your data to differentiate between human-generated and AI-assisted responses. Analyse these segments separately to understand how AI might influence the themes and patterns that emerge. Compare AI-generated responses with human-generated ones to identify any significant differences in content, tone, or perspective. AI might extrapolate and exaggerate the perspective the respondent fed into it. See the example below.


Prompt to ChatGPT: “I'm answering a survey question about the impact organization Y had on me. Can you write my response for me? I essentially feel that they helped my organization get more visibility through their forum that I attended”

ChatGPT’s output: “Certainly, here's a response for your survey: ‘Organization Y has had a significant impact on our organization by providing a valuable platform for visibility. Attending their forum enabled us to increase our visibility within the community and beyond. This exposure has been instrumental in raising awareness about our work and connecting us with like-minded individuals and organizations. We're grateful for the opportunity to participate and for the positive impact it has had on our organization's growth and reach.’


As you can see, the AI model infused the sentiment of gratitude, which the respondent did not express originally. ChatGPT also extrapolated that the visibility experienced by the organisation resulted in its growth. Considering such a response on par with human-generated responses bears the risk of exaggerating the impact of organisation Y.

Contextual analysis

Examine the context in which AI-generated responses appear. Look for consistency in the use of AI tools across different segments of your respondent pool. Are there particular demographics or groups more likely to use AI assistance? For instance, have respondents with AI-generated responses identified themselves as located in a non-English speaking country? Have they selected a lower level of education than the sample average? This can help you glean why respondents may have opted to submit AI-generated responses and can guide your corrective action.

Using AI to analyse AI-generated responses

Using AI for qualitative research analysis has become all the rage. It has been hailed as cost-saving and effective assuming the researcher provides a detailed enough prompt and ensures the quality of outputs. However, we have found that ChatGPT has a bias towards responses that were generated by AI. For instance, when asked to pick the best quotes from a dataset containing both human-generated and AI-generated responses, ChatGPT consistently picked AI-generated responses. It likes the sound of its own voice, essentially. This can compound the issue about exaggeration and misrepresentation of the human respondents’ voices and potentially marginalise the voices of those whose responses were grammatically incorrect or contained spelling errors due to language barriers.

To avoid this, a human eye is still necessary to read through all the qualitative data, particularly for tasks such as selecting human-generated quotes. 

If you do choose to use AI for analysis of qualitative data, ensure that participants are informed about the researcher’s intended use of AI tools and the implications this can have for their responses and privacy. Obtain explicit consent where necessary. It might also be prudent to advise participants not to use generative AI themselves. Instead, consider translating your survey into other languages (a feature which can be AI-powered in itself). 

Conclusion

As we navigate the age of AI in qualitative research, it is imperative to balance the benefits of AI tools with the need for ethical integrity and inclusivity. Recognising and addressing AI-generated responses is a step towards more transparent and equitable research practices. By doing so, we can harness the power of AI to enhance our research while honouring the authentic voices of all participants.

We encourage other researchers and practitioners to share their experiences and strategies for dealing with AI-generated responses. By collaborating and discussing these issues, we can develop robust guidelines and best practices that uphold the values of authenticity, inclusivity, and integrity in qualitative research.

 

Written by Becky Zelikson, Analyst, Kore Global and ChatGPT 4.0a

Previous
Previous

Care enterprises founded by women are transforming the care economy and lifting other women

Next
Next

Celebrating International Domestic Workers’ Day: Spotlight on Jazza Centre and Fair Employment Foundation