InfraNodus thematic analysis app can be used for studying interview transcripts in order to reveal the main themes and sentiment. This can be particularly useful for patient interviews in healthcare, user interviews in product research, customer feedback and reviews analysis, open-ended survey studies.
InfraNodus uses AI-powered knowledge graphs to provide an interactive visualization of the main topical clusters found across the responses to give a researcher a better understanding of recurrent concepts and themes that emerge in responses.
Its powerful filtering feature can be used to study specific segments (e.g. by country, sentiment, or age bracket). Built-in tags can be used for coding the text with researcher-defined categories or themes. AI-powered knowledge graph can be used to retrieve specific relationships from the data.
Below we present a step-by-step workflow for analyzing interview transcripts. As an example, we will use the dataset provided by the University of Bath, UK: "Qualitative Interviews with Drug Service Users and Addiction Therapists, 2017-2022". This dataset contains six transcripts of interviews with the drug users who talk about their experience with addiction and can be very illuminating for understanding their challenges and coping mechanisms.
We will show how InfraNodus can be used to get insights from these interviews.
Step by Step Qualitative Analysis Workflow
1. Import the data into InfraNodus
You can import text files or spreadsheet CSV to add segments to the speakers' text: age bracket, gender, location, product rating, etc.
2. Visualize the transcripts as a knowledge graph
You can build a graph using the words (for detailed analysis) or entities (for relational overview).
3. Clean the data using the graph
Focus on the the most important elements: hide the unnecessary text using the filter panel (e.g. the interviewer's questions) or unimportant concepts (e.g. "uhm", "ehr", "yeah", etc)
4. Identify the main topical clusters and get a general overview
You will get a good understanding of the main themes and concepts contained in your data
5. Analyze sentiment and filter content by emotion
Discover how the discourse is different for positive / negative statements.
6. Coding the interview data: use the graph to explore ideas
Add the insights you discover from text, tag the statements, export them to other analysis tools
1. Import the Data into InfraNodus
The first step is to import the data into InfraNodus. You may need to consider the different import options that you can use, depending on your objectives.
1/a: Importing text files — one per each distinct interview
In our example, we have a collection of 6 files in .docx format, which are the interview transcripts we use. Each file is an interview with a different person, which makes it easy to filter them later during analysis using the internal tagging system.
We will specify that we want to process them as interview transcripts so that every paragraph that starts with the name of the respondent followed by a colon (:) will be converted to a category tag. This will allow us to filter the statements by interviewer or respondents and clean our data after.
1/b: Importing a single file and using the speaker names as tags
You can also import the interviews as a single file, but in that case you need to prepare them to be in a certain format, so that InfraNodus can automatically tag the speakers. Use the name of the speaker followed by a semicolon, as in the example below:
Interviewer: So what do you think about this situation?
Interviewee 1: I don't really know what to say. I had a strange experience with it.
Interviewer: Tell me more?
Interviewee 2: Uhm... I think that he just got confused.
If you format the text files this way, InfraNodus will tag every statement with the name of the speaker it belongs to (even if it's in the same file), which will make it easier for you to filter them after and segment analysis by the speakers.
1/c: Importing a CSV file for in-depth segmenting
Alternatively, you can also save your data in a spreadsheet format (e.g. CSV) and import it as a CSV file into InfraNodus. This is a more powerful approach, because you can introduce more categories: for instance, an age bracket, location, gender, product rating, and any other segment you'd like to use for filtering later. Your data would look something like this then:
Person | Response | Gender | Age bracket |
Location |
Interviewer | So what do you think about this situation? | Female | 25-35 | US |
Interviewee 1 | I don't really know what to say. I had a strange experience with it. | Male | 35-45 | US |
Interviewer | Tell me more? | Female | 25-35 | US |
Interviewee 2 | Uhm... I think that he just got confused. | Female |
35-45
|
UK |
In this latter case, you have an option to not only filter by the person (interviewee / interviewer) but also by gender and age bracket. This way, you can see for example, what people of certain gender are saying, or how the discourse of people from UK is different from those who are from the US.
This is particularly useful if you're conducting a research of multiple respondents and want to see how the discourse differs by gender, age, or location.
2. Visualize the Interview Transcripts as a Knowledge Graph
Once imported, InfraNodus will visualize the interview transcripts as a knowledge graph. The words used in the responses are the nodes, their co-occurrences are the connections between them. You can read more about the algorithm in our peer-reviewed whitepaper.
We use state-of-the-art network analysis metrics to identify the clusters of nodes that occur more often together (iterative community detection algorithms based on modularity) and rank the concepts (nodes) based on their betweenness centrality, so the terms that connect different topics together will be shown bigger on the graph. They serve as the junctions for meaning circulation.
Such representation helps you quickly identify the recurring themes and concepts in any interview and they can be used in conjunction with AI algorithms to generate cluster names that you can use in coding.
Another option in InfraNodus allows you to build a knowledge graph from the entities detected in a text. This can be interesting if you want to get a sparser graph and focus on retrieving specific relations from the content between various phenomena, places, and individuals. We recommend using this option in business context. Personal interviews will often focus on feelings and experiences and entity detection may be a bit too formal for such applications.However, it can give you a good insight into the context and allow you to start with a sparser graph.
In order to build a knowledge graph of entities, during the import, when choosing what data you'll analyze, select "only the detected entities". Our algorithms will detect entities (phenomena, people, places, scientific concepts) and build a graph based on their co-occurrences.
The result will look like this:
As you can see, the entities are also tagged in the actual interview data using the [[wiki links]] format, which means you can also export your analysis into other tools such as Obsidian. This can be interesting if you want to find relations between different interviews and concepts within.
3. Clean the data
Before conducting the analysis, we need to clean the data. InfraNodus' knowledge graph is fully interactive, so you can select the statements using the graph (filtering by concepts or a topic) or by provided filters (e.g. a speaker's name), and remove unnecessary data from the graph.
In our case, we will do two things:
a) Remove all the statements made by the interviewer, because we want to focus on what the drug users said:
- click Filter By > Tag: Interviewer
- the statements that contain this filter will be shown on the graph
- click "Select All" to select all these statements
- then click the Delete icon at the bottom to remove all of them.
- the graph statistics will automatically be recalculated
2) you can also remove some of the irrelevant words from the graph: in our case, we removed "ehm", "uhm", "kind", and "yeah", as it's a spoken interview and those words didn't carry additional meaning. However, there may be instances where it can be interesting to keep them as they indicate the moments of hesitation, which may be relevant for analysis.
The new, "clean" version of the graph will look something like this:
4. Identify the main topical clusters and get a general overview
Using the graph above, we can identify the main topical clusters that are recurring in all of those six interviews with the drug users:
1. Recovery journey (how they are trying to recover from drugs and change their habits)
2. Time reflection (thinking about their history and the future)
3. Past struggles (life struggles)
4. Addiction cycle (talking about their addiction cycle)
Interestingly, the concept of relapse has the highest betweenness centrality, which means it links multiple clusters together. This is an indicator that getting back into drugs is one of the biggest challenges that the interviewees face.
The advantage of using this approach for extracting the main topics is that it prevents us from imposing our own bias on data and we derive the topics from the content itself (inductive coding).
Topical clusters as well as the main concepts shown in the knowledge graphs provide a good summary of the main ideas contained in the documents.
You can save them in the Project Notes tab for future reference. You can also generate short topical summaries using the built-in AI for each topic to understand what they are about:
5. Analyze sentiment and filter content by emotion
Sentiment analysis can indicate the emotional background of the interview statements. Using the built-in sentiment analysis algorithm in InfraNodus you can tag the statements by positive / negative / neutral sentiment with the built-in AFINN dictionary. This is the default option and works well for English-language responses.
For a more in-depth analysis, you can use the built-in Google NLP AI sentiment analysis, which takes longer, but provides much more detailed insights and works for multiple languages.
To access, sentiment, go to Analytics > Sentiment:
For our example, we can see that in those interviews about 60% of statements have a negative connotation, which is understandable as the subjects are talking about their drug-related experience.
We can now use the built-in filter in InfraNodus to see how the positive discourse is different from the negative one.
In order to do that, go to Filter > Sentiment > Positive and make sure to switch on the Filter & Recalculate toggle and then click Filter:
We can see that negative interview statements are related to the topics of
- Recovery journey
- Time perspective
- Addiction struggles
- Backtrack days
Which are all connected to being pulled back into the drug habit.
Now, let's apply the same filter, but for the positive statements:
We can see, that positive statements are more connected to
- Going back home
- Emotional support
Which provides an interesting insight on how people actually find salvation from their drug habit: connecting with family and friends, finding emotional support from other people.
We can combine the filters — for example, all positive statements that belong to a certain topics — and then use the built-in AI to generate a summary. To do that,z click on the topic while in the "positive sentiment" filter and then click "AI: Summarise Visible" button at the top:
We can see that 24 (out of 424) statements are both positive and belong to the topical cluster of "Back home". And AI summary provides an account where the interviewees are talking about some positive experiences from the past (even from the childhood) that actually led them to try drugs, which is an interesting insight indicating that it's not only related to negative experiences but also to the excitement of trying something new or meeting new friends.
We can write down those insights in project notes or use the coding approach to add tags to some of the statements we uncover through our analysis.
6. Coding the interview data: use the graph to explore ideas
Now that we have a general overview of the main topics, we can start making interpretations and coding the data.
Coding is a term used in thematic analysis describing the process of annotating the text to identify the main themes or patterns within the unstructured text data.
With InfraNodus, a good place to begin are the topics identified by the software.
For example, we can click on the "Recovery journey" topical cluster and InfraNodus will filter all the statements that belong to it. We can then click AI: Summarize button to generate a summary of the statements that belong to this topic to get a better idea of what they're about.
In fact, the topics identified by InfraNodus are also available in the filters and can already be exported with the interview data. For example, we can select a certain topic (e.g. "Addiction cycle") and export all the statements that have this topic as a CSV file where there's a column containing the topic name, so can be treated as a code in another software:
Once we select the topic and highlight it in the graph along with the statements that belong to it, we can zoom in deeper by clicking the concepts inside the topics to see what the interviewees say about them.
For instance, once we highlight the "Recovery journey" cluster, we can select the concepts of "change" and "behavior" within to see what the interviewees are saying about it specifically.
This allows us to explore the text in a non-linear way, focusing on the content pertinent to our study:
In our example, we can see that two different subjects (we can see that from tags that indicate different interview files) are talking about how changing the drug habit requires a complete change of behavior, context, habits, etc. A very good insight that gives us a clear idea about the notion of change in relation to drug use.
After we read the statements, we can select them and add the new tags to them (which is analogous to coding).
Comments
0 comments
Please sign in to leave a comment.