InfraNodus represents any text as a network, so you can get insights about the structure of the discourse.
Based on the graph's modularity and distribution of influence across the different communities, we can measure the diversity of the discourse structure and estimate its bias.
Video Tutorial (4 mins):
How it Works:
1. Add any text, book, PDF file, news article that you'd like to analyze into InfraNodus.
2. InfraNodus will represent the text as a network. The words are the nodes and their co-occurrences are the connections between them.
3. Based on this representation we can see the structure of the discourse: which terms tend to co-occur in the same context (next to each other), which ones are more influential than others.
4. Using graph theory InfraNodus obtains the following measures of the texts' structure and uses them to calculate the network diversity score:
a) Modularity — if it's more than 0.4 then the community structure is pronounced and there are several distinct clusters of meaning circulation within the text
b) Entropy — how are the most influential nodes distributed among the biggest communities? If all influence tends to be concentrated in one topic, the entropy is low. If all influence is equally distributed among the different topics (e.g. each top influential word in a distinct topic), then the entropy is high.
c) Concentration — how many nodes are in the top community? If more than a half, the structure is highly biased.
Taking these parameters into account the network diversity score is calculated.
High modularity, high entropy, low concentration — the network is dispersed.
Medium modularity, medium-high entropy, low concentration — the network is diversified.
Low modularity, low entropy, high concentration — the network is focused (all the influence is in one topic or only around a few nodes / words).
Very low modularity, low entropy, high concentration — all the influence is in one cluster. The network is biased.
It is important to note that the way we use "bias" in this context is not about the subjective / objective criteria or bias towards a certain ideological stance. The measure is based solely on the structure of the discourse, not on its content, so, interestingly, even the most "objective" texts may turn out to be "biased" in our system of measurement because they are relying too strongly on a certain discourse in their objectivity.