One thing that’s missing from the otherwise excellent LLM Wiki
Don’t just feed the AI more files; give it a roadmap for thinking
Feeding an AI a large volume of data does not necessarily yield better results.
Most people tend to think:
-
“If I feed it more data, the results will improve.”
-
“If I collect more notes, the AI will become smarter.”
-
“If I integrate RAG, that will solve our knowledge management issues.”
That’s half true and half false.
As the volume of data increases, so does the amount of information available for the AI to draw upon.
However, simply having more information
does not automatically generate good questions.
Good insights do not come from the sheer volume of data.
They come from the structure within the data.
What Karpathy’s LLM Wiki has shown
The idea for the LLM Wiki, proposed by Andrej Karpathy, is quite powerful.
The key is simple.
Place research papers, notes and materials into a single folder.
-
The AI reads the materials.
-
It extracts key concepts.
-
It organises the connections between these concepts.
-
It then compiles the results into a wiki that can be viewed in Obsidian or an IDE.
This is far better than the existing RAG.
This is because RAG typically searches for relevant snippets and constructs an answer whenever a question is posed.
And when the next conversation begins,
it forgets what it learnt from the previous conversation.
In contrast, LLM Wiki organises the material.
It creates concept pages,
provides links to sources,
and builds a knowledge base that you can return to later.
So far, so good.
However, there is one crucial problem remaining.
While the wiki organises things,
it doesn’t really spot where there are gaps.
With just Wikigenerates an average answer
If you feed a structured wiki to an LLM and ask it a question, it will produce an answer.
However, in most cases, that answer is
little more than ‘the most plausible connection’.
This is because an LLM is, at its core, a system designed to predict the next token.
It identifies concepts within the data,
links those concepts together naturally,
and arrives at a plausible conclusion.
This isn’t a bad thing.
It’s good for summarising.
It’s good for organising research.
It’s good for drafting.
But if you want to come up with new ideas, it’s not enough.
New ideas do not come from places that are already well-connected. They come from places that are not yet connected.
Why we need a knowledge graph
This is where knowledge graph tools such as InfraNodus come into their own.
A knowledge graph does not view texts and notes as mere lists.
It treats concepts as nodes,
and the relationships between concepts as links.
This reveals the following:
- Which topics are central
- Which clusters are too dominant
- Which ideas have been given insufficient attention
- Which two areas have not yet been linked
- Where connections can be made to generate new questions
This is the crux of the matter.
It’s not simply a case of asking the AI, ‘Come up with a good idea using this data.’
‘There are two disconnected clusters here. Generate a question that bridges the gap between them.’
That is how you should ask.
If the question changes, the answer changes too.
To be more precise,
the AI’s focus changes.
A good AI workflow establishes a structure before asking questions
This is how the workflow works.
1. Create a new research folder.
2. Place the source material in the ‘raw’ folder.
3. Convert PDFs, notes and texts into Markdown.
4. The AI generates pages for concepts, connections, sources and questions.
5. InfraNodus visualises this wiki as a graph.(*For a detailed introduction to InfraNodus, Please refer to
.)
6. Identify weak clusters and unconnected gaps.
7. Feed those gaps back into the LLM.
8. Save the resulting questions and insights as a to-do list and a graph.
This is not merely a summary of knowledge.
It is a system in which knowledge gives rise to further questions.
Why this approach is important for sole traders
This isn’t just a story for researchers.
It applies just as much to solo entrepreneurs.
For example, let’s say my files are scattered all over the place like this:
- YouTube scripts
- Draft X posts
- Bootcamp lecture notes
- Customer queries
- Product ideas
- Sales pages
- Retrospective notes
If you simply feed this into an AI and say,
“Give me 10 content ideas,”
it will produce a plausible list.
However, most of the ideas feel like something you’ve seen before.
Conversely, if you view this data as a knowledge graph, the questions change.
“What are the missing links between the customer query clusters and the lecture outline clusters?”
“What concepts remain unlinked between the arguments that went down well on X and the actual product features?”
“What are the points where bootcamp students repeatedly get stuck, and what topics have I not yet covered in my content?”
These are the questions that really pay off.
This is because I’m looking for gaps that already exist within my data
but haven’t yet been linked to content, products or sales.
The next step for RAG is not search, but steering
Many people understand RAG as ‘a technology that allows AI to search through my data’.
However, the next step is not simply to improve search accuracy a little further.
It is about telling the AI where to look.
Rather than having it average out the entire dataset,
it is about making it focus on specific gaps.
Rather than having it summarise scattered notes,
it is about making it connect clusters that are not yet linked.
Before demanding an answer,
it is about creating a structure that generates good questions.
This is why you need to use the LLM Wiki and the knowledge graph together.
If I were to apply this straight away, I would write it like this
If I were to apply this to running my business, the process is straightforward.
First, I create folders for each topic.
For example:
- AI Bootcamp
- Solo Entrepreneur Automation
- X Content Experiments
- Customer Queries
- Product Ideas
I place all the source material inside these folders.
I format scripts, notes, feedback, comments and retrospectives in Markdown.
Next, I use LLM Wiki to establish connections between concepts.
And I’m looking at it through InfraNodus.
Which topics are being repeated too often?
Which topics are weak?
Which customer queries are not linked to product features?
Which content claims are not linked to the sales page?
Next, you instruct the AI as follows:
“Generate five topics for X Article that bridge the gap between these two clusters.”
“What customer case studies are needed to strengthen this weak cluster?”
“Create a three-step framework to explain this connection.”
Now, the AI no longer makes things up at random.
Within my knowledge framework,
I fill in the gaps where they are most needed.
Conclusionnclusion
What matters going forward is not simply feeding AI more data.
It is about enabling AI to see the structure of my data.
LLM Wiki organises the data.
The knowledge graph reveals missing connections.
InfraNodus transforms those gaps into questions.
The LLM then uses those questions to generate new insights.
Once this flow is established, knowledge management becomes an engine rather than a repository.
It is not a system where notes simply pile up,
but one that continuously generates ideas for future content and products.
In my view, this is the true direction of personal knowledge management following RAG.
Let me leave you with one question.
Is my current note-taking system merely storing information, or is it generating the next question?
0
Please sign in to leave a comment.
Comments
0 comments