I'm thinking of a book that you can ask questions. It can explain topics in more detail, or it can tell you that the thing you asked will be explained later in the book. And it will allow you to skip material that you are already familiar with. Provide references to other resources, etc.
Maybe ingesting an entire book is too much for current LLMs, but I'm sure there are ways around that.
Note: I am __not__ trying to build such a tool myself.
> Distill $content by condensing sentences without loss of meaning and then restructuring as a Minto Pyramid. Use point headings, not trite labels; prose, not bullet points; and up to four outline levels, as appropriate. Provide summary umbrella sections to enumerate upcoming sub-points.
That usually gets the main points toward the top and logically organizes supporting details. It lets me quickly grok the gist. But it's always lossy and doesn't convey all levels of the text.
If the text is worth reading and understanding, then I like to paste the original text into a plain text file or outliner; put each sentence on its own line (you can use an LLM for this but regex is better, faster, and cheaper); and both rewrite and reorganize it myself. (Sublime Text and outliners such as Bike and OmniOutliner make moving and indenting lines easy with keyboard shortcuts.)
PS: People need to learn that they should either write better in the first place or else have an LLM properly structure their writing so that it gets done correctly once "on the server side" instead of forcing each reader to do it for them many times (inefficiently and more expensively) on the reader / "client side".
PPS: Kagi's Research and Research (Experimental) models work well with the prompt suggested above.
Most models local or cloud have issues with very long contexts even if they “support” 1M context window.
I’ve tried local models and around 30K context, it starts making up or summarizing content to store it in memory and will not fully reproduce the input.
You could try re-training a local model on a book or implementation of RAG.
I don’t know how latest local models would handle 200k context window but RAG may help keep the memory context clean.