Hey I've been thinking about this recently. Since there must be plenty of experts in how to LLMs work and avid buddhism followers in HN, this seems like an idea that could take shape in this forum. I'm not claiming this would work so cut me some slack, I want to see what ya'll think.
Some ideas that pull me to thinking about this:
- LLMs build a world model and are not 'just' parrots
- The Buddha spoke and people wrote it down, maybe the 'original' world model can be extracted from the writing?
- The Pali canon has 15-17k pages, sounds like a lot but it's a small dataset compared to how much data is used to train models these days
- There's orders of magnitude more commentary that could be used, but does it detract/dilute the 'original' world model?
- Say we dump everything written about buddhism, there's probably redundancy but maybe enough material to get 'enough' data to train
eg there's https://chat.openai.com/g/g-WxckXARTP-astrology-birth-chart-... for astrology