Does anyone know why all of these WebGPU LLM demos have you download the models to browser storage rather than letting you open a gguf already on your local drive? I have several models downloaded already that I would be interested in trying.
You can change this by changing settings, command line arguments, build flags, etc. But can’t really expect people to do this just to use your website.
You can open a file for performant access in all major browsers. It's the same API used for uploading files (<input type="file" />), but you can also just load them into memory and do stuff.