Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I’m no expert but Florence2 has been my go-to. It’s pretty great at picking up art styles and IP stuff - “The image depicts Goku from the anime series Dragonball Z…”

I don’t believe you can really prompt it though, but the other models where I could also didn’t work well on that front anyways.

TagGui is an easy way to try out a bunch of models.



Yeah, blip mostly ignores prompt too. I tried to disassemble it and feed my prompts, to no avail. Although I found that default kohya gui arguments are not even remotely the best. Here's my args:

  finetune/make_captions.py ... \
    --num_beams=12 \
    --top_p=0.9 \
    --max_length=75 \
    --min_length=24 \
    --beam_search \
    ...
With this, it's very often that I just take its caption as is, or add little.

TagGui

Oh, interesting, thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: