Easy Stable Diffusion XL in your device, offline

rgbrgb · on Dec 1, 2023

Just installed, this is very cool. Local AI is the future I want (and what I'm working on too). A few notes using it...

Pros:

- seems pretty self contained

- built in model installer works really well and helps you download anything from CivitAI (I installed https://civitai.com/models/183354/sdxl-ms-paint-portraits)

- image generation is high quality and stable

- shows intermediate steps during generation

Cons:

- downloads 6.94GB SDXL model file somewhere without asking or showing location/size. Just figured out you can find/modify the location in the settings.

- very slow on first generation as it loads the model, no record of how long generations take but I'd guess a couple minutes (m1 max macbook, 64GB)

- multiple user feedback modules (bottom left is very intrusive chat thing I'll never use + top right call for beta feedback)

- not open source like competitors

- runs 7 processes, idling at ~1GB RAM usage

- non-native UX on macOS, missing hotkeys you'd expect, help menu. electron app?

Overall 4/5 stars, would open again :)

liuliu · on Dec 1, 2023

You should check out Draw Things on macOS. It works well enough for SDXL on 8GiB macOS devices.

rgbrgb · on Dec 1, 2023

Thanks. Yeah I played with your app early on and just fired it up again to see the progress. Frankly I find the interface pretty intimidating but it is cool that you can easily stitch generations together.

Unsolicited UX recs:

- strongly recommend a default model. The list you give is crazy long. It kind of recommends SD 1.5 in the UI text below the picker but has the last one selected by default. Many of them are called the same thing (ironically the name is "Generic" lol).

- have the panel on the left closed by default or show a simplified view that I can expand to an "advanced" view. Consider sorting the left panel controls by how often I would want to edit them (personally I'm not going to touch the model but it is the first thing).

You are doing great work but I wouldn't underestimate the value of simplifying the interface for a first-time user. It seems to have a ton of features but I don't know what I should actually be paying attention to / adjusting.

Is there a business model attached to this or do you have a hypothesis for what one might look like?

liuliu · on Dec 1, 2023

Agreed on UX feedback. It accumulated a lot of crufts from the old technologies to the new. This just echos my early feedback that co-iterating UI and the technology is difficult, you'd better pick the side you want to be on and there is only one correct side (and unfortunately, the current app is trying hard to be on both-side).

miles · on Dec 1, 2023

Are you the developer by any chance? If so, it would be helpful to state it.

liuliu · on Dec 1, 2023

I am. I thought this is obvious. My statement is objective. I would go as far as: it is the only app works at 8GiB macOS devices with SDXL-family models.

cellularmitosis · on Dec 1, 2023

"You should check out this thing" has a very different implied context than "You should check out this thing I made". The first sounds like a recommendation from an enthusiastic user, not from the the author. Because of this, discovering that you are the author makes your recommendation feel deceptive.

liuliu · on Dec 1, 2023

I am sorry if you feel that way. I joined HN when it was a small tight-knit community without much of marketing presence. The "obvious" comment is more like "people know other people" kind of thing. I didn't try to deceive anyone to use the app (and why should I?).

If you feel this is unannounced self-promotion, yes, it is, and can be done better.

---

Also, for the "objective" comment, it meant to say "the original comment is still objective", not that you can be objective only by being a developer. Being a developer can obviously bias your opinion.

givinguflac · on Dec 1, 2023

I think it was obvious. That said, thank you so much for your labor of love! The app is amazing! Any plans for SDXL Turbo support?

liuliu · on Dec 2, 2023

Should be in a few days. I asked Stability to clarify whether I can deliver weights through my Cloudflare bucket and whether qualifying as non-commercial is who runs the model, not who delivers the model.

fragmede · on Dec 2, 2023

2008, when we both joined, was 15 years ago. In the interim, the userbase has grown. Most people aren't recognizable as the author of an app under discussion, so a simple "Developer here" is appreciated as it was not obvious to me.

ProfessorLayton · on Dec 1, 2023

Whoa, well let me just say thanks for the awesome app!! it's pretty entertaining to spin this up in situations where I don't have internet (Airplane, subway etc.)

I was also surprised on how well it ran on my iPhone 11 before I replaced it with a 15 pro.

(Let me know if you're looking for some Product Design help/advice, totally happy to contribute pro bono. No worries if not of course!)

adamjc · on Dec 1, 2023

How would that be obvious to anyone?

vunderba · on Dec 1, 2023

Nice app - but for future reference it is very much not obvious to any native English speaker. "You should check out X" sounds like a random recommendation.

TheHumanist · on Dec 1, 2023

What do you mean it was obvious? Only the developer could make that objective statement?

drcongo · on Dec 4, 2023

I've been generating stuff non-stop in Draw Things for a few days, it's very good. Agree with the comments elsewhere about the rather overwhelming UI, and I have only one feature request: let us input the number of images we want to generate - the 100 limit means I keep having to check if it's finished to restart it.

rcarmo · on Dec 2, 2023

Any plans for SD Turbo? Both base and XL models would be a great fit for a mobile device.

heyyeah · on Dec 2, 2023

Draw Things is amazing. Great work and thanks for developing it!

maxdaten · on Dec 2, 2023

If you are interested in the tech-stack:

https://noiselith.notion.site/License-61290d5ed7ab4c918402fd...

So yes, it is an electron app with svelte, headless-ui, tailwindcss etc

sytelus · on Dec 1, 2023

+1 for asking download location.

philote · on Dec 1, 2023

Another con is it only works on Silicon Macs.

Vicinity9635 · on Dec 1, 2023

Apple Silicon* I presume?

This could honestly be the excuse I need (want) to order an absolute beast of a macbook pro to replace my 2013 model.

wayfinder · on Dec 1, 2023

If you want an absolute beast, especially for this stuff, you probably want Intel + Nvidia. Apple Silicon is a beast in power efficiency but a top of the line M3 does not come close to the top of the line Intel + Nvidia combo.

Vicinity9635 · on Dec 1, 2023

Well this would just be the excuse. I'm typing this on a Ryzen 5950X w/32 GB of RAM and a 4090. So I guess I already have the beast?

cchance · on Dec 4, 2023

Get M1 macbook, throw your ryzen with 4090 in a closet and use it as a remote API endpoint for comfyui lol

NorwegianDude · on Dec 1, 2023

I guess EPYC and a few H100s is the next big step, at a much higer price point...

quitit · on Dec 1, 2023

If it's just for hobby/interest work, then just a heads-up that even the 1st generation Apple Silicon will turn over about one image a second with SDXL Turbo. The M3s of course are quite a bit faster.

The performance gains in recent models and PyTorch are currently outpacing hardware advances by a significant margin, and there are still large amounts of low-hanging fruit in this regard.

mtlmtlmtlmtl · on Dec 2, 2023

Is that 1GB idle per process or total for all 7 processes?

mikae1 · on Dec 1, 2023

> not open source like competitors

Who are the competitors?

quitit · on Dec 1, 2023

DiffusionBee: AGPL-3.0 license (Native app)

InvokeAI: Apache license 2.0 (web-browser UI)

automatic1111: AGPL-3.0 license (web-browser UI)

ComfyUI:GPL-3.0 license (web-browser UI)

There's more, but I don't pay enough attention to it

mikae1 · on Dec 1, 2023

Thanks! https://lmstudio.ai/ too. For the more technically inclined perhaps.

dragonwriter · on Dec 1, 2023

I don't think lmstudio is competes with Stable Diffusion frontends, even for the technically-inclined.

0xDEADFED5 · on Dec 2, 2023

for people with Intel video cards (all 10 of us!) there's also SD.Next (automatic1111 fork): https://github.com/vladmandic/automatic

8n4vidtmkvmk · on Dec 3, 2023

I like ComfyUI the most now but it's probably not the most beginner friendly. But has great features, is extensible, and you can build workflows that work for you and save them so you don't have to click a million times like Auto1111.

vunderba · on Dec 1, 2023

I'd also recommend InvokeAI, an open source offering which has a very nice editable canvas and is very performant with diffusers.

https://github.com/invoke-ai/InvokeAI

UberFly · on Dec 2, 2023

I just installed InvokeAI and wish I hadn't. It installs -so much- outside of its target directory. A1111 and ComfyUI are fairly self contained where you put them.

demosthanos · on Dec 2, 2023

It's all isolated in a single directory, though, right? I set it up ages ago, but my recollection is that it installs itself in ~/invoke on Linux and stays contained there.

sophrocyne · on Dec 1, 2023

There are already a number of local, inference options that are (crucially) open-source, with more robust feature sets.

And if the defense here is "but Auto1111 and Comfy don't have as user-friendly a UI", that's also already covered. https://github.com/invoke-ai/InvokeAI

internet101010 · on Dec 1, 2023

I switched to InvokeAI and won't go back to basic a1111 webui. I like how everything is laid out, there are workflow features, you can easily recall all properties (prompt, model, lora, etc.) used to generate an image, things can be organized into boards, and all off the boards/images/metadata are stored in a very well-designed sqlite database that can be tapped into via DataGrip.

quitit · on Dec 1, 2023

automatic1111: great for the fast implementation of the most recent generative features

comfyui: excellent for workflows and recalling the workflows, as they're saved into the resulting image metadata (i.e. sharing images, shares the image generation pipeline)

InvokeAI: Great UX and community, arguably were a bit behind in features as they were focused on making the UI work well. Now at the stage of bringing in the best features of competitors - Like you, I can easily recommend it above all other options.

squeaky-clean · on Dec 1, 2023

> recalling the workflows, as they're saved into the resulting image metadata (i.e. sharing images, shares the image generation pipeline)

Doesn't a1111 already do this? Theres a PNG Info tab where you can drag and drop a PNG and it will pull all the prompt, inverse prompt, model, etc. And then a button to send it to the main generation tab. It doesn't automatically load the model, but that may be intentional because of how long it takes to change loaded models.

dragonwriter · on Dec 1, 2023

> Doesn't a1111 already do this?

Not that provides the same thing, no, largely because of fundamental design differences.

> Theres a PNG Info tab where you can drag and drop a PNG and it will pull all the prompt, inverse prompt, model, etc. And then a button to send it to the main generation tab.

A1111 by nature, has a bunch of disconnected operations in separate tabs and scripts. Even if the PNG captures all of a generation operation that would be executed by a single launch-button click, its not really equivalent to capturing a whole ComfyUI workflow, which can be the equivalent of a process which would be numerous different tasks in A1111 with manually shuttling data between tabs and scripts.

A1111 has a bunch of manual "send to X" buttons to do with the output of runs, so that they can be the input of another task, wherein in Comfy those operations are part of one workflow with a pipeline connecting the output of one to the input of another. And when saving generation data, those manual shuttle points in A1111 are barriers as to what is part of a single generation that can be saved.

quitit · on Dec 1, 2023

Comfy is node based. The saved metadata pulls up the full nodal workflow.

holoduke · on Dec 1, 2023

Can you actually use those workflows in some sort of API from a script to automate it from lets say a python script. Played arround with comfy. Really nice, but i would like to automate it within my own environment.

dragonwriter · on Dec 1, 2023

> Can you actually use those workflows in some sort of API from a script to automate it from lets say a python script.

Yes, you can, and the workflow JSON format has a reduced "API form" that discards visual/UI related information.

Also, if you are using Python, you could do your automation in Comfy (as custom nodes) instead of outside, too.

sophrocyne · on Dec 1, 2023

Yeah, Invoke's nodes/workflows backend can be hit via the API. That's how the entire front-end UI (and workflow editor/IDE) are built.

I'm positive this can be done w/ Comfy too.

didibus · on Dec 4, 2023

It's just missing too many features for me still, even though I like what it has better. I use things like segment-anything, customer upscalers, I prefer how inpainting is controlled in A1111 where you can say if you want whole image or mask area only, etc.

I've personally been using SD.Next, which is a fork of A1111 with support for the diffuser backend, a cleaned-up UI, and also sometimes has support for newer things before A1111, though not always. It's plugin compatible with A1111.

GaggiX · on Dec 1, 2023

Also just Krita with the diffusion AI plugin: https://github.com/Acly/krita-ai-diffusion

SequoiaHope · on Dec 1, 2023

Yeah "Run Stable Diffusion locally" is a weird pitch since that's already easy to do tbh.

blehn · on Dec 1, 2023

No idea whether or not the UI is user-friendly, but the installation steps alone for InvokeAI are already a barrier for 99.9% of the world. Not to say Noiselith couldn't be open-source, but it's clearly offering something different from InvokeAI.

demosthanos · on Dec 2, 2023

I can't even figure out how one would install Noiselith. It has some text that says "Download for free on your PC", but it's not a button or a link. Maybe they're doing some weirdly locked-down user-agent sniffing and refuse to allow me to even attempt to download any version on Linux?

InvokeAI is installed via a script, sure, but it's also just a few clicks: download, extract, double-click on a specific file, enjoy.

blehn · on Dec 4, 2023

There are two giant download buttons on the Noiselith homepage. The mac button downloads a dmg and the windows button downloads an exe.

smcleod · on Dec 1, 2023

Yeah invokeAI is fantastic!

brucethemoose2 · on Dec 1, 2023

I would highly recommend Fooocus to anyone who hasn't tried: https://github.com/lllyasviel/Fooocus

There are a bajillion local SD pipelines, but this one is, by far, the one with the highest quality output out-of-the-box, with short prompts. Its remarkable.

And thats because it integrates a bajillion SDXL augmentations that other UIs do not implement or enable by default. I've been using stable diffusion since 1.5 came out, and even having followed the space extensively, setting up an equivalent pipeline in ComfyUI (much less diffusers) would be a pain. Its like a "greatest hits and best defaults" for SDXL.

stavros · on Dec 1, 2023

I was afraid of the Python setup (even though I'm a Python developer), but yep: Make the virtualenv, install the dependencies, done. This is amazing, the images it generates are immediately beautiful.

It does look bad that it bundles GTM, though, as a sibling commenter says.

Samples:

https://imgz.org/i9oicVqo/

https://imgz.org/i8Ur3WjW/

https://imgz.org/i5j6r6TZ/

brucethemoose2 · on Dec 2, 2023

Be sure to try the styles as well. Thats actually a seperate input than the prompt for SDXL, and most other UIs dont implement the style prompting.

dragonwriter · on Dec 2, 2023

> Be sure to try the styles as well. Thats actually a seperate input than the prompt for SDXL.

No, its not.

There are two text encoders, but they aren't really “prompt” and “style” inputs.

> and most other UIs dont implement the style prompting.

Most UIs default mode of operation sends the same input to both text encoders, but at least comfy has nodes that support sending separate text to them. OTOH, while there may be some cases where sending different text to the two encoders helps in a predictable way, AFAIK most of the testing people has done has shown that optimal prompt adherence usually comes from sending the same to both.

brucethemoose2 · on Dec 2, 2023

Hmm well that was a massive misunderstanding on my part.

dragonwriter · on Dec 2, 2023

I’m not sure the origin, but using ViT-L (the encoder shared with SD1. x) for what you might call the main prompt and ViT-G (the new SDXL encoder, and also a successor to the encoder used as the single encoder in SD 2.x) for a style prompt was a common idea shortly after SDXL launched, so its understandable.

antman · on Dec 2, 2023

What are the two text encoders?

dragonwriter · on Dec 2, 2023

OpenCLIP ViT-G and CLIP ViT-L. The latter is the same encoder used in SD 1.x, OpenCLIP ViT-H was used as the encoder in SD 2.x, and ViT-G is, as I understand it, a successor and improvement on ViT-H.

neilv · on Dec 1, 2023

Looks like the Web UI of the self-hosted install of Fooocus sells out the user to Google Tag Manager.

Can our entire field please realize that running this surveillance is a bad move, and just stop doing it.

stoobs · on Dec 2, 2023

I think that's coming from gradio?

SV_BubbleTime · on Dec 2, 2023

Probably, auto1111 does the same - but I agree with GP it shouldn’t be there.

If it isn’t explicitly surveillance, it could effectively be.

stoobs · on Dec 2, 2023

Yeah, it is, just need to set an env var of GRADIO_ANALYTICS_ENABLED=False to stop it - probably should be added into launch.py along with the other env vars being set at launch.

pmarreck · on Dec 1, 2023

Have to build it yourself on Mac, and we all know how "fun" building Python projects is

jessepasley · on Dec 1, 2023

Just spent about 10 minutes building it on MacBook Pro M1. I come with significant bias against Python projects, but getting Fooocus to run was very, very easy.

pmarreck · on Dec 2, 2023

I finally had a chance to set it up and yes, it works great!

mrbmbzl · on Dec 2, 2023

did you get fooocus to run on silicon mac with mps support? i cant get mps support running for the love of god - any help would be much appreciated from me (as well as the 20 or so people plus that are currently looking for a solution on github to achieve normal speed generation compared to the 15 minutes per image :)) thank you

jmpavlec · on Dec 3, 2023

Mine takes about ~3 min per image, didn't do anything special. Left everything at default settings after my initial install (about 5 days ago). Not speedy but certainly not 15 min. Running on an M1 with 32gb ram.

pmarreck · on Dec 3, 2023

I think so… it’s an M1 Max Macbook Pro and it takes about 2-3 mins per image. I followed all the prerequisites on their install for mac page

pmarreck · on Dec 1, 2023

That's good to know!

liuliu · on Dec 1, 2023

Yeah, Fooocus is much better if you are going for the best local generated result. Lvmin puts all his energy into making beautiful pictures. Also it is GPL licensed, which is a + in my book.

stoobs · on Dec 2, 2023

Eh, I messed around with it for a while - it's okay and good for beginners, but without much more effort you can get better results out of A1111 or ComfyUI

calamari4065 · on Dec 1, 2023

Is this at all usable on a CPU-only system with a ton of RAM?

brucethemoose2 · on Dec 1, 2023

Not really. There is a very fast LCM model preset now, but its still going to be painful.

SDXL in particular isn't one of those "compute light, bandwidth bound" models like llama (or Fooocus's own mini prompt expansion llm that in fact runs on the CPU).

There is a repo focused on CPU-only SD 1.5.

calamari4065 · on Dec 2, 2023

Yeah, llama runs acceptably on my server, but buying a GPU and setting it all up seems really unfun. Also much more expensive than my hobby budget

brucethemoose2 · on Dec 2, 2023

You don't need a big one, even an old 4GB GPU will massively accelerate the prompt ingestion.

airesearch69 · on Dec 5, 2023

i use the same. Any ideas where to find actually (not outdated) guides on how to create your own "modell" out of the most similiar picutures of my dream modell? want to use it further with the same face on it.

thanks for any tipp guys :)

rvz · on Dec 1, 2023

Looks like a complete contraption to setup and looks very unpleasant to use at first glance when compared against Noiselith.

The hundreds of python scripts and having the user to touch the terminal shows why something like Noiselith should exist for normal users rather than developers or programmers.

I would rather take a packaged solution that just works over a bunch of scripts requiring a terminal.

Liquix · on Dec 1, 2023

installation/setup is dead simple. up and running in under 3 minutes:

git clone https://github.com/lllyasviel/Fooocus.git

cd Fooocus

pip3 install -r requirements_versions.txt

python3 entry_with_update.py

Filligree · on Dec 1, 2023

Let's see...

> pip3: command not found

Okay. I'll need to install it? What package might that be in, hmm. Moving on, I already know it's python.

> /usr not writeable

Guess I'll use sudo...

= = =

Obviously I know better than to do this, but very few people would. This is not 'dead simple'! It's only simple for Python programmers who are already familiar with the ecosystem.

Now, fortunately the actual documentation does say to use venv. That's still not 'dead simple'; you still need to understand the commands involved. There's definitely space for a prepackaged binary.

pixl97 · on Dec 1, 2023

The people that make software that does useful things, and the people that understand system security live on different planets. One day they'll meet each other and have a religious war.

This said, it's nice when developers attempt to detect the executable they need and warn what package is missing.

brucethemoose2 · on Dec 1, 2023

There are projects that set up "fat" Python executables or portable installs, but the problem in PyTorch ML projects is that the download would be many gigabytes.

Additionally, some package choices depend on hardware.

In the end, a lot of the more popular projects have "one click" scripts for auto installs, and there are some for Fooocus specifically, but the issue there is its not as visible as the main repo, and not necessarily something the dev wants to endorse.

gerwim · on Dec 2, 2023

Depending on your platform, but if you read the readme, there’s a pre packaged release with Python embedded.

zirgs · on Dec 1, 2023

Or you can use Stability Matrix package manager.

brucethemoose2 · on Dec 1, 2023

Yeah, VoltaML is also another excellent choice in stabilty matrix.

liuliu · on Dec 1, 2023

You have to make trade-off in software development. Fooocus trades on the best picture rather than the most beautiful interface, and also simplicity in its use. I think it is a good trade-off given the technology is improving at breaking-neck speed.

Look, DiffusionBee is still maintained but still no SDXL support.

Anyone who bet that the technology is done and it is time to focus on the UI is making the wrong bet.

rgbrgb · on Dec 1, 2023

This project is really cool and I like the stated philosophy on the README. I think it's making the right trade-off in terms of setting useful defaults and not showing you 100 arcane settings. However, the installation is too hard. It's a student project and free so I'm not criticizing the author at all but I think it's a pretty fair and useful criticism of the software and likely a significant bottleneck to adoption.

Tiberium · on Dec 1, 2023

Huh? It has a really simple interface, much much much simpler than anything else that uses SD/SDXL locally. Installation is also simple for Windows/Linux, don't know about macOS.

alienreborn · on Dec 1, 2023

Interesting, will check it out to see how it compares with https://diffusionbee.com which I am using for last few months for fun.

janmo · on Dec 1, 2023

I just checked out both and Noiselith produces much, much better results.

AuryGlenz · on Dec 1, 2023

I realize it may be good marketing, but it's odd to have the fact that it's on device and offline be the primary differentiator when that's probably how most people use Stable Diffusion already.

I'd probably focus more on it being easy to install and use, as that's something that isn't done much. For me, if it doesn't have Controlnet, upscaling, some kind of face detailer, and preferably regional prompting, I'm out.

I also kind of wish all of these people that want to make their own SD generators would instead work on one of the open source ones that already exist.

While an app store might be a good idea, in a world with Auto111 and all of their extensions I think it's going to go over poorly with the Stable Diffusion community, for what it's worth.

philipov · on Dec 1, 2023

You hit the nail on the head when you said it's good marketing, but go all the way. The thing you find odd tells you who they want to use their product; You're not their target audience. They are trying to convert people from using online-only services like Dall-E, not people who already use SD.

michaelt · on Dec 1, 2023

I think there's probably a bunch of people who don't use things like A1111 because of the complexities of the download-this-which-downloads-this-which-downloads-this-then-you-manually-download-this-and-this setup model.

I can see how something simpler might appeal to new users, even if it doesn't appeal to existing users.

AuryGlenz · on Dec 1, 2023

Sure, and I agree with that. As I said, I'd probably push that just as much as it being 'offline,' if not more.

prepend · on Dec 1, 2023

I’ve oddly found many cloud wrappers to stable diffusion. So I like the upfront on device/offline description.

It was weird when I was first playing with SD how many packages did severe phone home or vms or whatever instead of just downloading a bunch of stuff and running it.

solarkraft · on Dec 1, 2023

I've used SD on my device, but I found it worth it to pay for the hosted version because it's much faster.

kleiba · on Dec 1, 2023

Sales prompt: "Young woman with blonde curls in front of a fantasy world background, come hither eyes, sitting with her legs spread, wearing a white shirt and jeans hot pants."

I mean, really??

rcoveson · on Dec 1, 2023

If the prompt wasn't somewhat sexual, divisive, or offensive it would be wide open to the chorus of "still not as good as midjourney/dall-e/imagen". Freedom from restriction is one of the main selling points.

momojo · on Dec 1, 2023

I'm genuinely curious how many people in the open source community are pouring their sweat and blood into these projects that are, at the end of the day, enabling guys to transform their macbooks into insta-porn-books.

SV_BubbleTime · on Dec 2, 2023

How many technological revolutions do we need to go through before we just accept an admit by default it’s typically about boobs?

KolmogorovComp · on Dec 1, 2023

Glad I’m not the only one who found it inappropriate. Feels very much like a dog whistle.

rcoveson · on Dec 1, 2023

What's subtle about it? In the dog whistle analogy, who are they who cannot hear the whistle?

To me this is more like yelling "ROVER! COME HERE BOY!" at the top of your lungs.

samutek · on Dec 2, 2023

The actual prompt is "magic world and the girl sitting inside a computer monitor, fantasy, cinematic close up photo."

OP is just offended by the image of an attractive woman, I guess. Apparently that's "creepy" now.

smcleod · on Dec 1, 2023

Yeah that’s creepy as.

dreadlordbone · on Dec 1, 2023

After installation, it wouldn't run on my Windows machine unless I granted public and private network access. Kinda tripped up since it says "offlilne".

tredre3 · on Dec 1, 2023

I had a similar experience.

On the first run it downloads about 30GB of data. I don't know if it would work offline on subsequent runs because for me it never ran again without crashing!

Also upon uninstallation it left behind all its data (not user data, mind you. But the executable itself, its python venv, its updater, and all the models. Uninstall basically just removed the shortcut in the start menu).

kemotep · on Dec 1, 2023

If you disconnected completely from the internet did it still run?

That is completely wrong to advertise it as “offline” if it requires an active internet connection to run.

stets · on Dec 1, 2023

definitely exciting to see more local clients come out. As mentioned in other comments, there are some great ones out already. I've used automatic1111 which is quick and doesn't require a ton of tuning. But it still has lots of knobs and options which makes it difficult initially. Fooocus is super quick but of course less customization.

Then there's ComfyUI, the holy grail of complicated, but with that complication comes the ability to do so much. It is a node-based app that allows you to create custom workflows. Once your image is generated, you can pipe that "node" somewhere else and modify it, eg: upscale the image or do other things.

I'd like to see if Noiselith or some others offer support for SDXLTurbo -- it came out only a few days ago but in my opinion is a complete game-changer. It can generate 512x512 images in ~half a second on consumer GPUs. The images aren't crazy quality but that ability to make a prompt like "fox in the woods", see it instantly and then add "wearing a hat" and see it instantly generate again is so valuable. Prior to that, I'd wait 12 seconds for an image. Sounds like not a big deal, but the value of being able to iterate so quickly makes local image gen so much more fun.

tracerbulletx · on Dec 1, 2023

All the real homies use ComfyUI

weakfish · on Dec 1, 2023

Elaborate?

tracerbulletx · on Dec 1, 2023

I'm being kind of tongue in cheek because I understand that this is for just making things really easy and ComfyUI is a node based editor that most people would have trouble with. But the best UI for local SD generation that the community is using is https://github.com/comfyanonymous/ComfyUI

ttul · on Dec 1, 2023

If you are a programmer at heart, ComfyUI will feel very comfortable (pun intended). It's basically a visual programming environment optimized for the type of compositional programming that machine learning models desire. The next thing this space needs is someone to build an API hosting every imaginable model on a vast farm of GPUs in the cloud. Use ComfyUI and other apps to orchestrate the models locally, but send data to the cloud and benefit from sharing GPU resources far more efficiently.

If anyone has a spare thousand hours to kill, I would build that and connect it up with the various front-ends including ComfyUI, A111, etc.. not a small amount of effort, but it will be rewarding.

dragonwriter · on Dec 2, 2023

> The next thing this space needs is someone to build an API hosting every imaginable model on a vast farm of GPUs in the cloud.

So, Civitai.com, if they had an API for the ob-site generation and training functions?

ttul · on Dec 2, 2023

Sure, they’d be well placed to do this.

rish · on Dec 1, 2023

Agreed. It's worth the learning curve for the sheer power you can enable your workflows. I've always wanted to toy around with node based architectures and this seemed quite easy after using A1111 extensively. The community providing ready to go workflows has made it quite enjoyable too.

SV_BubbleTime · on Dec 2, 2023

I can’t seem to get myself to switch. I’ve only used A1111 a dozen times and only for funny work images… I can’t seem to get myself to switch over to comfy because it looks rather intimidating.

rish · on Dec 2, 2023

I mean A1111 is great if you are a casual user as it has everything ready at your fingertips.

When you want to apply advanced workflows and repetitive tasks or use something cutting edge then Comfy is handy.

Another A1111 alternative to try which focuses on prompt generation is:

https://github.com/lllyasviel/Fooocus

cchance · on Dec 4, 2023

Haven't gotten to test it but given i use CoreML on Comfy, i wonder if we'll see more optimizations and performance work on the back end of these platforms as more useful frontends come out. the 1-4it/s on a 512 image is just sad, and the 2-3s/it on 1024 is just sad in this modern day, hell the ANE can't even run SD 1024x1024 images on a Macbook Pro M3 :S

amelius · on Dec 1, 2023

So it's free, but not open source.

What is the catch?

sib · on Dec 1, 2023

They will have a non-free (as in beer) version once they exit beta (per the website).

SV_BubbleTime · on Dec 2, 2023

With no real way to confirm it doesn’t phone home.

IDK, this all seems weird considering there are four other really good projects that do all of these things already.

stjohnswarts · on Dec 2, 2023

what? there are dozens of application level firewalls out there

NKosmatos · on Dec 1, 2023

As others have stated, Local AI (completely offline after model/weight download) is the way to go. If I have the hardware why shouldn't I be able to run all these fancy software on my own machine?

There are many great suggestions and links to other similar/better packages, so follow the comments for more info, thanks :-)

stuckkeys · on Dec 1, 2023

Installed it. Ran it. Generated. Slow for some reason. Deleted it. Looks similar to Pinokio, and that is opensource.

causi · on Dec 2, 2023

What's the privacy and licensing like for this? I'm honestly too ignorant to know if someone's allowed to use this for commercial purposes, or even if it's sending the generated images/prompts somewhere even if it's rendering locally.

ProllyInfamous · on Dec 1, 2023

The 16GB (base model) M2 Pro Mini, despite its overall awesomeness (running DiffusionBee.app / etc)... does not meet Minimum System Requirements (Apple Silicon requires 32GB RAM).

So now I have to contemplate shopping for a new mac TWICE in one year (never happened before).

sophrocyne · on Dec 1, 2023

https://github.com/invoke-ai/InvokeAI - runs on Mac silicon, can squeeze out SDXL images on a 16gb mac with SSD-1B or Turbo models.

ProllyInfamous · on Dec 5, 2023

Thank you for this link.

wsgeorge · on Dec 1, 2023

Currently using SDXL (through Huggingface Diffusers) on an M1 16GB Mac. Takes on average 4-5mins to generate an image. It's usable.

ttul · on Dec 1, 2023

Good lord. I can get a 2048x2048 upscaled output from a very complex ComfyUI workflow on a 4090 in 15 seconds. This includes three IPAdapter nodes, a sampling stage, a three-stage iterative latent upscaler, and multiple ControlNets. Macs are not close to competitive for inference yet.

rsynnott · on Dec 1, 2023

I mean, a 4090 would appear to cost $2000, and came out a year ago; it has about 70bn transistors. The M1 could be had for $700 for a desktop, $1000 as part of a laptop, came out three years ago, and has 16bn transistors, some of which are CPU.

An M3 Ultra might be a more reasonable comparison for the 4090.

ttul · on Dec 2, 2023

A very fair comment. I use an M2 MBP with max specs. It’s very powerful, but the Nvidia card draws a whole lot more power…

michaelt · on Dec 1, 2023

24GB cards weren't always $2000. I've seen people on this very forum [1] who brought two 3090s for just $600 each.

Agree the prices are crazy right now, though.

[1] https://news.ycombinator.com/item?id=37438847

myself248 · on Dec 1, 2023

When choosing a machine with non-expandable RAM, you went with the minimum configuration? That's a choice, I suppose, but the outcome wasn't exactly hard to foresee.

ProllyInfamous · on Dec 5, 2023

At the time, it seemed that ANY upgrade made the M2Pro mini cost-inaffective.

One of my coller Q12023 ChatGPT experiences was having it help me "reason through" which machine was most "upgrade-proof," dollar-for-dollar.

Now in Q42023, I would definitely had made the decision to purchase an M2 Studio (base model) instead — those additional upgrades (VS M2Pro mini config, sim.) were much more cost-effective. Overall, I'm extremely satisfied with my M2Pro base model.

airesearch69 · on Dec 5, 2023

i created an AI Modell (have like 20+ very similar face photos of my dream model) created with "Fooocus" Juggernaut XL SDXL)...

what is the best way to creat myself an "Modell" or "checkpoint" of my wished modell face?

i find tutorials but they seem complicated and outdated. thanks for any small tipp what things to research for actually method. Thanks

mg · on Dec 1, 2023

Would it be possible to run Stable Diffusion in the browser via WebGPU?

skocznymroczny · on Dec 1, 2023

https://websd.mlc.ai/#text-to-image-generation-demo

api · on Dec 1, 2023

Guernika is a decent one for Mac, available in the App Store.

evanjrowley · on Dec 1, 2023

How's support for AMD GPUs? I only saw Nvidia listed.

skocznymroczny · on Dec 1, 2023

The main issue with AMD is that to get reasonable performance you need to use ROCm, and ROCm is only available on Linux. They started porting parts of ROCm to Windows but it's not enough to be usable yet, might be different in few months.

saintradon · on Dec 2, 2023

Very much enjoying this, but, I BSOD'd very hard after making an image. (Maybe my PSU or GPU is bad, I need to take a look).

LorenDB · on Dec 1, 2023

Why do we never see AMD support in these projects?

stuckkeys · on Dec 1, 2023

I think it is a matter of why AMD does not support these projects. NVIDIA is involved everywhere. They could easy do the same. At least to what I have observed on the internetz.

orliesaurus · on Dec 2, 2023

I am running this on my 3070 RTX and 32GB ram Ryzen 5 and this is flawless!

Literally the reason why I am coming to HN every day! Thanks devs :)

tatrajim · on Dec 1, 2023

Works beautifully on my macbook m3 with 128gb.

ukuina · on Dec 2, 2023

That's quite a luxurious config!

solarkraft · on Dec 1, 2023

I find it interesting that it requires 16GB of RAM on Windows but 32 on a Mac. Unfortunately that leaves me out ...

mthoms · on Dec 1, 2023

I think that's probably because RAM on Mac is shared with the GPU. On Windows, you need 16GB RAM plus 8GB on GPU.

cchance · on Dec 4, 2023

This is the correct answer

smcleod · on Dec 3, 2023

Doesn't seem to work with SDXL Turbo or SDXL LCM

verdverm · on Dec 1, 2023

This is when I feel the 24G mem limit of the mac book/air

liuliu · on Dec 1, 2023

Again, try Draw Things, it runs well for SDXL on 8GiB macOS devices.

verdverm · on Dec 1, 2023

yeah, I know there are options, I'm more interested in language models than image generation anyway, so llama.cpp

starwin1159 · on Dec 4, 2023

Any prompt keywords website recommended?

stared · on Dec 1, 2023

I keep getting "Failed to generate. Please try again" 10 seconds after model loading. It is hardly helpful, as trying again gives the same error.

Apple Silicon M1, 32GB RAM, in any case.

seydor · on Dec 1, 2023

but what s gonna happen to all those AI valuations if we all go offline

m3kw9 · on Dec 1, 2023

Does not work at all, it needs you to go and find a “model”, like just download it for man.