I’m confused why there is so much focus on text to images and models. If you spe...

deepnet · on Feb 18, 2024

Can you share some of what you have found about the creative process by talking to people with artistic ability ?

What are your ideas about the differences between a human and AI's creative process ?

Are there any similarities, or analagous processes ?

Do you think creators have an kind of latent space where different concepts are inspired by multi-modal inputs ( what sparks inspiration ? e.g. sometimes music or a mood inspires a picture ) and then the creators make different versions of their idea by combining different amounts of different concepts ?

I am not being snarky, I am genuinely interested in views comparing human an AI's creative processes.

bugglebeetle · on Feb 18, 2024

I used to work as an illustrator. Most images appeared to me as somewhere between fuzzy or clear image concepts, unaccompanied by any words. I then have to take these concepts and translate them using principles of design, color, composition, abstraction etc., such that they’re coherent and understandable to others.

Most illustration briefs are also not wrote descriptions of images because people are remarkably bad at describing what they want in an image, beyond in the most general sense of its subject. This is why you see DALLE doing all kinds of prompt elaboration on user inputs to generate “good” images. Typically, the illustrator is given the work to be illustrated (e.g. an editorial), distills key concepts from the work and translates these into various visual analogues, such as archetypes, metaphors and themes. Depending on the subject, one may have to include reference images or other work in a particular style, if the client has something specific in mind.

ummonk · on Feb 18, 2024

Project briefs to an artist typically contain both text and reference images. Image diffusion models and the like likewise typically use a text prompt together with optional reference images.

bugglebeetle · on Feb 18, 2024

Project briefs are generally not descriptions of images and reference tends to be more style than content focused. Source: I used to be an illustrator for major media outlets like the NYTimes, etc.

ummonk · on Feb 19, 2024

So how is the desired content communicated to the artist?

(Also, reference images can absolutely be used to communicate style to a diffusion model)

bugglebeetle · on Feb 19, 2024

The artist is given the content to be illustrated, extrapolates themes and overarching rhetorical or narrative aspects of the work, creates visual representations or metaphors corresponding to these aspects, generates 3-5 interpretations, shows them to the AD, who provides feedback on what has been extrapolated, as well as various design considerations.

ummonk · on Feb 19, 2024

"The artist is given the content to be illustrated" and the content is in what form?

refulgentis · on Feb 18, 2024

Not even wrong, in the Pauli sense: to engage requires ceding the incorrect premises that image models only accept text as input and that the generation process relies on this text

astrange · on Feb 19, 2024

Text prompts aren't an essential part of this technology. They're being used as the interface to generation APIs because it's easy to build, easy to moderate, and for the discord models like Midjourney it's easy for people to copy your work.

With a local model you can find latent space coordinates any way you want and patch the pixel generation model any way you want too. (the above are usually called textual inversion and LoRAs.)

I would personally like to see a system that can input and output layers instead of a single combined image.

teaearlgraycold · on Feb 18, 2024

It’s good for stock images.

And for in-painting I think you’ll find text-to-image is still useful to artists. It’s extra metadata to guide the generation of a small portion of the final image.

nobut8 · on Feb 19, 2024

Not sure what these cars are all about. Everyone travels by horse and buggy…

We’re building a model optimized for the machine, not people.

Artists can go collect clay to sculpt and flowers to convert to paint. Computers are their own context and should not be romantically anthropomorphized

In the same way fewer and fewer people go to church, fewer and fewer will see the nostalgia in being a data entry worker all day. Society didn’t stop when we all got our first beige box.

bugglebeetle · on Feb 19, 2024

This is an incredibly dull, unthinking regurgitation of the nonsense people say online to feel better about their own lack of creative ability. My point wasn’t that computers can’t do the same thing as artists (they already can), it’s that computers won’t achieve the same result by having people describe the images they want to see because that’s fundamentally not how images are made or even perceived.

poojf · on Feb 19, 2024

I play three instruments, draw, sculpt, and used to build houses.

No one ever set a goal for AI to achieve the same result; just replace labor.

Your post is the same dull strawman non-engineers (I also have a BSc in engineering and MSc in math) repeat about AI.

Find me a formal proof of how “images are made” and I’ll show you one possible model of an infinite number of possible models to explain it with a few axiomatic correct twists to the math since all of our symbolic logic is a leaky abstraction that fails to capture how anything is “fundamentally made”.

Pretentious semantic wank is all you’re shipping

ilkke · on Feb 18, 2024

Check out invoke.ai for an example of something much closer to a professional tool.

refulgentis · on Feb 19, 2024

nah, that has absolutely nothing to do with what they're saying. I've used it for over a year and this is a weird way for it to appear in conversation, I hope you're not astroturfing