My hypothesis is until they can really nail down image to text and text to image...

heisenzombie · 2025-05-02T00:37:27 1746146247

My experience is that SOTA LLMs still struggle to read even the metadata from a mechanical drawing. They're getting better -- they now are mostly ok at reading things like a BOM or revision table -- but moderately complicated title blocks often trip them up.

As for the drawings themselves, I have found them pretty unreliable at reading even quite simple things (i.e. what's the ID of the thru hole?), even when they're specifically dimensioned. As soon as spatial reasoning is required (i.e. there's a dimension from A to B and from A to C and one asks for the dimension B to C), they basically never get it right.

This is a place where there's a LOT of room for improvement.

Terr_ · 2025-05-02T02:48:38 1746154118

I'm scared of something like the Xerox number-corruption bug [0], where some models will subtly fuck everything up in a way that is too expensive to recover from by the time it's discovered.

[0] https://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres...

flipflipper · 2025-05-02T04:15:51 1746159351

Try having it output the circuit in SPICE. It actually works surprisingly well and does a good job picking out components values for parts and can describe the connectivity well. It falls apart when it writes the SPICE (professionally, there isn’t really one well accepted syntax really)and making the wires to connect your components, like you say missing the minds eye. But I can imagine adding a ton spice schematics with detailed descriptions with maybe an LLM optimized SPICE syntax to the training data set… it’ll be designing and simulating circuits in no time.

kurthr · 2025-05-02T15:02:42 1746198162

Yeah, how to you thing that schematic is represented internally? How do you think the netlist is modeled? It's SPICE and HDL all the way down!

There are good reasons not to vibecode Verilog, but a lot of test cases are already being written by LLMs and the big EDA vendors (Cadence, Synopsys, Siemens) all tout their new AI capabilities.

It's like saying it can't read handwritten mathematical formulas, when it solves most math problems in markup (and if you aren't using it you're asking for trouble).

flipflipper · 2025-05-02T22:23:18 1746224598

I brainfarted a bit and mixed up my attempts with making LTSPICE asc schematics (which are the text representations of the GUI sch, with wires) with the normal node based SPICE syntax. I just tried this specifically asking for spice to run with ngspice to run in a CLI. Seemed to run great! Going to play around with this for a bit now…

tintor · 2025-05-02T00:37:46 1746146266

Problem #1 with text-to-image models is that focus is on producing visually attractive photo-realistic artistic images, which is completely orthogonal from what is needed for engineering: accurate, complete, self-consistent, and error-free diagrams.

Problem #2 is low control over outputs of text-to-image models. Models don't follow prompts well.

discordance · 2025-05-02T07:30:31 1746171031

Mechanical drawings and schematics are visualizations for humans.

If you look at the data structure of a gerber or DWG, it’s vectors and metadata. These happen to be great for LLMs.

My hypothesis is that we haven’t done the work on that yet because the market is more interested in things like Ghibli imagery.

jayd16 · 2025-05-02T15:30:44 1746199844

More like there isn't a resource of trillions of user generated schematics uploaded to the big tech firms that they can train on for free by skirting fair use laws.

danielbln · 2025-05-02T11:30:28 1746185428

Ate you being facetious or is that really your hypothesis?

notahacker · 2025-05-02T14:04:08 1746194648

Not the OP, but Ghibli imaging doesn't kill people or make things stop working if it falls into uncanny valley territory, so the bar for a useful product is lower than a "designer" based on a NN which has ingested annotated CAD files...

rjsw · 2025-05-02T15:33:00 1746199980

Programming languages don't really define drawings. There are several standards for the data models behind the exchange file formats used in engineering though.

Someone could try training a LLM on a combination of a STEP AP242 [1] data model and sample exchange files, or do the same for the Building Information Model [2].

[1] http://www.ap242.org/ [2] https://en.wikipedia.org/wiki/Industry_Foundation_Classes

slicktux · 2025-05-02T00:03:22 1746144202

Electrical schematics can be represented with linear algebra and Boolean logic… Maybe their being able to “understand” such schematics is just a matter of them becoming better at mathematical logic…which is pretty objective.

nyrikki · 2025-05-02T15:38:57 1746200337

This paper works because it explicitly is a problem domain that was intentionally constrained to ensure safety in the Amateur high-power rocket hobby. Specifically with constraints and standards that were developed for teenagers of various skill to do with paper and pen well before they had access to containers. While modern applications have added more functions, those core constrains remain.

It works explicitly because it doesn't hit the often counter-intuitive limitations with generalization in pure math.

Remember that Boolean circuit satisfiability is NP-complete, and is beyond UHAT's + poly length CoT expressibility, which is capped at PTIME.

Even int logic with boolean circuits is in PSPACE.

When you start to deal with values, you are going to have to add in heuristics and/or find reductions that will cost your generalizability.

Even if you model analog circuits as finite labelled directed graphs with labelled vertices, similar to what Shannon used; removing some of the real world electrical impacts and focus on them as computational units, the complexity can get crazy fast.

Those circuits, with specific constraints (IIRC local feedback, etc..) can be simulated by a Turing machine, but require ELEMENTARY space or time, and despite it's name ELEMENTARY is iterated exponential: 2^2^2^2^2^...^n with k n's.

Also note that P/poly, viewed as problems that can be solved by small circuits is not a practical class and in fact contains all of the unary languages that we know are unsolvable by real computers in the general case.

That apparent paradox that P/poly, which has small bool circuits, also contains all of those undecidable unary languages is a good starter into that rat hole.

While we will have tools and models that are better at math logic, the constrains are actually limits on computation in the general case. Generalization often has these types of costs, and the RL benefits in this case relate to demonstrating that IMHO.

davemp · 2025-05-02T02:10:02 1746151802

Not entirely true. Routing is a very important part of electrical schematics.

echoangle · 2025-05-02T07:34:25 1746171265

Is it? Isn’t that more like PCB design? The schematic is just the abstract connection of components, right?

davemp · 2025-05-02T10:58:44 1746183524

I would consider a PCB schematic to be part of an electrical schematic. Even if you don’t, you still have to consider final layout because some lines will need EMF protection. The linear equations and boolean algebra are just a (extremely useful) model after all.

imranq · 2025-05-02T15:05:05 1746198305

You can describe a diagram with markdown like mermaid, so you can at least understand state changes and processes which are core to engineering.

neodypsis · 2025-05-02T03:49:11 1746157751

Try one of the models with good vision capabilities and ask it to output code using build123d.

yieldcrv · 2025-05-02T01:43:32 1746150212

Tell it how to read schematics in the prompt