I think there's a lot of potential for AI in 3D modeling. But I'm not convinced text is the best user interface for it, and current LLMs seem to have a poor understanding of 3D space.
So there's OpenSCAD, which is basically programming the geometry parametrically. But... I'd liken it to generating an SVG of a pelican on a bicycle at the current levels of LLMs.
Text being a challenge is a symptom of the bigger problem: most people have a hard time thinking spatially, and so struggle to communicate their ideas (and that's before you add on modeling vocabulary like "extrude", "chamfer", etc)
LLMs struggle because I think there's a lot of work to be done with translating colloquial speech. For example, someone might describe a creating a tube is fairly ambiguous language, even though they can see it in their head: "Draw a circle and go up 100mm, 5mm thick" as opposed to "Place a circle on the XY plane, offset the circle by 5mm, and extrude 100mm in the z-plane"
When I use Claude to model I actually just speak to it in common English and it translates the concepts. For example, I might say something like this:
I'm building a mount for our baby monitor that I can attach to the side of the changing table. The pins are x mm in diameter and are y mm apart. [Image #1] of the mounting pins. So what needs to happen is that the pin head has to be large, and the body of the pin needs to be narrow. Also, add a little bit of a flare to the bottom and top so they don't just knocked off the rest of the mount.
And then I'll iterate.
We need a bit of slop in the measurements there because it's too tight.
And so on. I'll do little bits that I want and see if they look right before asking the LLM to union it to the main structure. It knows how to use OpenSCAD to generate preview PNGs and inspect it.
Amusingly, I did this just a couple of weeks ago and that's how I learned what a chamfer is: a flat angled transition. The adjustment I needed to make to my pins where they are flared (but at a constant angle) is a chamfer. Claude told me this as it edited the OpenSCAD file. And I can just ask it in-line for advice and so on.
> LLMs seem to have a poor understanding of 3D space.
This is definitely my experience as well. However, in this situation it seems we are mostly working in "local" space, not "world" space wherein there are a lot of objects transformed relative to one another. There is also the massive benefit of having a fundamentally parametric representation of geometry.
I've been developing something similar around Unity, but I am not making competence in spatial domains a mandatory element. I am more interested in the LLM's ability to query scene objects, manage components, and fully own the scripting concerns behind everything.
Opus 4.5 seems to be a step above every other model in terms of creating SVGs. Before most models couldn't make something that looked half decent.
But I think this shows that these models can improve drastically on specific domains.
I think if three was some good datasets/mappings for spacial relation and CAD files -> text then a fine tune/model with this in its training data could improve the output a lot.
I assume this project is using a general LLM model with unique system prompt/context/MCP for this.
Curious what you think is the best interface for it? We thought about this ourselves and talked to some folks but it didn't seem there was a clear alternative to chat.
Solidworks's current controls are the best interface for it. "Draw a picture" is something you're going to find really difficult to beat. Most of the other parametric CADs work the same way, and Solidworks is widely known as one of the best interfaces on top of that. They've spent decades building one that is both unambiguous and fast.
Haha yes, I've never heard any engineer discribe any CAD package as anything other than slow and full of bugs. But of the alternatives, I think most would still pick Solidworks.
maybe some combination of visual representation with text. For example it's not easy to come up intuitive with names of operations which could be applied to some surface. But if you could say something like 'select top side of cylinder' and it will show you some possible names of operations (with illustrations/animations) which could be applied to it then it's easy to say back what it need to do without actually knowing what actually possible. So as result it maybe just much quicker way to interact with CAD that we are using currently.
The clear alternative is VR. You put on hand trackers and then physical describe the part to the machine. It should be rid m this wide, gestures, and moves hands that wide.
It's the star trek future way of interfacing with things. I don't know SOLIDWORKS at all. I'm a total noob at Fusion 360, but I've made a couple of things with it. Sketch and extrude. But what I can do is English. Using a combination of Claude and openSCAD and my knowledge of programming, I was able to make something that I could 3d print, without having to learn SOLIDWORKS. Same with Abelton for music. It's frustrating when Claude does the wrong thing, but where it shines is when you give it the skill to render the object to a png for it to look at the scad that it's generating, so it can iterate until it actually makes what you're looking for. It's the human out of the loop where leaps are being made.
I've been experimenting with Claude Code and different code-to-cad tools and the best workflow yet has been with Replicad. It allows for realtime rendering in a browser window as Claude does changes to a single code file.
Here's an example I finished just a few minutes ago:
I think we are targeting different ends of the market, but I'm trying to do a full product development pipeline end to end. PCB, enclosure, peripherals, firmware.
I need to redo this blog, because I did it on a run where the enclosure defaulted to the exploded view, and kicanvas bugged out, either way, the bones of it is working. Next up is to add more subcircuits, do cloud compilation of firmware, kicad_pcb to gerbers.
The UI is the inverse of whatever intuitive is. It's built on convention after convention after convention. If you understand the shibboleths (and I'm guessing most people take a certified course by a trainer for it?), then it's great, but if you don't, it really sucks to be you (i.e. me).
I would LOVE to try out what you've built, but I am afraid that if the model misinterprets me or makes a mistake, it'll take me longer to debug / correct it than it would to just build it from scratch.
The kinds of things I want to make in solidworks are apparently hard to make in solidworks (arbitrarily / continuously + asymmetrically curved surfaces). I'm assuming that there won't be too many projects like this in the training dataset? How does the LLM handle something that's so out of pocket?
CAD and machining are different fields, true, but I see a lot of the same flaws that Adam Karvonen highlighted in his essay on LLM-aided machining a few months ago:
Do any people with familiarity on what's under the hood know if the latent space produced by most transformer paradigms is only capable of natively simulating 1-d reasoning and has to kludge together any process for figuring geometry with more degrees of freedom?
This is incredible. Coincidentally, I've just started using Claude Code to model things using OpenSCAD and it's pretty decent. The fact that it can generate preview PNGs, inspect them, and then cycle back to continue iterating is pretty valuable. But the things I make are pretty simple.
My wife was designing a spring-loaded model that fits in our baby walls so that we can make it more modularly attached to our walls and she used Blender. Part of it is that it's harder to make a slightly more complex model with an LLM.
Solidworks is out of our budget for the kind of things we're building but I'm hoping if this stuff is successful, people work on things down the market. Good luck!
Totally relatable pain- getting LLMs to reliably drive precise CAD operations is surprisingly hard, especially when spatial reasoning and plane/extrude/chamfer decisions go wrong 70%+ of the time.
For people looking at a different angle on the "text to 3D model" problem, I've been playing with https://www.timbr.pro lately. Not trying to replace SolidWorks precision, but great for the early fuzzy "make me something that looks roughly like X" phase before you bring it into real CAD.
don't worry, the one thing they did change was adding "the cloud" called 3DEXPERIENCE which is universally hated and gets more and more intrusively jammed into each new release.
oh and they changed the price as well, it went up, and up, and up
> but when poking around CAD systems a few months back we realized there's no way to go from a text prompt input to a modeling output in any of the major CAD systems.
This is exactly what SGS-1 is, and it's better than this approach because it's actually a model trained to generate Breps, not just asking an LLM to write code to do it.
I've been working on this exact same thing with both Solidworks and Altium! There has definitely been a step change in Opus 4.5; I first had it first reverse engineer the Altium file format using a Ghidra MCP and was impressed and how well it worked with decompiled Delphi. Gemini 3 Pro/Flash also make a huge difference with data extraction from PDFs like foot prints or mechanical drawings so we're close to closing the whole loop on several different fields, not just with software engineering.
For the most part they still suck at anything resembling real spatial reasoning but they're capable of doing incredibly monotonous things that most people wouldn't put themselves through like meticulously labeling every pin or putting strict design rule checks on each net or setting up DSN files for autorouter. It even makes the hard routing quite easy because it can set up the DRC using the Saturn calculator so I don't have to deal with that.
If you give them a natural language interface [1] (a CLI in a claude skill, thats it) that you can translate to concrete actions, coordinates, etc. it shines. Opus can prioritize nets for manual vs autorouting, place the major components using language like "middle of board" which I then use another LLM to translate to concrete steps, and just in general do a lot of the annoying things I used to have to do. You can even combine the visual understanding of Gemini with the actions generated by Opus to take it a step further, by having the latter generate instructions and the former generates JSON DSL to that gets executed.
I'm really curious what the defensibility of all these businesses is going to be going forward. I have no plans on entering that business but my limit at this point is I'm not willing to pay more than $200/mo for several Max plans to have dozens of agents running all the time. When it only takes an hour to create a harness that allows Claude to go hog wild with desktop apps there is a LOT of unexplored space but just about anyone who can torrent Solidworks or Altium can figure it out. On the other hand, if it's just a bunch of people bootstrapping, they won't have the same pressure to grow.
Good luck!
[1] Stuff like "place U1 to the left of U4, 50mm away" and the CLI translates that to structured data with absolute coordinates on the PCB. Having the LLM spit out natural language and then using another LLM with structured outputs to translate that to a JSON DSL works very well, including when you need Opus to do stuff like click on screen.
Thanks for the input! Haven't done much with Altium but it seems like you get at least somewhat of a boost for it being slightly more about the logic and less about the spatial reasoning.
2 things related to what you said I hadn't put in the original post:
1. In our experience, the LLMs were awful at taking actions directly with any of the SolidWorks API scripting formats (C#, VBA, etc.). Probably 75% of what they wrote just failed to run, and even when they had access to browse the documentation it wasn't much better. If you're getting Opus or anything else to interact with SolidWorks from the CLI, can you say more about how you're getting it to interface effectively?
2. The LLMs are indeed surprisingly bad at spatial reasoning unless prompted specifically and individually. The most notable case of this is when they need to choose the right plane to sketch on. When creating revolve features, they'll often choose the face that would've only worked if they were going to extrude rather than revolve, and when creating sweeps they'll often try to put the sketch that's going to be swept on the same plane as the path that's being swept. If you go back and ask them why they did that and point out that it's wrong, they can fix it pretty fast, but when left to their own devices they often get quite stuck on this.
> If you're getting Opus or anything else to interact with SolidWorks from the CLI, can you say more about how you're getting it to interface effectively?
I just have a Solidworks plugin that translates CLI calls to JSON to Solidworks API calls and back again.
What really started working was creating a bunch of high level CLI commands so that Claude Code could query the part/assembly by asking stuff like "What is the closest distance between the southmost surface of Extrusion1 and the surface of Cylinder2" which would then be translated to a specific high level command or a bunch of lower level commands by Gemini 3 Flash. Those would then be translated to Solidworks API calls, as would any editing commands. It also really helps to give it the ability to parametrize the queries so instead of "all features smaller than 2mm" it can say "all features smaller than $MinFeatureSize", with some logic and a downstream LLM to translate that parameter into values in the design and review it with the human in the loop before committing it to the project.
The key is to absolutely minimize how often the LLMs think about numbers and have them think in relationships instead. The hard part is translating those relationships back to the CAD API calls but LLMs are much better at hot hallucinating if you resolve all the parametrized equations last.
Interesting! How did you make Claude "know" the PCB layout/schematic specifics? Is just giving it a "reference prompt" enough to produce any interesting results? One thing that comes to mind is to use LLMs understanding of SVG to build "spatial representation" of the board layout by presenting components and tracks as SVG primitives of different type etc.
I imagine a practical "PCB Design Copilot" would come from specialized models, trained from the ground up on a large design dataset. There is no GitHub for free professional grade PCB design sources to scrape though.
> How did you make Claude "know" the PCB layout/schematic specifics? Is just giving it a "reference prompt" enough to produce any interesting results?
I have a mini query language in the CLI that implements a lot of spatial queries, both structured and via prompts (another LLM translates the prompt to a structured query), against the Altium file format and an intermediate representation I keep. Most queries and editing commands use relative positioning ("to the left of"), units ("right edge minus 10mm"), and parameters (U1.P1.Top + MinSpacing * 5), etc. The LLM rarely needs to use concrete units because it's mostly parametrized by component clearances and design rules - I just choose some numbers at the beginning like board size and layer stackup (mostly set by the fab).
The CLI has over a hundred subcommands and I use Claude skills to split up the documentation, but the agent actively explores the schematic and PCB themselves. The Claude skills include instructions to use the measurement subcommands to sanity check after making a move or when doing a review at the end, although I'm in the process of implementing a GPU based design rule checker. My Solidworks interface works the same but there are many more "verbs" there for the LLM to manage.
At the end of the day it's mostly just orchestrating another tool which does most of the spatial logic and calculations. It's definitely easier with Altium than Solidworks so far.
Very interested in your workflow - do you have anything online to share? Also looked into automating altium more and found having to do a lot of GUI work to guide the models along. How much of going from ‘design in head’ to schematic and layout have you automated?
> Very interested in your workflow - do you have anything online to share?
Not yet but once I'm ready it's all going to be open source.
> Also looked into automating altium more and found having to do a lot of GUI work to guide the models along.
Have you tried the Windows UI Automation/Accessibility APIs? You can download Accessibility Insights for Windows to see the data structure and it's well supported by Altium. It has everything you need to tell the LLM what's on screen without ever sending a screenshot (except for the 2d/3d CAD view) and the UIA provides an API that can actually click, focus, etc. without sending fake keyboard events manually. When reverse engineering the file format I put Opus on a loop and it just kept fiddling with the binary file format until Altium stopped throwing parsing errors.
> How much of going from ‘design in head’ to schematic and layout have you automated?
At this point everything but the final PCB routing. My autorouter and autoplacer are very much a work in progress and LLMs aren't great at routing complex traces, but they can brute force it given enough time and DRC checks. Right now I just shell out to autorouters like Specctra with a DSN and route the most important nets by hand in Altium. Since the LLM sets up all the design rules and can spend hours on a loop figuring out the optimal placement, it's usually a breeze. Especially compared to when I first started out in EE and spent days with part selection and footprint capture. Now a lot of that tedium is completely automated.
Soon I'll integrating probe-rs into the whole workflow and making end to end vibe PCBs, adding support for FPGAs, and integrating Solidworks so it does enclosures and molds too.
> the LLMs aren't as good at making 3D objects as they are are writing code
I am still hoping that openSCAD or something similar can grab hold of the community. openSCAD needs some kind of npm as well as imports for mcmaster-carr etc but I think it could work.
Would you use something like this if it worked well in Fusion 360? We chose to start with SolidWorks because when talking with people in mechanical engineering, almost everyone was using SolidWorks and no one even mentioned Fusion (despite online surveys saying it's like 45/45).
The people who use Solidworks buy Solidworks to use Solidworks. You'll go up to one of them and they'll have the whole model done in like 8 key presses before you even finish typing in your prompt.
All the people who aren't professional CAD users have Fusion, which they can get for free. They would probably benefit the most from text to model AI, but would probably also be the least willing to pay.
it's definitely interesting, but the demo of the coffee mug has a lot of flaws, are there some concrete examples you can think of where the hosted LLMs really shine in this problem?
I've watched the video a couple times and the only thing I can see that it does wrong is the fillets on the handle (and maybe the way it used a spline & sweep for the handle could have been improved but it's no worse than you'd see from a new Solidworks user).
Honestly, the out-of-the-box models aren't great at CAD. We were mostly trying to figure out (1) how well it could do with the best harness we could give it and (2) whether people would want and use this if it worked well.
LLMs struggle because I think there's a lot of work to be done with translating colloquial speech. For example, someone might describe a creating a tube is fairly ambiguous language, even though they can see it in their head: "Draw a circle and go up 100mm, 5mm thick" as opposed to "Place a circle on the XY plane, offset the circle by 5mm, and extrude 100mm in the z-plane"
Amusingly, I did this just a couple of weeks ago and that's how I learned what a chamfer is: a flat angled transition. The adjustment I needed to make to my pins where they are flared (but at a constant angle) is a chamfer. Claude told me this as it edited the OpenSCAD file. And I can just ask it in-line for advice and so on.
This is definitely my experience as well. However, in this situation it seems we are mostly working in "local" space, not "world" space wherein there are a lot of objects transformed relative to one another. There is also the massive benefit of having a fundamentally parametric representation of geometry.
I've been developing something similar around Unity, but I am not making competence in spatial domains a mandatory element. I am more interested in the LLM's ability to query scene objects, manage components, and fully own the scripting concerns behind everything.
But I think this shows that these models can improve drastically on specific domains.
I think if three was some good datasets/mappings for spacial relation and CAD files -> text then a fine tune/model with this in its training data could improve the output a lot.
I assume this project is using a general LLM model with unique system prompt/context/MCP for this.
(Of course, you and I know it is, it's just that you're asking it to do a lot)
I guess it's all in the perspective
https://shapelabvr.com/
Here's an example I finished just a few minutes ago:
https://github.com/jehna/plant-light-holder/blob/main/src/pl...
https://github.com/MichaelAyles/heph/blob/main/blogs/0029blo...
I need to redo this blog, because I did it on a run where the enclosure defaulted to the exploded view, and kicanvas bugged out, either way, the bones of it is working. Next up is to add more subcircuits, do cloud compilation of firmware, kicad_pcb to gerbers.
Then order the first prototype!
The UI is the inverse of whatever intuitive is. It's built on convention after convention after convention. If you understand the shibboleths (and I'm guessing most people take a certified course by a trainer for it?), then it's great, but if you don't, it really sucks to be you (i.e. me).
I would LOVE to try out what you've built, but I am afraid that if the model misinterprets me or makes a mistake, it'll take me longer to debug / correct it than it would to just build it from scratch.
The kinds of things I want to make in solidworks are apparently hard to make in solidworks (arbitrarily / continuously + asymmetrically curved surfaces). I'm assuming that there won't be too many projects like this in the training dataset? How does the LLM handle something that's so out of pocket?
https://adamkarvonen.github.io/machine_learning/2025/04/13/l...
Do any people with familiarity on what's under the hood know if the latent space produced by most transformer paradigms is only capable of natively simulating 1-d reasoning and has to kludge together any process for figuring geometry with more degrees of freedom?
My wife was designing a spring-loaded model that fits in our baby walls so that we can make it more modularly attached to our walls and she used Blender. Part of it is that it's harder to make a slightly more complex model with an LLM.
Solidworks is out of our budget for the kind of things we're building but I'm hoping if this stuff is successful, people work on things down the market. Good luck!
I've tried ChatGpt and Claude on datasheets of electronic components, and I'm sorry to say that they are awful at it.
Before that is fixed, I don't have high hopes for an AI that can generate CAD/EDA models that correctly follow some specification.
For people looking at a different angle on the "text to 3D model" problem, I've been playing with https://www.timbr.pro lately. Not trying to replace SolidWorks precision, but great for the early fuzzy "make me something that looks roughly like X" phase before you bring it into real CAD.
Solidworks might be as close to a final form for CAD as you're going to get.
oh and they changed the price as well, it went up, and up, and up
This is exactly what SGS-1 is, and it's better than this approach because it's actually a model trained to generate Breps, not just asking an LLM to write code to do it.
For the most part they still suck at anything resembling real spatial reasoning but they're capable of doing incredibly monotonous things that most people wouldn't put themselves through like meticulously labeling every pin or putting strict design rule checks on each net or setting up DSN files for autorouter. It even makes the hard routing quite easy because it can set up the DRC using the Saturn calculator so I don't have to deal with that.
If you give them a natural language interface [1] (a CLI in a claude skill, thats it) that you can translate to concrete actions, coordinates, etc. it shines. Opus can prioritize nets for manual vs autorouting, place the major components using language like "middle of board" which I then use another LLM to translate to concrete steps, and just in general do a lot of the annoying things I used to have to do. You can even combine the visual understanding of Gemini with the actions generated by Opus to take it a step further, by having the latter generate instructions and the former generates JSON DSL to that gets executed.
I'm really curious what the defensibility of all these businesses is going to be going forward. I have no plans on entering that business but my limit at this point is I'm not willing to pay more than $200/mo for several Max plans to have dozens of agents running all the time. When it only takes an hour to create a harness that allows Claude to go hog wild with desktop apps there is a LOT of unexplored space but just about anyone who can torrent Solidworks or Altium can figure it out. On the other hand, if it's just a bunch of people bootstrapping, they won't have the same pressure to grow.
Good luck!
[1] Stuff like "place U1 to the left of U4, 50mm away" and the CLI translates that to structured data with absolute coordinates on the PCB. Having the LLM spit out natural language and then using another LLM with structured outputs to translate that to a JSON DSL works very well, including when you need Opus to do stuff like click on screen.
2 things related to what you said I hadn't put in the original post:
1. In our experience, the LLMs were awful at taking actions directly with any of the SolidWorks API scripting formats (C#, VBA, etc.). Probably 75% of what they wrote just failed to run, and even when they had access to browse the documentation it wasn't much better. If you're getting Opus or anything else to interact with SolidWorks from the CLI, can you say more about how you're getting it to interface effectively?
2. The LLMs are indeed surprisingly bad at spatial reasoning unless prompted specifically and individually. The most notable case of this is when they need to choose the right plane to sketch on. When creating revolve features, they'll often choose the face that would've only worked if they were going to extrude rather than revolve, and when creating sweeps they'll often try to put the sketch that's going to be swept on the same plane as the path that's being swept. If you go back and ask them why they did that and point out that it's wrong, they can fix it pretty fast, but when left to their own devices they often get quite stuck on this.
I just have a Solidworks plugin that translates CLI calls to JSON to Solidworks API calls and back again.
What really started working was creating a bunch of high level CLI commands so that Claude Code could query the part/assembly by asking stuff like "What is the closest distance between the southmost surface of Extrusion1 and the surface of Cylinder2" which would then be translated to a specific high level command or a bunch of lower level commands by Gemini 3 Flash. Those would then be translated to Solidworks API calls, as would any editing commands. It also really helps to give it the ability to parametrize the queries so instead of "all features smaller than 2mm" it can say "all features smaller than $MinFeatureSize", with some logic and a downstream LLM to translate that parameter into values in the design and review it with the human in the loop before committing it to the project.
The key is to absolutely minimize how often the LLMs think about numbers and have them think in relationships instead. The hard part is translating those relationships back to the CAD API calls but LLMs are much better at hot hallucinating if you resolve all the parametrized equations last.
I have a mini query language in the CLI that implements a lot of spatial queries, both structured and via prompts (another LLM translates the prompt to a structured query), against the Altium file format and an intermediate representation I keep. Most queries and editing commands use relative positioning ("to the left of"), units ("right edge minus 10mm"), and parameters (U1.P1.Top + MinSpacing * 5), etc. The LLM rarely needs to use concrete units because it's mostly parametrized by component clearances and design rules - I just choose some numbers at the beginning like board size and layer stackup (mostly set by the fab).
The CLI has over a hundred subcommands and I use Claude skills to split up the documentation, but the agent actively explores the schematic and PCB themselves. The Claude skills include instructions to use the measurement subcommands to sanity check after making a move or when doing a review at the end, although I'm in the process of implementing a GPU based design rule checker. My Solidworks interface works the same but there are many more "verbs" there for the LLM to manage.
At the end of the day it's mostly just orchestrating another tool which does most of the spatial logic and calculations. It's definitely easier with Altium than Solidworks so far.
Not yet but once I'm ready it's all going to be open source.
> Also looked into automating altium more and found having to do a lot of GUI work to guide the models along.
Have you tried the Windows UI Automation/Accessibility APIs? You can download Accessibility Insights for Windows to see the data structure and it's well supported by Altium. It has everything you need to tell the LLM what's on screen without ever sending a screenshot (except for the 2d/3d CAD view) and the UIA provides an API that can actually click, focus, etc. without sending fake keyboard events manually. When reverse engineering the file format I put Opus on a loop and it just kept fiddling with the binary file format until Altium stopped throwing parsing errors.
> How much of going from ‘design in head’ to schematic and layout have you automated?
At this point everything but the final PCB routing. My autorouter and autoplacer are very much a work in progress and LLMs aren't great at routing complex traces, but they can brute force it given enough time and DRC checks. Right now I just shell out to autorouters like Specctra with a DSN and route the most important nets by hand in Altium. Since the LLM sets up all the design rules and can spend hours on a loop figuring out the optimal placement, it's usually a breeze. Especially compared to when I first started out in EE and spent days with part selection and footprint capture. Now a lot of that tedium is completely automated.
Soon I'll integrating probe-rs into the whole workflow and making end to end vibe PCBs, adding support for FPGAs, and integrating Solidworks so it does enclosures and molds too.
I am still hoping that openSCAD or something similar can grab hold of the community. openSCAD needs some kind of npm as well as imports for mcmaster-carr etc but I think it could work.
All the people who aren't professional CAD users have Fusion, which they can get for free. They would probably benefit the most from text to model AI, but would probably also be the least willing to pay.
Solidworks is the GOAT though
I've watched the video a couple times and the only thing I can see that it does wrong is the fillets on the handle (and maybe the way it used a spline & sweep for the handle could have been improved but it's no worse than you'd see from a new Solidworks user).