A Codebase That Makes Codebases

thunky · on May 10, 2023

The struggle I have with project generators is that you can only run them once.

If you didn't think you needed the billing module when you ran it but later you find that you need it, the generator is of little use. You end up running it again to create a new project and manually copying in the billing pieces.

Or, if the generator is upgraded after you used it, it's to late for you.

I wonder if this tool tries to mitigate those issues at all.

r3trohack3r · on May 10, 2023

We went down this path on a project I worked on at Netflix and, in hindsight, I consider it a mistake. The result was a massive number of services with dashboards, metrics, code bases, etc. all stuck in a snapshot of time. Upgrades were manual, tedious, and error prone. As a central team, that work scaled linearly with the number of services that used our generator.

Pulling all of that down into a module that is managed and versioned allows you to upgrade your users. I now consider that almost table stakes for anything I build/manage as a platform team.

sramam · on May 10, 2023

Have you found good examples of this in other languages/frameworks? I'd be very interested in links/references.

robertlagrant · on May 10, 2023

Exactly. Generation of anything without round-tripping is going to be a problem. Libraries > frameworks :D

czue · on May 10, 2023

Yeah, you can re-run it, but then you have to merge the code. It's often not too bad, but can be tedious and complicated to figure out the first time around.

What I recommend people do is immediately commit the "clean" generated project to a new branch. Then you can apply an upgrade directly onto that branch, so it only has the "pure" diff. Then you can merge that to your main branch.

There's more info here, and a video walkthrough if you're curious: https://docs.saaspegasus.com/upgrading.html#using-branches-r...

I've dreamed of automating it more, but would have to build an entire github integration to manage other people's compiled projects...

RangerScience · on May 10, 2023

Have you ever looked into Rails generators? They’re meant to be run over the lifetime of the project, and many libraries (specifically thinking of Devise, since it’s a large feature) have “install me” generators.

noworriesnate · on May 10, 2023

Interesting, thanks for the tip! I think what grandparent may have been referring to is if you run a generator with one set of arguments, make changes to the generated code, then you want to rerun the generator to regenerate the code with a different set of arguments.

For example if you have a generator that has an argument for which logging library to use, you might want to change the logging library later. Maybe a bad example.

I think this is where LLMs can thrive. Has anyone investigated using LLMs for resolving merge conflicts?

Ethan_Mick · on May 10, 2023

I was thinking of building this for Next.js in a very similar way. I love the fact that this exists for a different stack! I think your implementation of the project template is definitely the way to go. As a customer, when I buy something to jump-start me, I want it to be as clean and simple as possible. Generating the project from a configuration handles that nicely.

Outside of the generator itself, what have you found most beneficial for converting customers and convincing people to buy? Documentation? Videos? The community access?

czue · on May 10, 2023

Yeah, as I mention in the post, I started without the template option and just couldn't stomach how much extra cruft would be lying around...

Re: converting, it's a mixed bag and often hard to know. But some factors that customers have mentioned:

1. Positive comments (there have been a few threads about Pegasus/boilerplates on HN and Reddit where people have had good things to say).

2. Some trust in me. I have a blog, write Django guides, publish YouTube videos, am active on Twitter, have been on podcasts, etc. I think people realizing that the creator is a serious person who has at least some credibility goes a long way.

3. I think the high-quality docs and regular release history help a lot to convince people it's a solid product.

Those are the first ones that come to mind. But honestly, I'm lucky if I find out why a customer picked me! And I'm sure some just google "django saas boilerplate" and buy the first thing that comes up, which, thankfully, at the moment is Pegasus.

Ethan_Mick · on May 10, 2023

Thanks for the response! It sounds like (in a good way) there's no shortcut. You build something, make it better, make it great, and talk about it for a long time to establish credibility.

Probably a good lesson in there for everyone.

eclipticplane · on May 10, 2023

For the first Jinja2 template pass, you could swap the Jinja2 delimiters rather than mess with {% raw %} tags: https://jinja.palletsprojects.com/en/3.1.x/api/#jinja2.Envir...

czue · on May 10, 2023

Whoa, are you saying I can tell Jinja to use some other combination of tag markers instead of {% %} and {{ }}? I had no idea you could do that. Where were you four years ago?!

Seriously though, thanks. That sounds like a much better option. Wasn't totally clear how to do it from the link you sent but will look closer into it.

eclipticplane · on May 10, 2023

For Django specifically, I believe you need to pass in a dotted path to a method that returns a Jinja2 `Environment`. There's an example in the docs: https://docs.djangoproject.com/en/4.2/topics/templates/#djan...

And you'd set something like `Environment(block_start_string="[%", block_end_string="%]")`

czue · on May 10, 2023

Awesome, thanks! I think I'll have to do the equivalent thing in cookiecutter, but this gives me a good path to go down.

czue · on May 10, 2023

Hello, author here. Happy to answer any questions about the post if you have them.

williamdclt · on May 10, 2023

I've built a "project generator" that worked pretty much exactly the same way (using plop.js rather than cookiecutter), generating various parts of various stacks (Django/NestJS/React/Elastic beanstalk/React Native...).

It's simple enough, as you say it's mostly templating, but the hardest part I found is maintaining dependencies. I needed to keep all dependencies up-to-date, which was a big amount of manual work. For something like React-native, rather than templating I was generating a project using create-react-native-app then applying patch files.

How do you manage this?

Also, how's the Pegasus devx? Another annoying part of templating is that I was never really writing Python or Javascript, I was writing Jinja, so syntax highlighting or autoformatting didn't really work well or at all.

MoreQARespect · on May 10, 2023

>It's simple enough, as you say it's mostly templating, but the hardest part I found is maintaining dependencies. I needed to keep all dependencies up-to-date, which was a big amount of manual work. For something like React-native, rather than templating I was generating a project using create-react-native-app then applying patch files. How do you manage this?

I did something like this before by building integration tests against the templater project and performed various smoke tests on them.

The integration tests I would schedule once or twice a week on the latest dependencies.

It was somewhat complicated by the fact that often plugin B to framework A required a maximum of version 4.1 for framework A, so at some point you had to make a decision:

* Wait for plugin B to catch up to framework A before upgrading framework A.

* Drop support for plugin B.

It really helped having the tests run periodically to get quick notification when something broke, coz then you could quickly adjust the dependencies (e.g. change >=3.5 to >=3.5, <4.1).

czue · on May 10, 2023

Yeah, dependencies are a pain. I'll do security patches ~immediately, and have gotten into a ~quarterly rhythm of just making a big push to upgrade everything all at once. Usually it's pretty painless, but sometimes it opens up a big rabbit hole. But I've found the more I stay on top of them, the easier it is. Big version jumps are always hard/scary.

Re: devx... it kinda sucks, but I've made it work. Honestly, that could be a whole follow up blog post. But basically I do dev in a generated project, and then have some scripts that use git patches to apply the changes back into Pegasus itself. After that I have to add the cookiecutter/templating logic. I learned early on to do as little dev as possible in the actual generator repo, because - as you said - nothing works.

RangerScience · on May 10, 2023

Hi hi! Nice work :)

Thinking over some of the other conversation here, about the difficulties in the devx what with the templates being invalid python -

Could you pull some shenanigans with https://docs.python.org/2/library/inspect.html#inspect.getso... and siblings?

I'm thinking - at least in some places - you might be able to have lists of functions/modules/etc to get copied, rather than lists of templates to get rendered. Might even be able to do some code-as-strings manipulation during the copy.

The entire point would be so that the code from Pegasus that ends up in the generated project is valid Python code from the get-go, rather than only one it's rendered.

czue · on May 10, 2023

Interesting idea. I could see that being a viable option, though not necessarily any easier to maintain than the current situation.

japhyr · on May 10, 2023

Hi Cory, this was a fantastic writeup! Your explanation of the value of generating a codebase for people instead of having them use a project template is compelling; it articulates that aversion I've always felt towards large templated projects.

The section about testing is particularly interesting to me. I'm working on django-simple-deploy [0], a project that automates people's initial deployments of a Django project. When people first hear about this project, they assume it's just for beginners. But when a lot of people started migrating away from Heroku, we saw how much work it is even for experienced developers to read through a new platform's docs and configure their project correctly for deployment. simple_deploy does all that configuration in one pass, and then all of the platform-specific configuration is contained in one commit.

A fun question I've shared when talking about this project is "How do you test a standalone management command, whose goal is to act on a variety of Django projects?" In this project there's no settings file, no models, no urls, no views, etc. The project also needs to support multiple dependency management systems, multiple target platforms, and since it works on the user's local system it needs to be cross platform as well.

To deal with this efficiently, the unit test suite copies a sample project to a temp directory. It then builds a venv, and runs a commit. It runs simple_deploy against the project, and then does as many tests as it can against the configuration that was done for that project. Then, instead of starting over, it uses git to reset the sample project to the original state. This lets it run ~90 tests in about 10 seconds.

Even more interestingly, that temp project is still available when the test suite finishes. pytest takes care of deleting it at some point, but not immediately. So, debugging can be really efficient. If you run `pytest -x`, you can open that test project and poke around to see what didn't work. The setup uses an editable install, so you can modify simple_deploy, reset the test project manually, and run the command again (outside of the test suite).

pytest is an amazing tool. Just yesterday I wrote a plugin that lets you run `pytest -x --open-test-project`. When the test suite exits, it pops open a new terminal window at the test project, with an active virtual environment and the output of `git status` and `git log`. This turns the test suite into a development tool.

I'm curious if you use your test suite for development in any way, or if it's just for catching bugs and regressions? Also, with that tree of configuration options, have you integrated randomization into your tests at all? That seems like it might be a good way to hit branches you haven't tried before, but then again the majority of the 33M branches are probably not meaningful to test.

- [0] https://django-simple-deploy.readthedocs.io/en/latest/

czue · on May 10, 2023

Wow, that sounds like a really nice test setup! And makes sense that simple-deploy would be a very complicated thing to properly test.

For now the tree-like test suite is mostly for regressions. The inner test suite for the built projects is useful for dev, but not the one that does all the permutations.

Randomization is a fun idea! It would be interesting to randomize some number of runs with each build. I wonder if that would catch anything. Biggest problem is keeping my Github Actions bill manageable...

rtetryrtyhrt · on May 11, 2023

[flagged]

psnehanshu · on May 11, 2023

Spam alert!