Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Scenarios where an IDE with full syntactic understanding is better:

- It's your day to day project and you expect to be working in it for a long time.

Scenarios where grepping is more useful:

- Your language has #ifdef or equivalent syntax which does conditional compilation making syntactic tools incomplete.

- You just opened the project for the first time.

- It's in a language you don't daily drive (you write backend but have to delve in frontend code, it's a 3rd party library, it's configuration files, random json/xml files or data)

- You're editing or searching through documentation.

- You haven't even downloaded the project and are checking things out in github (or some similar site for your project).

- You're providing remote assistance to someone and you are not at your main development machine.

- You're remoting via SSH and have access to code there (say it's a python server).

Yes, an IDE will save you time daily driving. But there's no reason to sabotage all the other usecases.



Further important (to me) scenarios that also argue for greppability:

- greppability does not preclude IDE or language server tooling; there's often special cases where only certain e.g. context-dependant usages matter, and sometimes grep is the easiest way to find those.

- projects that include multiple languages, such as for instance the fairly common setup of HTML, JS, CSS, SQL, and some server-side language.

- performance in scenarios with huge amounts of code, or where you're searching very often (e.g. in each git commit for some amount of history)

- ease of use across repositories (e.g. a client app, a spec, and a server app in separate repos).

I treat greppability as an almost universal default. I'd much rather have code in a "weird" naming style in some language but have consistent identifiers across languages, than have normal-style-guide default identifiers in each language, but differing identifiers across languages. If code "looks weird", if anything that's often actually a _benefit_ in such cases, not a downside - most serialization libraries I use for this kind of stuff tend to do a lot of automagic mapping that can break in ways that are sometimes hard to detect at compile time if somebody renames something, or sometimes even just for a casing change or type change. Having a hint as to this fragility immediate at a glance even in dynamically typed languages is sometimes a nice side-effect. Very speculatively, I wouldn't be surprised if AI coding tools can deal with consistent names better than context-dependent ones too; greppability is likely not specifically about merely the tool grep.

And the best part is that there's almost no downside; it's not like you need to pick either a language server, IDE or grep - just use whatever is most convenient for each task.


I'd say that writing this example from the article is a pretty big downside, especially if you had e.g. 20 different cases and not just 2.

```ts const getTableName = (addressType: 'shipping' | 'billing') => { if (addressType === 'shipping') { return 'shipping_addresses' } if (addressType === 'billing') { return 'billing_addresses' } throw new TypeError('addressType must be billing or shipping') } ```


Grep is also useful when IDE indexing isn't feasible for the entire project. At past employers I worked in monorepos where the sheer size of the index caused multiple seconds of delay in intellisense and UI stuttering; our devex team's preferred approach was to better integrate our IDE experience with the build system such that only symbols in scope of the module you were working on would be loaded. This was usually fine, and it works especially well for product teams, but it's a headache when you're doing cross-cutting work (e.g. for infrastructure projects/overhauls).

We also had a livegrep instance that we could use to grep any corporate repo, regardless of where it was hosted. That was extremely useful for investigating failures in build scripts that spanned multiple repositories (e.g. building a Go sidecar that relies on a service config in the Java monorepo).


If running into this, make sure to enable 64-bit intellisense and increase the ram limit, by default it is 4gb.


As someone who runs into that daily, I'm surprised I never heard of this before.

I seem to have found the 64-bit mode under "Tools > Options" then "Text Editor > C/C++ > IntelliSense". The top option is [] Enable 64-bit IntelliSense.

But I can't seem to find the ram limit you mentioned and searching for it just keeps bringing up stuff related to vscode. Do you know where it is off the top of your head or a page that might describe it?


The RAM limit is 32 bit Intellisense. 2^32 is 4GiB.

Edit: I take that back, this was a first-principles comment. There's a setting 'C_Cpp: Intelli Sense Memory Limit' (space included).


Thanks for that, while searching google for that result only lead me to vscode's IntelliSense settings. Searching for "Intelli Sense Memory Limit" setting in visual studio didn't lead me right to the result but it did give me a whole settings page that "matched". I found the setting in visual studio is "IntelliSense Process Memory Limit" which is under "Text Editor > C/C++ > Advanced" then under header "IntelliSense" towards the bottom of the section.


> It's your day to day project and you expect to be working in it for a long time.

I don't think we need to restrict the benefits quite that much—if it's a project that isn't my day-to-day but is in a language I already have set up in my IDE, I'd much prefer to open it up in my IDE and use jump to definition and friends than to try to grep and hope that the developers made it grepable.

Going further, I'd equally rather have plugins ready to go for every language my company works in and use them for exploring a foreign codebase. The navigation tools all work more or less the same, so it's not like I need to invest effort learning a new tool in order to benefit from navigation.

> Yes, an IDE will save you time daily driving. But there's no reason to sabotage all the other usecases.

Certainly don't sabotage, but some of these suggestions are bad for other reasons that aren't about grep.

For example: breaking the naming conventions of your language in order to avoid remapping is questionable at best. Operating like that binds your business logic way too tightly to the database representation, and while "just return the db object" sounds like a good optimization in theory, I've never not regretted having frontend code that assumes it's operating directly on database objects.


> if it's a project that isn't my day-to-day but is in a language I already have set up in my IDE, I'd much prefer to open it up in my IDE and use jump to definition and friends than to try to grep and hope that the developers made it grepable.

It's funny, because my preference and actual use is the exact opposite: for a project that isn't my day-to-day, I'm much more likely to try to grep through it rather than open it in an IDE.


> if it's a project that isn't my day-to-day

Another overlooked advantage of greppability is to be able to fuzzy the search, or discover related code that wasn't directly linked to what you were looking for.

For instance if you were hunting for the method updating a `foo_bar` instance, grepping it will also give you instances of `generic_foo_bar` and `shim_foo_bar`. It can be noise, as it can be stuff you wouldn't have seen otherwise and save your bacon. If you're not familiar with a project I think it's quite an advantage.

> hope that the developers made it grepable

hopefully it's enforced at an organization level.


- You're fully aware that it would be better to be able to use tooling for $THING, but tooling doesn't exist yet or is immature.


you would not believe the amount of time i spent pretty-printing python dicts by hand last week



yeah, pprint is why i was doing it by hand ;)


I used to pipe things through black for that. (a script that imported black, not just black on the command line.)

I also had `j2p` and `p2j` that would convert between python (formatted via black) and json (formatted via jq), and the `j2p_clip`/`p2j_clip` versions that would pipe from clipboard and back into clipboards.

It's worth taking the time to build a few simple scripts for things you do a lot. I used to open up the repl and import json to convert between json and python dicts multiple times a day, so spending a few minutes throwing together a simple script to do it was well worth the effort.


part of what i ended up with was this:

    {'country': ['25', '32', '6', '37', '72', '22', '17', '39', '14', '10',
                 '35', '43', '56', '36', '110', '11', '26', '12', '4', '5'],
     'timeZone': '8', 'dateFrom': '2024-05-01', 'dateTo': '2024-05-30',
black is the opposite extreme from what i wanted; https://black.readthedocs.io/en/stable/the_black_code_style/... explains:

> If a data structure literal (tuple, list, set, dict) or a line of “from” imports cannot fit in the allotted length, it’s always split into one element per line.

i'm not interested in minimizing diffs. i'm interested in being able to see all the fields of one record on one screen—moreover, i'd like to be able to see more than one record at a time so i can compare what's the same and what's different

black seems to be designed for the kind of person who always eats at mcdonald's when they travel because they value predictability over quality


My understanding of black is that it solves bikeshedding by making everyone a little unhappy.

For aligned column readability and other scenarios, # fmt: off and # fmt: on become crucial. The problem is that like # type: ignore, those start spreading if you're not careful.


My only complaint with black is that it only splits long definitions into per-line if they exceed a limit. That’s probably configurable, now that I write it down.

Other than that, I actually quite like its formatting choices.


Line length is definitely configurable. All it takes is adding the following on pyproject.toml[1]:

  [tool.black]
  line-length = 100
Aside from matrix-like or column aligned data, the only truly awful thing I've encountered has been broken f-string handling[2].

[1]: Example from https://github.com/pythonarcade/arcade/blob/808e1dafcf1da30f...

[2]: https://github.com/psf/black/issues/4389


yeah; unless your coworkers are hindu, you can solve 'bikeshedding' about which restaurant to go to by going to mcdonald's, too


Fair. I spent some time trying to figure out how to make it do roughly that before giving up.


i kind of get the vibe from the black documentation that it's written by the kind of person who thinks we're bad people for wanting that, and perhaps that everyone should wear the same uniform because vanity is sinful and aesthetics are frivolous


we keep having similar problems, lol.


You forgot massive codebases. Language servers really struggle with anything on the order of the Linux kernel, FreeBSD, or Chromium.


I honestly suspect that the amount of time spent dealing with the issues monorepos cause is net-larger than the gains most get from what a monorepo offers. It's just harder to measure because it tends to degrade slowly, happen to things you didn't realize you were relying on (until you need them), and without clear ways to point fingers at the cause.

Plus it means your engs don't learn how to deal with open source code concerns, e.g. libraries, forking, dependency management. Which gradually screws over the whole ecosystem.

If you're willing to put Google-scale effort into building your tooling, sure. Every problem is solvable. Only Google does that though, everyone else is getting by with a tiny fraction of the resources and doesn't already have a solid foundation to reduce those maintenance costs.


The projects mentioned were all single projects with single repos


Sure. But those are far from the only massive codebases out there, and many of the biggest are monorepos because sorta by definition they are the size of multiple projects.


IMO the biggest problem with big projects is finding the right file and finding the right lines of code, so greppability is even more helpful with big repos.


clangd works fine for me with the linux kernel. For best results build the kernel with clang by setting LLVM=1 and KERNEL_LLVM=1 in the build environment and run ./scripts/clang-tools/gen_compile_commands.py after building.


Ok, but now every time you switch commits you have to wait for clangd to reindex. Grepping in the kernel is just as fast and you can do it without running a 4GB+ process that takes 10+ minutes to index


Sure. I wasn't intending to claim that there is no reason to care about greppability. Just providing some tips about getting clangd to work with linux for those who might find that useful.


>It's your day to day project and you expect to be working in it for a long time.

Bold of everyone here to assume that everyone has a day to day project. If you're a consultant or for other reasons you're switching projects on a month to month basis, greppability is probably the top metric second to UT coverage.


They said the scenario in which that would be useful was IF: "It's your day to day project and you expect to be working in it for a long time". The implication being that if neither of those hold then skip to the next section.

I don't think anyone is assuming anything here. I've contracted for most of my career and this didn't seem like an outlandish statement.

Also, if you're working in a project for a month, odds are you could set up an IDE in the first few hours. Not sure how any of this rises to the level of being "bold".


> - Your language has #ifdef or equivalent syntax which does conditional compilation making syntactic tools incomplete.

You need a better IDE.

> - You just opened the project for the first time.

Go grab a coffee

> - It's in a language you don't daily drive

Jetbrains all products pack, baby.

> - You haven't even downloaded the project and are checking things out in github (or some similar site for your project).

On GitHub, press `.` to open it in a web-based vscode. Download it & open it in your IDE while you are doing this.

> - You're remoting via SSH and have access to code there (say it's a python server).

Don't do this. Check the git hash that was deployed and checkout the code locally.


> You need a better IDE.

No IDE will resolve this if the same code is preprocessed more than once to produce different files; or if you often build with different and conflicting preprocessing values; or if your build uses a tool your IDE doesn't know about; or if some of the preprocessing and compilation occurs at runtime.

> Go grab a coffee

So, you're saying "wait".

> Jetbrains all products pack, baby.

JetBrains CLion won't even try to properly index C/C++ files which aren't officially part of the project.

Plus, if you have errors in some places - which occurs while you're editing your code - then that breaks very badly with JetBrains IDEs. e.g. missing closing paren or #endif


> - You're remoting via SSH and have access to code there (say it's a python server).

VSCode SSH Extension for the win.


"I've not personally found the things you mentioned to ever be a problem, therefore they aren't actually real problems and you just need to get good"


> - Your language has #ifdef or equivalent syntax which does conditional compilation making syntactic tools incomplete.

LSP-based tools are fine with this, generally. A syntactic understanding is an incomplete solution. I suspect GP meant LSP. (as long as compile_commands.json or equivalent is avilable).

Many of those other caveats are non-issues once LSPs are widespread. Even Github has lsp-like go-to-def/go-to-ref, though it's not perfect.


"Go to definition" often doesn't work in dynamic languages like Python without type hints; it might not work when the code is dynamically generated.


It always work in VSCode if your environment is correctly configured.


No; Python is a dynamic language and when you see a method call like

    x.so_something()
you might not know what is the type of x and what method does do_something refers to.


> Your language has #ifdef or equivalent syntax which does conditional compilation making syntactic tools incomplete.

Your other points make sense, but in this case, at least for C/C++, you can generate a compile_commands.json that will let clangd interpret your code accurately.

If building with make just do `bear -- make` instead of `make`. If building with cmake pass `-DCMAKE_EXPORT_COMPILE_COMMANDS=1`.


Does it evaluate macros? Because macros allow for arbitrary computation.


The macros I see in the real world seem to usually work fine. I’m sure it’s not perfect and you can construct a macro that would confuse it, but it’s a lot better than not having a compilation db at all.


- you just switched branch/rebased and the index is not up to date.

- the project is large enough that the IDE can't cope.

- you want to also match comments, commented out code or in-project documentation

- you want fuzzy search and match similarly named functions

I use clangd integration in my IDE all the time, but often brute force is the right solution.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: