Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Really, many power users of Excel ought to be looking at more capable, testable and readable solutions from professional data analysts and scientists. Python would be a good start.

I can write python, but I can get the answer out of excel in a fraction of the time for most use cases.

As an added bonus, excel documents are far more ubiquitous and I can share them with clients, and they can see my workings. Python isn’t very transparent and business users can’t usually check the logic.

Excel is also much more interactive and allows for much more “discovery” and playing with the data.

What I would say though, is most power users don’t know about functionality in excel like PowerQuery, relational data models and DAX that actually do turn it into a serious and repeatable tool.

(I’m not arguing that Python isn’t better for many use cases, and excel definitely isn’t right for most critical applications, but I also think Python is much slower for ad-hoc data analysis so both have different places in the market. Anyone think python is faster? I’m down for a race!).



> they can see my workings

How? Do they check that every cell in a column actually has the same formula? Do they check data format everywhere?

Python script is something that is possible to be actually reviewed, and results - reproduced, and "formula's" there are actually readable.

But for some stupid reason excel files are still shared over email.


It's only "stupid" if one suffers from a catastrophic lack of empathy for people who are not programmers and for the incentives to which they are exposed--which do not include using a software developer's preferred tools, nor do they include the carving out of time with which to learn them.

I would hope that we would be generally wiser than that here.


Like being a "programmer" is some genetic trait or something, or like using excel in any more or less productive manner doesn't require carving out hell a lot of time, it's just that time is taken from user's lives in small pieces, and doing PROGRAMMING seems like taking a university course.

I actually have a lot of empathy for people who forced to deal with all those problems, otherwise I just wouldn't care.


> Do they check that every cell in a column actually has the same formula?

They just have to check the top cell as most of my formulas are array formulas.

You don’t have to drag a formula down - that’s a common misconception in the latest versions. If you do =A1 + B1 and want to apply it to the 1000 cells below you just write = A1:A1000 + B2:B1000.

That’s still not that readable though, so I’ll apply those cells two named ranges “Sales” and “Taxes”.

Then the formula is = Sales + Taxes once and that will populate the whole column of data.

Then there’s M Code and PowerQuery which literally allows you to review the data cleaning line by line and even see the data state at any intermediary step. It also has

> Python script is something that is possible to be actually reviewed

The problem for me is that, as someone who works in consulting, it can’t be reviewed by my boss or a client, neither of whom can program. But they can review a tidy excel sheet.

And then they can’t edit it either, so if I go on holiday and I’ve built some sort of model nobody else can make progress until I come back, or if I move projects I’m also stuck maintaining the model on the old one.


That's terrible practice, ironically, as it's extremely unreadable. How could someone looking at it know what Sales and Tax actually are? You have to go into the formula name box and dig in to find sales = "yada yada" etc. That doesn't seem too bad until you have a decently sized file and you have to dig into 40 formulas to find the one you want, and go check that it's actually referencing what you want.

I work as a banker, and what you do is one of the very first things new employees are taught not to do in excel. It's an amazing solution for the person that built it, but a terrible one for anyone looking to check the work.


> That's terrible practice, ironically.

Can you point me to the 'best practice' guide you are referring to? What authority on excel standards said this?

Personally I tried to find articles saying it's not best practice by typing in "dont use named ranges best practice" or "named ranges in excel are bad" into google, but it mostly brings up articles stating that using named ranges is best practice and improves readability!

> How could someone looking at it know what Sales and Tax actually are?

If you really want to use that example, you would click in the formula bar and it will highlight the ranges, and colour code them automatically.

If it's on a different sheet, you just hit ctrl + g and type in the name, and it will take you directly to the cell it's linked to (which usually in my case, is linked to a sheet that contains all my model's assumptions in one place, each one with a named range describing what it is). It's much easier and quicker than going to =Assumptions!G52.

> I work as a banker, and what you do is one of the very first things new employees are taught not to do in excel.

Seems like a silly thing to teach people IMO. In my experience it makes formulas much more readable (both writer and reader), makes it much faster to build models, and cuts down errors substantially.

I personally find that formulas are much easier to review, because the named ranges provide some intent. If someone writes =(A2Assumptions!92)/Assumptions!91 I've got to really unpick it to work out if it's right, but if someone labels it =(A2Miles_Per_Hour)/Average_Miles_Per_Vehicle then I can see that the formula is wrong almost instantly.

Additionally if I want to write another formula using those values, I can just type it straight into the formula bar without having to go and click on the right cell reference in another sheet.


>Can you point me to the 'best practice' guide you are referring to? What authority on excel standards said this?

Well investment bankers tend to be pretty darn good at excel, and the banks I've been at would scorn you for doing it, as well as all the standardized training given out to fresh recruits to the industry. We had a buy-side deal recently fall through partially because the (fairly sophisticated) model the company selling itself used was absolutely unreadable and unaccountable. They did exactly what you said (naming), including with their assumptions. It was 20 sheets of unpenetrable mass, and we were all turned off by the fact that you couldn't follow it whatsoever, and was basically unaccountable.

>If you really want to use that example, you would click in the formula bar and it will highlight the ranges, and colour code them automatically. If it's on a different sheet, you just hit ctrl + g and type in the name, and it will take you directly to the cell it's linked to (which usually in my case, is linked to a sheet that contains all my model's assumptions in one place, each one with a named range describing what it is). It's much easier and quicker than going to =Assumptions!G52.

This how I can tell you've never seriously worked with excel, because that's MUCH slower than using the native/macabacus auditing tools. Adds up over thousands of times. And what if Sales isn't just a simple cell in another sheet, but is actually tied to a named formula itself, tied to a named formula itself, tied to another worksheet (with named formulas in them!). That's a deep rabbit hole and you to be going down with little transparency, where you're having to look up the formula manager to find whatever the fuck the named variables are actually referring to (what if someone follows your advice and names the tax rate as "Tax" and now ctrl-g doesn't work!)

>Seems like a silly thing to teach people IMO. In my experience it makes formulas much more readable (both writer and reader), makes it much faster to build models, and cuts down errors substantially.

It makes formulas much easier to read, but with zero accountability, and considering someone will very likely be looking at the excel at some point, with no idea what you did, they have to check each and every one out to make sure it's not bullshit. Plus, it doesn't really make modeling faster assuming you've built out your source numbers/assumptions well and use best practices (like A24+A25+A26 instead of A25+A26+A24).

>I personally find that formulas are much easier to review, because the named ranges provide some intent. If someone writes =(A2Assumptions!92)/Assumptions!91 I've got to really unpick it to work out if it's right, but if someone labels it =(A2Miles_Per_Hour)/Average_Miles_Per_Vehicle then I can see that the formula is wrong almost instantly.

Use tracing, and if the assumption tab is well built out it really isn't faster, at all. +alt w,n

>Additionally if I want to write another formula using those values, I can just type it straight into the formula bar without having to go and click on the right cell reference in another sheet.

Use multiple windows, makes life much easier.


> Well investment bankers tend to be pretty good at excel

> This is how I can tell you have never seriously worked in excel.

Ah, so the style guide and best practice is based on “Trust me - I rock and you suck”.

I can guarantee that I do use excel seriously, and my personal experience is the exact opposite. You wouldn’t see beautiful code with variable names like A2, and it’s exactly the same for me with excel.

Besides - a feature you can turn off in about 5 minutes stopped you from doing a deal?! Why not just explode the named ranges out? I suspect the issue here is an overly complex model rather than named ranges - I’ve seen lots of these without named ranges too and they are even less manageable in that state!


This debate about whether to use named-ranges in Excel formulas brings back some distant memories. I was on a team at Goldman Sachs that built Excel tools for the investment bankers and because of that we'd help out with the new banker Excel training held at the giant Chelsea Piers sports complex.

Every year freshly minted MBA's would begin the training and immediately become aghast when the trainers told them to never, ever, use named-ranges in formulas. Not only had the trainees been explicitly taught to use named-ranged in their MBA programs, but any idiot could plainly see that [=enterprise_value/ebitda] was better in every way than [=C13/F22]. More expressive, more readable, easier to spot errors, etc.

The trainers would argue that in an MBA program you build your models, submit them, and move on to the next assignment, so you don't get a view of the longer-term problems that arise from named-ranges. What does [=ebitda] actually tell you? Is that the last quarter's actual number, is it the current estimate of the next fiscal quarter, is it a 12-month blended forward estimate?

And as you try to solve those questions with [=ebitda_est_next_fiscal_qtr] you often end up with two more problems: an unambiguous name for you can unambiguously mean something else to someone else and as you update your model over time, if you forget to update your named-range references, you have formulas that look right, but are wrong, e.g., [=ebitda_est_next_fiscal_qtr] now refers to an actual reported ebitda number, not next quarter's estimate.

[=C47] while not telling you much at all, is unambiguous. It doesn't look "right" or "wrong". It can't mislead you. If you want to know what it is, go look at cell C47. The new MBA's would argue, "Wait a second, C47 could now point to the wrong data as well, and it could be mislabeled with a stale row or column header. And the trainers would reply, "Exactly, and when you go to see what C47 represents you will have the context to recognize those errors and fix them."


You put it in much better words than I could, thanks haha.

Also, very cool experience at Goldman!


heh, your exchange looked pretty much like a holywar between programmers. Except problems looked like if you're discussing merits of different old varieties of BASIC, with GOTO and line numbers. I really don't see how excel is more visual, or easier to grasp after adding just a several columns of data or formulas.

I wonder if there could be made some middleground language, visual enough to not being scary for spreadsheet people, and debuggable and readable enough to not become a mess when model becomes big. Jupyter comes to mind, but it looks like it didn't get any traction outside data science.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: