I work in critical infrastructure planning. My organization builds software in R, Python, and other programming languages customized for these major organizations.
So many critical infrastructures, billions of dollars in planning, and just systems are built out of Excel. It's amazing. You'd assume something that services millions of people a day would have some more sophisticated and customized solution, but you're wrong.
The reason is because most people know how to use Excel. Most people know how to handle it, use it, modify it, build up from it. You don't need to get a regular support contract from a company for your custom-built spreadsheets. If something critical in that "Excel software stack" breaks (aka Microsoft Excel), most of the time you just need to reset Excel. It's amazing. You don't need a new server, you don't need support contracts, it's just like riding a bike.
In my opinion, the limitation of Excel is actually when it comes to big-data analytics. The way Excel handles large quantities of data is slow (due to the nature of it's software structure). That's where we come in and build out these models and systems. However, in the end, our data outputs will be fed back into Excel, because that's what most people are used to.
I have deep respect for Excel. After all, Excel empowers so many users who don't know how to code to provide amazing plots and perform major calculations with ease.
Yeah it works and you don't need support contracts and then:
- Fidelity's "Minus Sign Mistake": loss of $1.3 billion
- TransAlta "Clerical Error": loss of $24 million
- Fannie Mae "Honest mistake": loss of $1.3 billion
Then you get employee turn over where new employees don't get "arcane" knowledge passed down by people who left and took their spreadsheet foo with them.
Excel does not have "access control", "auditing", "change tracking". Just getting work done is not enough.
> - Fidelity's "Minus Sign Mistake": loss of $1.3 billion - TransAlta "Clerical Error": loss of $24 million - Fannie Mae "Honest mistake": loss of $1.3 billion
Weigh that against <insert any company> makes $X Billion due to correct usage of Excel multiplied thousands and thousands of times over. Survivorship bias at it's finest.
Excel isn't perfect and has notable downsides, but work gets done all of the time on them for the aforementioned reasons.
I’m not convinced there’s a such thing as “correct usage of excel.”
What a lot of companies do is just scare entry level employees into being very careful. My friend’s girlfriend works at a place like this and people just accept all the problems with manually editing large spreadsheets because “that’s just life.”
Anything in excel is a hack and most of its users don’t know any better.
The manager goes to the people, who really know how to make a report (maybe that is "accounting"?) and asks them to make the report taking into account some single report specific things that they talk about. Then those people, who know how to do it deliver. They might have someone in their team, who knows how to work with dataframes, which would come in handy. Those people store the program written to generate the report from the data and are ready to modify when more is requested. Instead of doing code in Excel, they simply write their code in whatever language they use to process spreadsheets or other data sources.
If they have knowledgeable people there, it will take no longer than coding something up in Excel and will be more reliable.
I guess no one has used Power BI for reporting... Link the data source, save the view that "Manager A" wants every week or month, setup on premesis gateway, and your new reports are there before your are asked for them again. Always up to date. Then share your dashboard.
Also, you guys should be ashamed of yourselves. "not sure who does the reports, maybe finance?" Eceryone should know how to make their own reports.
If you regularly produce reports one would suppose that you would have an existing project template and know how that would allow a reasonable individual to deliver a result that isn't all wrong in a reasonable length of time not THAT dissimilar from the time to throw up a spreadsheet.
This also wouldn't silo the data in the users laptop, subject it to being destroyed because the user dropped it, and limit frequency of updates and access to how fast the user can respond to emails.
And meanwhile the manager that broke the rules, connected their personal copy of excel straight into the database and made the company 8 figures in profit over the course of the last year just outdid you and your team of report programmers.
And meanwhile an actual data scientist looked at the data and outdid the manager by a factor of 10, by creating a more meaningful report, that is based on an actual statistically sound basis, preventing a disaster.
Logically most of the profit making activities may be enabled by tech it usually isn't created by tech. The managers rule breaking activities are probably unrelated to his success.
In my experience, business success is often hinged on the reports and the insights they provide. Without accurate numbers, you've got nothing to go on.
I think it will be easier to solve those problems in the Excel development space than it will be to export the Excel solutions to another solution domain.
Also the kind of mistakes you've described can (and do) happen in other programming domains that are deemed more respectable.
Excel is hamstrung by having to maintain backwards compatibility for an endless number of hacks.
For instance, it is notorious that
0.1 + 0.2 != 0.3
in binary exponent floating point math. Excel does funny stuff with number formatting that hides this, but it is like having a bubble under a plastic sheet that moves someplace else when you push on it -- numeric strangeness appears in different places.
The right answer is to go to decimal exponent floating point math, but that is only HW accelerated on IBM Mainframes, maybe on RISC-V at some point. You'll probably crash Excel if you have enough numbers in it for performance to matter, but Microsoft would be afraid of any performance regression and it would break people's sheets so it won't happen.
On a technical basis we could use an Excel replacement that has some characteristics of Excel, and other characteristics of programming languages; one old software package to look to for inspiration is
What makes it almost impossible to do on a marketing basis is that Excel is bundled into Microsoft Office so if you have an Office subscription you have Word, Powerpoint, Excel, Access, etc.
GP was about how Excel lacks "access control", "auditing", "change tracking". Then OP says it'll be easier to add those things to Excel than to make non-Excel suit the purpose.
Then you say no, backwards compatibility problems. But how does Excel's need for backward compatibility make it hard to add "access control", "auditing", or "change tracking"?
"access control" by definition means "make things not work some of the time"
Excel has change tracking, but like Jupyter notebooks and similar products it doesn't make the clean distinction between code and data that is necessary for it to be useful. (e.g. if I develop an analysis pipeline and use it for May 2019 it should be as easy as falling off a log to run it for June 2019)
"At its heart, VisiCalc is about numbers. One of the early decisions we made was to use decimal arithmetic so that the errors would be the same one that an accountant would see using a decimal calculator. In retrospect this was a bad decision because people turn out to not care and it made calculations much slower than they would have been in binary."
That sounds like you want to rebuild the whole paradigm of how 99% of software world treats numbers (which may be, at the end, a right thing to do but unlikely to happen at least in the near term) and you are laying the fault of it not happening at Excel's door. That would be very misplaced blame.
You don't need Access Control because it's just an excel file and whoever has the file has the planning model. In most organizations like this, there's only like 5 or 8 people who all work together on a team, so it's not like 50-some people who are separated. Excel is a tool, a tool that you use to get things done. I mean even programming solutions have issues such as Knight Capital Group's $460 million loss due to mistakes in their code and deployment processes. Planning modules does not mean Operational modules. It's a big difference/jump between the two.
Again, these are all tools. What's important is how you use it and how do you properly validate these calculations. the thing you also need to understand is that no critical infrastructure planning is perfectly to-the-dot numbers. Our systems are far too complex, built off of human operational decisions, and have so many unknown losses that planning modules are based on scenarios. Also we're talking about thousands of these agencies around the world and their critical infrastructure planning teams are mostly 5 to 8 people. The current agency I'm working with has hired around 3000 employees to operate the systems but only 5 or so are really in-the-deep running these models and building these plans.
These people are not programmers. They spent their time learning about resource planning, mathematics, operation theory, physics, and engineering. Excel is an incredibly user-friendly tool that lets them automate a tremendous amount of their tasks. They're all very intelligent and can definitely learn how to code, but that takes their time away from more critical skills and tasks they need to accomplish.
The most important solution to the problems you've stated is the workflow. You setup a proper workflow, and risks of these problems should be minimized.
>Then you get employee turn over where new employees don't get "arcane" knowledge passed down by people who left and took their spreadsheet foo with them.
Yeah, this never happens with a proprietary codebase!
Who needs historical knowledge when you can just rewrite the program every few years in a different environment with all new bugs? (and a smattering of the old ones, too)
Few work things have bemused me more in the last couple of years than watching someone re-implement a slow SQL process in an HDFS big data platform forgetting all of the things that went wrong the first time, fixing them again, and then running into weeks worth of new bugs just to get not time critical data faster.
Fidelity didn't lose $1.3B, they miscalculated by $1.3B. The same with the other numbers you quote. Spreadsheets are error prone, but they still make sense for a lot of use cases.
The Fidelity "minus sign mistake" didn't create a loss.
The mistake was in relaying the information to the end user. It didn't actually cause a loss of that magnitude.
That is like saying if I mistyped in a word doc, that word created the loss.
Arcane knowledge will be an issue with magical spreadsheets, as it is with any software (and I could argue that with custom software you have this legacy knowledge over both code AND ui, which is worse than just a spreadsheet).
The last paragraph is key, though - auditing and access control are tough ones for excel, and there are many tasks that require this functionality.
The trade-off is customizability - any excel can be changed to your specific need whereas that nice and shiny enterprise software you build may lack one or two things which you will never be able to change.
> Excel does not have "access control", "auditing", "change tracking". Just getting work done is not enough.
Doesn’t the web version of Excel already solve that? I’m not familiar with it but Google Sheets does these things, at least to some extent - I’d expect Office 365 to do so as well.
While I haven't encountered localized names, localized formats make Excel an absolute pain for me.
My language uses the "European" number format of a comma for decimals and periods for thousands separators. Excel tries to adjust to that by using semicolons for argument separators (i.e. ADD(1.5, 3.5) -> ADD(1,5; 3,5)).
The problem is that their locale detection is wildly inconsistent and there isn't a good way to override it without changing system settings. When moving a file between computers, this is an utter disaster.
Excel goes one step further, and assumes that users in some locales (e.g., Dutch) want semicolons as separators in every CSV file you open in Excel.
I don't care (I just use LibreOffice Calc, which accepts any delimited values file just fine, and just asks which separator to assume), but it means that when you develop an option for users to download some statistical data as comma-separated values file (which is easy to generate), that you now have a problem if that user wants to open it in Excel (which makes sense for CSV) and that user is using one of Excel's broken locales (e.g., Dutch) where comma-separated values are not understood (again, it wants semicolons in Dutch).
I.e., the expected file format changes because of the locale Excel uses! So where someone in the US can just double click the CSV file (not to mention anyone using LibreOffice/OpenOffice Calc anywhere in the world), someone using Excel in the Netherlands will just get a spreadsheet where every line consists of that whole line of values stuffed into column A.
And that's not even the worst. You can teach people the few buttons necessary to read in the text, but once they save it again, the file is changed! Now you have a different separator and chances are good, excel also mangled date columns, removed leading zeros from numbers and switched out decimal separators! So you may not be able to further process the file. And its also sad, because if they just made some of these changes optional, excel would be a good tool to quickly work on an csv file and pass it on. But as is, I am always hesitant to work with csv in excel, because it mostly breaks and becomes unusable for further processing.
Oh I didn't know that is a localization thing. I use the german version and got used to open csv's in notepad first (search and replace ; with ,).
I always thought that's some kind of the usual MS vs. the rest of the world thing.
You can specify what separator to use as the first line in the CSV file:
sep=,
My experience is that Excel 2007 and later will correctly parse the file and use the specified separator. However, other software, such as Google Sheets, will simply render the declaration as-is.
Then there's the issue of what character encoding to use to encode the file, whether to include a byte-order-mark with UTF-8 to make Excel recognize the file as UTF-8 and the effect that has on whether the separator line is recognized (spoiler: it isn't).
Here's part of the documentation I wrote for Calcapp's CSV exporter, which digs into these issues in more detail (Javadoc):
/**
* The prologue of files containing comma-separated values (CSV). This
* prologue contains an instruction detailing the separator character that is
* used in the file. This instruction is known to be understood by Microsoft
* Excel 2007 and later versions, but is rendered as-is by other spreadsheets,
* including Google Sheets.
* <p>
* Microsoft Excel expects either a comma or a semicolon to separate values in
* CSV files, depending on the Windows locale. The only way to produce a CSV
* file that can be read by Excel regardless of what locale Windows is set to
* use is to use a prologue similar to this one, which explicitly tells Excel
* which separator is used.
*/
private static final String PROLOGUE = "sep=" + SEPARATOR_CHARACTER + "\n";
/**
* The character set used to encode files containing comma-separated values
* (CSV): UTF-16LE (UTF-16 for little-endian systems). Using UTF-16LE allows
* characters that cannot be represented by the ASCII character encoding to be
* correctly read by Microsoft Excel and other spreadsheets.
* <p>
* There is no way to formally specify the character set used by a CSV file.
* With one exception, Excel assumes that CSV files use the ASCII character
* encoding, unless the first three bytes consist of a byte-order mark, in
* which case the UTF-8 encoding is used. (A byte-order mark is redundant for
* UTF-8, as it does not depend on endianness, but is traditionally used by
* Microsoft Windows applications to detect whether a text file uses the UTF-8
* encoding.)
* <p>
* Excel only recognizes a file as being encoded with UTF-8 if a byte-order
* mark is included, but doing so prevents Excel from recognizing the
* information of the {@linkplain #PROLOGUE prologue}, which in turn prevents
* CSV files from being produced which work regardless of the locale Windows
* is set to use. This is likely due to a bug, present in Excel 2007 and
* likely later versions as well (based on anecdotal evidence).
* <p>
* Fortunately, Excel does recognize another character set which can encode
* all of Unicode: UTF-16. Excel likely uses heuristics to determine that a
* file is encoded using UTF-16. (Text written using Western languages and
* encoded using UTF-16 tend to include many null bytes for various reasons,
* making the detection of UTF-16 trivial, but only for Western languages.)
* Unfortunately, UTF-16 is dependent on endianness, meaning that it would be
* desirable to include a byte-order mark at the beginning of the file.
* However, that does not work due to the aforementioned bug.
* <p>
* In other words, using UTF-16LE should work well for CSV files containing
* mostly Western text and parsed on little-endian systems. It is probable,
* though, that files produced using this converter will not work if the text
* mostly contains Chinese, Japanese or Korean characters or if the file is
* parsed on a big-endian system.
*/
public static final Charset CHARACTER_SET;
If I'm going to mangle CSV into a non-standard form to conform to whatever Excel expects, I might as well go the extra mile and just provide an XLSX file.
That was my solution to a data-export function in an API (it switches on the HTTP Accept header). It now offers a choice of CSV (RFC 4180), ODS, and XLSX, and people can just pick XLSX if their Excel is using one of its broken locales.
It's not too hard to generate the XML for ODS (OpenOffice/LibreOffice etc.) and XLSX (Excel) from a bunch of tabular records once you've set up minimal empty template ODS/XLSX files to inject it in. It's certainly less work than explaining to Excel users that their software is broken and how to work around it.
+1000. This is ridiculous. Couldn't they at least make this optional so you'd be able to switch this off somewhere in the configuration dialog, registry or command line? I hate localized apps altogether (who even needs localized Visual Studio srsly? I can't believe anybody can be competent in a .Net programming language and relevant frameworks and practices without being able to read English) but localizning some apps makes some sense for some people, nevertheless localizing functions feels beyond reason. I feel thankful they didn't localize C# and command line commands :-)
The worst of it is that there are some very specific (and undocumented) edge cases where the translation won't be done properly.
It create very 'fun' to debug issues where a spreadsheet would work in a French Excel and not in an English one.
(For instance if you use the "Row-Column" cell reference, in English it is `R1C1` while in French `L1C1` - and it won't translate)
Even worse, the token used to separate arguments is localized. So you may have to use semicolons instead of commas. But in some places you need to pass a string representation of a formula and there Excel will only understand the non-localized commas. Fun all around!
I once had a contracting gig where I was tasked with updating a bunch of Word BASIC macros (this was Office 95 days...) used to maintain the ISO 9001 documentation for a major company.
Problem was the original version was written in their Danish office, and whoever wrote it had handed it to the Norwegian office exported as text before he left, and they'd imported it and started munging it and adding lots of stuff before someone bothered to try running it and realized it was completely broken. I wish I'd dug more into how they'd managed to get themselves into a situation where they'd significantly modified the code before trying to run it even once...
Version control? What is version control?
So they had this broken version that had some Danish keywords with a bunch of Norwegian updates, that they didn't want to just re-import into a Danish version and copy over properly in a tokenized form and having to try to identify and re-apply the Norwegian updates. So I got the fun job of properly translating it and figuring out what the new code was meant to do and make it do it in the process, with no access to any of the people who had worked on it.
[it is worth pointing out that this was an office for several hundred staff of a major international company that developed large scale information systems for things like police departments; while version control wasn't everywhere in the mid 90's, a large systems integrator certainly ought to be using it... Particularly amusing that it was their ISO 9001 documentation that was being handled in such a haphazard manner; happy to have never been a customer of theirs, though]
The task was extra "fun" for certain values of fun, because unlike, say, English vs Norwegian, or English vs Danish, Danish and Norwegian are close enough that 80% of the time a term from the Danish version might look right and be valid Norwegian, but the odds seemed to be (and maybe my memory is exaggerating it due to my lasting memory of extended pain) about 50/50 that the Norwegian translator of Word BASIC had chosen a different function name to the Danish translator for no good reason. Or there were slight, hard to spot, spelling differences. And sometimes what looked like a mistake was a function defined locally.
I spent a couple of weeks reading through the whole thing and mostly rewriting code that could have been written from scratch in a couple of days if they'd just told me what the end result was meant to be instead of dumping a pile of non-functioning code on me. But I guess I was cheap compared to their regular staff back then.
It was how I learned Word BASIC (I'd taken the contract, figuring it'd be easy to pick up, so I told the recruiter that sure I knew it, on the basis that I certainly knew a couple of variants of BASIC). Never to use it again (at it was replaced by a Visual Basic variant a few years later anyway)
On the same hardware, with the same table, Excel in Windows 7 does a cross-tab faster than MySQL in Ubuntu does. Indeed, MySQL chokes if there are more than a few hundred columns, but Excel just keeps going. And it uses all CPU cores.
Edit: In case anyone is wondering why I did that, I wanted a simple visualization of ping-location results for thousands of IPv4 from several hundred ping-probe locations. So that meant aggregating (getting minimum rtt for) millions of observations, and displaying min(rtt) in a IPv4 by probe location cross-tab. MySQL did a great job at the aggregation, but choked (as in, errored out) with several hundred numeric columns. Even if I converted them all to SMALLINT.
The other thing people don't realize about Excel is that it has a compressed, in-memory, columnstore analytics database built into it. It was called Power Pivot then Power Query, not sure if it changed names again, but it's actually a SQL Server Analysis Services Tabular Model, the same thing that's in Power BI as well. It's going to perform much better than a row-oriented RDBMS at aggregating columns because that's all it was designed to do.
It was a separate data processing engine within Excel developed by another team (SQL server) for the purposes of self service BI (Power Pivot/Query). Most probably, this engine is not used for normal (traditional) calculations.
You mean if you create the pivot table from data in a sheet? It's not needed for that because you'll run out of memory trying to fit any more than about a couple million rows in a sheet first. Meanwhile you can easily load 10s and possibly 100s of millions of rows into Power Query.
You're blowing my mind here a little bit. I don't use MySQL but do use Excel and the R/Python data science stacks quite a bit. I regularly open data with tens or even hundreds of thousands of columns in the latter by using lazy computations. Is this not the case with database approaches? What kind of hardware are we talking? Presumably Excel has to load everything into memory at once since it's immediately viewable, right?
In MySQL, maximum row size for non-text columns is 65,535 bytes. MySQL is limited to 4096 columns per table, and 1017 columns per table if the InnoDB engine is being used.
Excel's limits are 16384 columns and 1048576 rows.
I'm talking wimpy hardware here, I admit. Basically, VirtualBox VMs on a quad-core i5 box with SSD and 8GB RAM. With the VM having three cores and 6GB RAM. But it was the same wimpy hardware for Windows 7 and Ubuntu.
Notwithstanding the row length and column count limits, did you also give MySQL more memory? Setting InnoDB buffer pool size to 3 or 4 GB might help; its default is about 134 MB.
> Presumably Excel has to load everything into memory at once since it's immediately viewable, right?
Depends on the file format. An XLS or XLSB file can contain special markers for where each logical row starts, so it can randomly access rows; Both also can persist "calculation chains, which are a simplified dependency graph. The binary formats also store formulae in a parsed representation allowing easy scans to see what cells have to be inspected if a file needs to be recalculated.
Aha! As far as persisting calculations, this makes sense. I guess in my head I was thinking "pure" data where everything opened was precomputed. Forgive my ignorance about database solutions, but do they not implement something similar?
But now you've got me thinking, it would be nice if libraries like Dask could allow for flagging of symbolic operations like this to be written to disk for quickly saving metadata where intermediate steps don't explicitly need to be saved.
Might be worth a try to look into Postgres. In case of the column problem, PG has a limit of 8 Kilobytes for Column Width (instead of a count of columns). You can circumvent some of the limits by using arrays, types, json or hstore. It also has pl/Python which lets you code up queries containing python fragments for data processing.
Meanwhile <side rant> open any document in Photoshop with a few layers and effects and the computer grinds to a halt, no matter the specs, no matter the year, through the ages. New hardware comes out? Booya, new Photoshop XYZ -> let's put your fancy hardware on its knees, begging for air.
>Meanwhile <side rant> open any document in Photoshop with a few layers and effects and the computer grinds to a halt, no matter the specs, no matter the year, through the ages.
Hasn't been my experience ever. Photoshop is one of the speediest image manipulation programs out there...
The monthly fee for Photoshop is around 1/100th of the old retail price. That works out at about 8 years of use before you have to 'buy' the software again. Seems fair to me.
And for those who don't like paying for Photoshop - which, given it's an astonishingly powerful piece of software, probably means "people who don't actually need Photoshop" - there's always cheap or free alternatives that provide about half the functionality.
>The monthly fee for Photoshop is around 1/100th of the old retail price. That works out at about 8 years of use before you have to 'buy' the software again. Seems fair to me.
That's because you only think of yourself. Creatives are one of the most struggling professions, and can have widely different returns per year, and salaries in different parts of the world can be much lower.
People who could afford to buy Photoshop at some point (perhaps even a used copy), later down the road might not be able to pay the subscription for a few months, but in the new scheme they lose access to the program altogether.
8 years? There are creatives that use 10 and 20 year old versions of Photoshop on Windows.
You pay per year. $120 gets you access to the photo bundle for one year. That's 30x cups of Starbucks drip coffee a year. Creative people who are 'struggling' blow more than that a month on weed, beer, coffee or Tinder. We all make choices that show what we prioritize.
Anyway, the subscription model for this kind of software works really well. Software keeps becoming better and so far Photoshop has been running circles around the limpy thing known as Gimp
P.S. I just got hit with the next year charge. I was annoyed so I tried installing CS 6. It's OK. It still beats Gimp but I would rather not drink coffee or smoke weed for a month to have money to pay for the yearly subscription for the photo bundle.
P.P.S. And if you are really really really struggling "creative" you are probably young and either go to school or have a friend that goes to school which can get you it under the EDU discount, which is peanuts.
> And for those who don't like paying for Photoshop - which, given it's an astonishingly powerful piece of software, probably means "people who don't actually need Photoshop" - there's always cheap or free alternatives that provide about half the functionality.
Now? Absolutely! 20 years ago this was much harder. Adobe used to dominate the market so that you used Photoshop even if you often didn't really need all its complexity. But these days, there is so much competition that you can probably find something that works well enough. I would assume that hurts their bottom line, but as a consumer, I am happy.
> Adobe used to dominate the market so that you used Photoshop even if you often didn't really need all its complexity.
Still does. Part of the reason is that in the past, they turned a blind eye on people pirating Photoshop. Every kid with even passing interest in graphics or photography got themselves a bootleg copy of Adobe tools to work on; it was an easy guess what they'll be using later, as adult professionals.
So true, I used to run European cross-docking ops, planning, inventory, forecasting, scheduling, in Excel back the day. Successfully scaled by 100% YoY for three years. Was quite a lot of fun.
Issue I have with that approach is not using excel per se. It is using Eycel in addition to what ever system is being used the first place (SAP for example is a pretty popular thing to circumvent with Excel sheets). That and over engineering spreadsheets to the point where nobody but the creator can use them anymore. At that point you data just gets suspect. And circumventing existing systems with local offline spreadsheets just screws up everything. Now combine these two.
But that has less to do with spreadsheets themselves, see my fist point regarding cross docking, and more with the application. Used correctly spreadsheets can be incredibly powerful. And at least for my purposes I have yet to encounter an analysis issue I failed to dig through using Excel. Sure, something like Python might have worked better but consider me to be an empowered user (read: I have no clue about Python or SQL or...). And at least I knew I could trust the way the analysis was done as intended.
> It is using Eycel in addition to what ever system is being used the first place (SAP for example is a pretty popular thing to circumvent with Excel sheets). (...) And circumventing existing systems with local offline spreadsheets just screws up everything.
I usually side with the circumventers on this one. This misbehavior happens for a reason, which usually is that it's impossible or infeasible to do the work with "proper" systems. Excel sheets on pendrives is what you get when people can't exchange data using "correct" software, or when people need to continuously iterate on the shape of the data set, while getting the schema changed in the "correct" system would involve a ticket to IT and 2 months of waiting.
I think instead of paying people for doing a job and then doing everything possible to make that job difficult, companies should embrace that people will work around any deficiencies of top-down systems, whether software or procedural. But I guess this tug-of-war exists since the time of first corporations.
True that. I used to both back then, first a user and then a SAP system guy. For me it is always a death spiral: people don't trust the system or it is too cumbersome, they use Excel, the system gets even less reliable, the use Excel even more,... and so on. I always saw it as my role a system guy to give users a system they can use and then force them to actually use it. Because, when providing the systemnI talked to them first so in the end it was ind of their system too.
And then there are examples like the production planning done in Excel because SAP PP isn't just good enough. Then the guy who created said Excel tool left. Years later when production had to switch to weekend shifts zhey couldn't because no one could adopt the Excel tool. Once you reach that point you are screwed.
Agree but slight modification - goal should always be for the users to be able to do their main functional tasks in-application and maintain it as the system of record.
But there will always be a use for Excel as a tool for informal data-monkeying and analysis - exporting a table and doing some pivots to summarize or use a different view. Non-technical users will always expect this capability and flock back to it. Not everyone's a programmer or has the interest to be - you'll have to pry excel out of most accountant's cold, dead hands for example
You describe the ideal world. Tools like SAP or Business Warehouse suck at data crunching and nor everyone can use Python for example. Even ops can under the right circumstances be okish offline in Excel as long as results are imported into e.g. SAP to assure data integrity.
I feel like PowerBI is a good solution to overcome Excel's deficiencies while working well with Excel. It forces you to structure your data and define measures while being dyanamic, efficient and friendly to less structured inputs.
I have to give PowerBI a serious shot. We have it at my current place but nibody is really using so I need some spare time to dive into it.
Now, the more I think about it, the combination of MS Office, Power BI and MS Dynamics with a shopfront from, say Shopify would make up a quite nice e-commerce set up.
I do something around the same line as you describe (build better tools in R/Python for larger data problems) but I have a deep aversion to Excel. It's proprietary, has horrible standards (it does not support native UTF8 csv files for example), and is stuck in the 80's in terms of paradigm. And this is precisely the tool that prevents people from doing things more efficiently because "they can do it manually in Excel even if it takes a long time". It makes people take really, really bad habits.
Excel is a tool. It empowers those who don't know how to code to do their jobs more effectively than before without "learning to code". It's something you keep in your garage in case you need it. It depends on what their current workflow is. Sometimes I'll use Excel, sometimes I'll use R/Python.
What's important is that Excel gives you access to the computing power of modern technology with a lower barrier-of-entry/knowledge. That's one of the most valuable things that many people fail to understand. Not everyone is a programmer and for a lack of a better term, "doesn't really care about UTF-8 and just want the calculations to work". Also, in many cases Enterprise IT in critical infrastructures are resistant to supporting R or Python due to their security models (firewall rules prevents install.packages('tidyverse') for example).
This is what makes us specialists and someone they bring on board to help them out and explain to them about what the "best-use-case" is, and then watch as people try to cram a square peg into a round hole.
Well if you're using Python, you're following an imperative/oop paradigm which goes back to the sixties, so I don't know why you'd criticize Excel for using a dataflow model. I can understand not wanting to use Excel because it's proprietary (but Libreoffice isn't) but I can can understand not wanting to deal with a lot of it's legacy cruft, but for the stuff that Excel does well, it does it a lot more quickly, more interactively, and more intuitively for most people than a similar solution in Python. In fact, one of the key benefits of an Excel solution is that you can see every step of your calculation which creates a degree of self documentation which you don't get with Python, for example.
> that you can see every step of your calculation which creates a degree of self documentation which you don't get with Python, for example.
If you want to do that in Python then Jupyter Lab/notebook is a good solution to document your code and see the result of each operation. And way faster than Excel for large data tables.
Think of all the open, modern, nicely standards compliant tools that you use. Now think of the fact that Excel lets people solve their problems more easily than all the things you like.
I'm not saying it's good, but it's better than everything else (for the people who use it.)
I am talking about importing CSV files in UTF8. If I use Japanese characters for example it fails utterly to display properly unless I import it as "Text" first.
The secret trick for importing data into Excel (without actually writing a file in the Excel file format): don't try CSV - as you have already noticed that will fail in all kind of fragile ways depending on your character encoding and operating system locale.
Instead, convert your data into an html table and then paste or import that into Excel.
But not any html table, you need to have the appropriate proprietary css properties in there, such as the mso-data-placement:same-cell. Those are actually documented ...in the Office HTML and XML Reference published in 1999.
I frequently go from SQL query results to HTML table to excel in a similar manner, but I haven't had issues w/ any proprietary CSS properties. Using Paste Special -> Match Destination Formatting it always comes out without any issues
But how do you handle sharing? The main issue with Excel, and its main downfall, is the sharing of data and versioning, with person X has an older version of person Y by a few hours and needs an email of person X to keep doing her job.
If it's O365 Excel, then you can share and work on the same spreadsheet in a similar way to Google Sheets (seeing other peoples real time cursor in your spreadsheet etc).
Be very, very careful with this. There are some update/race condition bugs in Excel Online that result in multiple users causing bulk updates to incorrectly update the filtered/sorted view of other users.
This race condition happens often. In fact about every time I try to rely on it with my team. Such a shame Google Drive isn't seem as a "trusted platform" in my client environment.
Sharing Excel sheets is extremely easy - you just mail them, or host them in some online place, like SharePoint if you've bought into the whole MS ecosystem like many companies do. Not sure what your point is there.
Versioning is of course more difficult, though Excel does support diffing in principle. I expect though that what most people end up doing is simply keeping track of versions manually, same as they would have in the days before excel.
Honestly, people handle sharing via Google Drive/Dropbox/Enterprise solution plus file naming convention.
The current enterprise network I'm on? They block Box, Dropbox, and a lot of other solutions. They instead use Citrix's ShareFile and a NAS setup with file naming.
Also remember, surprisingly most of these critical infrastructure planning isn't handled by hundreds of people, at most it's a team of 5 people who then present their findings and suggestions to the C-suite decision makers. It's not as big of a problem as many people think.
We tend to keep ours on Sharepoint - which means we could use Excel Online to work collaboratively, but usually we don't need them all at the same time, so open in Excel desktop and "check it out". If someone else needs it, they can just ping on Skype and see if you're done.
It is amazing, just not in a good way. Maybe I should try to find and post a link to that paper where it was pointed out that Excel was munging things like gene names because they get interpreted as dates in the 'untyped' cell input, and these things were showing up in published research.
Excel is a disaster, it will seem to work until it turns out it's doing something weird behind your back, or maybe you've made a careless mistake somewhere and it has zero tools to help you catch it. And you won't know until it's far too late. Please say no, use anything else - Python, JavaScript or heck even QBASIC or whatever - but just don't use Excel.
A friend of mine works converting spreadsheets to Python/Java programs. Basically people at the company solve the problem with Excel, and when that gets to slow, he codes it up to read from a CSV and they're off to the races.
It's amazing how far these business types can get before needing a programmer to optimize things. As a programmer, I tend to think in terms of logic, and then I tried to build a complicated budget tool in Python and the turnaround was just atrocious. So, I did it in a spreadsheet instead and was far more productive.
Spreadsheets are awesome, and whoever came up with the concept should get some kind of award. I can't think of a single software tool that has enabled more productivity than a spreadsheet.
And is it just me, or hopping on OpenOffice Calc or Google Sheet felt like a different device altogether: They all look like bikes, but accelerate differently, change gears differently etc?
I love working with spreadsheets! In a past life, I was all about taking data from various sources, processing it, and spitting out CSV's that I would then work on with a data analytics person using Excel. Curious if y'all use a similar approach of generating CSV's or do you go even deeper and actually create excel sheets with excel native stuff in it (like formulas).
What I specifically handle is Water. Starting from the natural system (weather generator + Hydrology) through to the human system (Demand model (showing increase over time and temperature increase) + system operations (dams, reservoirs, treatment facilities, pipes, etc.) to delivery to customers.
The big problem we answer is "How can we continue delivering water to people while under climate change uncertainty? What policies or new infrastructure can we build so we remain robust and resilient under these new conditions?"
My issue with spreadsheets is that they could be improved a lot with minimal changes, not nobody seems to do it in popular office packages. Making them more database-like and making table data first-class (at least you can make named tables in excel on windows) could be used to push people a bit more towards organised data, without changing how anything works.
Half of the mess that makes excel hell comes from the fact it's too easy to put two tables of data + some random constants on a single sheet and refer to them by H3. Now it's hard to add more data, hard to move anything, and hard to create space which expands to the next row with existing formulas.
Airtable (and Access) implements this idea, but unfortunately sacrifices the generic, free-form spreadsheet along the way.
I'd be even happy with clippy popping up with "It looks like you're adding a new table in the same sheet. Would you like to learn about using multiple sheets?"
The link is to "You Suck at Excel" by Joel Spolsky, which is an excellent tutorial.
I sincerely wish it had a different name, though!
I would love to send this video to some of my business colleagues inside a large enterprise. They need this information and they would enjou everything about this video. However, it would not be acceptable to send them a video entitled "You Suck at Excel."
If it had a more enterprise-friendly name I would even have a link and description to it in my email footer inside the enterprise. It would really help a lot of people.
Genius idea. I have a bunch of domains, combine it with the right subdomain... and we're onto something. Adding this to my list of things to do. Cheers!
I'd be over the moon to see one change to Excel. Native support for a language other than VBA. Perhaps the CLR. Perhaps Java/Typescript. Just something (optional) for programmers who want to use Excel and not want to deal with the garbage that VBA is.
Some workarounds exist, but afaik they require collaborators to also have the tool install, which is dead in the water.
I know MS has considered it, I'm still pretty surprised they've haven't followed through.
I've built plugins with VSTO (for Outlook) when I was at MS, but my memory is that it’s dependent on a plugin being distributed and not available for native use embedded into the excel workbooks.
You have to install Visual Studio, compile a plugin, register the plugin, and then from VBA you can call out to that COM interface if you'd like. Even if Excel users could overcome those hurdles, you still break the collaboration flow (i.e. just sharing an excel file).
Looking a little deeper, it looks like they're starting to support Javascript for the newer cross-platform add-in system (and also VSTO), but it looks like you still have to distribute your JS add-in via a web-service as opposed to being fully integrated into Excel and XLSM files.
The bit of just being able to share the XLSM file and users using Excel with no other installs or special network access, is really the make or break for me and non-professional programmer user scenarios I've seen.
Microsoft would also like to see that change, that’s why they tried to kill VBA. The reason for not following through (in killing VBA, because they definitely support and prefer other extension mechanisms) was pushback from users.
Wait, really? I've been working in a thoroughly MS ecosystem for the first time this year, and I can't seem to find any evidence that they've tried to kill VBA based on how often I run into situations where VBA (or worse, VBscript) is the ONLY option, especially when working with excel's developer functions or writing expressions in SSRS utilities. Did the users push back because the options they were presented with were use VBA or have no scripting capabilities?
Although I don't like JS, I use that too, but it's pretty half-baked tooling compared to what Excel has. I would rather like to see just being able to point out some Google Cloud Functions and use those as spreadsheet commands. That would be powerful. Now you can but you have to do plumbing and you cannot reuse libs over sheets unless you make them public (last I checked, I could be wrong on this one but I read posts from as far back as 2011 where 'they are working on it'); you have to copy/paste. Yuck.
I wish someone (MS, ...) would do something, but I work on Linux so no Excel for me. And I do like very large datasets to be in the cloud anyway and not killing my laptop.
There should be more competitors in the space. I know there are a few, but they are not really competitors, just more niche / boutique products that attack a specific case, so you need to go back to Sheets or Excel anyway.
You've been able to control all the Microsoft Office and other apps like Internet Explorer through "out of process" OLE automation for ages, and Python has a great win32com module (part of pywin32) for controlling "in process" and "out of process" OLE controls / ActiveX components.
Intgrating COM into Python is one approach, but another approach is integrating Python into Active Scripting. (The age old extending/embedding debate.)
And Active Scripting (1996) let you plug different "in process" interpreters into the web browser and other multi-lingually scriptable applications, and call back and forth between (many but not all) ActiveX components and OLE automation interfaces more directly, without using slow "out of process" remote procedure calls. (Some components still require running in separate process, like Word, Excel, etc, which work, but are just slower to call).
>The Microsoft Windows Script Host (WSH) (formerly named Windows Scripting Host) is an automation technology for Microsoft Windows operating systems that provides scripting abilities comparable to batch files, but with a wider range of supported features.
>It is language-independent in that it can make use of different Active Scripting language engines. By default, it interprets and runs plain-text JScript (.JS and .JSE files) and VBScript (.VBS and .VBE files).
>Users can install different scripting engines to enable them to script in other languages, for instance PerlScript. The language independent filename extension WSF can also be used. The advantage of the Windows Script File (.WSF) is that it allows the user to use a combination of scripting languages within a single file.
>WSH engines include various implementations for the Rexx, BASIC, Perl, Ruby, Tcl, PHP, JavaScript, Delphi, Python, XSLT, and other languages.
>Windows Script Host is distributed and installed by default on Windows 98 and later versions of Windows. It is also installed if Internet Explorer 5 (or a later version) is installed. Beginning with Windows 2000, the Windows Script Host became available for use with user login scripts.
I dislike VBA too, but it would be easier to stomach if Excel at least had a decent IDE. Compare the VBA editor in Excel with Visual Studio and the difference is stark.
> Half of the mess that makes excel hell comes from the fact it's too easy to put two tables of data + some random constants on a single sheet and refer to them by H3.
That is a feature. If you can keep in a screen all the data you need, you reduce the cognitive overhead of having to switch back and forth between tabs. Of course if you start needing to add more data you need to "refactor". But it does not seem a lot more different that starting with a prototype and having to change everything to support the features of a real product.
> it does not seem a lot more different that starting with a prototype
People underestimate this point. Spreadsheets are one of the best possible approaches for prototyping data-driven applications, thanks to its dual data/code nature and the ease to build and extend data types (just add more columns to a table); it's like building the application inside a debugger.
Non-developers don't have anything else that resembles a debugger, anywhere in the common approaches of software for end-users in industry.
Combine this with spreadsheet applications working as integrated environments -a single tool that handles all the computing needs of a project, without having to build a toolchain-, and it's no wonder than it's the preferred method for non-programmers to build custom automation workflows when their needs aren't supported by any specific software.
> Spreadsheets are one of the best possible approaches for prototyping data-driven applications,
Bingo. Back when I worked in big corp Excel was the prototyping tool for the masses. If someone in a remote office built something in Excel that solved a problem that was useful to other offices my team would come in and use that work to build a real system. It was great because a lot of the requirements discovery work happened organically before we would even hear about the process/Excel.
Admittedly, we were a small team and could only take on so much work. This led to Excel being used beyond its capabilities which leads to a host of other problems.
> That is a feature. If you can keep in a screen all the data you need, you reduce the cognitive overhead of having to switch back and forth between tabs.
That's true, but it doesn't necessitate different tables living in the same row/column grid.
My main issue with them is that they often aren't reproducible as the data transformations aren't logged. As a computational biologist I often deal with spreadsheets that experimentalists give me. I have no idea how the data were transformed from the raw data, and often the experimentalists themselves can't remember either. That's why I infinitely prefer real programming languages for use in data manipulation -- you can look at the code.
PowerQuery is integrated into excel, gives a visual editing platform (not purely code), and is quite powerful in terms of linking up to any datasource, manipulating data, and giving you the output you need. All the steps are listed on the right side, and you can click back to any previous step in the transformation.
It's not super popular, but it does serve as a nice middle ground for people in the space who aren't strong in coding.
> Half of the mess that makes excel hell comes from the fact it's too easy to put two tables of data + some random constants on a single sheet and refer to them by H3
This is certainly true, but that is also where almost all the power comes from, its generic nature. I had a startup that tried to reinvent Excel for 6 months and we kept moving closer and closer to the excel "sack of undifferentiated data" paradigm...
That's what Excel Tables are for.
You get a table within your spreadsheet and instead of using cell references like =Sum(C2:C7), you can use structured reference like =SUM(DeptSales[Sales Amount])
I have quite actively tried to avoid excel lately (which means also avoiding to learn new tricks...), and I am not that familiar with tables. Do the tables:
1. Enforce same data type per column. I.e., if you have a number in C4, you can't enter text into C5?
2. Enforce that if there is a formula, it is applied identically to each row?
(By enforcing I mean that there is _no_ way around that short of copying the data to new sheet)
Give me those two things (preferably within sheet context instead of table context) and a decent version control and I can reconsider that I do anything but disposable ad-hoc in excel again.
Unfortunately the answer is "no" to both #1 and #2.
It does do something close to those. Column data types are automatically carried over to the next row when inserting or appending rows to an existing table. But you're free to override it if you want for a particular cell
Same with formulas. When you enter a formula in one cell, Excel will automatically copy across the whole column. But you're able to undo that auto copy if you really just want the formula in that one cell. It'll also flag cells that have inconsistent (compared to rest of table) formulas.
So, no, it doesn't enforce. But it does encourage.
The warning you receive when a formula is not consistent is a nice feature, and an advanced user will quickly learn to be very sensitive to those warnings, will spot them quickly, and avoid causing them when possible.
To strictly enforce that the formula must be the same in each cell of the column would not be very excel-like, I can't see it being done.
Yep, the problem is that I have this conspiracy theory that Microsoft has designed Excel to be the ultimate booster of Dunning-Krueger effect so that most people think they are advanced users [1] while they actually have no clue what they are doing. All the while giving no protection whatsoever against those Dunning-Krueger cases.
Totally. I know about it. The problem is that approximately nobody using excel does. (Yes, finance, data processing, analysts etc. do. Compared to the numbers with excel installed - still ~nobody.)
We (https://sheetjs.com/) almost exclusively deal with those types of complex applications built in spreadsheets
> they could be improved a lot with minimal changes
They can be improved for very specific use cases at the expense of others.
> Making them more database-like and making table data first-class (at least you can make named tables in excel on windows) could be used to push people a bit more towards organised data, without changing how anything works.
There are "database functions" like DAVERAGE, PivotTables and other systems for dealing with more structured data. Getting users to use them is a challenge outside of the scope of the tool. "You can lead a horse to water but you can't make it drink"
> Half of the mess that makes excel hell comes from the fact it's too easy to put two tables of data + some random constants on a single sheet and refer to them by H3.
Half of the reason Excel is so popular is the loose structure and the power that it enables.
> Now it's hard to add more data, hard to move anything, and hard to create space which expands to the next row with existing formulas.
Most of the hairy workbooks start as a solution to a problem at hand and eventually accumulate cruft after people try expanding it, not too dissimilar to other forms of software development.
> Airtable (and Access) implements this idea, but unfortunately sacrifices the generic, free-form spreadsheet along the way.
You either enforce the constraints and weaken the platform, or you give users the power to do what they want.
> You either enforce the constraints and weaken the platform, or you give users the power to do what they want.
I disagree with that. All the pieces are already there. There's nothing to weaken by education. But nothing tells the new users about them. They see the main screen and rarely ever know about multiple sheets. I've seen people using excel at work for years without knowing that. You don't have to take anything away from them, just go: hey, did you know there's a better way?
I know it's possible because I introduced a few people to named ranges, sheets, and data tables. All were happy to apply them later. (On their own)
Clippy, Genie, Plany, Peedy the Parrot, Merlin the Wizard, Milton the Bear, Oscar the Cat, Max the Search Doggie, and all of their other happy friends were cheerfully retired in 2009, and are now merrily frolicking on a nice farm in the country side with a very loving family. Or so we are told.
My endgame is build a Excel+Access alike dev tool. I'm working in a relational language as the foundation (http://tablam.org) because Excel have another problem: The excel UI paradigm (reactive) is not the VBA macros one (almost OO). So make new functions not follow the way of excel.
I think too a spreadsheet is an interface for a database/source and must not have issues handling gigabytes of data. So I think internally it must a sqlite db for example and put on top the control. Making it virtual and reactive and is done.
> at least you can make named tables in excel on windows
I built a spreadsheet that loaded external data sources (filtered log files from a production system) into Excel tables. I then combined two log files into another table, and created a pivot table from this, which I then filtered (date ranges etc) and analysed.
That worked great at the start - until I came to update the external data and found the pivot table didn't update. Then when I force an update it lost my groupings, filtering etc.
I suggest you use Power Query to load external data. If you do, pivot tables and tables connected to those queries should all update automatically on refresh.
Power Query is really the best feature in Excel to me, eventually users will stop copy pasting from Access to Excel and I'll be happy.
It had its problems, but Apple's Numbers product got this right I feel - it was a kind of blend between desktop publishing and spreadsheets, where you could scatter arbitrary tables of data around a page and then have them refer to each other. I suspect a lot of people who are throwing multiple tables of data in the same sheet are doing so because they want both of them to be visible at once.
In my opinion - "No Code" tools will fix this gap to build real world apps yet being as simple as excel.
While airtable is just a DB as a service, a real no code tool would have ideally all the basic building blocks:
a. DB as a service (like Airtable)
b. BPM capability -> to add workflows to the data
c. Ui Building to abstract data
d. API integration (may not advanced learning for power users)
Excel is the closest we have come to no code tools and its been around for years.
Ms Access actually has a fairly decent way to build apps without really touching code, but it requires understanding relational databases enough to use it properly. Most (non technical) people end up using excel as a database instead.
I haven't looked at Access lately, but when I was forced to use it extensively, there were limits tonhownfar you could push input validation without writing VBA. So I suspect that most Access databases have severe issues with data consistency. But at least the data is structured.
No, I fully understand that this is great for many users and that it cannot be made harder in any way to make whatever mess you want on a single sheet - or excel would die. I only want it easier to not do that. (There's nothing guiding people to better design right now)
That's by design. People don't want to be "guided to better design," they want to get their math done and over with, leaving some trace (the Excel sheet) for the next time they or somebody else needs to revise that math.
I'd rather want it to stay easy, but offer better refactor tools to clean up the model after the fact.
My main complaint against spreadsheets is that, to get a robust and clean model, you must plan from it from the start, negating the main value of using a spreadsheet to begin with. When moving data and formulas around, it's too easy to break the formulas and not even notice.
named ranges and DGET "sql-like" queries are in google sheets as well. (or perhaps what you meant is that few people know about these things, and don't use them, so you end up with unreadable messy code...)
Yeah, I meant that they're not right in your face and ready to be used. I know it's hard to force people to use new methods, but if anyone could do it, MS / Excel could.
It's all possible though, if you know where to look (I think excel even has table sheets now?)
I see what you're saying, but this is actually a deliberate usability principle: make the basic features easy to access (visible without clicking into a menu), and put advanced features inside menus. Then basic users who don't need more advanced features can do what they need without getting lost in a super complex UI, and power users can still get to the advanced stuff.
If people want to do advanced stuff with software, they really do need to invest the time to learn where the advanced features of the software are. I think this is reasonable.
I worked with excel for some years early in my career. I did not know how to program back then beyond if else and for loops.
We used it for budget estimation at multi million / up to 100 m. I have seen other uses in Corp world in many departments.
IMHO I think it’s good for specific use cases. But it is routinely abused beyond that.
The key shortcomings for excel uses by non-programmers for key business applications:
- How do you test that your calculation is correct after update. The same argument holds for business applications without tests?
- how do you maintain the knowledge of the inner workings without having to follow the arrows around cells. Probably there are good practices but how likely are you to ensure them in your product / team.
- the data / computation is spread in user space. In case your app is useful many people would like to use it. How do you manage the updates and bug fixes beyond it’s the user’s
Responsibility
- how do you avoid the black box effect: mission critical software that no one knows how to touch inside without a full rewrite?
- how do you convince stakeholders that they need to migrate to an adapted solution while they have a working one now and often are oblivious to the hidden costs of users copy pasting data / reformatting for hours sometimes to fit an existing tool that is no longer adapted.
The data frame structure in R or python solves a large portion of these issues. Yes you need training. It’s the same for excel if you want to avoid in inferno machine case you need the same concepts.
Why not train for programming with data frames and have the option to gradually extend to web app without a full it project that needs 3 levels of validation.
Data frames are going to be huge. This year I started using them for a personal project -- python, jupyter notebooks, and pandas -- and was astonished at how easy it was to get running, and how powerful.
I feel like lots of number people, once exposed to it, will realize that there's a place to go after Excel that's worthwhile.
There's nothing WRONG with the spreadsheet model, and never has been. It's immensely powerful and useful, and the growth in capability over the life of a product like Excel is remarkable and enviable. Excel is perhaps Microsoft's greatest product.
The problem with it -- and there IS a problem -- is really a problem of applicability. Excel, like Lotus before it, is the first place many people encounter the ability to create their own logical conditions.
For a huge subset of these people, it's also the LAST such tool they learn. And now they have a hammer, so everything is a nail. I don't just mean finance people who live and die by spreadsheets; I've met engineers who would solve problems in Excel macros that would've been better attacked in perl or Python.
The other "problem" is the degree to which a horrifying number of organizations end up depending on very, very complex spreadsheets full of arcane and undocumented formulas and macros, and for which change control is "save-as."
But, again, none of this is a problem with the product itself, or with the idea of spreadsheets.
I've been showing a team of economists working on some economic modeling how to use git and R. They were amazed that they could keep working on a file without creating backup copies, or worrying about team mates overwriting their work. It's a work in progress, but they are eager to learn and understand the value there.
We sell & implement a project management financial metrics tool (supporting earned value analyss; google if curious). A key input is always the actual costs of work done, which has to come from the financial system of record.
A horrifying amount of the time, the actual path is through goofy undocumented Excel sheets, because nobody knows how to extract it natively from $FinSys.
I think there is a tipping point in complexity where Excel needs to be replaced by something else. In the IT world I've seen businesses that have essentially created bespoke ERP applications for themselves by evolving a web of interlinked Excel files over years.
A food distributor I worked for was handling their upstream vendors, warehouse inventories, truck inventories, customer list, order history, deliveries, invoicing, and more this way. These businesses suffer because the "Excel ERP" is constantly breaking or losing data, and no one knows how it really works anymore.
The food distributor eventually migrated to an actual ERP made for their industry, and the amount of effort and stress saved at all levels of the company was massive.
The sheer number of everyday business use cases that can be solved quickly and efficiently with spreadsheets is pretty astounding. Of course, there comes a point when heavily-used spreadsheets become intricate behemoths on which core business functions are run, and then things can get bad in a hurry. The challenge many programmers face is thinking that every business problem is best solved with code, when in reality the majority are solved faster, more efficiently, and more maintainably with a spreadsheet. Then the rest are worth developing "real" solutions for.
one of my first 'real' programming tasks involved replacing excel sheet with simple application.
Bcause excel sheet was doing a lookup in HUGE(i am talking about few million rows of raw data) dataset - basically a fulltext search in a database with unstructured address data.
The idea was to fuzzy join two datasets, and to make manual joining a bit easier with search functionality.
It worked fine on dev's machine, but on user's toasters it took an hour to load.
> The challenge many programmers face is thinking that every business problem is best solved with code, when in reality the majority are solved faster, more efficiently, and more maintainably with a spreadsheet.
It's not like you can't do both. Excel supported VBA macros for a long time. Now it supports JavaScript.
It still supports VBA. I've seen multiple manufacturing facilities which are absolutely reliant on VBA macros to be able to produce and ship their products.
For the last few years I've believed Excel is the most important piece of software written, no hyperbole. The reason is that it's so damned accessible. Software is eating the world, and letting people who don't write "true" code still kinda write code opens up software's productivity and automation benefits for a bunch of use cases and an insane amount of people.
For programmers, the analogy is python. It's almost never the best tool for any specific thing, but it's a really solid tool for a ton of things. It's easy enough to learn and has enough depth to keep learning. It's easy to prototype a quick answer, and can not-to-painfully grow to a large complex system. I think that's why you get people from all backgrounds using python. It enables people who don't know much about software to start writing software.
Excel's the same way, but with an even lower barrier (and probably lower ceiling). It enables people who don't know anything about software to get many of software's benefits. That's huge.
Just like python can great for a biologist whose focus is biology not software, Excel is great for the accountant whose focus is accounting, not software. It makes them a programmer, or at least close enough for many many purposes.
My biggest problem with Excel as a "solution" is that document control is kind of an afterthought. You've got X number of people in your organisation and they each have these adhoc collections of Excel spreadsheets. Usually the devil is in the details and you'll have one person who has calculated gross revenue one way and another person who has done it another way. The two approaches are never reconciled, but their output will invariably be compared as if they are discussing apples and apples. I've seen so much analysis done that was really just garbage because nobody understands where the data came from and what it means.
It's not really fair to blame Excel -- it's a calculation tool. However your solution space needs to address this problem and I very, very rarely see it happen.
It’s absolutely fair to blame excel. It obfuscates the code in favor of displaying the calculated values - it’s designed to hide the code - which, along with a number of other design decisions, makes it really hard to build an error free spreadsheet beyond trivial sized cases.
Further, its design makes reproducible data practices difficult - in contrast to R or Python which do a lot to separate code from data - and let you re-run the same code on new/updated data. Python and R (and other non-spreadsheet tools) encourage practices that make keeping raw data pristine with work being done on copies of the data. In contrast, it’s really easy to make mistakes with Excel in ways you’ll never catch. Sorting within filtered columns is a good example. Did you add another column after creating the auto filter? Surprise, data in that column won’t sort with all the other data when you use re-sort one of the original columns. Just like that, poof, silent data corruption with no easy way of reverting if the error isn’t caught quickly.
I almost cracked the org-mode code. Almost. Used it for some sheets but now I only use it for my journalling. Hopefully it will keep the embers hot until I decide to pick it up for real again.
I feel the same way about Access, too. It doesn't matter whether they're "real programming", it matters that they let real people solve real world problems for themselves, who otherwise would have to pay real programmers real money to develop real applications.
I'm in that category of people the regularly have to unwind the web of Excel or deconstruct the Access databases that grow to rival SAP in terms of intricacy...
... I actually agree with the overall premise that these tools do let non-tech types do real work and I think department power user types should have these things.
But we must also recognize that these tools are a bit like a black hole, too. Not unlike many applications created by professional developers, these users will tend to continually add features and bells and whistles until they pass the event horizon that exists between good judgement and bending these tools to be what they aren't (the most common is turning Excel into a database). On one side of that horizon you can pull back and reasonably find better technology to implement sophisticated capabilities and on the other you start to reduce efficiency as results as the errors and issues with the misuse of these systems overwhelming the benefits. Worst part is... you never really know when you've passed through that horizon... until you're spaghettified that is...
So in-house technologists need to be aware of these things and give good support, and then also provide the technical judgement on when these approaches start to break.
I once build a Rails app that translated a giant Google Sheets document into a web application.
What I found really fascinating was that:
95% of my development time was spent building CRUD, access control, UI that was already more-or-less provided for free in Sheets
5% of my development time was spent writing unit tests and services that implemented the actual logic and formulae in the spreadsheet. Even though this was the tricky, business critical "thinking carefully" part, it was also the least time consuming.
It just goes to show, even compared with Rails, applications like Excel and Sheets give you a LOT for free, out of the box, accessible to everyone.
Has Access solved the corruption issue that plagued it for tens of years? While you can connect it to a "real" database engine trivially, a common use-case is via a single MDB file that could become damaged.
You could damage the MDB a number of ways including:
- Leave Access running before shutdown (or power loss)
- Intermittent network outage to the file server
- Multiple users trying to access the same MDB (or anti-virus scans/locks, even on another user's machine)
The biggest headache though with Microsoft Access was never the product itself. It was that the product didn't really have a natural evolution. You'd start with a single MDB/single employee, but one day you'd need two employees or more (and security, and more tables, and this and that), and while you could migrate the MDB into a real database engine ($$$) and use Access as the front end, the record locking was funky and scaled poorly (plus control was limited).
The whole product felt a bit like a mouse-trap. A nice shiny piece of cheese, that genuinely tasted good/worked well, but as soon as you tried to move it SNAP. I don't dislike Access, but it was always painful when a business outgrew it (whereas Fortune 500 companies live and die on Excel).
Access and FileMaker made sense before the advent of open source, specifically Rails. Django, Postgresql, and MySQL. Both products make little sense now.
These are much harder for non-programmers to use than Access. With Access, almost completely non-technical people can set up their own database and make the queries they need to answer their own questions. With Postgres accessed from a general-purpose programming language, non-technical people need to hire someone to help them with every basic task.
As far as I can tell Access has nothing to do with making websites, so it’s unclear what Django or Rails has to do with anything.
> As far as I can tell Access has nothing to do with making websites, so it’s unclear what Django or Rails has to do with anything.
You can make apps with Django and Rails. They're not just "websites". Web apps are also still useful even when they're not public facing.
> These are much harder for non-programmers to use than Access. With Access, almost completely non-technical people can set up their own database and make the queries they need to answer their own questions.
That's the thing. Access and FileMaker aren't typically run by non-technical people. Yes, they aren't initially programmers, but they're usually technical people. FileMaker & Access users, whether they realize it or not, become programmers. I feel that you're confusing Access & FileMaker users with users of Excel.
> With Postgres accessed from a general-purpose programming language, non-technical people need to hire someone to help them with every basic task.
A relational database is not a big leap from either Access or FileMaker.
I know all of this because I used to work closely with a team of these people, and at some points I've even helped maintain their code.
> For example, my anthropologist parents used Access for analyzing their manually gathered census data for a small rural village.
That's surprising. Most of the time, this is what Excel & Wordpress are used for by non-techies, since both FileMaker and Access feel daunting to most of them. Maybe this is exclusive to museums? This is anecdotal, but I've worked in a lot of different industries, and in all of them everyone maintaining FileMaker or Access were also knowledgeable enough to code in those platforms ie. they were techies before they started using FM or Access
Yeah, I hear this a lot, but that really does not make a lot of sense. I hear the same for wysiwyg builders, CMS and 'codeless/lowcode' platforms, but they are all alive and kicking despite programmers thinking that it is much more easier to whip up something with Rails/Django.
Those platforms are very far removed from 'the normal user'; you cannot just install ONE (!) installer file and then open a visual drag & drop editor and whip up what you want. In the worst case you have to install several things and some of this need to be configured; you already lost 99% of the business people there.
You forget 'little things' like; they have to know html/css/js as well, or at least know how to add and work with Bootstrap. For business use you need Bootstrap controls; you need to know how to add and integrate them (people are not going to understand what it means to just 'bind to the onclick event') while in the before mentioned packages that is jut drag & drop. Installing new controls is too.
Then, if they got as far that they run into having to securely deploy it somewhere. It is all just too hard.
The web made them less used, but not because of Rails/Django, but because they are, legacy wise, less well fit for the web, but there is still a huge market for them and there are more and more appearing that solve some of these issues for web and definitely quite a substantial group will choose them over hiring 'real coders' to make things quickly. There are plenty of web shops (with big clients) doing their db work with Filemaker (and Filemaker does support web).
Citizen developers in an organization are not a bad thing. But it is important to properly support them, with the right tools and training, but especially when the harder problems need to be addressed by specialists.
I second the "right training" part. I think any organization working with any kind of data should invest in making all their office employees Excel power users.
Wrt. right tools, I think Excel is the right tool. It may lack safety affordances, but it does its job cheaply and efficiently (and you already paid for it since you probably need to open Word documents anyway).
As for "harder problems addressed by specialists" part, it happens, but I'd argue not that often. Are they using Excel to half-ass CFD sims of their rocket engine nozzles? Sure, they probably could use a proper software for that. Are they doing their own specific munging of customer or process data? Their workflow is probably too specific and too fluid for it to be shoehorned into a "serious" application without a tremendous loss of productivity.
It's interesting how often people's response to the manifest problems caused by people using Excel in unfortunate ways is to either suggest tools with much steeper learning curves or to suggest additional Excel features. 99% of these problems can be solved by proper training and standards appropriate to the problem and context at hand.
For instance:
Always separate and label inputs, calculations, and outputs.
Document where source data has come from, where one cell has had an ad-hoc adjustment made, what formulas do.
Use some type of version control and don't keep loads of concurrent versions around floating on email and local hard drives.
If the spreadsheet loads data from external sources, try and make that load automatic and live to prevent staleness.
Consistent formatting rules.
If data is tabular, put it in an Excel table. If data is tabular and we are always doing the same queries on it, and it is large then we move it to a database but that rarely happens.
Make it clear who owns spreadsheets and is responsible for keeping either/or data & functions/formulas to work.
Do all of this first, only then start thinking of replacing Excel with something else.
Hardest part of everything I do in my life is making other people follow rules, or any kind of guide lines. If you succeed with making only 1 rule of those you described consistently followed by organization of 10 people for 6 months ... please send me how you did that, then I am going to be a millionaire.
I am team lead of small dev team, if you push people by force to follow rules they will start maliciously follow them up to the point where no work is done and your spreadsheets are perfect.
If we lived in a world where your suggestions were followed Excel would indeed be an OK tool. Here in the real world however Excel has just the right combination of power and usability to shoot off every left foot in a five cube radius, and frequently does.
Note that the reason why Excel is popular is precisely because people "don't try to fix the user", but instead develop software solutions tailored to address (author's misunderstanding of) a specific use case users have. The problem is, the specifics of that use case changes much faster than the dedicated software can be updated.
Processes that are thoroughly understood are easy to automate. Those that aren't, require competent people with flexible tools. Excel is a flexible tool. Competence can be gained through training.
In our context, we use Excel as a team in a work context where our rules make our specific tasks easier. It is much easier for people joining who are already proficient with Excel to follow some formatting guidelines than to invent a custom tool and learn to use that.
Hmm, I'm not sure about that! Formatting in Python, sure - because it's not designed for outputting colours and type face. But the popular Python package, Pandas, makes Excel-like tasks a breeze. I'll often help colleagues who can't code to do something with their data that is nothing more than a single "groupby" or "apply" expression in Python+Pandas.
I always like these threads where programmers talk about tools like Excel and say they’re fine for small stuff but grow out of hand, as if 90% of codebases built by professional programmers aren’t exactly the same.
But professional programmers (emphasis on professional) are well aware of how codebases can grow out of hand and employ entire categories of theories, strategies and tools to keep their software maintainable, even when under delivery/time pressure. Excel doesn't really have much in this category of theories and tools.
It's unfortunately not available in the Mac client, but Excel also has an amazingly powerful and useful tool called PowerPivot/Get & Transform[1].
You can connect to external data sources such as CSV files, Excel files, any database with an ODBC connector, APIs, all kinds of neat things. Ingest that data into your Excel file, create an enforce constraints and relationships within your data model, gives you incredibly robust data munging and analysis functionality[2], and then expose all of that as a PivotTable. And the functionality itself bypasses the limitations of Excel such as max data size or computationally inefficient formula implementations, as it uses a separate data storage and computational engine that's a highly compressed columnar data store.
The PowerPivot work is also mostly transferable to PowerBI and Analysis Services. Taken together, you've got all the tools to apply progressive enhancements for end users. Let them create their Excel-based stuff. When it starts to become more mission critical, non-performant, or error-prone, provide them with support to clean it up in the ways that video from Joel Spolsky mentions. When it hits growing pains from that, refactor it further to leverage the built-in data modeling capabilities to enforce some integrity, automation, and potential data volume scaling. And when you hit growing pains with that, or the underly process/usage finally matures to a state of stability, or you need to address security/access/audit-ability concerns, transfer that data model and everything to either PowerBI or an Analysis Server deployment and migrate the management to IT.
I don't see it in practice very often, but it's an incredibly effective and frictionless way to both enable your business users to innovate their work processes via the tools they know, while also providing a non-disruptive way to mitigate your business becoming reliant on apocryphal spreadsheets being passed around to support critical business functions. And by design alleviates many of the causes of "automation" projects failing.
I used to work at a bank that built a huge integrated application development and deployment stack on Python. One aspect of it was a custom library implementing a Directed Acyclic Graph, basically a way to ink up properties so that when the results of one property changed, any dependent properties on the DAG would automatically recalculate. This was explicitly implemented so that we could directly translate the logic from Excel sheets into Python code.
A lot of our projects were built to accept an excel sheet as input, then generate another Excel sheet as the output. The idea was the logic and data manipulation and transformation should be in source control, but the Quants would still have Excel available to build graphs, pivot tables and do statistical analysis on the results.
The DAG also turned out to be pretty handy for implementing web applications and all sorts of other apps though.
We also found using Excel as an output to be a powerful design pattern. Excel is a great language for explaining the flow of calculations. We used to have WPF screens where a portfolio manager / trader / analyst could right-click on a number and select "show details" and he/she would get an Excel workbook where that same number would be selected and they could see the derivation by following the cell's precedents. It was not always feasible to show a complete derivation, but we'd try to show a useful amount of elaboration. This was a heavily used and loved feature, often leading to the users making edits or additions to these detail sheets and giving it back to us as a spec. Building the 'details' Excel books programmatically was a lot of work, but part of the payoff was that we also used them as a nightly check on our primary system, i.e., we'd automatically run 'show details' on everything and make sure the 'detail' result matches the primary system result. That comparison caught many bugs. Kent Beck: "More important than tests vs types is the principle of double checking. If you say something twice in independently derived ways, you're more likely to be correct than if you just say it once. Tests are a form of double checking. So are types."
I love spreadsheets. When I'm working out a method for analyzing some dataset, I just hack it out in Calc. Or if it gets too big, Excel. Once I've blocked it out, I move to MySQL. But I can do any database operation in Excel, or even Calc. Maybe not efficiently, but well enough.
But I've also learned that basic Linux tools (grep, sed, tr, awk, sort, uniq, etc) are far more efficient for cleaning and preparing data for spreadsheets or databases.
And then I use spreadsheets for final analytic steps, stuff that SQL doesn't do efficiently, and for charting. I could learn Python and R, I suppose, but SQL and Calc/Excel have always been enough. And Gumeric, sometimes, because it can do some amazing charts that the others don't.
I was thinking of min-max. You can select an area of cells as a series, and so display numerous min-max lines. Also matrix and contour. Those aren't in Calc 5. Maybe they're in Excel now. I have an old Windows 7 version.
Excel is powerful. It does so many things really well. Sharing data between sheets in real-time isn’t one of them though (maybe I just haven’t found the right tool, so we went ahead and built one).
I worked at a large fintech company a few years ago. During my interview Excel popped up as a topic, the interviewer quipped “Excel is terrible!”, he was referencing the heavy reliance of his customers on spreadsheets rather than the “better” functionality they offered via their platform. A few years before I would have agreed, but Excel really is amazing, it allows almost anyone to just-get-things-done.
There are plenty of cases where Excel projects had grown to the point where specialised software would be better for the business and the users... and if you’re in an industry with heavy and advanced usage of excel (like the fintech space), it’s a great place to mine solid ideas for a startup. Just don’t try to recreate everything Excel does! Focus on areas it doesn’t excel in.
Having worked a decent amount in Excel, here are my key issues:
- VBA Macros and Sheets are two distinct paradigms. Within an Excel file, it is not always clear how the two interact and requires meaningful digging.
- Once a numerical model has been calculated in a sheet, it is difficult to scale it. Yes, it is possible to copy sheets but if you make a change or want to do something 100's+ of times, it's a pain.
- Data integrity is a problem. Opps I pressed the wrong key and I deleted some data. Oh shit, I don't have Git to compare what was changed.
For a long while, my dad's job was as a consultant, and his job was to convert Lotus spreadsheets to Excel spreadsheets (and sometimes small databases). That was it. The spreadsheets he worked on were responsible for moving millions and millions of dollars around.
One of his jobs was for a major movie studio updating their sheets that calculated royalty payments. Every actor that ever worked on a show distributed by that studio relied on the accuracy of that single spreadsheet for their "money mailers".
Funnily enough, excel is a programming language. I don't mean VB Script or macros. I mean with the equations, ranges, and constraints - that's a programming languages there.
A cell is simply a computational variable (as opposed to the notion of variables in lambda calculus). A named range is a data structure (a struct). The rest are term rewriting
Isn’t the most relevant distinction when comparing a spreadsheet to programming is that it’s non-procedural? Writing spreadsheets reminds of writing Makefiles. There’s no start or end. Just a bunch of declarations about relationships. And it’s just...happens.
Reactive programs don't have to be declarative. You can express a reactive program imperatively.
The defining feature of a reactive program is that relationships between inputs and outputs are automatically tracked, and changing an input will automatically update the dependent outputs.
Facebook had a whole experimental language (now abandoned) for writing reactive programs with imperative code: http://skiplang.com/
The portion of comment I was replying to was "no start or end" . This is reactive programming.
Having said that, other than Excel and some cute UI things, I've not seen much utility in reactive paradigms. Hence I am bearish on reactive programming. Someone please correct me.
And that's why when you start to use use VBA or js in Excel, it's a fail.
Excel power users (I am not one of them) know how to solve almost every problem they are presented to, using only Excel. Like SQL with relational data. With orders of magnitude better performance.
Main issue yet is scalability, when the dataset gets too big.
At my company we've built a slack bot that sits in front of a google sheet to manage transactions (eg. borrows) in our office library.[1]
I'd love to know whether this could be a Glide app, or whether we'd run into a technical limitation where we can't have users scan a book's ISBN and have that do a lookup in the sheet.
Few minor notes:
- You've featured an app "Tournament of Books" but there are no links there. Both the title and the mock text conversation image can be links to that app
- Interspersed fixed width font is jarring
(OP concludes ‘programming should be more like Excel’. Glide’s mission is to apply spreadsheet programming to software in general. This is the most relevant comment, downvoters be darned!)
Huh, I have the exact opposite feeling. If a SaaS app can be boiled down to a specific spreadsheet + web UI bolted on top, it's making money by wasting everyone's time.
And as I grow older I get more and more convinced that Excel is overrated. I ask myself, WHY in Excel so popular? And I keep coming back to the same answer: because it's live. But we do have this concept in the programming community (see links 1 and 2) and it's not theoretical. When I teach Python I see people quickly understand everything that can be made live. Do this, create a variable, then create a function, then let them play with the variable and see how the output changes. This is also why SQL is so popular, because you "declare" it and forget it and you see the changes live.
Don't get me wrong, I love Excel, but it has its problems and it definitely is not a solution to everything data.
One other thing in defense of spreadsheets is that they are a tremendous introduction to data manipulation and management for people who don’t think of themselves as being ‘programmers’. Purpose built SaaS apps pre-solve problems, and that can be great, but spreadsheets allow users to become owners and there is a lot of growing that can start from that place.
That's all very well and good, but can I make a plea for organisations to stop using spreadsheets to produce tabled documents.
This seems to be endemic for information security surveys and would be fine for questionnaires with
preset answers, but when a question starts "Describe..." and wants a full description of your software development cycle from an InfoSec perspective, it's painful edit-wise - even if you can copy/paste an answer from a different spreadsheet completed earlier.
As the guy on the other end who has to read the responses; it's not fun here either. Though, there's movement in the SaaS space that looks to solve a lot of this. Proper text input fields, automated yearly requests for update, audit logging and notifications (x party has responded, y party asks for clarification on question z, etc).
Expensive though, but enterprise solutions always are...
> From there, you can calculate literally anything, and transmit not only the results of those calculations, but the actual environment itself, to anyone in the world, and expect that if they have a computer, they can replicate your results.
...unless they use Excel in a different language than you, in which case it can't interpret the commands or sometimes even the syntax. Like with the german version of Excel which requires ; instead of , as separator for arguments, among other things. This alone makes me prefer pretty much everything else.
Your strengt is your weakness is your strength. Spreadsheets excel at ad-hoc things because they aren’t systems, they don’t have restrictions and limitations on what data you can pull in or where you store it or what you calculate. They are horrible for all the same reasons, people end up spending tons of time semi manually joining up data from different sheets and copy pasting numbers from emails into calculations. The sheets take on a life of their own as they are emailed around between people, and errors like crop up: “oh, I think you have the version of the sheet that doesn’t work in October[real example], that’s got fixed in a version that was sent to Peter, check the mail stream to see who the original was sent to..”
And they grow over time without any obvious way of optimizing old dependent calculations so you end up with sheets that do company wide economic calculations but take 10min just to open, so IT ends up setting up batchjobs to open it everyday, copy in new data and generate PDF of results[also real example].
Of cause you can just put systems and rules around your spreadsheet practices and start adding passwords and accompanying documentation of “what you must do when using this document”. But When you get into these situations it’s almost always a better decision to set up an actual system to handle your data.
While it’s possible to “easily” write a computer program that is hard to understand later, the gap is much wider with Excel: many more people can create stuff, and the risk of being left with an undecipherable mess seems greater.
I am also still, in 2019, risk-adverse to these complex file formats, even if they are widely used. I can open a real computer program in dozens of editors and run them on many platforms. Yet with Microsoft’s own SharePoint solution, half the time Excel files can’t be opened: my web browser just hangs and then I have to download and open the file in Excel. That’s just crap.
For me, spreadsheets are also frustrating because they can make very poor use of space (and this happens in some other user interfaces too). I shouldn’t be forced to see only 3 numbers at once on a giant display just because they happen to be in cells that are a million miles away and separated by useless empty/unused cells. This feels like what you’d see if web sites decided to just dump their raw database tables onto the screen instead of presenting the data usefully. My theory is that people are just really adaptable, to an unpleasant degree; I’m amazed when I see people squint and tolerate absurd truncation of data and other unhelpful displays.
In cases where Data Science is a replacement for astrology ("just tell me some reassuring mumbo jumbo to alleviate the burden of decision making), the inevitable bugs may be irrelevant.
I appreciate the theoretical power of Excel. I just don't see how keeping logic in Spreadsheets bug free could be possible. The logic is hidden away and hard to get to. It is already hard enough to debug classical code. Spreadsheets seem impossible.
I worked as a spreadsheet DevOps in a financial company. We used to have a python script to spin up windows vms with the exact version of windows that was known to be good, with the exact version of excel and then automatically kickoff the computation.
The reson for this was someone once got the order of execution wrong and screwed up a couple billion dollars worth of trades.
Excel is great until you have to maintain it. Same reason why you don't let people build bridges with Lego.
Is there anything like a spreadsheet to Python converter, with cells becoming made up variable names? I wonder if it’d be useful for quickly noodling together a set of expressions, then “baking” it into code.
One major drawback of using spreadsheets is that it is harder to spot calculation mistakes. A famous/infamous recent example is the Reinhart and Rogoff incident in 2013, which involved their heavily cited paper "Growth in a time of debt". When a research student named Thomas Herndon tried to replicate the paper's findings, he found, after a careful inspection of the Excel spreadsheet the two Harvard professors used, that the spreadsheet was ridden with errors. Some of them were even in principle rather trivial, such as mistaking the sum of one column as the sum of another. The incident was widely reported in many business newspapers. Paul Krugman had also commented on it on New York Times.
Had the steps of computations been written as a program, it might be easier for the authors to discover their mistakes. With a spreadsheet, if you put formulae inside data cells, you need to click all these cells one by one to see if the formulae have been input correctly. This is tedious.
>With a spreadsheet, if you put formulae inside data cells, you need to click all these cells one by one to see if the formulae have been input correctly. This is tedious.
Not really-really, you can choose to show formulas in settings. (reading them without making a mess of the formatting/size of cells is another thing, but you can always make a copy and inspect formulas there, "ruining" the formatting).
It is still tedious, actually very tedious, but as tedious as reviewing source code.
If spreadsheets work for your use-case or industry, then by all means keep using them. Occasionally, I'll use one myself for some business purpose. However...the problem with spreadsheets is that people routinely use them for the wrong reasons. Those wrong reasons are the cause of the spreadsheet hate. It's not spreadsheets themselves that cause this, it's people with limited knowledge and limited ambition to learn how to use the correct tool for a task.
The saying "when all you have is a hammer, everything looks like a nail" applies.
Often, someone using spreadsheets will suspect that their task may be better accomplished in some manner of programming. This leads them to try to do it. If they are in this situation, chances are, they are not a professional programmer. They are going to have a poor experience, and likely fail. This is due to sub-par or non-existent software engineering skills, and not due to spreadsheets being the correct tool for every task.
Try to find duplicates or do vlookup in a table/sheet with more than 10.000 rows.
It will only look at the first part of your data, and skip a lot if you don’t remember to sort your columns first.
Yes spreadsheet are a great start. I often ask people to “prototype” in Excel, after that it is SQL that rules.
I don't know what do you mean by "find duplicates", but VLOOKUP has a option to specify an exact match so you can use it even if the data is not sorted (MATCH does as well).
This morning I was just thinking how spreadsheets are a great tool that would be worth an blog post.
In the past I built server dashboards on Google Sheets that consume live data through JSON and automatically refresh.
I wrote a real time labeling system for ML to categorize images in less than a day.
I even wrote a behavioral reinforcement system to build good habits that shows in real time the results on an Android widget.
A spreadsheet is a relatively very simple tool that most of the people understand (and often undervalue). It has a lot of flexibility in it and let you build "good enough" interfaces for internal tools in a fraction of the time that takes to build a web UI and they are are easy to iterate on.
You get a lot of value for time invested on a Google Sheet. They are free and from a user prospective do not require any infrastructure to run them.
Spreadsheets are the best -- specifically Google Sheets. The addition of QUERY really allows Sheets to be superior to Excel, at least for the things I use it for.
Combine that with an IMPORTJSON script, and you can create really clean, structured data sets for nearly anything you can dream up. I've been tracking movies and TV shows, for instance. Combining a few of the APIs out there to get the best possible results really makes the set shine.
I don't code beyond modifying basic things, but I rarely find something I cannot do with a spreadsheet.
I've been doing Excel and VBA work for my internship this summer, and I have to say, I have an immense amount of respect for the engineers behind Excel. Excel is incredibly extensible, it has very nice built-in error handling for mangled data (but horrible exception handling in VBA), and overall it is simply a joy working with it.
Sure, it has some pain points, but for my use case (financial analysis), it complements Bloomberg very well. Bloomberg's Excel add-in is very well-engineered, and there is even a way to hook into Bloomberg through VBA. Cheers to the MSFT devs who crafted this stuff.
I think spreadsheet usage can be done in good ways, but not when it is done in Excel, because Excel is a typical MS lock in product.
Things that should be obvious to do when having something important in a spreadsheet:
* Use well labeled (row name, column name, cell above, below, left of it, whatever) cells for in-between results.
* Have some checking for mistakes formulas for cells to show you a warning when there seems to be some calculation or entry mistake - like assertions in normal programming.
* Use plain text formats and use version control. Do not run around with 10 copies of the file named after what the index of the copy is.
* Have explanations of the formulas and the reasoning behind them somewhere, maybe even best inside the spreadsheet.
* Make only use of macros in there is no other way. Macros simply break things, at least in Excel. Only a few days ago I witnessed a case, where someone simply could not run some macro, even after reinstalling Excel, using an MS cleaning tool for "completely" removing Excel and various other attempts. And it only happened on that person's machine. The macro is not as reliable as plain text formulas.
* Have your data elsewhere as well. Do not use a spreadsheet as your single database.
And those are only the few things that come to my mind, although I am not a daily spreadsheet user. More frequent users might have many more guidelines.
I also recommend people to take a look at Emacs Org mode spreadsheets in combination with various other Org mode functionality. Those can be quite neat.
I agree, as someone who spent quite some time building fancy admin dashboards for our team, in my most recent project I chose to abandon this and just write an Excel exporter and importer so they could edit the data in a familiar tool. The flexibility of this was huge for our team - everyone in an office has some experience of Excel and they were more productive with it than they ever were with my previous data grids and clunky web based editors.
Just don't try to use dates before January 1, 1900 (or 1904 on Mac). You can work with dates in Excel 1,000 years in the future, but if you try to do that for dates from the 1800s it will completely screw them up. I don't understand why this problem still exists today. If you work in a museum, or anywhere else where you deal with old dates, you have to constantly be on the lookout for this "gotcha."
The spreadsheet is the original 'serverless' programming environment. It allows relatively untrained people to augment their brains to levels that they would otherwise definitely not achieve. This has obvious limitations but I still believe that the spreadsheet is the 'killer app' for all of computing, back when VisiCalc was first released the sales of small computers went up in an incredible way.
Serverless is really the wrong term to use here. That refers to a specific way of handling cloud infrastructure (the point being that there are still servers, you just don't care)
Excel is used for client-side only programs. There are no servers at all. All programs originally were client-side only originally because networks didn't exist.
Google sheets (and other google docs) can be programmed in "serverless" JavaScript that runs in the cloud somewhere. It's hellishly slow making sheets API calls, though. Feels like some kind of remote procedure call. (Slower than driving Excel via OLE Automation even, and that's saying something!) Then it times out on a wall clock (not cpu time) limit, and breaks if you take too long.
A CS grad student friend of mine was in a programming language class, and the instructor was lecturing about visual programming languages, and claimed that there weren't any widely used visual programming languages. (This was in the late 80's, but some people are still under the same impression.)
He raised his hand and pointed out that spreadsheets qualified as visual programming languages, and were pretty darn common.
They're quite visual and popular because of their 2D spatial nature, relative and absolute 2D addressing modes, declarative functions and constraints, visual presentation of live directly manipulatable data, fonts, text attributes, background and foreground colors, lines, patterns, etc. Some even support procedural scripting languages whose statements are written in columns of cells.
Maybe "real programmers" would have accepted spreadsheets more readily had Lotus named their product "Lotus 012"? (But then normal people would have hated it!)
What in spreadsheets is not testable? It's trivial to write a test that opens the spreadsheet in excel and tests any formula calculations via Excel Automation API.
You store them in the same place where you place other tests. It's just code. You version the data as you version any other test data. Never used travis myself, I guess it's some kind of CI.
Many people do this. It's perfectly normal.
Of course if you don't even use Excel in the first place, I don't know why you're asking.
A bit short on details. What exactly was it about this spreadsheet that made it hard to recreate in a programming language? Personally I find this hard to believe unless one of the key deliverables is "pretty much every parameter to this calculation must be tweakable by the user with the results being visualised instantly".
Spreadsheets are great tools, but they are not suitable for every kind of problem. I use Excel daily too, but I also use a lot of other programming languages as well.
Most of the time, complex calculations are easier to do in a programming language than in a spreadsheet, but viewing them in anything else than a spreadsheet is a royal pain in the butt.
This is one of the reasons that I use Google Sheets regularly for such tasks at work. The API allows you to do all the hard work in your language of choice and the post the output into a spreadsheet for viewing or other manipulation better suited to spreadsheet-style interaction. We have implemented all sorts of ad-hoc interactive sheets that can be viewed by anyone by emailing the shared URL around. It's a powerful tool for transcending the code-to-reporting boundary.
Love them or hate them, I think Google has done a great thing here. I'll be sad when they decide to suddenly drop support for it ;)
My first job out of University was working for a large US custodial bank (not in the IT dept). We were responsible for monitoring corporate actions happening on the securities held by their clients and accounting for the proceeds.
Most of the work was done in some old IBM 390 terminal emulators. The work flow was generally scrape a bunch of information from the terminal into Excel. Reformat it and figure out any discrepancies. Copy some adjustments into another workbook which would automatically enter the data into some screen or other in the terminal emulator to fix the discrepancies.
Someone had built a COM object which could be scripted quite easily in Office VBA. I found a few of the scripts they had written for it hadn't been protected and quickly learned to write my own.
Spreadsheets are awesome, and I think that another thing people are dismissing is grep, awk and friends.
I get it that both spreadsheets and unix tools might not be trendy, but they fulfill the Taco Bell Programming idea [1]; they provide simple, scalable, and efficient solutions.
Summing data from column B for each unique value in column A would be as easy as
`awk -F ',' '{a[$1] += $2} END{for (i in a) print i, a[i]}' my-input-file.csv'`
Sure, it would take a few tries to get right, but could for example R do something this concise and dynamic that can run in parallel?
Yes and no. That command is easy to understand/use only if you know what all these symbols do but most excel functions are written in a way most people will understand them without major difficulties.
Grep, awk and friends are powerful (only) in the right hands but they are not at all at the same ballpark as excel.
We were building our zero code app creation service and we also faced the situation where the users needed to add complex logic, calculations etc. and creating separate draggable modules and buttons for every requirement would not really solve the problem.
So we added draggable blocks and provided autogenerated variables for all elements just like the cell number in Excel. For all complex logic and calculations, the users can simply write a formula the way they write in Excel and complex apps can now be created without any coding language.
I can hardly imagine what is that which is easy in Excel but hard in Python with Pandas and SqliteiteAlchemy/Pytable/whatever and matplotlib.
What really annoys me in Excel is they won't replace the fossil VBA with Python, F# or a new language designed from scratch right for this. The VBA environment feels fun to touch to have the feeling of time-travelling back to the years of your childhood but it feels quite clumsy in actual programming.
Excel's plots feature also feels fairly weird. I could never make it to produce exactly the plot I want. Perhaps that's because I haven't mastered it but this means it is harder to master than matplotlib is.
I'm a professional software developer and I open excel every day.
Let's say someone emails me a list of figures and I want to quickly add them up?
Sure, I could write an incantation in awk but I can't then see if it's wrong because maybe on one row they 'accidentally' put an extra blank column in before the number by marking it with a letter, or there being an extra tab or whatever might cause that.
It's far quicker and less error prone to pop open excel, paste it in and then sum the column, and most importantly there is clear visual feedback if that doesn't work. In awk or a programming environment you'd just either get an incorrect figure and never know it was incorrect, or you'd get an error (e.g. trying to add a letter and number) and then have to debug what should have been an instant thing.
Paine has long worked with the European Spreadsheet Risks Interest Group (eusprig.org) and has used category theory to develop a system in Prolog called Excelsior about which he says "Excel lacks features for modular design. Had it such features, as do most programming languages, they would save time, avoid unneeded programming, make mistakes less likely, make code-control easier, help organisations adopt a uniform house style, and open business opportunities in buying and selling spreadsheet modules. I present Excelsior, a system for bringing these benefits to Excel."
Examples of Excelsior:
"Less Excel, More Components: presentation to EuSpRIG 2008 [by] Jocelyn Ireson-Paine":
(This might be a different take on the topic, but just a thought)
I have just dipped my toes in the corporate world and to be honest, i have not heard the term 'Excel' in the workspace or otherwise. Actually I did but that was also when its major pain point was discussed: Online capability. A customer needed an on-the-go Daily Sales Report app. (what I learnt and is obvious as well is that the dependency on the office/Excel expert to send the report is not ideal)
My point is, the shift to "the cloud" has already begun right? Freshers today will be up the hierarchy tomorrow and are they really going to use Excel?
This is what I am currently working on: Making programming fluid and easy like spreadsheets (or alternately, elevating the niche occupied by spreadsheets into something that can be called "real programming").
There is LibreOffice Calc, which allows you to program it with Python[1]. Best of the both worlds. I recently created a suggestion [2] to support also Julia language in it. Hopefully they will consider it.
did you ever "program" in LibreOffice with Python. Last time I checked, the API were veeery thin wrappers around the Java-API itself - I guess that this is hindering adoption (and not not having bindings to X)
Our had sophisticated mathematical models written for our industry. These were developed in the late 90's/early 2000's. When they tried to port these models to "a more modern format" to programming language code, they realised that it was far more complex than they had anticipated, and even small mistakes would be disastrous for the company.
Their solution? Dozens of virtual servers around the world, each one hosting a copy of Excel with a 20-30MB workbook, communicating with the outside world via TCP/IP and COM interface code.
Spreadsheets are great. I don't know any alternatives to spreadsheets, that does, what spreadsheets does, so well. And believe me, I have been searching for a long time.
That said, like any other tools, spreadsheets has it's perks and it's drawbacks. It's important to consider these drawbacks, when considering spreadsheets.
My only real problem with spreadsheets is that once they get too big, there's almost no turning back. There's no easy fix if someone screwed up at the beginning.
Even though conventional programming languages are less visual, it's so, so much easier to modify models.
But with that said, I do use spreadsheets for 90% of my daily needs, as far as calculations go. Just type "sheets" into the search bar, and I can start working (google sheets) in 2 seconds. It's even faster than firing up notepad.
Anyone have any advice on how to learn to use excel well? I've only ever down simple column operations before, but I've seen spreadsheets that do crazy complicated things.
Yes, check out the two books by John Walkenbach, Excel 2013 Formulas and Excel 2013 Power Programming with VBA. Not too much has changed since those versions-- I originally read the 2007 versions of them and it all transfers over because Microsoft is incredibly good about maintaining backwards compatibility.
There was a new app posted here a few weeks ago that combined a Jupyter notebook with a spreadsheet-like automatic recalculation. Does anyone here know what it's called?
I convert cell equations to visual basic so that I don't screw up the cells and so I can share my results without giving away all my code.
There are also methods to animate simple graphics for the purpose of providing high level presentations.
I 'learnt to code' by recording macros and then reworking the syntax to suit my needs. After a while it becomes much less difficult than it appears at first. And I will admit my code isn't a model of perfection, but it gets the job done.
Excel conflates the ideas of data and presentation. This leads to an entire class of headaches that just aren't necessary. It's the desktop application equivalent of the string 'null'.
If the spreadsheet layer (calculations and formatting) was separate from the data layer (types and values) then we could all be happy.
Wrap that up with a UI that wasn't designed at an office in Redmond (or for a web browser) and you'd really have a winner.
Not really, maybe I should write something. It's a thought I frequently have in the depths of Excel Hell. I thought for a long time about why Excel is so frustrating to me. I think fundamentally it comes down to Excel putting the spreadsheet ahead of the data.
If the data was clearly separated from the spreadsheet itself then the logic becomes much easier to reason about and test and the data is no longer susceptible to the Excel data loss.
The display layer really needs to be logically separate from the data. For example if I change a column format from text to number and back I should not lose the leading zeros in the actual data.
Spreadsheet tools are great visual programming environments but they are lousy databases. The data layer in Excel leaves a lot to be desired but there's really no reason a spreadsheet has to be so limited.
Basically I'd like to see some kind of hybrid SQL(database) client/Spreadsheet UI/Pivot Table builder where all three concepts are first class.
The thing that's missing from "Google Docs" is a decent collaborative outliner called "Google Trees", that does to "NLS" and "Frontier" what "Google Sheets" did to "VisiCalc" and "Excel".
And I don't mean "Google Wave", I mean a truly collaborative extensible visually programmable spreadsheet-like outliner with expressions, constraints, absolute and relative xpath-like addressing, and scripting like Google Sheets, but with a tree instead of a grid. That eats drinks scripts and shits JSON and XML or any other structured data.
Of course you should be able to link and embed outlines in spreadsheets, and spreadsheets in outlines, but "Google Maps" should also be invited to the party (along with its plus-one, "Google Mind Maps").
It should be like the collaborative outliner Douglass Englebart envisioned and implemented in his epic demo of NLS:
Dave Winer, the inventor of RSS and founder of UserLand Software, originally developed a wonderful outliner on the Mac originally called "ThinkTank" and then "MORE", which later evolved into the "Frontier" programming language, and ultimately the "Radio Free Userland" desktop blogging and RSS syndication tool.
More was great because it had a well designed user interface and feature set with fluid "fahrvergnügen" that made it really easy to use with the keyboard as well as the mouse. It could also render your outlines as all kinds of nicely formatted and stylized charts and presentations. And it had a lot of powerful features you usually don't see in today's generic outliners.
>MORE is an outline processor application that was created for the Macintosh in 1986 by software developer Dave Winer and that was not ported to any other platforms. An earlier outliner, ThinkTank, was developed by Winer, his brother Peter, and Doug Baron. The outlines could be formatted with different layouts, colors, and shapes. Outline "nodes" could include pictures and graphics.
>Functions in these outliners included:
>Appending notes, comments, rough drafts of sentences and paragraphs under some topics
>Assembling various low-level topics and creating a new topic to group them under
>Deleting duplicate topics
>Demoting a topic to become a subtopic under some other topic
>Disassembling a grouping that does not work, parceling its subtopics out among various other topics
>Dividing one topic into its component subtopics
>Dragging to rearrange the order of topics
>Making a hierarchical list of topics
>Merging related topics
>Promoting a subtopic to the level of a topic
After the success of MORE, he went on to develop a scripting language whose syntax (for both code and data) was an outline. Kind of like Lisp with open/close triangles instead of parens! It had one of the most comprehensive implementation of Apple Events client and server support of any Mac application, and was really useful for automating other Mac apps, earlier and in many ways better than AppleScript.
Then XML came along, and he integrated support for XML into the outliner and programming language, and used Frontier to build "Aretha", "Manila", and "Radio Userland".
He used Frontier to build a fully programmable blogging and podcasting platform, with a dynamic HTTP server, a static HTML generator, structured XML editing, RSS publication and syndication, XML-RPC client and server, OPML import and export, and much more.
He basically invented and pioneered outliners, RSS, OPML, XML-RPC, blogging and podcasting along the way.
>UserLand's first product release of April 1989 was UserLand IPC, a developer tool for interprocess communication that was intended to evolve into a cross-platform RPC tool. In January 1992 UserLand released version 1.0 of Frontier, a scripting environment for the Macintosh which included an object database and a scripting language named UserTalk. At the time of its original release, Frontier was the only system-level scripting environment for the Macintosh, but Apple was working on its own scripting language, AppleScript, and started bundling it with the MacOS 7 system software. As a consequence, most Macintosh scripting work came to be done in the less powerful, but free, scripting language provided by Apple.
>UserLand responded to Applescript by re-positioning Frontier as a Web development environment, distributing the software free of charge with the "Aretha" release of May 1995. In late 1996, Frontier 4.1 had become "an integrated development environment that lends itself to the creation and maintenance of Web sites and management of Web pages sans much busywork," and by the time Frontier 4.2 was released in January 1997, the software was firmly established in the realms of website management and CGI scripting, allowing users to "taste the power of large-scale database publishing with free software."
Spreadsheets are phenomenally flexible, but they obviously don't scale well or behave nicely with source control.
I work at Anaplan, and the most common way that our biggest customers discover us is when they've been bitten by spreadsheets as they've scaled, and now they have users emailing spreadsheets around and someone with a full time job collating them.
We've modelled the product around the flexibility, but rigor and scale on top of it.
Excel is everywhere where I work. it frustrates me to see how it's used and how much it's used, and it's totally unfair. I could rant about how it's unmaintainble, error-prone, time consuming, and cause costly mistakes, but I wonder if the company would be as large as it was without Excel.
I dislike a lot about Excel, but it's so easy to be immediately productive with it.
> and transmit not only the results of those calculations, but the actual environment itself, to anyone in the world, and expect that if they have a computer, they can replicate your results.
I think this is the real secret. It's really the only kind of "model" where the environment travels with it. Everyone has the same Excel setup. If it works for you it will work for them.
I wish this was true. The biggest problem is different language versions that cause all sorts of problems. Did you know that Excel formulae names are localized? For example AVERAGE is MITTELWERT if you have a German Excel and there are some with umlauts as well.
But there are more subtler problems. I'm an engineer and I used to work a lot with radians and degree values. There is this nice little trick in excel where you can enter radians and have it display as degrees. It works using the date format but only if you have 1900 dates enabled. I used to to use that for a short while until I noticed that all dates in spreadsheets from other people are off by 30 years.
Edit: I see someone made the same point below, but I definitely haven’t had Excel installed in years. And that’s across multiple billion-dollar employers.
I think one of the main takeaways is: included batteries are underrated. This is seemingly one of the main appeals of Python.
Another is: an integrated solution that's been designed holistically, instead of blindly assembled out of modules, does wonders for reliability. See: the difficulty of setting up a JS build environment vs a Rust build environment. Also, the success of Atom vs VSCode.
Difficulty recreating an Excel spreadsheet in a general-purpose programming language isn't evidence of the superiority of spreadsheets, it's just a consequence of the object/relational impedance mismatch. I strongly suspect that reimplementing OP's spreadsheet in MySQL wouldn't have presented any deep problems.
But some times it is necessary to extract the logic, processes, and behaviors from a spreadsheet into an application that can add necessary stability, traceability and redundancy that an expert concurrent enterprise tool requires from a legal, compliance and technical point of view.
I once was tasked with the unforgiving job to do exactly that.
I can't help but think that people using excel to accomplish their task instead of writing a program is a lot like Pythagoras doing geometry without algebra. When you get away from the numbers/code it makes it easier to visualize things.
The thing about finance trader types removing the F1 keys from their keyboards so they don't accidentally press 'help' while using Excel at high speed says a lot about how handy Excel is (and how slow the Help loads)
I think spreadsheets are an absolutely great idea. The problem is the current implementations. I have to deal with Excel files a lot in my work, and it's a nightmare.
This is why I’ve built React Spreadsheet[1]. To empower users and programmers to build interactive computational interfaces.
[1]: iddan.github.io/react-spreadsheet
Worked on a simulation game in the 80s.. boss coded the entire world simulator as an excel spreadsheet.. I then wrote a tool to compile that spreadsheet and its formulas into 68k assembly language that would run the sim for 1 tick. It worked Amazingly well. I have never doubted the power of spreadsheets since then, and whenever a problem in a similar domain comes up, it's one of my go to tools.
With Excel, chances are I either have it on my computer, it is a single click download away, or I have another program already on my computer that can open it.
I certainly don't have Excel installed on my computer. It's not even free. And it's far harder for me to install it compared to free software, like Docker or Python (for which I can just type `brew install ...`).
If you own a Mac, it's already got Python installed on it.
The Venn diagram of people who can spin up docker containers + Jupyter notebooks vs. people who prefer to use Excel for similar scenarios has very little overlap...
Yes, that's why the statement "Reactive programming is useful" is interesting to me, and "You should model your problems in Excel" is not. The article makes both points, and I disagree with the second.
This specifically is nonsense in the article:
> From there, you can calculate literally anything, and transmit not only the results of those calculations, but the actual environment itself, to anyone in the world, and expect that if they have a computer, they can replicate your results.
I certainly can not expect that people I'm interested in have Excel. I don't even have Excel.
Ah yes, the confident programmer that would DIY everything.
I feel like underestimating Excel is a classic example how an expert can have holes in their thinking.
With VBA, Excel becomes a ton more useful, adding programming. Excel is merely a visual database you can share with coworkers and encapsulate in a single file.
There is a time and place for everything. Excel is very useful for workplace data sharing and manipulation.
I like spreadsheets because they are reliable and logical, unlike whatever trash programmers tend to use. Spreadsheets do what they say they do. Outputs are functions of inputs, like math.
Meanwhile software “engineers” are using programming languages that eschew logic, reject reason, reject mathematics, reject determinism, and do different things each time you run them.
Yeah I’ll take a spreadsheet over whatever ball of confusion software engineers are chasing their tail with.
So many critical infrastructures, billions of dollars in planning, and just systems are built out of Excel. It's amazing. You'd assume something that services millions of people a day would have some more sophisticated and customized solution, but you're wrong.
The reason is because most people know how to use Excel. Most people know how to handle it, use it, modify it, build up from it. You don't need to get a regular support contract from a company for your custom-built spreadsheets. If something critical in that "Excel software stack" breaks (aka Microsoft Excel), most of the time you just need to reset Excel. It's amazing. You don't need a new server, you don't need support contracts, it's just like riding a bike.
In my opinion, the limitation of Excel is actually when it comes to big-data analytics. The way Excel handles large quantities of data is slow (due to the nature of it's software structure). That's where we come in and build out these models and systems. However, in the end, our data outputs will be fed back into Excel, because that's what most people are used to.
I have deep respect for Excel. After all, Excel empowers so many users who don't know how to code to provide amazing plots and perform major calculations with ease.