Probably some of the worst code I ever worked on was a 12k+ line single file Perl script for dealing with Human Genome Project data, at Bristol-Myers Squibb, in the late 1990s.
The primary author of it didn't know about arrays. I'm not sure if he didn't know about them being something that had already been invented, or whether he just didn't know Perl supported them, but either way, he reimplemented them himself on top of scalars (strings), using $foo and $foo_offsets. For example, $foo might be "romemcintoshgranny smithdelicious" and $foo_offsets = "000004012024", where he assumes the offsets are 3 digits each. And then he loops through slices (how does he know about slices, but not arrays?) of $foo_offsets to get the locations for $foo.
By the time I was done refactoring that 12k+ was down to about 200 ... and it still passed all the tests and ran analyses identically.
We should use the Antarctic highlands as a prison colony for people found guilty of writing Stringly Typed Code. Siberia isn’t awful enough for them. I thought people who stuffed multiple independent values into a single database column were the worst and then I saw what people can accomplish without even touching a database.
Ha - we did that too at BMS. We were paying Oracle by the column or something like that, so people would shove entire CSV rows into a single value (because corporate said everything HAD to be in Oracle) and then parse them application-side.
We got rid of most of the DBAs because they got overfond of saying 'no' and they became a major source of friction for doing reasonable architecture. People started figuring out ways to work around the DBAs like stuffing more shit into existing columns or adding secondary tables that established new relationships between the sacrosanct tables.
A lot of the worst sins happened on Oracle databases, mostly because they were the most popular at the time, and they were the only real DB cult, so the cottage industry of specializations that told other developers not to worry their pretty little heads was especially bad.
I have to work with a DBA who has decided that nothing new gets developed using Postgres and is confident that our use cases are best served with a document db… all without knowing any of our use cases, requirements or constraints.
Now I just don’t involve him in anything unless forced.
A friend once picked up a tiny contract from Uncle Sam, requiring some changes to a Perl script that stashed values as "this/that/thenext" in one column of an Oracle table. She had not previously dealt with Perl, so I explained to her about "split" and "join".
I don't know why the developers had chosen that method.
No we should reserve the antarctic highlands for type system fetischists who write abstract protocol adaptor factory repositories for every damn simple piece of functionality.
That sounds more like an OOP thing than a type system thing - functional languages make excellent use of type systems without mention of anything remotely sounding like a “Design Pattern (tm)”
People balk at the idea of licensure, but the bar is so low that all you have to do is ask "what is an array?", and you'll filter out vast swaths of people like this.
If I were being charitable (sometimes I try), I'd guess he was concerned that an array of scalars is going to have a high per scalar size overhead and perhaps worse cache locality, whereas the non-unicode perl string implementation at the time probably just stored the string as a large char* with a length prefix.
In a similar vein, an alternate explanation may just be that a lot of bioinformatics algorithms are described as string algorithms. It would be a pity if that was just taken too literally, but I've seen a lot of bioinformatics perl that works on strings in this way.
While trying to be charitable though, it's almost impossible to come up with a rationalization for the three character offsets string. Sometimes I think when people really go wrong is when they hear of some problem or are trying to be too clever and end up going beyond their abilities and stray into farce.
Being wrong with malicious intent and being wrong because you have an interstate-number IQ are indistinguishable; the outcome is still terrible. I don't care who's playing 4D chess, the pieces are still all over the floor and I have to clean them up.
I am somewhat surprised that a programmer who was unaware of arrays in Perl managed to have tests. But then again, he managed to implement their own version of arrays, maybe he came up with the concept of software testing by himself :-P
It sounds more like the refactoring started with creating tests for the old code base. We do this with our legacy code base, too, sometimes. Of course, it’s always the high risk code of an application that‘s never been accompanied by automatic tests.
As an outsider of any reasonably large application can you really ever expect to grasp 200 random lines of code in any language?
Maintenance programming is all about understanding the software‘s context and implicit design or a model thereof — even if it is a 20 year old patch work.
Young developers tend to be amazed when I find the source of an obscure bug in our 1M lines of Perl code application as a senior. But the thing is, I just happen to „get the application“ and developed intuition about its inner working. It’s rarely of knowing Perl particularly well. The thing could have been written in TCL or even Brainfuck — with a working model of your software in your head (not with the memory of the total code base, mind you) you will eventually find the problem in a piece of software written in any language.
If you get a typically-structured Java/Spring application then often yes. By typically structured I mean domain objects, DTOs, service layer, controller layer, view layer. The idiomatic use of dependency injection frees the mind of the reader from understanding all the dependencies in code, just use an @Autowired class you want and someone part has the knowledge to configure it and does so.
Admittedly I've somehow only worked in perl but the worst code I tried for fix felt similar. They know about arrays but every map and grep used perks default $_ and there was enough nesting the function was near 1k lines if I remember right.
The primary author of it didn't know about arrays. I'm not sure if he didn't know about them being something that had already been invented, or whether he just didn't know Perl supported them, but either way, he reimplemented them himself on top of scalars (strings), using $foo and $foo_offsets. For example, $foo might be "romemcintoshgranny smithdelicious" and $foo_offsets = "000004012024", where he assumes the offsets are 3 digits each. And then he loops through slices (how does he know about slices, but not arrays?) of $foo_offsets to get the locations for $foo.
By the time I was done refactoring that 12k+ was down to about 200 ... and it still passed all the tests and ran analyses identically.