Hacker News new | past | comments | ask | show | jobs | submit login
Plotting the source code “TODO” history of the most popular open source projects (schleiss.io)
235 points by nreece on May 17, 2021 | hide | past | favorite | 112 comments



I always suggest TODO's to be replaced during a Code Review by: 1. A ticket number that will be picked up shortly if should still be part of a larger change. In a healthy team, this is done within two weeks and you know where to perform changes when you pick it up. 2. You do not add a TODO, but explain your current understanding of what is wrong and what should be done. This way you can refresh the knowledge if it ever again is touched. With a simple TODO, this knowledge is usually not writtend down.


I don't think I can agree with this, since some TODOs have a different target audience.

The kind of TODO you're talking about is splitting a ticket into parts so you can hit an artificial deadline (ie, it's no longer 'done done', it's just 'done'). If the artificial deadline is your boss, then we're in a bad place. If it's another team needing a feature, that's pipelining and that's often okay.

For the TODOs that make it to PR without human error, I write most of them for the next person who adds functionality to an area, to either encourage them to do so or at least not make things worse. But sometimes it's for the person who hits the Rule of 3.

Those TODO's should be addressed in six months, not two weeks, and having someone call me on them in a PR is not particularly helpful. No, I'm not going to quadruple the scope of this story because you don't like the word TODO.


Or do both. Might as well add TODO: to make it stand out as a thing that can be improved while also making it greppable.


Might also be cool to automatically create these tickets (when commited to master?). Then you don't forget, and even if they end up not that detailed, you at least get a nice list of all of them.


TODO: Automatically create tickets based on TODO comments.

We'll get to it. Someday.


https://en.wiktionary.org/wiki/round_tuit

There used to be an ASCII art version occasionally slapped on Usenet posts: "here's a round tuit, you can go ahead."


Caption: "An artist's impression of a round tuit."



That is really cool! Thanks for mentioning.


If you want a cli tool for this I have one here https://github.com/Schell/todo_finder

You can output to markdown or to GitHub issues with a token.

I used to operate a service that did this for you.


I like this idea, but I would add that my preference for scenario 2 is to add HACK (or FIXME) along with the explanation instead. This way it is still searchable, but you make it a bit more clear that there is no obvious fix at the moment.


OP here. I used `git log -G TODO --reverse -p -- . > ~/Desktop/test.txt` and used the results in PHP to aggregate the data as I couldn't think of the bash one liners in the other comments :(


Stupid question, is "TODO" in this instance case sensitive?


It's case sensitive by default, you can add -i/--regexp-ignore-case to disable that.


You might also want -w for whole words, otherwise commits by your colleague Todor might cause some confusion.


Interesting to see that most of them are almost always growing. It would be interesting to compare them to some other metrics, such as TODO/lines_of_code or TODO/num_contributors, to compare the TODO's with the size of the project. I guess that as project gets bigger, it also gets more TODO's


> TODO/lines_of_code

At least this is needed for any meaningful comparison between project (or even in projects themselves, as some double in SLOC count over a few months)


Ignoring the numbers on vertical axis, plots look like they are normalized to fit the plotting area. TODO/LoC should basically have the same form.


I don't think so. These plots are normalized by the max number of TODO's, but that may vastly differ from max TODO/LoC of each project.


Indeed! When those numbers jump up/down in the graphs, there's prolly a code merge/purge cause to it.


It was interesting to see that Swift has 2K+. Seemed kinda high when I consider it's youth and uptake relative to some of its plotted peers.

I don't know wether to suspect that's because it has

a) an overly parliamentary development process that just creates lots of bookkeeping side affects

b) a very aspirational development community that is busy writing tons of "try to take over the world" goals to improve in various and sundry ways

c) indicates a lot of short sighted/highly focused language evolutions that leave a long trail of todos because that kind of "we have no big picture" creates a lot of corner/edge cases that need "todo" signs to document them

d) something else?


Interesting that some of them don't grow, or don't grow much, over time.

Someone at PostgreSQL and Django is actually reading and fixing the TODOs.


It would also be interesting to plot the number of TODOs against the size of the code base too. One would assume that as a project grows, the number of outstanding TODOs would grow too. Where and when this isn't true, might reveal something more interesting.


We took some action on this internally at a place I worked.

We had a couple of projects that had unit tests neglected so we enforced that you had to round up to the next closest 1% on the package.json for your merge to be approved (as well as adding some unit tests)

In 3-4 weeks, code coverage slowly went from 20% to 73%.


I bet it's more to do with rate of growth than total volume.

I believe that one of the cognitive dissonances with people who think a lot of code is good news is that they become overwhelmed by how much they would actually write if they stuck to their convictions and so they start using TODOs to make themselves feel better about doing the wrong thing.

Projects that grow slower I suspect have fewer TODOs.


I suspect that in postgres'case that is just because a TODO list was moved out of the code...


While it's probably the case for postgres or Django, todos getting less could also mean high code churn.


Yeah I noticed several cases were there to do's drop dramatically. I wondered if that was because that was a module that had a lot of to-dos that was also low quality, and the subject of a complete rewrite or replacement at some point. Golang was also interesting because of a huge increase followed by an almost equally huge decrease a short time later, both of them seemingly vertical. I wonder if something got reverted, or if they actually went back and addressed a bunch of to-dos.


Might be some type of automation / tool adding a bunch of TODOs (as part of a migration?) which are then automatically removed in a later step.


Well, I'm glad I'm not the only one who never gets around to my //TODO#s.

One related tip for devs: I've started adding "You are here" as a placemark for where I'm working in the code. So for example, on friday, if I want to pick up quickly next monday, I add "//TODO: YOU ARE HERE Finish doing foo". Then on monday, I search for "You are here" and pick up where I left off easier. Saves me a few cycles, though if I'm honest, I have quite a few of those hanging out too.


I've been annotating my work several times a day, with `INK`, from "leave some water in the well", a productivity hack from Hemingway[0].

I forgot how I went from "Water in the well" to "ink the well", though. It's been a while since I started doing it, and I wrote a blog-post[1] with some scripts and helpers that I still use.

[0] https://www.fastcompany.com/3021905/hemingways-secret-to-mai... [1] https://berk.es/2012/05/30/leave-some-ink-in-the-well/


> in other words, never end a day’s work without knowing how you are going to start the next day.

This turns out to be very hard advice for developers to follow. I don't say that as a complaint about Hemingway, but as a complaint about developer neuroses.

I have tried many, many times to convince people to associate their 'sense of completion' not with the act of getting their changes into master but the act of committing their changes (or pushing it to a branch, if you are PR-driven). It works less than half the time, and almost always with the more junior people...

So many incidents of someone staying late to finish something, pushing it, then coming in late the next day (because they stayed late) to a bunch of upset coworkers who had to clean up the mess they made.

Completionism will be the death of us all.


> but the act of committing their changes (or pushing it to a branch, if you are PR-driven)

I'm not sure if I understand you entirely correct, but it seems the "you have to commit" is a large part of the problem.

You don't.

Your harddrive is not going to crash between tonight and tomorrow. And if it may, and this is crucial, git is not a backup system, so get a proper backup system in place.

Commit when you have a coherent piece of work done. Or maybe commit a "WIP: working on foo, halted for emergency hotfix X" if stashing is out of the question.

Part of my "INK" workflow is not committing to RCS. Leave the dirty state as another mental nudge what you were working on; and add your current memory-state as annotations in the code behind an INK.

I've mentored a junior and he was the opposite, never committed untill he was finished, sometimes work of multiple weeks, then he commited it with "finished the FooBar". We agreed he would try to commit at least daily. So then he made one commit a day. "17:00 going home. commit" in the logs. Every day. Needless to say, we did not keep him for long.


> Commit when you have a coherent piece of work done.

No, that's the problem.

Commit when you are available to deal with the consequences of commit. Having a coherent piece of work reduces, but does not eliminate the chance for problems. If you leave before the code builds, you've created problems for someone else to fix. If you haven't budgeted time for the rollback to go wrong on the first try, you're creating problems for other people.

This is why reliability and responsiveness of CI/CD systems matter much more than most people allow. On a good project, pushing things after 4 is probably a bad idea. On less well run systems, anything after 3 might be risky. On a bad one, 2:30 might be pushing things. So depending on meetings and lunch time there may be very small intervals where you can push things, and when the clock is ticking we begin to rationalize, which just makes the likelihood of failure increase.


> If you leave before the code builds, you've created problems for someone else to fix.

I'm curious to your workflow now. How can it be someone else's problem if you do not commit (and/or don't push) your work? Is everyone working only on master/main/develop branch? Do you work in a shared drive? Is your entire team working off a dropbox or network drive (I've seen this, I needed eyebleech)?


A commit does not have any consequences. A push may.


even better, don't leave it as a comment. Leave it uncommented, as a syntax error, so when you try and run your code on Monday morning you are reminded of it!


Just today I announced a sweep through all of the TODO's and to either turn them into issues, stories or remove them. Biggest problem in Xcode is that it clogs up the warning list and the real warnings get swamped by them. But it's funny to see that Rust has less outstanding TODO's than our brand new 45KLoC project.


TODOS are great documentation. They often reveal design decisions, suboptimal implementations and the thought process of the creator.

This information is often completely lost in issues that noone will ever look at again.

I also like to do TODO sweeps, but with a bias towards rewriting the into documentation or leaving them in if the are actionable.


Instead of TODO, I often add a NOTE instead now, e.g.:

   NOTE: Attempting to re-synchronize here would cause an infinite loop because...
This prefix distinguishes such info from normal comments, which exclusively pertain to the code-as-written.


Agreed but that's not quite what's being discussed here.

Notes are great but the context is to make a note about something that can be improved in the future (something actionable), hence it being a TODO.


Huh, if I add a TODO to typical code then it's something that ideally needs to be done there (e.g. "todo handle errors from invalid inputs" -- it'll work if you don't but... it's not ideal), or in LaTeX reports it's something that needs to be resolved still before shipping to the customer (e.g. "todo set \endDate variable"). Never is it a design decision or documenting my thought process. Why would that ever be labeled todo or, if you mean that todos indirectly convey that, why would having todos throughout the code as a means of conveying design decisions be "great documentation"?

"This information is often completely lost in issues" - of course, because an "issue" is "a vital or unsettled matter" and are not meant to be revisited once resolved. Writing design decisions or thought processes in issues seems equally weird to using TODOs to document that. They're kept around because it costs nothing and you might want to refer back to it for details on some past problem, but I've never heard of that being considered documentation.


The Rust graph is inaccurate since Rust uses FIXME instead of TODO. Almost all of the usage of "TODO" in rustc relates to the todo!() macro, usually to stub out parts of tests.


Yeah you get vastly different numbers if you grep for FIXME (instead of the hundred or so todo's):

    $ rg FIXME compiler | wc -l
    1036
    $ rg FIXME | wc -l
    18656


Can't you configure Xcode to filter out such warnings?

TODOs offer valuable insight but they may not warrant turning into issues or stories. Removing valuable information because it's cluttering seems like a shame. Better if you can filter it in such a way that it doesn't clutter


Quick idea: You could use a different wording that doesn't get recognized by Xcode but sticks out in a similar way. It would take some time to get used to it though.


I think it's actually my linter generating these warnings! But I do like them to be resolved so I consider them issues.


everybody wants todos resolved but chances are that if a decision is made to fix, make issues or nuke them, they'll just get nuked instead


> Just today I announced a sweep through all of the TODO's and to either turn them into issues, stories or remove them.

I think it makes sense to get rid of inline TODOs.

On new unshipped projects I've gone down the road many many times where it starts with a TODO comment above a chunk of code and then it turns into a few lines of context or a summary of my thoughts. Then comes linking to references, recapping my thought process on the "why", potentially writing a couple of versions of the code and keeping them commented out, or even having a chat with a friend over IRC and copy / pasting the conversation next to the code.

Now you open a file and suddenly it's your code mixed with a massive brain dump of notes, research and a ton of other things that have no business being in your code base and of course your intent is to remove all of that stuff once you get the well thought out implementation but eventually all of this stuff builds up. Then it happens across multiple files and eventually it becomes really hard to figure out what you need to do.

Putting all of that stuff into a kanban board has been a huge win for me. Now I just drop all of that contextual info into a "research" list and I pick things off that list when I'm ready to really do them.

A made a video about this process a while back at https://youtu.be/HHOkcCqsipE?t=77. It shows an example of the before and the after.


Sorry to break it to you but these TODOs are a measure of technical dept you've accrued by limiting dev work to issues and stories.


"How noone could have seen how we got hacked (then we checked Git and found a `TODO: Fix Authentication`)"


"Yes, but authentication won't make the boat go faster"


Is it controversial to say that those with low TODOs are pretty clearly the cleanest packages I enjoy working with most? (Postgres, Rust, Django, VueJS, maybe Python)


Use and semantics of TODOs are decidedly not consistent across these projects.

Just because you (and I) see correlation with our expected biases shouldn't be construed as proof: merely interesting chart wiggles.

I think the shape (monotonic increase or sawtooth) can be used to see how a project handles either missing features of technical debt.


But if the semantics speak to project philosophy differences that result in a worse or better experience, then it is important that they don't have the same semantics.


Well, is the converse true? Are the packages with high TODOs the dirtiest packages that you least enjoy working with?


"I'll do it later" is one of the more challenging personality faults to deal with in coworkers. There are very few people you can trust to make that statement, while most of the rest behave as if you should trust them as well, even though everybody knows they won't in fact do that later.

Not only will they not do that later, but at some point they will compare their productivity to someone who ended up having to 'do that later' for them, causing their other work to suffer.


I was just thinking the other day that searching for TODO is probably a very good way to search a project for potential bugs or security issues. E.g; I see a bunch of todos in Firebase iOS SDK that look kind of interesting to an attacker. Without looking into how the methods are called I can't say if they are actually exploitable (and I am sure Firebase is fuzzed to high-hell) but it was a little seed planted in my head.


for a great example of this just have a look at the following macOS privesc the source of which came with the handy comment "deal with OOB".

https://blog.zecops.com/research/from-a-comment-to-a-cve-con...


Am security tester. Can confirm.

Sometimes the vulnerabilities are just handed to you on a dark-themed platter and I don't look them in the mouth.


The growing number is hardly surprising, yet I don't know exactly what the implications are. There are a lot of different categories of TODOs, ranging from harmless ("it would be nice if this was improved at some point") to critical ("this is a really bad solution that needs to be fixed ASAP"). I wonder if these repos has official definitions of what a TODO entails.


Perhaps we should be including a priority in such comments, eg:

TODO: high: don't use bogosort

Would only be useful if it became a defacto standard though


At that point, make an issue and add it to the tracker.


Yes, I agree. Priority and some context on what to do instead. Because as soon as it's a standard, big software vendors like Jetbrains could automatically categorize and mark the various lines.


Back when I used Eclipse there was another level called “FIXME”. Maybe it works in IntelliJ as well.


It does.

At least it works in PyCharm, so most likely it works in all IntelliJ based IDEs.


Yes, it does. In fact, you can configure it to use whatever you want. (I have it also match BUG as another priority)


Some projects use FIXME for things that need more urgent improvement to distinguish them from 'would be nice' TODOs. I wonder how many of these repos that is the case for?


I use a priority number after my tags: @todo2, @todo9, etc.


Damn, what happened to Typescript in May/June 2018? It jumped from around 730 TODOs to 3000, and has never really come back down.

Wikipedia's list of versions suggests that this was probably related to version 3 happening in July 2019.


I'd like to see a todos added / todos disappeared graph that would account for the velocity of development of a project. For some, while the todos grow, the relative "todo per LOC" might decrease etc.


Todos are generally terrible practice in code, as they often don’t give any indication how they are going to get to done.

They never get prioritised and very rarely does anyone get to do them.

So what’s the point? It feels like a todo is really only there to serve as an excuse for suboptimal solutions.


> It feels like a todo is really only there to serve as an excuse for suboptimal solutions.

That's not the only way to use a TODO, but even if it was, it's still valuable to mark suboptimal code.

Imagine you're investigating a performance bug and see a comment in related code which says "TODO: use a faster sorting method", that's probably going to be helpful.


Delivering something is far more preferable to never delivering a perfect solution, most of the time. TODO is a way of dealing with guilt about imperfection while actually shipping.


I dunno... I think they have their place.

I'm working on a personal project right now and in order to see how things look, if they work etc, I've got a bunch of vanilla js frontend code and //TODO in a few places to call an API and get actual data. It works great for now as I've hardcoded everything, got it looking broadly like the finished product, and it means that I just have to do the API calls (and programme the API too, of course).

I use them a lot.


On a personal project, I can see this working, on a team project, you're not gaining anything with TODOs, because the chances that you or your colleagues actually go back and fix/implement the TODO are close to zero.

Not only are you rarely gonna have the time to fix the TODO, but weeks, months, years later when you run into a TODO in your code, you have no idea what the actual requirements were, why it was not implemented, why it hasn't hurt anyone and whether anyone is actually needing it. Thus the TODO comment will remain forever, as you can't figure out what to do about it, without investing a lot of time and energy on requirement engineering.

Personally, I've started to block PRs with TODO comments that aren't directly mentioning the future implementation story/bug. As such, even if the TODO is forgotten in some way, you at least will find a reference point to what should have been done here.


> because the chances that you or your colleagues actually go back and fix/implement the TODO are close to zero.

This depends entirely on the team, company and/or work, though. It certainly is not a given.

> TODO in your code, you have no idea what the actual requirements were, why it was not implemented, why it hasn't hurt anyone and whether anyone is actually needing it

This depends on the task you are TODO-ing. Sure, if it is a "TODO: seems broken, fix." or "TODO: make sure that users don't see this", you are putting not just the wrong things in TODOs you are not giving them enough context. Compare that with a "TODO: this duplicates the routine in FooBars#bar_bar, but we cannot move this to a generic helper until the BarBar can handle both ActiveUsers and PendingUsers. Once that polymorphism is implemented, this can be DRYd up", which gives context, predicaments, and communicates that the author knows it is suboptimal, and explains how the author would've fixed it.


It certainly all "depends". My main point is, that if you don't actively plan in to fix your TODOs, then they usually won't be fixed, which you can also kind of see in a lot of the charts where the amount of TODOs just goes up. And as such the question arises whether you really gain anything from them.

I much rather have someone finalize their implementation and create follow up stories/tasks to indicate what needs to be done next, than having hints of what should've/could've been done and nobody ever going back and cleaning those up.


Here's my most common use case:

1) Need to do a hotfix on some bug. Fastest fix is to just turn off something.

2) I turn it off (maybe commenting out the line) then add a TODO above it with the ticket number that corresponds to the ticket for turning it back on once the issue has been investigated.

3) Once I start working on the ticket to turn the thing back on, having the todos makes it easy to know exactly what to do (context isn't lost) and I don't end up missing some things because I can just search all of the TODOs.

4) If a TODO is missed, we have a script that will re-open any ticket for which a TODO ticket number is still in the codebase. i.e, let's say we have "TODO: xy-123 ..." If i close xy-123 without deleting that line, the script will re-open the ticket and comment saying that there is a remaining todo


> they often don’t give any indication how they are going to get to done

It would be a really bad practice to merge procedures and policy with technical artifacts. TODOs embedded in the code are technical artifacts, they say what should change, they really shouldn't say how and when.

> They never get prioritised and very rarely does anyone get to do them

Well, you are faulting the tool for your development practices. If your team doesn't look at TODOs, you indeed should avoid them. But that doesn't mean anything for other people.

At my workplace embedded TODOs would be bad too. At personal projects I find them quite useful with a similar life-cycle to warnings: you keep them there while the feature is being developed, but they must be gone by the time it's complete. Other people have different practices, and may successfully use them in different ways.


I'd rather have a todo (that marks the line with a nice bright color in my IDE) than the immediate alternatives of #1: Not implementing a feature because it has room for improvement, or #2: Leave out the deficiency marking because it makes the code look more complete.


Cool! That date axis though...

PHP's 3 years 2011-2014 are much shorter than 2 years 2014-2016. NodeJS's years 2017-2020 that were ~50% longer than 2013-2017.


I guess that each data point is a commit and they just made more commits in the 2014-2016 period than in 2011-2014. But it's just a guess.


I wonder what other strings people use like this. So far I have seen FIXME TODO HACK XXX BROKEN.


With Vim, each filetype has its own set of rules for syntax highlighting (mostly contributed by third-party maintainers). A quick grep through the `/usr/share/vim/vim82/syntax` directory shows the most popular set of such strings that are to be syntax highlighted by Vim are:

    TODO FIXME XXX
Other strings that appear are:

    BUG NOTE CHECK DEPRECATED HACK TBD FIX TEMP REFACTOR REVIEW HACK Todo
I guess these strings are most likely a mix of popular idioms for programming languages or particular to the habits/culture of the individal contributors to the syntax files.


At work we use "TODO(JIRA-XXXX):", where JIRA-XXXX is the Jira ticket for the TODO. Every TODO needs an accompanying Jira ticket. Otherwise it won't pass code review.


If it's got an accompanying JIRA ticket, what do you experience as the value of also including the `TODO` in a source comment, over just the jira ticket alone?

[edit: reasonable answers below, thanks!]


Suppose a future developer is working on a separate ticket that touches the same code. If there’s an inline TODO, the future developer knows that the TODO needs something changed, which can help them understand how the code works and they might wind up resolving that TODO as part of the second ticket. If there isn’t an inline TODO, the issue might be resolved without that first ticket ever being touched. I see it inevitably leading to a lot of dead meaningless tickets crowding up the backlog. If an even later developer then was assigned one of those unknowingly-resolved tickets, they might spend a significant amount of time looking through the codebase to find where the code needs to be fixed, only to realize later that they’re looking for nothing.


When reading the source code, you immediately see an acknowledgement of the deficiencies, instead of assuming that everything is as it should be, or needing to investigate what is ok and what is not. It also maintains a link between the ticket and the location in source code throughout future source code changes.


From the opposite direction as the other responses, I've recently been running across years-old weirdness that involves digging through commits and merges to find the original Jira issue to explain it, then add a comment and reference to it to the code. They're not TODOs, they're explanations for future devs about why something was done on an odd way, so explicitly distinguishing a TODO would be better for those cases.


I tend to use my initials for the TODOs I introduced.


The Golang Todo chart is interesting. It has a sharp peak in April 2018. Linux and Swift seem to be the most in number and uniform in growth.


I wonder what happened in Go there.


most likely code generation?


Anyone got a (git-based) one liner to get this info for an arbitrary repo?


I'll throw this into the mix

  git --no-pager grep -I --full-name --line-number 'TODO' |\
   sed 's/\(^[^:]*\):\([0-9]\+\):.*/\1\n\2/' |\
   xargs -d '\n' -n2 sh -c 'git --no-pager blame "$0" -L $1,$1'
Not blazing fast but I think it does okay-ish.

What it does:

1) git-grep for files that are checked in, not "binary" that contain the string 'TODO'

2) sed away the actual line contents (git-grep doesn't seem able to only output file:line-nr)

3) use xargs and sh to call git blame on that file:line-nr

This shows the last time the TODO line was modified, ie: it may have been created 10 years ago but somebody modified the last yesterday

edit: one might want to throw in --cached to git-grep to search the index and not just the current working-tree


You've asked for it ;)

This prints the years - you can the group and plot them as you wish (it should be fairly easy, but I wrestled enough with git). It's a non-rigorous script (eg. it assumes nobody's email/name includes an `YYYY-MM-DD`-like string, and that filenames don't include the colon character):

    grep -P '\bTODO\b' -n -R -- * | awk -F: '{ system("git blame "$1" -L "$2","$2) }' | perl -lne 'print /(\d{4})-\d\d-\d\d/'
The working is actually fairly simple:

- grep prints filenames and lines

- awk captures the filename and line, and executes a git blame on it

- perl matches the year and prints it

In the Perl matching expression, month and day are not strictly necessary, but disambiguate potential 4-digit numbers in the email/name.

I'm very underwhelmed by the lack of customization of the `git blame` command - `--porcelain` is also uncustomizable, which makes things even uglier.

Note that `git blame` also mishandles some edges (printing "fatal: file [...] has only 1 line").


What do you want `blame --porcelain` to do that it doesn't? Using:

    git blame --line-porcelain "$1" -L "$2,"$2" |
    perl -MPOSIX=strftime -lne '/^author-time (\d+)/ and print strftime("%Y", localtime($1))'
I suppose it would be a little more convenient if you could ask `git blame` to format the whole line itself, but that wouldn't be part of the `--porcelain` output.

All that said, that pipeline is quite slow on something like linux.git, as it runs a series of blames which will walk over the same history many times. I think:

    git log -STODO --format=%ad --date=short
would be much faster (it's not _quite_ the same thing, as it counts TODOs which went away, but is a reasonable variant).


> What do you want `blame --porcelain` to do that it doesn't? Using:

It increases the complexity, due to time conversion. One can certainly solve the general problem by throwing enough awk/perl/sed at it, but an option to customize the blame output would make it significantly more ergonomic (and the oneliner much simpler).


As a starting point,

    git log --format=format:"%at %H" | sort -nr
gives a list, with the oldest entry first, of just "hash timestamp" pairs, one per line.

You can then use e.g.

    date -I --date='@1620720025'
to convert the timestamp back to a human-readable date in ISO format, i.e. "2021-05-11".

The next step would be to loop over the list, checkout each revision, grep and wc the TODOs, and collect into date buckets. Anyone? :)


here you go

  git rebase -i --exec 'ack TODO | wc -l >> log' HEAD~20
(create a temp branch first!, then substitute your starting revision, and save-quit the editor that'll pop up)


I thought Golang's number of TODOs was much much lower than the others until I looked at the scale


I'd be quite interested in seeing data on the age of TODOs over time - for instance, are there lots of old TODOs sitting around gathering dust while newer TODOs get fixed, or are they getting worked through?


Django doing really well (or nobody dares to write TODOs anymore)


What about amount of TODOs per character or line of code?


Agreed, this could make the graphs much more comparable and maybe reflective of project culture. Comparing something like the linux kernel to VueJS is nonsensical without any normalization with respect to overall repo size.


Golang has/had TODOs in generatred code I assume?


What about FIXME ?


Tangential: magit-todos is a great package for Emacs/magit users to keep track of todos in a project.


TODO: Look Ma Im doling out work for whoever reads this. Are you the intern? You are it.


Would be interesting to add major release versions to the graphs as well


How did golang gain and lose 12k todos in an instant?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: