having worked in a government agency that ditched IBM, let me offer a view of what that looks like from the customer side:
IBM bought a company whose product we'd been using for a while, and had a perpetual license for. A few years after the purchase, IBM tried to slip a clause into a support renewal that said we were "voluntarily" agreeing to revoke the perpetual license and move to a yearly per-seat license. Note: this was in a contract with the government, for support, not for the product itself. They then tried to come after us for seat licenses costs. Our lawyers ripped them apart, as you can't add clauses about licensing for software to a services contract, and we immediately tore out the product and never paid IBM another dime.
I tell this story not to be all "cool story, bro", but to point out that IBM does focus on renewal growth, but they're not geniuses...they're just greedy assholes who sometimes push for growth in really stupid ways.
Until passkeys can pass the test of "my non-technical friends and family don't call me for help about them", passkeys aren't ready. Vendors keep making assumptions about how users behave which are not safe assumptions, and that keeps blowing up the interactions of non-technical users. (I'm sure there's an "assumptions developers make about user accounts" blog out there somewhere.)
For example, my family has had to call me for help on the interaction between passkeys on Apple & Amazon multiple times. They have a shared Amazon account, which neither Amazon nor Apple seem to like. The first problem came when they didn't even know they'd been moved to passkeys - there was a popup that one of them didn't understand, they clicked OK to get it to go away, and suddenly the other partner can't log in, and neither of them can figure out how to log into Prime Video on their AppleTV. Another time one of them got "nudged" to add a fingerprint to the account, again freezing out the other person.
Until that nonsense stops happening, Passkeys aren't ready.
By that metric, passwords are even less ready, as I seem to always have to field calls for passwords getting stolen or compromised or accounts getting phished. I guess we're back to faxing ID.
I have a non-technical father with dementia, and passwords+TOTP are almost frictionless for him, with minor exceptions. We are able to share around passwords and TOTP codes without any problems so I can properly monitor his online activities to keep him out of trouble. He’s a cranky old guy with almost zero trust, so having to input all that stuff satisfies him that security is being employed.
It's funny, I started with rPis for the same reason, but I'm about to replace them. I bought 20 rPi 4Bs for my homelab, and I just couldn't get them to do what I needed. I was looking to run a home k8s cluster and the Pis were just not suited to it at all (don't use sd storage for k8s 'cause it'll burn out the card w/writes, booting off usb was unstable even with powered usb hubs, netboot turned into an enormous pain in the neck).
If you have another machine with a SSD or at least a fast-ish HDD and want to give it another go, you could try running k3s with an external datastore (e.g. postgres).
That's the setup I've been using on 3 x rPi since 2021 and I'm super happy with it as I can host all my own personal projects, some OSS ones (changedetection, n8n, etc), even longhorn for distributed storage and still have capacity left -- and this with just microSDs (to be fair, A1s, but still).
Yeaaaahhhh...NLM is/was a special beast inside NIH.
At the time I was there they had a budget to die for, and Pubmed was in the top 300 websites in the world (both are probably still true). NLM pushed so much traffic because of Pubmed that they had their own internet connection to avoid DoS'ing the commodity desktop traffic of the rest of the NIH.
PubMed is self hosted in a datacenter located within NLM on the NIH campus.
They also have their own fiber connections as you mentioned.
There is also a legit storage vault located within the building that houses non digital records. We were told that there were medical records going back thousands of years kept there; and possibly as far back as the Egyptian era.
I'm going to regret this, I bet, but here we go: Many years ago I was the manager of the team inside NIH/CIT that ran both the border firewalls and the DNS servers in question (to be specific, at the time it was NIH/CIT/DNST/NEB/NSS). Obviously, since I'm not there anymore I don't have any special inside information, but given that the DNS servers were responding to TCP and not UDP during the outage, my bet would be a simple firewall screwup, rather than malice.
Stuff happens - maintenances have unintended consequences, people typo stuff, etc. Don't freak out about every event - save your powder for the real outrages (like 18F).
I’ve thought about it and you’re right, this was just confirmation bias on my part. There is in fact no evidence that DOGE is in any way involved. Apologies.
I've popped back to this thread again (in the midst of another argument:) and noticed you've edited a preceding comment (perhaps unnecessarily but certainly magnanimously), which, along with this apology, I think, shows you in a good light and HN in general. Thank you, it's so good to see people being reasonable in amongst the tribalism.
This is exactly how the HN community responded to Twitter, too. The amount of baseless speculation paraded as fact is crazy for a supposedly rationalist community.
Maybe if I remembered or looked at how HN handled Trump's first election win then I'd have a more accurate impression, but both the Twitter/Musk takeover and now Trump's win and DOGE seem to have sent a lot of people into strange hysteria that I've rarely seen on HN. I just want people to calm down and provide better commentary but I think that would be downvoted to hell with responses like "don't you know <insert apocalyptic event here> is happening? Are you <select from stupid/bigoted/evil>?"
The other day, I had a chat with a friend (IRL) and he told me his friend told him that planes are falling from the sky because Trump had fired all the air traffic controllers (which, as Trump would put it, is fake news, driven by irresponsible headlines). The commentary here is on that, very low, very misguided, tribal (not even partisan, worse) and very unhelpful level. That's not the kind of plurality of opinion HN needs, in my opinion.
The first election was very different as he had more accountability in the other branches of government. Now we are seeing Nazi salutes right from the inauguration, the alienation of every ally, the opening of mass detention centres outside the US, executive orders to place himself above the courts while siding with Putin, and unofficially created branches of government acting without oversight to dismantle large swathes of the state.
The reaction is completely different because the reality is completely different. People are "hysterical" because we grew up with the explicit agreement that we wouldn't let these things happen again and now the systems which we built to prevent it are rapidly falling apart. When facism comes knocking at the door the time for plurality of opinion is gone, there are only two sides here and only one of them is in any way a moral choice.
I'm sorry to say, but I find your comment an example of the hysteria I mentioned.
Firstly, that is not fascism and I do wish people would stop misusing that word. The Doctrine of Fascism[1] is an interesting and elucidative read. I'm sure that tyrannical, authoritarian or any of the other more appropriate (yet still inaccurate, in my view) words would be a much better choice. The thesaurus lists arbitrary, which might fit Trump much better than any of them, but fascist, no. That would be a risible choice. Fascists don't cut the size of government for start, and they certainly don't fight for the right to do so in court. What a thought!
> the opening of mass detention centres outside the US
Are you referring to the Guatanamo Bay detention camp? That was opened in 2002 and has run continuously till now. Two Republican presidents, two Democrat. I was against it in 2002 and I'm glad people are finally noticing, but its use as a processing centre for migrants goes back even further to the early 90s (and less officially, the 1970s)[2].
> executive orders to place himself above the courts
That is not what the executive orders do, nor is that how the system works, and even if you think the executive's ignoring of court orders would be a crisis, Andrew Jackson ignored them[3], almost 200 years ago, America didn't descend into fascism (and not because it hadn't been conceived yet).
From[4]:
> The Trump administration is still entrenched in legal fights in the lower courts and, so far, has not defied any orders from the U.S. Supreme Court, the nation's highest court, she noted.
> Firstly, that is not fascism and I do wish people would stop misusing that word.
It is facism. The treatment alone of the trans community is enough to earn it that title. If you want to make an argument from your book then please do so but linking to it alone is not an argument.
> Are you referring to the Guatanamo Bay detention camp?
Are you deciding to pick and choose from the list I gave you? The Nazi salutes mean nothing?
I'm referring to the directive to house 30,000 people there, up from a current population of zero. Coupled with dehumanising language like "illegals," the "delegation of immigration authority" to untrained officers, and the careless attitude to false positives make these developments very much incomparable to anything that's gone before (at least outside of places like Nazi Germany). Do ask yourself as well how easily the use of these facilities can be broadened in the future once there there to house other types of "illegals."
They are already ignoring the courts [0]. What the Executive Order does "legally" does not matter to someone who claims "he who serves his country breaks no law." The only thing that matters is intention. You're thinking about this from a legalist perspective but those days are gone.
> If you want to make an argument from your book then please do so but linking to it alone is not an argument.
From[1]:
> "The Doctrine of Fascism" (Italian: "La dottrina del fascismo") is an essay attributed to Benito Mussolini. In truth, the first part of the essay, entitled "Idee Fondamentali" (Italian for 'Fundamental Ideas'), was written by the Italian philosopher Giovanni Gentile, while only the second part "Dottrina politica e sociale" (Italian for 'Political and social doctrine') is the work of Mussolini himself.
Please, ignorance is natural, but brazen and wilful ignorance in support of an argument based on nothing but pejorative and performative tribalism is something for Twitter or Reddit, not here. If it were Twitter I’d definitely add a facepalm emoji instead of babying you through this.
> The Nazi salutes mean nothing?
If we’re referring to Musk then I must say that I don’t agree that it was a Nazi salute. In my country, we grew up with the war as a constant cultural reference so I’ve seen thousands of Nazi salutes, I’ve never seen one like that. Wishful thinking for more fodder for those performative pejoratives does not a Nazi salute make.
> The only thing that matters is intention.
Let me know when you do the Show HN for your mind reading device.
I know what the book is, but I need more than the title and the author to take an argument from it. Generally with references like this people will point to a page number or offer a summary of what they feel are the relevant parts
Yes, I'm referring to the Nazi salutes he gave. I can't help you see what you refuse to see right in front of you. Even Hitler gave it that way. I grew up in an apartheid conflict zone and have seen this othering before so let's not get into appeals to authority based on our backgrounds.
Calling me a baby is not an argument and I'm not sure that saying "I would totally emoji you but I'm too mature for that" is actually indicative of the maturity you're claiming. I want to point out as well that the tone of the conversation was set by yourself when you started it all with misogynistic accusations of hysteria.
1) Hysteria is a human emotion, not exclusive to females. To imply that it is certainly verges on misogyny.
2) If you knew what the Doctrine of Fascism was then you wouldn’t respond as you did. It contains the definition of fascism by the two figures that brought it to fame, one of whom was called “the godfather of fascism”.
3) Hitler did not salute that way and Musk did not make a Nazi salute.
4) I didn’t call you a baby, but I am having to baby you through basic knowledge. “To baby” is a verb, not an adjective.
5) I would use the facepalm emoji if this were Twitter. Have you noticed it’s not? Misquoting other users is not for HN.
> It was reasonable to expect that most of them didn't have the technical capacity to accomplish that in the available timeframe.
So what? Just because the owner can't respond in a given timeframe does not give a you (or anyone) the right to appropriate other people's property. By your argument at the height of the Covid lockdowns I would be justified in taking your car & loaning it out to people because you weren't using it and didn't "have the technical capacity to accomplish it in the available timeframe."
The fact that it was a license that IA was assigning to themselves rather than a physical object makes no difference whatsoever.
> Just because the owner can't respond in a given timeframe does not give a you (or anyone) the right to appropriate other people's property.
Inter-library loans are common. It was reasonable to think that more than enough libraries would have agreed to provide the number of books they lent out if it was feasible to contact them.
> By your argument at the height of the Covid lockdowns I would be justified in taking your car & loaning it out to people because you weren't using it and didn't "have the technical capacity to accomplish it in the available timeframe."
That would have deprived the vehicles owner of the use of the vehicle, and created a risk that it could be damaged or worn out through use. You're using an analogy that hinges on the very thing that makes copyright different than personal property.
Also, doing things like that often is permissible, even with personal property, in an emergency.
Yes, we throw a lot of the normal rules out the window in an emergency.
However, to me (and the law), an emergency is "someone is going to die or be seriously injured, and imminent intervention is needed to prevent that."
I know that it sucked that most public libraries were closed for several months.
But nobody needed a copy of my book "Experimenting With Babies: 50 Amazing Science Projects You Can Perform on Your Kid" to prevent imminent serious harm.
If reading my book could have prevented injury or starvation, sure. But there was no "literary emergency" here that required pirating copyrighted material as the only reasonable response.
That can have downstream effects, though. When I talked to them about doing it to my more recent Subaru, they told me I'd lose the front speakers and the in-car microphone as well, since all of them went through the same fuse.
The title is a bit misleading. The point of the article isn't that their ethics board resigned - that happened a while ago (right after the Uvalde shooting). The point of this article is that even after the board resignation over taser-carrying drones and a shareholder proposal to stop drone development, Axon bought a drone company anyway.
> the gold standard should be that anything generated by ai is public domain
That's close to what the Copyright office is laying out as their actual policy for AI-generated work: images or text created by AI prompting with no other human interaction are not eligible for copyright [1]. I like the Copyright office's approach, since it's pretty straightforward - did a human do this? If yes, can be copyrighted. If no, cannot. It follows clearly from the monkey selfie thing from a couple years ago, as well.
Unsurprisingly, though, some people have a problem with that idea. The Washington Post ran an op-ed that they put on the front page of the opinion section for a couple days claiming that this ruling was awful, and would destroy everything. [2] Personally, I'm taking the op-ed writer's opinion with a grain of salt, since he seems proud to have written a book praising NFTs, but that's just me.
Edward Lee does such a cynical pearl clutch, artists does creative work, if ai replaces 90% of that, there's no "creative work" left, then it's like a record label labeling itself as a culture creator despite only maybe doing 10% of creative work.
And as you point out, it's not as if you can't copyright individual pieces and plan out the parts you are willing to make non-copyrightable (such as backgrounds).
1) They talk about "active" accounts (meaning have tweeted in the last 9 weeks), and do a bunch of filtering against that. That seems like a huge bias - lurkers exist, and in my experience are usually the majority of users...this step removes them or ignores them entirely. Frankly, until recently, my twitter account would have been one of the ones they would have discarded as inactive. This one thing alone makes me question all of the rest of their results.
2) By the same token, the rate or frequency with which a user sends tweets has no relation to whether a user is monetizable. If they're seeing ads, they're monetizable...lurkers are just as monetizable as high-volume posters.
You seem to be arguing against something that the article doesn't claim. The article isn't equating inactivity and fake/spam, but that: of the accounts that actively send tweets ~20% are fake/spam.
Sure that's a different question from what proportion of all users are fake/spam, but this is still a perfectly valid question to ask, and the fact that they're only considering active users is in the title so I really don't get your complaint.
If you want an analysis that attempts to answer a different question go find or write one that addresses the question you want answered...
The article clearly states (emphasis mine):
> This represents the largest set of accounts on Twitter we could acquire, but it includes analysis of many older accounts that haven’t sent tweets in the last 90 days and thus, likely don’t fit Twitter’s definition of mDAUs (monetizable Daily Active Users).
From the linked Twitter earnings report:
> We define monetizable daily active usage or users (mDAU) as Twitter users who logged in or were otherwise authenticated and accessed Twitter on any given day through Twitter.com or Twitter applications that are able to show ads.
EDIT: rephrased "accounts that are active" to "accounts that actively send tweets" to clarify what the article addresses.
That edit was made 40m before you joined the conversation. Noting your edits is a social convention and voluntary concession offered by a posts' author to validate replies that were made before the edit, while clarifying the authors intended message for future readers. If those future readers use the content of the edit message to shallowly refute the post, consider the incentive this creates to not follow that convention for all authors in the future. If you have a valid refutation, surely you can find evidence for such in the body of the message rather than nitpicking the edit history.
I think you misunderstood their response. They are saying that the study has an unusual definition of "active", and that your need to clarify the definition proves that it is unusual.
Though personally I think filtering specifically for users that actively send tweets makes sense, since that's really what matters when it comes to measuring how healthy and authentic the discourse is
It seems like everyone is arguing about different metrics and it makes more sense to discuss different, specific measures that might fall into a range of behaviors that are "active" in some sense rather than focusing on which definition of "active" is somehow the best one.
What would be more interesting would be to adapt this and answer several different questions about the proportion of spam among accounts with different metrics of activity to see how things change. For example, does the percentage of spam accounts go down a lot if we lower the bar for "active"? How much & how fast?
Twitter's quarterly earnings define active users thusly:
> Twitter defines monetizable daily active usage or users (mDAU) as people, organizations, or other accounts who logged in or were otherwise authenticated and accessed Twitter on any given day through twitter.com, Twitter applications that are able to show ads, or paid Twitter products, including subscriptions.
I'm pretty sure I've heard a similar definition from Facebook.
This definition supports g-clef's critique that the article picks an unorthodox way to measure active users, resulting in an inflated percentage of accounts being measured as spam/fake accounts, vs what the percentage would be if measured against Twitter's definition of 'active', which includes lurkers.
Strange rant. It's not about you editing your post in general. It's that your edit shows that saying "active accounts" when you really mean "accounts that have recently tweeted" is wrong, like the very title of this submission.
Look, there are dozens of potentially interesting and valuable questions to ask on this subject. Answers to which may produce a wide range of insights and conclusions. And there's a whole potential conversation about which questions are most important, that may have different answers depending on the context.
But there's no reason to pin the whole frame of the conversation to the one question for which Twitter corporate chose to publish an answer, unless the only question we are interested in is "did Twitter technically lie" which is the most uninteresting question in this whole situation. If this is the sole context you are using to frame this issue then maybe you should consider if you're following the current news cycle a little too closely.
The idea that there is such a thing as an 'inaccurate definition of active' is silly.
In the light of Musk's statements, which presumably precipitated this timely article, I would say the question of whether Twitter technically lied is the most important question for Musk doing the things he does.
If you're more interested in Twitter's ecosystem as a whole, it is less interesting.
At every company I've worked at any time someone has asked "How many active users do we have?" it was a difficult question to answer because everyone's idea of "active user" was different.
"Active, as in logs in regularly? Wait, what is 'regularly'? Once a week? Once a month? Every day? Does 'active user' mean, online right now?"
Etc, etc...
Their definition of "active user" is relative, not inaccurate.
> I don't know, that seems more interesting than most questions that could be asked about Twitter.
Why? Twitter is a for profit corporation. If, on the balance, lying serves their interests (I'm sorry, I meant "is consistent with their fiduciary duty to their shareholders") more than edging up to the line without crossing it, that's what they will do.
Even the watchdog organizations such as the FTC and SEC that police the speech of corporations more or less limit themselves to material statements that move markets or influence consumer behavior in ways that can be considered fraudulent. The FTC, FDA, and others are concerned with a fairly narrow reading of consumer harm, the SEC is motivated by the health and trustworthiness of the public market. In any case, there pretty much always has to be some sort of alleged harm. Lying per-se is hardly ever forbidden. So if the advantages of a lie outweigh the (risk adjusted) penalties and reputational risks, that's that.
I think a conversation about what ways we expect and permit corporations to lie, either specifically in financial statements or to the general public, is much more interesting than a discussion of exactly how many fake tweets there are and exactly how many accounts are making them, though I guess you could construe that as broadly part of the same conversation.
> I think a conversation about what ways we expect and permit corporations to lie, either specifically in financial statements or to the general public, is much more interesting than a discussion of exactly how many fake tweets there are
I agree, that would also be a much more interesting conversation than "did Twitter technically lie."
Sure. But if I'm looking to purchase Twitter, I think I'd be much more interested in and concerned about this "white" lie than you are as a general consumer.
I think it's pretty easy to argue that their definition is intentionally misleading, which may not be technically inaccurate, but is arguably just as bad.
The big story in the news last week was "Elon Musk says deal on hold while verifying twitter's 5% Monthly Active Users stat", or something to that effect.
That's the context this article was published in. It is transparently obvious they are re-using the word "active twitter accounts" to cause confusion with the definition of "active" that has been being bandied around. The post is using such a title as a clickbait, to hop aboard a trend.
I think the title, and lack of significant clarification in the article, make it clearly misleading, and I don't think pedantic "well technically active can have multiple definitions" changes the reality of the situation meaningfully.
> Why? Twitter included lurkers in its dataset, this article didn't, why should that impact stats in the direction of fake accounts being smaller?
Because you usually don't create fake accounts to lurk, but to do "something".
I'm speculating, but even when you create bots to boost follower counts you'd probably make them post now and then so as to seem "active".
It makes sense that the proportion of tweeting accounts being bots is much higher than the proportion of lurkers. And since there are also more lurkers in turn than posters, I would say that the real number is much lower than that.
I don't buy the speculation as obviously accurate.
Let's say I own a twitter bot farm. I make 20k accounts, have a system setup that logs into each of them from a unique IP each month at random times to make sure they're not banned yet, and advertise it out. On month 1, someone buys 1000 of them as followers. On month 2, someone buys 1000 of them to tweet spam. etc etc.
Each month, there's 20k active bot accounts (logged in to verify they weren't banned). Only a small number may actually tweet though since buyers may have not gotten them yet. Bot accounts lurk too, for months on end, before ever acting.
I'm not claiming this is accurate, but I am claiming this is a reasonable alternative which doesn't align with the view of bot accounts being more prevalent in tweeting accounts than lurking accounts.
A metric they've artificially inflated by gating tweets, which works to their advantage when calculating spam. With that in mind, I think I'm more inclined to look at spam as a percentage of active tweeters and ignore lurkers.
I thought the parent was criticizing Twitter's active monthly user definition, which only includes people who have tweeted in the past 90 days. The article used this definition of active use as well.
Twitter requires users to log in before lurking so their definition of activity is intentionally selective. I'd be surprised if Twitter doesn't know how active their users actually are, even the lurkers.
I read lots of tweets and don't have a Twitter account, or at least one that I've logged into in the last 10 years... The philosophical question seems to be, "am I a Twitter user"?
You could probably argue that most of the world read Twitter and hence are users, account or none. It's that pervasive.
But then there's the next question: "am I a user that reportedly matters to Twitter's business?". What people are trying to land on, in light of Elon's tweet that the deal is on hold pending investigation of Twitter's metrics reporting, seems to be a framework for carving out what exactly constitutes a user that brings the platform revenue that shows up in quarterly reports and hence would directly relate to the tangible value of the enterprise.
In reality, nobody knows what numbers are being thrown around behind closed doors. This article is just one framing.
It's not an active account if by "active" they mean "generating content". While Twitter isn't a typical content aggreagation site like Youtube or Reddit, tweets are still "content" in the sense that they drive further user engagement on the site.
Words used to mean things. The current HN submission title just says active, heavily implying accounts with any kind of activity (eg. like, follow/unfollow), not "users who Tweet".
Sure, clickbait headlines are the norm and the devil lurks in the details, but still, many comments have been spent on this, because it's clearly misleading.
~80% of email is spam, it doesn't surprise anyone, because it's so cheap to send spam. Similarly it's easy to create fake accounts and spam, yet it doesn't mean much.
Who's counted as "engaged"? The people reading, or only the people writing? More to the point, if Twitter moved to a subscription model, would zero lurkers buy in?
Seems like social network aren't interested in counting those that don't use all potential features of the platform. I'd say a lurker/ghost member is definitely an active account.
I would say that if someone is able to be advertised to (since that is what makes the business money) then they should be counted. So yes, there should be no requirement to tweet to be counted.
"Active user" is a common industry term with a well-defined meaning. It's misleading to use it to mean something else, particularly when there are a number of more appropriate choices, e.g. "20% of Twitter posters".
The article clearly defines those accounts as "active" because it's the only way an external observer can somehow isolate an "active" group. Only twitter can know how many users are "lurkers".
And since they are trying most probably to get some PR for their company, they use their specific definition of "active Twitter account".
When you are in the context of :
- Twitter determine the active status of an account using login
- People are wondering the % of active users as defined per the twitter metrics
But then use your own definition of active and write only a one liner on the difference with no reflection on the impact it might have and no warning on the fact you are answering a different question.
Then my conclusion is you want people to make this mistake.
> EDIT: rephrased "accounts that are active" to "accounts that actively send tweets" to clarify what the article addresses.
Made me laugh because you had to add it and made more effort than the author of the article to prevent the confusion :D.
Interesting. This could be a bracketing error, because I read
> it includes analysis of many older accounts that haven’t sent tweets in the last 90 days and thus, likely don’t fit
> Twitter’s definition of mDAUs (monetizable Daily Active Users)
As implying that they think accounts that haven't tweeted in the past 90 days don't fit Twitter's mDAU definition. Given the placement of the qualifying phrase, I think that's a reasonable parsing of the sentence, but I see your point that they could be trying to imply their set doesn't fit the definition. If so, that sentence is very badly constructed.
The full quote doesn't do SparkToro and Followerwonk any credit:
> Followerwonk selected a random sample from only those accounts that had public tweets published to their profile in the last 90 days, a clear indication of “activity.” Further, Followerwonk regularly updates its profile database (every 30 days) to remove any protected or deleted accounts. We believe this sample is both large enough in size to be statistically significant, and curated to most closely resemble what Twitter might consider a monetizable Daily Active User (mDAU).
The fact that they don't even consider the concept of a non-tweeting lurker to be an mDAU brings their entire analysis into question. Let's face it - Twitter is an emotionally-charged enough place, and tweets have such a way of living forever and being taken out of context, that there are many who use it to consume (and perhaps Like) content but will not tweet publicly. These people are still viewing and engaging with advertisements! Twitter absolutely should consider them monetizable!
But of course, engagement data on lurkers is internal only, and Likes data counts against global API caps: https://developer.twitter.com/en/docs/twitter-api/tweets/lik.... Which means that SparkToro and Followerwonk are incentivized to ignore these users. That they do ignore them, and don't address it anywhere in their methodology, is highly suspect.
The article is just clickbait. The title is obviously clickbait (based on your edit you've realized that "active account" !== "accounts that tweet"). Then they try to define active account:
> “Spam or Fake Twitter accounts are those that do not regularly have a human being personally composing the content of their tweets, consuming the activity on their timeline, or engaging in the Twitter ecosystem.”
Ok, but "consuming the activity on their timeline" is essentially unknowable outside of Twitter, since you can't see what tweets people are viewing. It turns out they're trying to infer this through some other signals like follower count, etc. But you can imagine why that might be sketchy.
Then they constrain the analysis:
> A more fair assessment of Mr. Musk’s Twitter following would only include accounts that have tweeted in the past 90 days
Let's be real, if you look at a list of Elon tweet replies, they might as well all be spam. Just search @elonmusk and sort by latest. Then compare that to the sorted tweet replies under an actual tweet. IDK how many millions of dollars and man-hours went into the AI that sorted this list, but it seems to just be putting the blue checks at the top and shrugging at the rest. I doubt this three man team is doing any better at spam detection.
For manipulation / spam purposes I don't really care about accounts that don't actively post/like/retweet/follow. The mDAU isn't useful at all for determining if the activity on Twitter is done largely by bots.
I do wonder how "fake" is calculated. Is @tweetsfrommydog fake? It's a real person making tweets that are funny and provide value to the platform, but it's not a real person as an individual tweeting their personal thoughts, are corporate accounts or parody accounts fake?
It is valid criticism because the context of this article is that Elon Musk wants to know whether Twitter's own claims of ~5% fake/spam accounts is accurate. We do really want an analysis that investigates that precise question and not a related one.
According to Matt Levine, that's "not how any of this works". The $1B is if he could not secure financing, but it appears we are now past that point. The relevant question is whether the Twitter board wants to sue in court to compel a sale.
Given what Musk does to the personal lives of his opponents, I'm not sure I would want to fight him. But given how many laws and rules he's broken at the point, I think there is a clear failure of justice if he can just do whatever he feels like without repercussions due to his common popularity.
Lurkers are also the most important people. They consume the content. They are the meat of the business, the ones that respond to advertising and political messaging. If I were twitter I would champion all the lurker accounts, all the eyeballs to which twitter serves content. Nobody ever faulted the Nielson ratings scheme for "lurker" viewers who only watched but didn't themselves create television shows.
Definitely agree. I joined Twitter four months ago. I haven't tweeted yet, but I'm reading it daily on the app and occasionally liking tweets.
I've been so surprised at how effective the advertising has been on me. I've never experienced this level of engagement with online marketing.
Ads for TV shows, movies, live shows, musicians and comedians have been particularly effective.
I've found myself following a lot of show writers I've never heard of, and I even signed up for some new streaming services because of it. Google and Facebook ads never felt like they impacted me, though I know how important and dominant they are to business marketers. I've never clicked on a banner ad and my eyes glaze over sponsored links. Twitter's level of engagement with their marketing content is new to me, and I'm impressed.
I actively work to block or prevent ad tracking. When youtube serves me an ad for retirement planning or feminine hygiene products, that is my little victory. That is me successfully preventing them from knowing enough about me to target ads.
Yes and no, just like any major media platform, huge majority of tweets being seen are from a very small group of influencers/popular person. That's why when you join twitter, it suggests to you a lot of people to follow that are already big.
You don't really need that many people to submit content though. I imagine most YouTube users have never uploaded a single video, and they don't need to, since there's basically no end to available content there.
Twitter specifically added the annoying feature of your likes being shown to your followers so that lurkers would be actively contributing to the algorithm though.
As long as lurkers are "liking" content, their local network will see an engagement increase.
This is a second-order objective though. The goal is to show ads to humans on the platform. Having a lot of human authors (or any kind of content authors) generating content is a way to achieve the goal, not a goal in itself.
There are other ways to achieve the goal, such as making ads more relevant (targeted advertising), having users consume more of the same content (recommendation), having the same content take longer to consume (periscope). Growing the number of human posters is definitely not a requirement.
And I thought it was common knowledge that lurkers always vastly outnumber people who post content on any platform. If lurkers outnumber posters by at least 3:1, then 20% goes to 5% and twitter’s “<5%” figure is accurate.
Lurkers are probably anywhere between 8-12:1. People actually posting stuff on the internet are in the vast, vast minority, creates a sort of echo chamber.
I am technically "logged into" twitter so I can click through and read the postage stamp-sized charts linked to through various articles and blogs, or watch a video about a riot in some far flung part of the planet. Once a year I tweet at airlines when they lose my luggage or whatever but otherwise don't tweet. Twitter isn't a good social media service, it just happens to be the image/video sharing platform of choice for journalists to promote themselves.
I created an account 5 years ago, followed one or two people, got bored and never logged in again.
Presumably their intention is to exclude abandoned accounts, like mine - is there any way they, viewing Twitter externally, could tell lurker accounts like yours and abandoned accounts like mine apart?
As a third party? Probably not. Which is why it's going to be very hard to disprove Twitter's assertion unless Twitter chooses to share their data.
That's part of why I find articles like this frustrating: I don't think they have the data to actually answer they question they're attempting to answer. Knowing that, what's the purpose of the article?
> Which is why it's going to be very hard to disprove Twitter's assertion unless Twitter chooses to share their data.
It's impossible to disprove Twitter's assertion because they never claimed that less than 5% of their accounts are spam. From their quarterly earnings:
>We define monetizable daily active usage or users (mDAU) as Twitter users who logged in or were otherwise authenticated and accessed Twitter on any given day through Twitter.com or Twitter applications that are able to show ads.
>... mDAU does not include users accessing Twitter through third-party applications.
Their statement said that less than 5% of their monetizeable daily active users are spam. There very well could be 50% of the entire user base as bots or spam, but that doesn't negate the metric Twitter releases.
This doesn’t resolve the issue the article has though. I’m a mDAU because I’ve logged in, yet there is no way for the people writing the article to know that I’m active.
They could maybe use like activity in addition to just tweets? Inherently though this system is going to be less accurate than the dataset that Twitter has access to. If a large chunk of users only engage in Twitter through DMs then an external organization isn’t going to have insight into that.
I would imagine Twitter would have access to analytics that third parties don't have, which would allow them to pretty easily work out which accounts are logged in and used for browsing and which are actually abandoned.
Opening a Twitter link in a private tab is the low complexity solution, or there's nitter.net, or deleting cookies, or various browser extensions that delete cookies for you.
After posting that, I went back and retested. It looks like they have swapped back to a soft nag popup. For a few months it was hard blocking any further scrolling, at least with Chrome.
If an account is in lurk mode, then its not a spammer so I'm okay with it being left out of that equation.
Where I might agree with you is a lurk mode account could become collateral damage in being considered fake. Lurkers don't retweet though. An account with a million followers isn't seen by everyone. Having a portion of that million like/retweet amplifies even further with their network now possibly seeing something from someone they are not following directly.
I'd be willing to accept that the number of lurkers that get lumped in with fake accounts when deciding the percentage of actual eyeballs on posts is not harmful. Those numbers are made up stats anyways. Like the old days of TV/Radio stations that covered large cities with millions of citizens. They would claim they have an audience in the millions even though a small fraction were actually watching/listening.
Except the question isn't about the pure number of spam/bot accounts, it's about the ratio of spam/bots to "authentic" users. If you leave out the lurkers, that ratio gets skewed to mistakenly inflate the bot count.
First off, I don't give 2 shits about twitter, so I don't care if the numbers are skewd in either direction. This is more of an interest in seeing how SV stats/metrics are just a game. Just so that's out there.
A lurker isn't an active user in my opinion. Maybe that's not the same understanding as accepted definition. The lurkers might be absorbing some of the ad content, but they are not helping create new avenues for ads to be shared. Twitter's ad share surface area would increase tremendously if every user was actively producing tweets. That's the only metric that they are concerned. They don't care about how many people actually see the ads once they are there. They make their money on the potenial eyeballs alone. Lurkers are not helping increase those numbers.
> They make their money on the potenial eyeballs alone. Lurkers are not helping increase those numbers.
I don’t follow this.. Lurkers are they eyeballs presumably.
If everyone on twitter tweeted the same amount it would probably just drown out the popular accounts and create a more diffuse and less profitable ad space I think.
>> They make their money on the potenial eyeballs alone. Lurkers are not helping increase those numbers.
>I don’t follow this.. Lurkers are they eyeballs presumably.
The number of eyeballs allows for the price per ad to increase while the number of places ads can be placed increase the volume of ads. If lurkers are not helping to increase the volume, it doesn't make the platform as much money. Proving the lurkers are actually consuming the ads and making the ad buyer happy is non-trivial. Proving the lurkers are worth increasing the price per ad is also non-trivial. In the end, I personally feel like it is a wash by lurkers being overly represented in the fake account numbers.
Volume of ads is irrelevant. An additional tweet to attach an ad to does not generate revenue if there is nobody looking at it. On the other hand, though, an additional set of eyeballs on an existing monetized tweet does generate additional revenue.
As an extreme example, a single monetized tweet with a billion viewers generates money. A billion monetized tweet with one viewer obviously does not..
Why are lurkers not helping numbers? It's the exact same as Youtube, do you expect majority of lurkers on YouTube to not be counted because they didn't create a video? People follow what is already out there and ads target the people watching.
Edit: I guess it's true that lurkers won't be bots, unless they are clicking on ads or trying to simulate engagement to help certain twitter accounts seem popular.
That means that 20% of the posts that I see, as a lurker, are generated by bots. The bots are having a huge influence on conversations, and that's important to know.
> That means that 20% of the posts that I see, as a lurker, are generated by bots
I don't see how you can arrive at this conclusion. It depends on who you are following, with some additions by the algorithm (unless you use the chronological feed) and (speculating here) the algo pushes content from real humans.
I read tweet replies, not just tweets (apologies if I'm not using the correct terms, I'm not an active Twitter user). The original tweet may be a real user, but I often dive deep into all of the comments. If 20% of those comments are from bots, then that's a lot.
No, since you choose who you follow, you're most likely filtering for interesting stuff. I'd wager that most of the spam bots are pretty obvious to spot, and makes up very little of a user's feed.
I don't know how many original tweets are made by bots but 20% of the replies to anyone with a 5 figure follower count seems to fall on the low side of what I would guess.
It sounds like you are genuinely a non-active user, and probably not interesting from the PoV of Twitter/acquirers or the GP poster. This thread is about lurkers: people who regularly log in and read their feed (thus consuming ads and being relevant from Twitter's business perspective), but who don't post and would thus be excluded using the methodology of TFA.
I was offered $300 for my twitter account, I suppose partially on the basis that I haven’t tweeted much, but I use it daily to weekly though don’t tweet often, one tweet in last 2 years or so.
Well, I've been actively trying to create a new Twitter account for a little under a month and Twitter thinks I'm a bot. I've made 1 tweet and followed 5 people.
Even paid for Twitter Blue...still thinks I'm not real. Support is unreachable.
My current plan is to wait til Elon completes the takeover and then build an entire site dedicated to getting Elon's attention to unlock my account...because that's the only way to contact somebody apparently.
Edit add: I find it horrible that we have companies that you can not contact, in fact they seem to be going out of their way to make hard to contact them.
Even things you pay money for, like airline tickets. They want you to email them, make the phone number hard to find. So you do, they don't respond and then you have to search and call them, wait an hour or more on hold. The agents are nice but the entire process is terrible.
Earlier I had to do that for a damaged luggage claim. Went through the automated phone assistant to get to damaged luggage claims and it gave the option to use text messages. So I give it a try, nope. They can't resolve the issue through text, has to be on the phone. So I had to call back, re-enter all the info through the automated system and then ignore it's pleadings to use the text system.
Probably forced to since they do not have access to login information. Especially since if you do not post but login you are certainly not a spammer ^^, could still be bot crawling.
But they probably should expand more on this and reflect on how much inaccuracy it adds. With a quick search you can find that less 50% of US users tweet five times a month (https://www.pewresearch.org/fact-tank/2022/03/16/5-facts-abo...). Or the study which, reported that the top 25% of user produce 97% of the content, the median user of the bottom 75% as posting 0 tweet a month (https://www.pewresearch.org/internet/2021/11/15/2-comparing-...). Those studies were done using survey I believe so should include only active users and no spam/bot.
So with random invalid maths, if you make the assumption that the 25% less active users might not even post every two month (exponential decrease of activity ?) then you need to add back a quarter of the 80% they found as active.
Not to say I believe the 5% number from twitter; and I was going to use the price for a thousands follower as an example, but seeing it appears to be at 30$ now (https://socialboss.org/buy-twitter-followers/ ?) when I remembered it at like 5$ then the twitter team might have done some good work ;).
But one can say that 20% of the content on the platform was distributed by bots.
Meaning that all the Lurkers have to consider if they are really interested in content, that was pushed by some bot-farms.
Technically, every user of this platform has to take a step back and evaluate, if anything they have seen is not pushed content by some bots.
20% is huge and I am curios if there will ever be some comparable "official" numbers to that.
No - you can say that 20% of the accounts actively posting are spam/bots.
It's possible they are posting MUCH more or less than 20% of the content.
If these are skewed toward the high end of producers - the 80/20 rule would say that as much as 80% of the content could come from them. Still - it's possible this content isn't interacted with much outside of other bots. You can't draw many conclusions from such a limited data point.
There was this suggestion to conduct a sting operation of displaying captcha to a sample of users to determine the % of the bots.
Probably picking the sample is still challenging but at least can somewhat tell if the accounts in the sample are genuine.
The method in this article is so flawed that Larry Ellison, founder of a famous law firm, would count as an inactive account since haven't tweeted since 2012[0] and that person apparently looks into investing in Twitter[1]. How can be investing a billion in Twitter when he doesn't use Twitter at all?
They point out that's their definitions of active accounts is a flaw in their methodology (inside the article). However, I think it's fair to say that while TWTR has better internal insight into an "active user", it's the best approximation one can do from the outside.
I do wonder about, given perfect knowledge, how the bot accounts would shake up. What percentage produce content (presumably propaganda, automatic tweets using it as an RSS like announcement service, and spam) vs follow people (boost follow accounts, sell likes)?
>They talk about "active" accounts (meaning have tweeted in the last 9 weeks), and do a bunch of filtering against that. That seems like a huge bias - lurkers exist, and in my experience are usually the majority of users...this step removes them or ignores them entirely.
All true. However, do you really believe that a bot is more likely to be active than a real user? If so, fair play to you. If not, then we would expect inactive users to be bots in an even greater proportion than what we see among active users.
We can argue about what the article did and didn't imply, but what's interesting to me about the issue you raise is that among lurkers there is probably a much lower rate of fake/spam activity, since there are fewer reasons for a bot to log in and not tweet. Couple that with the fact that lurkers are generally the vast majority of users on any platform, and that alone could explain the discrepancy between Twitter's 5% number and SparkToro's 20%.
Services that sell followers and spammers "aging" accounts generally would look like lurkers. Twitter could probably get an accurate estimate with the amount of analytics they have for internal use only, but of course they might be incentivized to not try very hard.
> lurkers exist, and in my experience are usually the majority of users...this step removes them or ignores them entirely.
I've spent many, many hours lurking on twitter, don't have an account at all, and mostly access it through nitter instances. Are they "biased" for not including me?
edit: should inactive users be counted as active users?
Yeah, and I fully expect that these numbers went up recently with Twitter requiring login to view threads.
The fact that they add a .42% is a red flag in itself, especially when they admit in their own post that they agree that their analysis is deserving of critique. Very misleading stuff.
Their analysis using purchased bots seems a bit more reasonable.
“Passive” accounts may actually be more likely to be bots as many services sell fake followers. It’s just harder to detect with public information rather than their IP addressees etc.
Similarly I don’t think there is any way to separate active vs abandoned passive accounts as a 3rd party.
> They talk about "active" accounts (meaning have tweeted in the last 9 weeks),
This is not their definition, that's what Twitter considers an active account in their revenue reports.
> has no relation
It has some relation, no? I wouldn't be surprised if there is a strong correlation between how frequently a user sends tweets and how monetizable that user is.
their TL;DAbstract refers to this as a 'conservative' methodology, that is 'rigorous', and 'likely undercounts.
Their definition:
> “Spam or Fake Twitter accounts are those that do not regularly have a human being personally composing the content of their tweets, consuming the activity on their timeline, or engaging in the Twitter ecosystem.”
They note the following to differentiate fake and spam:
> Many “fake” accounts under this definition are neither nefarious nor problematic. ... By contrast, most “spam” accounts are an unwanted nuisance.
Some general data analytics notes from their post:
* Then lump together fake and spam in their analysis - and this really matters! somewhere like NYT is both 'fake' meaning it isn't a real person and A HIGHLY VALUABLE ACCOUNT for twitter to have.
* They use a sample of 44,058 accounts (of ~1.047B)
* They look at a number of classifying variables (17), spam accounts met 10+ of those 17 criteria. They don't list all 17.
* The criteria were developed from a "machine learning process" that is undescribed, and was developed from a sample of 35,000 'known' fake twitter followers bought from 3 vendors and 50,000 claimed non-spam accounts. They appear (imply?) to have used 50% training 50% real data but dont't specify explicitly.
* They say their model is about 65% accurate, and unlikely to produce false positives ("almost never includes false positives") - however they don't list any specificity, sensitivity, etc. that would be useful to evaluating that claim.
* The analysis does no statistical tests, no confidence intervals, minimal information about how the model was tested or validated.
* Critically: they note, but do not describe or quantify, that a lot of the criteria are highly correlated
* then later in the article they suddenly seem to switch to a 10 point scale for quality away from their 17 point scale? with a threshold of 3 or below as low quality?
* My personal twitter account meets most of the metrics where they have listed a quantifiable threshold. And their fake followers tool lists it as pretty f'ing suspicious - i.e., low quality.
I'm not saying there wrong but I am saying good luck getting this from a blog post to any sort of respectable science publication. As they note at the end, they aren't even calculating the same metric - twitter uses monetizable daily active users - remember NYtimes? Absolutely a monetizable account - even if it isn't a real person.
anyone who thinks this is proof of Elon's 4D chess based on this article is, to me, frankly delusional.
Turning on my cynicism switch on a bit. The author is a very good content marketer. A hot topic in our corner of the world — which is author’s target audience — is Elon Musk buying Twitter. Musk tweeted that the percentage of bots is the main issue of the deal. He disputed Twitter’s number of 5%.
I believe the author writing prompt was just: a headline about fake Twitter accounts showing a number significantly higher that 5%. That’s it. Whatever the methodology, that was the author’s goal.
The article achieved this goal. Otherwise is completely irrelevant. Even for the person who wrote it.
My account was active until recently (deleted when Twitter accepted Musk's offer, I don't need to be a participant in a right wing cesspool). I have 0 tweets. I don't like things, because I don't want my name attached to someone else.
IBM bought a company whose product we'd been using for a while, and had a perpetual license for. A few years after the purchase, IBM tried to slip a clause into a support renewal that said we were "voluntarily" agreeing to revoke the perpetual license and move to a yearly per-seat license. Note: this was in a contract with the government, for support, not for the product itself. They then tried to come after us for seat licenses costs. Our lawyers ripped them apart, as you can't add clauses about licensing for software to a services contract, and we immediately tore out the product and never paid IBM another dime.
I tell this story not to be all "cool story, bro", but to point out that IBM does focus on renewal growth, but they're not geniuses...they're just greedy assholes who sometimes push for growth in really stupid ways.