There is a leak video from Chainalysis, they basically deploy rogue nodes or reverse proxies able to capture IP address along with the monero tx. Before reading the article, I suggest to watch that leak before.
I thought the entire point of cryptocurrency is you operate in an adversarial environment where you can trust no one. In that light calling nodes that log IPs addresses rogue seems foolish (it’s not like they’re trying to undermine the protocol, in which case rogue might be the right word); they are exactly what you should expect.
Yes, in the Chainalysis video they say clearly that Dandelion++ is very effective so they have no confidence in IP addresses collected in those cases. You need to be foolish enough to directly connect to a malicious node while using your home IP, which obviously will leak the IP. I think a lot of people are just confused by the video because they go through examples that seem very 'constructed' and unrealistic. I mean, this is all very well known information to monero users and I can't believe anyone would be doing important transactions without using their own node and/or I2P/Tor etc
> I mean, this is all very well known information to monero users and I can't believe anyone would be doing important transactions without using their own node and/or I2P/Tor etc
And I wonder what an estimate is of % of transactions (by volume and value) that are sent from a full node vs public remote node and public web vs tor/i2p.
Obviously won’t be able to get an accurate answer, but one of the remote nodes in that pool might be able to provide some absolute numbers and a rough estimate of their share of connections in that pool.
Should be easy for them to differentiate clearnet vs tor exit node (and dunno how detectable i2p is).
Even the geo-dns mentioned in the article would be interesting data to see geo-source of transactions.
There are multiple attacks in that video. IP is one, but a much bigger one is knowing the real output (or maybe real output plus one other output) among 16 ring signature members. They do not explain how they achieved it but one could guess that maybe by just doing a lot of tx themselves - 15 decoys is just way too small and flooding blockchain with transactions all but ensures that someone picking random ring members will pick a lot of your outputs (and thus have little privacy from you). It is also for sure too small for targeted active attacks (e.g., it is not safe to have repeat interactions with the same entity), see https://www.youtube.com/watch?v=9s3EbSKDA3o You really want more than 16 ring members, preferably all outputs ever created like in Zcash. FCMP work promises to bring similar privacy to Monero but is years on the roadmap.
Your own node is connected to other nodes to get latest blocks and publish transactions to the network. These peers are selected randomly among the pool of available nodes. If the attacker has enough nodes, there is a good probability that your node's peers are partly controlled by the attacker. When you publish a new transaction and broadcast it to your peers, the attacker can detect that it is indeed a new transaction (since it is the first time it's seen by the attacker nodes) and that the IP address of your node is the IP address of the transaction sender.
It's not going to work 100% of the time (except if _all_ your node's peers are controlled by the attacker) but with a few transactions it's eventually going to lead the attacker to your IP address.
It's the same kind of attacks that are used to deanonymize people on TOR.
If you want to protect yourself from that, you need to add a few layers of trusted no-logs VPN in front of your node, so that the attacker is lead to a dead end.
> When you publish a new transaction and broadcast it to your peers, the attacker can detect that it is indeed a new transaction (since it is the first time it's seen by the attacker nodes) and that the IP address of your node is the IP address of the transaction sender.
You're assuming that peers will relay new transactions to all their peers, but that is not the case with the Dandelion protocol that Monero adopted [1].
What proportion of nodes? There are papers that analyse it but I haven't read closely or found a clear answer.
I suppose even if they controlled all but 2 nodes - the extreme case - even then they couldn't know with certainty which of the 2 nodes sent the transaction, so it could be argued that there is always plausible deniability.
Let’s call these 2 nodes N1 and N2.
The case you mention only works if N1 is connected to the network only through N2, in which case when the attacker’s nodes receive a new transaction from N2 there is plausible deniability for both N1 and N2.
In any other network topology, N1 and N2 are broadcasting their transactions to attackers node, which can then link then directly to N1 or N2.
So no, this attack doesn’t require to own all the network.
I don’t know which threshold makes the attack practical though. I guess there is probably no threshold: the bigger the share of the network you own, the bigger your percentage of successful IP tagging is.
I dont think that's how dandelion++ works; one of us is mistaken. In any network topology, I think it is possible that in the first step of the stem phase the transaction is propagated only from N1 to N2. It will be impossible for the other nodes to know if that happened or not, so they can't know whether N1 or N2 transmitted the transaction first. I could be mistaken but this is how i understand it.
The entire system seems like a scam. Thankfully, the COVID-19 pandemic and the Ukraine war have deflated startup valuations a bit. I believe another crisis is looming in the near future, and once it passes, we should be in a better place for the next 5-10 years.
More individuals are increasingly recognizing that we are currently in a medieval era of the Internet. The content and creations we share online today are essentially not entirely our own. Many are eagerly awaiting a transformative moment akin to the French Revolution to rectify this situation.
I personally hope that nothing changes in this regard. I think it’s great if everyone can use the content that was voluntarily posted online for the whole world to see.
Speaking as someone with 100k rep on SO, I don't mind them using my answers to train the model. What I object to is when the resulting model is not then itself posted online for the whole world to see. If OpenAI actually shared the weights for GPT-4, I would consider it an ethical trade.
>More individuals are increasingly recognizing that we are currently in a medieval era of the Internet. The content and creations we share online today are essentially not entirely our own.
currently? Even in the geocities days, companies can presumably crawl/scrape your site to train their LLMs. The only way of preventing that is by paywalling your site, which I doubt many hobbyist bloggers are going to do.