Hacker News new | past | comments | ask | show | jobs | submit login
A peek under Bitcoin’s hood: Writing a client that can create a transaction (samlewis.me)
241 points by samlewis on June 5, 2017 | hide | past | favorite | 63 comments



Article says

> This is also why address reuse in Bitcoin is encouraged as to sign a transaction you need to reveal your public key. If you don't reuse an address after sending a transaction from the address, you don't need worry about the private key of that address being exposed.

Shouldn't that say "address reuse in Bitcoin is discouraged"? Otherwise I don't think I understand what he's trying to say.


It should say This is also why address reuse in Bitcoin is "discouraged" as to sign a transaction you need to reveal your public key. If you don't reuse an address after sending a transaction from the address, you don't need worry about the "public key" of that address being exposed

The reason being without revealing public key, with only the bitcoin address the attacker first need to guess the public key from the address, then guess the private key from there. So just breaking one of the hash algorithm or ecdsa algorithm is not enough to steal funds. at least that's in theory, in reality if either algorithm is broken we have a much bigger problem.


I was under the impression that ecdsa was potentially broken by quantum computers, but SHA-256 was not. Is that not the case?


Yes, in theory. See https://en.wikipedia.org/wiki/Elliptic_curve_cryptography#Qu...

ECDSA is vulnerable to a modified version of Shor's quantum integer factorization algorithm. However, nobody on Earth is known to be close to producing such a computer. Adiabatic quantum computers like the ones produced by D-Wave are not known to be capable of running Shor's algorithm. See https://en.wikipedia.org/wiki/Adiabatic_quantum_computation

SHA-256 and hashing algorithms have no known quantum attack against them, but one could theoretically gain a sqrt(n) advantage in brute-force search using Grover's quantum search algorithm. https://en.wikipedia.org/wiki/Grover%27s_algorithm


Huh I knew that Grover's algorithm would yield speed ups in db searches, but apparently I didn't read the wikipedia article closely enough. It allows for inversion of any function in sqrt time!


Address reuse is most certainly discouraged from a security standpoint! I expect that you are correct as to the article's intended meaning.


Address reuse only becomes a problem in the theoretical case that someone can determine the private key from a signature. This would only happen if there was some sort of breakthrough in cryptoanalysis of elliptic curve cryptography, which while theoretically possible, is unlikely.


The idea is that you don't want to be secure now, but also in the future. The ECC Bitcoin uses is vulnerable to Shor's algorithm, and thus quantum computation. Quantum computing at that scale is a fair ways off, most likely, but on a timescale of a couple decades or more, a successful attack seems quite reasonable.

Bitcoin can soft-fork, allowing address generation with a different algorithm. It's not any sort of existential threat to Bitcoin. But it does require that you move your coins to a new address if you reuse an address, else you are vulnerable.


Not to mention, that such a discovery would be significant enough - that a protocol upgrade and fork would be inevitable, in order to protect addresses retroactively.


or you reuse the nonce in ECDSA.

This burned sony, and burned people that had faulty wallet code that submitted transactions with the duplicate nonces.

If you only published, and signed a transaction once, you would be immune to fail by ECDSA nonce reuse.

Its good to rotate the publickey/address per transaction in bitcoin


Oops, you are correct. Good pick up, I'll get to changing that!


Very cool. This could be used to implement offline wallet with minimal amount of third party code.

One issue I'd like to point is I wish the author used proper random generated private key. Examples on the internet have a nasty habits of being copy pasted and reused verbatim :(


Not using cryptographically safe random key is very stupid, even though it is only an example. It should be very trivial to fix this example.

I think in python the right way is to use the os.urandom() function.


My intention in using "hand crafted"/memorable private keys was just to make the article and what I was doing easier to follow.

I don't think there's any harm in people copying and pasting examples, it only becomes an issue if they start depositing money to either of the addresses that I listed the private keys for. I hope that's an obvious enough mistake not to make! Thanks for the feedback anyway, I'll keep an eye on those addresses and if they start getting mysterious deposits change the article to include more cryptographically secure keys.


Hi Sam,

If you're reading the comments, can you do this treatment but for Ethereum and the smart contracts built on top of it?

I find that your article gives a good explanation especially for beginner to understand from the code perspective.


FWIW, I was looking at eth code in the weekend and it's pretty accessible (I was looking at go -- I've read that eth python should be even more readable).

The address part is very similar, even simpler, than bitcoin. It's the same elliptic curve, same way to build private/public keys.

For the address, it uses a different hash function, Keccak256, and then it's just serialized in hex. Code here [1]. Pointers to key generation [2, 3].

[1] https://github.com/ethereum/go-ethereum/blob/release/1.6/cry...

[2] https://github.com/ethereum/go-ethereum/blob/release/1.6/acc...

[3] https://github.com/ethereum/go-ethereum/blob/release/1.6/acc...


Keccak256? Isn't that just SHA3 with a digest of 256 bits?


No, it's slightly different from SHA3-256. There was a tweak for standardization.


I'm unsure. I think this came before sha3 was defined so some parameters may have changed. But the core is essentially that.


There are minor padding differences but that's it.


No promises but I was actually considering having a crack at this as my next project. stay tuned!


Looks like there was bitcoin left in the address he published the private key for until about 30 minutes ago. I hope that was a deliberate giveaway.

https://blockchain.info/tx/2685ff794de17cebdf94eb0f111e8b8c0...


I did deliberately leave that there in the hope someone might take it using what they'd learnt from the article. there's a short note about this in the conclusion to the article. From the "non whole" fee amount it looks as though someone might have just used some sort of wallet software to take it though, which is a shame.


As of the time of publishing, roughly ~8.58 USD.


>> 0xFACEBEEF and sent it 0.0005 BTC.. 1 month later and someone had stolen my 0.0005 BTC! I guess people must occasionally trawl through addresses with simple/common private keys.

This made it worth reading the article. Such cyberpunk!


For fun[1], you can occasionally trawl through addresses with any private key!

http://directory.io/

[1] For various definitions of "fun."


Their FAQ (unlisted on the main page) is interesting as well..

https://directory.io/faq


This is an excellent writeup! If any readers are looking for an already-written and tested bitcoin client library and can use javascript, https://bitcoinjs.org/ is great. I wrapped a simple cli tool around the library to make the 'coindust' npm package to do simple operations with bitcoin addresses and public bitcoin APIs.


I am actually really surprised it took a month for someone to swipe the BTC he sent to an address with a private key of 0xfacebeef. That could be found via an incremental search in under an hour of CPU search time on one computer.

I note that compressed public keys are not being used in these examples - it's highly recommended to use them, since they reduce transaction size and cost.

Regarding weak keys - there have been a lot of weak key generation techniques in bitcoin where hiding the public key won't do you any good.


Just a warning for people, I was playing around with BTC the other day and not generating random entropy from urandom, just putting in a short string.

I found my BTC was stolen immediately when I did this. It wasn't a lot of coins, since I was just testing, but it certainly cost me some debugging time as I wondered how the extra transaction had occurred.

Basically, someone out there has already generated the keys corresponding to short strings and is keeping an eye on any transactions on them. Maybe more than one group, who knows?


There are many of us with bots sweeping weak keys. Some of us give the coins back to a safe address, some people just keep it. It's like finding money on the sidewalk.


A lot of people are equating this to theft. While I can see their point, the concept of ownership in bitcoin is closely tied to control of the private keys. In this case many people control certain weak private keys so in a way you are giving your money away by sending to these addresses.


Taking money from someone who did not intend to give it to you is a dick move, regardless of legality or technical difficulty of doing so.


If it can be taken, somebody is going to take it. It can be people that intend to keep it, or could be a person to intends to give it back. If you left a bunch of money unsecured, which would you prefer? Not having it taken isn't an option if you use a weak key.


This is like admitting to stealing things from people who leave their houses unlocked. If you think that is all good then you need to re-evaluate your life choices.


You can try to create whatever analogy you want that makes it seem worse, but it's a fact that if you use a weak key somebody is going to sweep it. The only thing preventing it from being lost forever is people getting it first and returning it to secure keys.


No, it's like picking money off the sidewalk when someone drops their coins and then running away.


It depends on if they "run away" and keep it, or if they return it to the proper owner in a secure manner.


It's more like stealing from an unlocked locker at a public pool.


Not incorrect, but only if you assume the pool is available to every person on Earth and that no government has jurisdiction over things. In that case, I would personally take the items from the locker and leave information on how to securely retrieve them.


There are multiple parties doing this, at least tens of billions of keys in their databases.


Can someone explain why bitcoin adresses can be created (offline) without blockchain-validation for duplicates?

Even if the chances are very small I mean...like....gone is gone...no bank to call for a false transaction.


Addresses are created offline for scalability reasons, meaning that not having to interact with the blockchain is a feature, not a bug.

There are 2 levels of collision that are theoretically possible. The tldr is that both are really hard, way harder than mining itself, so you'd better spend your time mining that trying to find collisions in the address space.

The first level is that you can generate the same private key, i.e. guessing exactly 256 bit. Prob is 1/2^256.

The second level is that you find a collision in hashing the public key onto the address. Hash is a combination of sha256+rimemd160, but in fact it's a hash onto 160 bits, so the probability of finding a collision is 1/2^80 because of the birthday paradox.

When you generate a new address, you can certainly add an extra step and verifiy if it's used already in the blockchain. If you find a collision, though, please send it to me before discarding it :)


Bitcoin addresses are basically a public key, so you can generate them offline the same way you can generate pgp keys, openssl keys, SSH keys you use to access GitHub etc.

The chances of a collision are so astronomically low that our sun will probably run out of fuel and explode before two identical keys are generated.


With the comfort of a public blockchain at hand why not eliminate this case at address-creation time?


Let's do some back-of-the-envelope calculations.

Let's say the probability of someone cloning a Velociraptor within the next 20 years is 1 in 1 trillion. Let's say that given a Velociraptor has been cloned, it has a probability of 1 in 1 million of escaping. Let's say that given a Velociraptor is on the lam, it has a 1 in 100 billion chance of breaking into your house. Let's say that if there's a Velociraptor in your house, checking for it increases your chance of survival by 1%.

Let's assume there are 1 quadrillion Bitcoin addresses created.

Under the assumption of correct implementation, checking for velociraptors in your closet is literally more rational than checking for duplicate addresses.

Now, there is some value in checking for duplicate keys as a safeguard against implementation bugs, but that's best covered by generating a billion addresses on your own and checking for duplicates within your personal set.


From now on, I'm going to start my explanations about how Bitcoin addresses work with opening a closet to look for velociraptors.


A very succinct way of explaining it, thanks!


Not worth the effort. You could absolutely make a wallet that does this, but you'd have to either do a lookup with a trusted third party, or keep around a bloom filter of used addresses to query against (probably about a gigabyte). We're talking about "winning a lottery jackpot several times in a row" odds.


Blockchain's function is only to prevent double spend. Note that addresses are just a virtual term, in reality when you send money to an address you are setting up a "lock" (predicate) that can be unlocked only by a private key associated with public key that hashes to this address.


You clearly don't understand the scale, here.


You could, but it's simply not necessary. I have not verified the numbers, but I think this answer gives a better idea of the scale we are talking about:

https://bitcoin.stackexchange.com/a/3205


Consider that if you do find a collision, you would be able to take any bitcoin outstanding to said address on the blockchain.

Framed this way, you can see that in fact if your defensiveness is necessary, addresses would be extremely unsafe against targeted attack.



doesn't answer my question.

(accepted answer includes) „...It may be "theoretically" possible...“


Wrt transactions, I don't understand why you need to send yourself the unspent BTCs. Why does it make things significantly easier?


It's essentially for efficiency but not merely; it would be impractical to do with partial spends. Transactions are immutable and spend all of their inputs because it's easier to track and verify no double spend has occurred. If you allowed for partial spends, you'd have to trace the ancestry and descendants of all the transactions to verify whether there was enough bitcoin to spend it. This way if you have a transaction and it hasn't been spent since it originated, you know exactly how much can be spent.

I think it's natural to think of addresses as having a certain amount of bitcoin in them like an account at a bank. But bitcoin doesn't actually have accounts with balances. It just has transactions that have yet to be spent. You can calculate the amount of bitcoin that is spendable by an address so it looks like an account, which is what every wallet does but underneath it's just a collection of transactions that point to some address that you have the private key of.


The high-level meaning of a transaction is to send all the contents of an address. So if you only want to send partial payment, then you need two transactions - one to the foreign destination address and one back to yourself for the remaining funds. The analogy is closer to law and property title, rather than the mathematical/accounting actions of addition or subtraction.

In fact addresses don't really exist, other than as named state in the chain of title. It is the prior, unspent transaction output (utxo) that is unlocked during a transfer, and not some abstract bitcoin address.

Other blockchains, can and do handle these things differently.


Because the miner almost always expects a fee, so why not just infer how high it is and you don't need to spend block bytes on it?


Ok, but the unspent output takes space as well. I think you could just replace it with the fee. Not sure if I'm missing something or if it was just an arbitrary design decision.


It has to do with the way transactions and the blockchain ledger works, reread the transaction part of the article and it might make more sense. Basically, transaction outputs can either be spent or unspent, you can't half spend an output (or perhaps more accurately, you can't spend a previous output twice).


Yes I understand that's how it works, but I don't get why it's important. Surely you could double spend an output as long as you don't exceed its value; and that looks easily verifiable.


Ah, my bad. Sorry if I was being condescending in my previous reply then.

I suspect it was a design decision, yeah. If outputs could be spent multiple times then you'd need to traverse the chain of outputs backwards to see how much an output has left in it. Bitcoin's implementation is less complex and more elegant, imo.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: