Great work — the functional spin on top of Rebol's clean syntax is really nicely done!
Any chance you'll be adding something like the parse function? The ability to have domain-specific dialects was one of the best things about Rebol and would be great to see in a modern language.
And speaking of functions, could you make the "Function Reference" that's on the homepage, be available on a standalone page too? It's a very useful reference and would be great to access it easily. Thanks!
Regarding 'parse', it's absolutely in my to-do list. Although, already, the language by its very nature has the means to do it (even you, as a user, could) - but I have to prioritize things.
Over the last decade, Redis has been my go-to example of beautifully written code. Thank you for such an amazing creation @antirez, and best of luck for whatever you do next.
2. Oracle problem: In practise you never have full specifications (let me be
provocative: Facebook has no specification in principle, only its ever changing code), and often,
especially in early phases you rarely have specifications.
3. You pay the price for dependent types always (e.g. having to prove
termination), even if you don't use interesting specifications.
4. Like Java's exception specifications, dependent types are 'non
local', meaning that a change in one part of the code may ripple
through the entire code bases, this stands in the way of quick code
evolution, which is important -- at least in early parts of
large-scale software projects.
5. Despite dependent types being 1/2 century old by now, and despite intensive effort, dependent
types have not really been successfully been generalised to other
forms of computing (OO, stateful, concurrecy, parallel, distributed,
timed) -- unlike competing approaches.
Summary: dependent types are wonderful as a foundations of
mathematics, but have failed to live up to their promise as general
purpose languages.
1. Bidirectional typing can be easily generalized to dependent types. You can't always infer the type, but an 80% solution is definitely good enough for inference.
2. Your types don't have to be full specifications, and you can refine them later, no?
3. Idris, at least, doesn't force you to make all your functions total, and functions are actually partial by default.
4. Again, in Idris, implementations are hidden by default across modules. Maybe there's a nice middle-ground between hiding everything and exposing everything, but I don't know what that would look like.
5. What are the competing approaches and how are they better at dealing with, for example, distributed computing?
I'm not very familiar with dependent types, but none of these seem like fundamental restrictions.
1. BIdirectional isn't magic. If I program MergeSort or Quicksort, and give it the signature
forall T.
Array[T] ->
(T -> T -> Bool) ->
Array[T]
Bidirectional will not infer what's missing towards a full spec.
2. Yes, you can refine later, but if you want to refine later (and in practise you almost always
have the full spec -- if it exists at all -- only at the end), then
why pay the cost of dependent types from the start? Once you have the
full spec, verify with Hoare logic.
3, 4: I'm not sufficiently familiar with Idris to comment.
5. Alternative approaches, in addition owhat "pron" mentioned includes
program logic, e.g. TLA.
1. Yes, but the situation regarding type inference is only a bit worse than the situation in non-dependently-typed languages. Of course the compiler can't magically figure out what theorems you want to prove.
This is my point. Adding dependent types comes with costs that you will always have to pay, even if you don't use the power of dependent types. If you verify with e.g. program logic in the few real-world cases where formal verification is financially viable, then you only pay the price when you do the verification.
Every single one of your claims is addressed by the programming language Idris [1], which has dependent types. This is not to say that Idris is the only dependently typed language which solves these issues, only that it exists as a counterexample to your claims.
1. Idris does not have type inference at the top level, but it does infer types within the body of functions. It can even disambiguate identically-named functions with different types, something Haskell cannot do.
Moreover, Idris (and other dependently typed languages) can perform implementation inference, something which is much more powerful and useful to the programmer than type inference. It lets you work interactively with the Idris compiler whereby you specify a type and Idris creates a skeleton of the function for you, filled with holes that you replace with expressions. You can ask Idris what the type of a hole is and it will helpfully tell you the types of all other identifiers in scope. You can even ask Idris to attempt to fill in the hole for you, something it can usually do when there is only one possible expression with that type (occurs extremely often in practice).
Watch Edwin Brady's demo of Idris 2 [2] to see this form of interactive development in action. Note that although Idris 2 isn't finished yet, a great deal of these examples (such as zip) work in Idris 1.3.2 (the current release).
2. Idris lets you specify your program gradually. When problems arise, you can refine your types and Idris will guide you through all of the refactoring you need to do to enforce your new invariants.
3. Idris does not force all of your functions to be total, only the ones you want to use in types. Idris checks the totality of all your functions and can produce compilation errors based on totality annotations attached to your functions. In practice, It is not very hard to make every function in your program total, apart from the main loop.
4. Idris has modules and gives you powerful tools (such as views) that allow you to define a stable API at the module boundary. This means you can mess around with any of the types of private functions to your heart's content, without any visible change to the outside world. This is entirely the opposite of 'non local' Java exceptions.
5. Idris has extremely powerful tools for working with state and writing concurrent programs which preserve invariants that cannot otherwise be checked easily with other tools. As for OO, well, that's a concept so nebulous as to be almost incoherent at this point. In practice, Idris's interfaces can dispatch on as many parameters as you want, unlike most OO languages which only do single dispatch.
I'm not sufficiently familiar with Idris to comment -- I will learn Idris in the future, hopefully in December.
Be that as it may, I'm somewhat sceptical of implementation inference: after all, you need to give a spec, exactly the thing you typically do not have in large-scale software engineering. The spec emerges at the end and often the program is the spec itself. Moreover, as far as I gather, implementation inference works by program synthesis, and unless Brady has had a massive breakthrough here, that's an exponential search process, so it won't scale much beyond inferring small programs like the zip function. (Still I think implementation inference is a useful tool- but it's not magic.)
Definitely give Idris a go. I'm working through the book Type-Driven Development with Idris [1] by Edwin Brady, the author of Idris, and I'm having a ton of fun with it. Idris is a lot more practical than I thought. It fixes some annoyances I had with Haskell, such as record naming (Idris lets you reuse record names in different types, it resolves ambiguity by type checking).
It's also eagerly evaluated (unlike Haskell), with a very simple Lazy annotation you can tack on to a type when you want laziness. I think most people will find this much easier to reason about than Haskell's laziness everywhere.
> [...] Idris is a lot more practical than I thought.
This may sound like a stupid question (but is probably necessary in a thread about the convergence of proof systems and static typed functional programming languages):
"How can i implement something like unix' wc?"
So just a program that counts lines (forget words and bytes for now) of a file. This means we need to support (at least in some kind of standard library) basic IO stuff like files, processes and environments. If the language talks about how "simple" (in terms of code/term length) it is to define a binary tree ... the integers >= 0 (the famous Nat = Zero | Succ Nat) example, or beware the Fibonacci sequence - most people won't be impressed.
Because that's what is associated with "programming language" (as compared to "proof assistents").
Sorry that sounds a bit like a rant, but after 5 minutes of searching I couldn't find an example how to read a file in Idris. The closest I found was: http://docs.idris-lang.org/en/latest/effects/impleff.html but it did not tell me how to open a file ...
I don’t know about Idris, but the basic idea would be transform the file’s type from (a|b)* into (a* b)* a* where b is a newline and a is any non newline. Then your answer simply counts the first star of the transformed type (change that star to a counter exponent as you are forming the type).
Yeah ... but how does that look? A "naive" wc implementation (counting the character '\n') is <50 lines of code (if we have modern file IO in the standard library) in a practical PL.
But maybe there is a too big difference between the concept of "files" (as seen by OS developers that view them as a byte-blob in a hierarchical name system) and language designers that want to represent a file as a collection of lines (and probably would like your command line interpreter not allow to call "wc" on a file that contains non UTF8 encoded data) - but this is still a pretty big step for our current computation environment (where I can call programs by name that are also files).
For many people (including me) a "practical" programming language needs an implementation that can generate (compile) an executable or interpret a file representation of its source. Otherwise its a proof assistant.
Maybe we have to merge the concept of data in terms (no pun indented) of bytes (types?) in-memory and collections of bits and bytes that have a lifetime longer than the program (if it's not a daemon)?
I'm still a beginner at Idris, but here's my attempt at a "naive" wc:
module Main
wordCount : String -> IO ()
wordCount file = let lineCount = length $ lines file
wordCount = length $ words file
charCount = length $ unpack file in
putStrLn $ "Lines: " ++ show lineCount
++ " words: " ++ show wordCount
++ " chars: " ++ show charCount
main : IO ()
main = do (_ :: filename :: []) <- getArgs | _ => putStrLn "Usage: ./wc filename"
(Right file) <- readFile filename | (Left err) => printLn err
wordCount file
It counts lines, words, and characters. It reports errors such as an incorrect number of arguments as well as any error where the file cannot be read. Here are the stats it reports on its own source code:
Wow, thanks for the response - that actually looks like Haskell and if its performance is not that abysmal (let's say slower than Python), that is pretty cool.
This actually looks quite reasonable - usage string included, I like it. Hats off to you, I will try to compile it to a binary now and do some benchmarking.
These might be reasonable functions to implement in a standard library, so having them it makes sense to use them. I rather feared Idris does not have some IO abstraction (reading/writing) files at all. Maybe I am conflating languages and libraries here, but they often go hand in hand.
My practicability expectations for languages implementing dependent types are pretty low.
If you implement by manipulating the types directly, a few operations with a loop (just for counting lines mind you). Again, I’m not sure how this would be done in Agda, Idris, or COQ, and the system designing is a bit different from other type systems.
But also, any discrete one dimensional problem like wc is going to benefit from a type system with types that resembles regular expressions.
Agreeing with you: I'm only familiar with Idris insofar as I've watched talks on it, but my recollection is that its implementation inference is a search across pre-listed implementations. Edwin Brady says that that's surprisingly useful. I believe him and look forward to trying it out some time soon, but it is not as magical as we might guess.
Type inference in idris works well if the types of the terms in question are non dependent. The moment you introduce complicated types, type inference suffers, which is expected because even idris cannot break the fundamental nature of logic.
Adding dependent types comes with costs that you will always have to pay, even if you don't use the power of dependent types. If you verify with e.g. program logic in the few real-world cases where formal verification is financially viable, then you only pay the price when you do the verification.
1. Problems with type inference is a problem introduced with polymorphism with implicit type variables and isn't exclusive to dependent types, 'f = show . read' shouldn't compile because of type ambiguity. What is so specific about dependent types in this regard?
1. Dependent types, because they offer 100% certainty in the propositions they express (assuming they're proven) suffer from the same problem all sound verification methods suffer from, and that is fundamental computational complexity costs of verifying something with absolute certainty. I summarized some of them here: https://pron.github.io/posts/correctness-and-complexity (there are some properties that are easy to verify called inductive or compositional properties -- not surprisingly, the properties verified by simple type systems or Rust's borrow checker fall in that category -- but, unfortunately, most correctness properties aren't compositional).
2. Dependent types rely on deductive proofs for verification, and deductive proofs (the "proof theory" part of a logic) are the least scalable verification method as they are least amenable to automation. That's why model checkers (that provide proofs based on the "model theory" of the logic, hence the name model checkers) scale to larger programs/specifications. In addition, deductive proofs are much less forgiving of partial specifications. In other words, the contract (or type, in the case of dependent types) must express every property of the algorithm you rely on (unlike, say, concolic testing). This makes writing specifications very tedious. Unsurprisingly, deductive proofs are the least used formal verification technique in industry, and it's usually used when all other options have failed or to tie together results obtained from model checkers.
Those were the theoretical limitations. Here are the practical ones:
3. Unlike contracts, type systems tie the specification method (types) to the verification method (type-checking, i.e. deductive proofs). This means that everything specified with a dependent type must be deductively proven, or it has to be specified in some other way (i.e. you also need contracts), or else you risk making even your simple types unsound (although I hear some people are working on relaxing that problem, and that's what I mentioned in my previous comment). But properties differ greatly both in the confidence you need as well as the effort they require to achieve that confidence. Contracts give you the freedom to specify your program however you like, and then choose, for each property, which method you'd like to use to verify it -- deductive proofs, model checking, static analysis, concolic tests. randomized tests, manual tests or even inspection, all without risking the soundness of the type system, on which the compiler relies to generate correct machine code. In short: dependent types are inflexible in that they force you to use a particular verification method, and that verification method happens to be the least scalable, most tedious one.
4. Because dependent types are so tied to the program, the program must be written with the specific properties you want to verify in mind. If, after writing the program, you want to prove other properties about it, you'll probably need to rewrite it.
I don't think human programmers have any hope of ever being able to maintain noncompositional program properties for nontrivial programs. A model checker may indeed be able to prove some program property without the programmer having to understand why or how that property holds (in contrast to using dependent types) - but I submit that this is not actually much of an advantage; a human working on the codebase will still need a deductive-proof-like understanding.
> This means that everything specified with a dependent type must be deductively proven, or it has to be specified in some other way (i.e. you also need contracts), or else you risk making even your simple types unsound (although I hear some people are working on relaxing that problem, and that's what I mentioned in my previous comment). But properties differ greatly both in the confidence you need as well as the effort they require to achieve that confidence. Contracts give you the freedom to specify your program however you like, and then choose, for each property, which method you'd like to use to verify it -- deductive proofs, model checking, static analysis, concolic tests.
What practical distinction are you drawing here? Surely with any approach you either have a fully sound proof of a given proposition, or you have a potentially unsound proposition that you've tested at whatever level you feel appropriate. Like, if I have a SortedList type that can produce an assertion that, given n <= m, l[n] <= l[m], either the constructors of SortedList bake that in as a deductive proof, or they unsoundly assert it and I apply some level of static analysis and/or testing to them - how's that any different from what I'd do using any other technique?
> 4. Because dependent types are so tied to the program, the program must be written with the specific properties you want to verify in mind. If, after writing the program, you want to prove other properties about it, you'll probably need to rewrite it.
I'm not entirely convinced that this is any harder than any other approach - you'd have to propagate the new properties you were interested in through the code, sure, which would likely involve changing the types that the relevant points in the program were annotated with. But isn't it much the same process with a detached proof of the program property?
> I don't think human programmers have any hope of ever being able to maintain noncompositional program properties for nontrivial programs
We do it all the time. Whatever it is that you want your program to do, and any interesting functional property of it, is a non-compositional property. I know that because if it were compositional, you wouldn't need to write the program. A property P is compositional with respect to terms A and B and composition ∘, if P(A) and P(B) implies P(A∘B) (or more strongly, iff). Compositional and inductive are the same thing if A and B can be any language term. For example, type preservation is a compositional property for, say, ML, and memory safety is a compositional property for, say, Java.
> a human working on the codebase will still need a deductive-proof-like understanding.
I think this is not true, either in principle or in practice: https://pron.github.io/posts/people-dont-write-programs I know I certainly don't have anything close to a deductive understanding of the program I'm working on, but I do have such understanding of the specific pieces of code I'm writing. This is insufficient for a useful formalization of a program, because correctness properties aren't compositional, meaning that, in general, a sound specification of a program component would either be insufficient or unhelpful (i.e. no easier than the code itselg) for the writing of other components. Sure, in some cases it can be, but those are usually the easy cases. We need help with the hard ones.
I totally agree that we try to isolate components and expose them with APIs, but 1. the specification here is imprecise and "soft", and 2. as the results in my "correctness and complexity" show, this does not necessarily make deduction easier (https://news.ycombinator.com/item?id=20714920).
But what you should be asking is this: as there haven't been any major relevant breakthroughs in formal logic in the past couple of decades, nor in any relevant proof techniques (we still use deduction for safety properties), our deductive proof capabilities aren't significantly higher now than they were 30 years ago. If it's all a matter of discipline, then, how come we're still not able to affordably (or at all, if the programs are not tiny) verify programs with deductive proofs?
> how's that any different from what I'd do using any other technique?
I'm not sure I understand the question, but the last thing you want is a rich specification language that forces you to use a particular verification technique, and that's exactly what dependent types do. You want to specify whatever helps you, and then you want to verify using whatever technique you need or can afford, to reach the appropriate level of cost/benefit for your requirements.
> But isn't it much the same process with a detached proof of the program property?
To be honest, I don't know that dependent types are harder than "detached" deductive proofs, but this is something that Neel Krishnaswami (who's an advocate of dependent type advocate) mentioned to me as his observation (he's also the one who told me about attempts to build "gradual" dependent types to alleviate some of the other downsides I mentioned.) This can also be an upside, because it may making writing some proofs easier, as you write the program with them in mind. Either way, I think deductive proofs are, at the moment, too far behind other verification techniques to contrast different kinds of deductive proofs. But this certainly strikes me as an engineering problem.
I also mentioned this in my other post, but the CompCert compiler is a highly non-trivial example of a program verified using dependent types, and which would not really be possible using automated methods. Of course, not very affordable, since it took years of work from people with expert knowledge. But you get what you pay for, although the price isn't worth it for most real programs in industry.
Furthermore, you don't necessarily need to have an intuition about why the program you are working on is correct to formalize it using dependent types. You just need an intuition about a method for CHECKING that the program is correct.
The four-color theorem was proved in Coq, and it required considering over a thousand cases. No one can keep a proof of this magnitude in their head, but that didn't prevent it from being formalized in a proof assistant. What they did was they just formalized the method for checking and then used that. You can similarly formalize the kind of reasoning that a model checker (or any other automated method) does and then use it in your proof assistant.
> but the CompCert compiler is a highly non-trivial example of a program verified using dependent types, and which would not really be possible using automated methods.
First of all, it wasn't really verified using dependent types; it was verified in Coq. Coq's theory is, indeed, based on dependent types, but Coq is used more like a traditional proof assistant (HOL or TLA+) than a dependently typed programming language like Agda or Idris.
Second, CompCert, while certainly non-trivial, is also about 1/5 the size of jQuery, which means it's at least one to two orders-of-magnitude away from the average business application.
> You can similarly formalize the kind of reasoning that a model checker (or any other automated method) does and then use it in your proof assistant.
Yes. In the few cases where deductive proofs are used in industry, they're often used to tie together results from model checking.
> We do it all the time. Whatever it is that you want your program to do, and any interesting functional property of it, is a non-compositional property. I know that because if it were compositional, you wouldn't need to write the program.
I don't think I understand what you're saying here?
Say I want to write a program that tells me the duplicated members of a list. My program might call helper routines like, "stable sort the list," and then "collect adjacent duplicates."
I might want to prove my program is "correct", e.g., a statement like, "the output is a list that contains exactly those elements of the input list whose duplicity exceeds 1"). To do this, I can first reason about the helper functions in isolation, then compose these results about the helper functions to establish that my desired property.
Why is this not an interesting property, or a program that I don't need to write?
> Why is this not an interesting property, or a program that I don't need to write?
It's both of those things, but it's not compositional, because "finds duplicate members in a list" is not a property of either of your components. If it were a property of one of them, you wouldn't need the other.
All proofs proceed in stages, and "compose" known facts. But that's not the meaning of compositional, and the fact that the property you want easily arises from the properties of the component is luck. I gave an example on this page where this is not the case (the `foo bar` example), and it is provably not the case in general.
Thanks. I think we may differ in our understanding of "compositional."
My understanding of "compositional" reasoning is something like, an approach where we (1) establish some properties about our helper functions, and then (2) reason about larger functions using these properties, instead of the definitions of the helper functions.
It seems like you have a different notion of compositional, which I'm still not sure I can articulate in a clear way... something more akin to transitivity?
A property P is compositional with respect to terms A and B and composition ∘ if P(A) ∧ P(B) ⇔ P(A∘B); the equivalence is sometimes replaced with implication in one of the directions (this is a special case for a predicate, i.e. a boolean-valued function; a function F is compositional if F(A∘B) = F(A) ⋆ F(B)).
What you're saying is that proofs can be composed. That is true, but unhelpful, as it doesn't make proofs easy. All proofs work this way. Compositionality does (in fact, all properties that are easily proven by type inference are compositional; type safety itself is compositional). I've seen somewhere compositionality described as the opposite of emergence[1], which makes sense. A property is emergent in a composition if it is not a property of the components.
Thanks, this seems like a perfectly coherent definition of compositional. I think my earlier confusion was a purely terminological matter.
Regarding "What you're saying is that proofs can be composed. That is true, but unhelpful, as it doesn't make proofs easy." I agree that the ability to compose proofs does not make proof easy. But I would push back somewhat against the idea that this is unhelpful.
The ability to prove, in isolation, the relevant properties about some entity (e.g., a function, a hardware module, or what have you), and then to make use of these properties, without ever looking inside the entity again, to prove properties of some larger system, seems to me to be absolutely essential for scaling proofs to large systems.
This ability is of course completely common in tools like interactive proof assistants, but is less common in more automated procedures.
> The ability to prove, in isolation, the relevant properties about some entity (e.g., a function, a hardware module, or what have you), and then to make use of these properties, without ever looking inside the entity again, to prove properties of some larger system, seems to me to be absolutely essential for scaling proofs to large systems.
I understand the motivation, and of course you're right that it would be "essential", it's just that it has been mathematically proven (see my post) that this is not the case. Results in computational complexity theory, that studies how hard it is to do things, show that "making use of these properties, without ever looking inside the entity again" does not, in fact, make verification easier. More precisely, the difficulty of proving a property of A_1 ∘ ... ∘ A_n is not any function of n, because we cannot hide the inner details of the components in a way that would make the effort lower even though that is what we want to do and even though it is essential for success.
In practice, we have not been able to scale proofs to large systems, which means that maybe there is some truth to those mathematical results.
> This is insufficient for a useful formalization of a program, because correctness properties aren't compositional, meaning that, in general, a sound specification of a program component would either be insufficient or unhelpful (i.e. no easier than the code itselg) for the writing of other components. Sure, in some cases it can be, but those are usually the easy cases. We need help with the hard ones.
Or we need to transform the hard cases into easy ones, which is what the whole dependent typing project is about: turning program correctness into just a matter of type safety, at which point it is just compositional. Maybe there are important cases that we can't solve this way, but I've not encountered any yet.
> But what you should be asking is this: as there haven't been any major relevant breakthroughs in formal logic in the past couple of decades, nor in any relevant proof techniques (we still use deduction for safety properties), our deductive proof capabilities aren't significantly higher now than they were 30 years ago. If it's all a matter of discipline, then, how come we're still not able to affordably (or at all, if the programs are not tiny) verify programs with deductive proofs?
I think the article itself is a pretty direct answer to this, in as much as it outlines the programme of work that's necessary to enable us to use these techniques in practical programs with more interesting propositions. I'd argue that we already do huge amounts of program verification with deductive proofs - that is, typechecking - many orders of magnitude more code is typechecked than has any other formal method applied - and the frontier of progress is how much we can represent important program properties within that. In terms of theoretical proof techniques, the dependent Haskell type system is indeed much the same as, say, the Java type system. But in terms of what properties are practical to encode and verify that way, the dependent Haskell type system is streets ahead. We've only had practical dependent-typed languages very recently - arguably we still don't. And many of the techniques for encoding important properties in our type systems (whether dependent or not) are also relatively new (e.g. the monadic region papers are relatively recent).
> the last thing you want is a rich specification language that forces you to use a particular verification technique, and that's exactly what dependent types do. You want to specify whatever helps you, and then you want to verify using whatever technique you need or can afford, to reach the appropriate level of cost/benefit for your requirements.
Assuming that we do always have an escape hatch, I don't see that using types forces anything any more than any other approach to specification? If I want to prove a larger overall proposition using types then of course I have to express any smaller propositions that I depend on in the language of types (as would be the case with any proof technique - there's no getting away from needing to express our premises and conclusions), but I still have the freedom to verify that smaller proposition using any technique or none.
> Turning program correctness into just a matter of type safety, at which point it is just compositional.
It isn't. "Truth-preservation" in all deduction systems is also compositional (if the premises are true, so is the conclusion); that doesn't make the propositions compositional with respect to program components (if I conclude that X > 3 and Y > 2, then I can also conclude that X + Y > 5; the conclusion preserves the truth of the premises, not the property itself, as neither X nor Y is greater than 5). If you have a property of a program that isn't a property of a component, then the property is not compositional (note that type safety is: A∘B is type-safe iff A and B are; but type safety here is merely the inference step, similar to truth preservation). If it's not compositional, it can be arbitrarily hard to prove.
> Maybe there are important cases that we can't solve this way, but I've not encountered any yet.
If you're able to affordably prove any correctness property of any real-world program, this knowledge is worth billions. I am not aware of anyone who's ever made a similar claim.
> I'd argue that we already do huge amounts of program verification with deductive proofs - that is, typechecking - many orders of magnitude more code is typechecked than has any other formal method applied - and the frontier of progress is how much we can represent important program properties within that.
But the properties proven are easy. It's like saying, I can jump 50cm, so jumping all the way to the moon is just a matter of degree. No. There are real, fundamental, theoretical barriers between proving compositional properties and non-compositional ones (or those that require one or two inferences and ones that are more); between proving properties with zero or one quantifier, and properties with two quantifiers.
Also, while you can be pleased with Haskell's formal accomplishments, have you compared them to those of model checking in safety-critical real-time systems? Because once you do, you realize it doesn't look so good in comparison.
And your comparison of Haskell and Java is particularly interesting. You say that Haskell's type systems is "streets ahead". I agree (although I think this comes at a great cost). However, Haskell correctness or development cost is not "streets ahead" or even noticeably ahead. This shows that strengthening some deductive property is no guarantee to improving quality. Why? Because software correctness is very complicated. Its theory is complicated, and its practice is even more complicated.
> I don't see that using types forces anything any more than any other approach to specification
I think depdedent types are just redundant. You can get all or most of what give you without the inflexibility and the bad default using other means.
But look, I'm personally not too fond of dependent types in particular and deductive proofs in general, and consider them something you use when all else fails, and you have the money to spare. Others can be as bearish on model checking or concolic testing, paths that I think are more promising and I believe are seeing a bigger research investments. But what is certain is that no one has any idea whether we will find a way to drastically increase software correctness in the next twenty years, let alone which method will get us there. The one thing you can be sure of anyone who tells you they do is that they are simply unfamiliar with the different avenues explored or with many of the challenges involved (including social challenges).
> And your comparison of Haskell and Java is particularly interesting. You say that Haskell's type systems is "streets ahead". I agree (although I think this comes at a great cost). However, Haskell correctness or development cost is not "streets ahead" or even noticeably ahead.
I'd argue correctness is in fact streets ahead: orders of magnitude fewer defects (in terms of the code not doing what the programmer intended; the programmer not understanding what some other human intended is still as big a problem as ever) in practice. It's interesting how little value that actually brings in ordinary commercial development, but the correctness techniques will be there when the industry is ready for them. Development cost is much harder to measure; I'd argue that some systems developed with Haskell-like type systems would simply not have been possible to make without them, but that's the sort of question that's very difficult to answer objectively.
> orders of magnitude fewer defects (in terms of the code not doing what the programmer intended; the programmer not understanding what some other human intended is still as big a problem as ever) in practice.
I don't see that, and I'm not aware of any report that reports that.
> It's interesting how little value that actually brings in ordinary commercial development, but the correctness techniques will be there when the industry is ready for them.
Except that in some cases it's already true. Companies like Amazon and Microsoft are seeing great value in using TLA+ and other formal methods, and now management encourages engineers to use them. I just don't see it for Haskell. Facebook is a good example, as they invest a lot in correctness research (both formal methods, like Infer, as well as inductive methods for automatic bug fixing and codegen[1]) and they also use some Haskell. But they're pushing hard on formal and inductive methods, but not so much on Haskell. I don't think correctness is a problem that cannot be addressed, and I believe there are techniques that address it. I just don't see that Haskell is one of them.
> Development cost is much harder to measure; I'd argue that some systems developed with Haskell-like type systems would simply not have been possible to make without them, but that's the sort of question that's very difficult to answer objectively.
First of all, I would agree with you that full formalization of program correctness using dependent types is generally extremely costly in terms of time and money, and that there are often other formal methods which might get you results that are "good enough" while requiring much less effort and expertise. However, I do think that there are some inaccuracies in some of the statements you are making about proof assistants based on dependent types.
One major benefit that proof assistants have over other formal methods is that they can be used to prove essentially arbitrary properties of programs, which is not true of any automated methods. Basically, if we as humans are able to convince ourselves that a program has a certain property, then we could also show this using a (dependently-typed) proof assistant.
(Yes, theoretically all proof systems have limitations due to Gödel’s incompleteness theorem, but this limitation applies just the same to ALL formal methods. Furthermore, you’re not that likely to run into it in practice unless you are doing metatheory, and even there many interesting statements CAN still be shown.)
Consider the formally verified C compiler CompCert, where the assembly output of the compiler has been proven to have the same behavior as the input source code. A proof like this simply cannot be done using automated methods currently, and requires human assistance. This kind of correctness proof is also highly non-compositional, since the correctness proof of a compiler does not mirror the structure of the compiler very closely. Hence this also invalidates your point about dependent types not being able to handle non-compositional properties very well. There are plenty of other examples, like the seL4 microkernel.
I also do not understand what bearing computational complexity results have on humans manually formalizing their informal intuitions about program correctness in a proof assistant. Humans do not work on arbitrary problems where they have no intuition at all about why a program is correct, hence referring to worst-case computational complexity results does not make much sense here. In fact, if you have no intuition about why a program works, then how did you manage to write it in the first place? You surely did not write random garbage in the hopes that you would get a working program.
You might say that there are programs you can check using a model checker, but which humans cannot reasonably keep in their heads or form an intuition about, and therefore these proofs cannot formalized in a proof assistant. But what you can do is implement a model checker in your proof assistant and then show that the model checker is correct. After you've done that, you can simply use the model checker to do your proofs for you. And of course it should be possible to formalize the model checker itself, since the people that wrote the model checker must have some intuition about why it works and produces the correct results.
In this sense, proof assistants subsume all other formal methods, since they can be "scripted" to include them.
Furthermore, it is generally not the case that you have to rewrite your program in case you want to prove some more properties about it. You may be thinking of a programming style where the type of a program encodes all the properties you are interested in, but you can avoid this style where necessary, by separating your program and its correctness proof.
> One major benefit that proof assistants have over other formal methods is that they can be used to prove essentially arbitrary properties of programs, which is not true of any automated methods.
It is true of model checkers.
> There are plenty of other examples, like the seL4 microkernel.
Yep, there are plenty of such examples, I'd say 2 or 4, and they're all less than 10KLOC, i.e. 1/5 of jQuery.
> I also do not understand what bearing computational complexity results have on humans manually formalizing their informal intuitions about program correctness in a proof assistant.
Because humans can't outperform Turing machines.
> Humans do not work on arbitrary problems where they have no intuition at all about why a program is correct, hence referring to worst-case computational complexity results does not make much sense here.
For one, the results aren't just worst-case. One of the most important ones is on parameterized complexity. Second, problems that are very hard in the worst case have rarely been proven easy in the "real world" case, unless the real world case was somehow categorized in a neat subset, and this isn't the case here. It's possible that we're usually far from the worst case, but if so, automated methods could also exploit that, and still be better than manual ones in most cases.
> In fact, if you have no intuition about why a program works, then how did you manage to write it in the first place? You surely did not write random garbage in the hopes that you would get a working program.
> In this sense, proof assistants subsume all other formal methods, since they can be "scripted" to include them.
The same can be said about all other formal methods. Indeed, various sound static analysis tools like Frama-C or optionally fall back to manual proofs when needed. In the end, though, in practice deductive proofs don't scale as well as other methods. It's a good tool to have in your toolbox, but you don't want it as your default. Your model checker could just as easily ask for help from a proof assistant. So it's not the case that any one method subsumes all others, but you could combine them. Which is why dependent types are problematic, as they make these combinations harder than, say, contract systems, that are flexible in the choice of verification method.
> You may be thinking of a programming style where the type of a program encodes all the properties you are interested in, but you can avoid this style where necessary, by separating your program and its correctness proof.
True, but that's not the style discussed here, of dependent Haskell or Idris.
I agree that verifying programs using dependent types is impractical for large programs. I think that the reasons are mostly practical rather than theoretical, however. There's too little benefit for too great a cost if you want essentially 100% correctness guarantees for every single part of a large, existing program.
Humans regularly do "outperform Turing machines." Most of the complicated mathematical proofs that have been produced by humans, cannot currently be automatically produced using an algorithm in reasonable time. Pick almost any non-trivial mathematical theorem, and you would be hard pressed to write a program that could prove it in a reasonable amount of time given just a list of the axioms. In theory, humans might not be able to outperform TMs, but in practice they regularly do, as attested to by the large amounts of mathematical proofs that we have produced by hand.
Model checkers also outperform humans for some tasks, but then again this is because human intuition has been used to write the model checker in the first place. My point is that using theoretical computer science results to talk about the difficulty of finding mathematical proofs in real-world cases is a bit of a non-sequitur. Yes, the Linux kernel would be practically impossible to verify completely using semi-automated tools, because it would take too much time and effort. But is this because of some theoretical result about the model checking problem? I don't think so.
One difference between having a model checker call a proof assistant and using a proof assistant to implement a verified model checker is that if you write a verified model checker implementation, you can be almost 100% certain that it's bug-free, because it creates proof terms that can be independently checked by the (ideally small and simple) proof kernel of the proof assistant. You don't have to trust that the model checker has been implemented correctly. This difference does not matter that much in practice, however, since model checker implementations probably receive a lot of testing and are quite trustworthy.
You're right that I am mostly talking about Coq, and that a different style tends to be used in Haskell/Idris. Note that dependent Haskell is actually logically inconsistent for backwards-compatibility reasons anyway, so there's not much to lose there by cutting a few corners for pragmatic reasons. It is not necessary to verify every single part of the program. Dependent types in Haskell gives you the freedom to do more. You can choose not to use that freedom where it doesn't make sense.
> Most of the complicated mathematical proofs that have been produced by humans, cannot currently be automatically produced using an algorithm in reasonable time.
That's true, but humans have not done that in a time frame that's even remotely reasonable for verifying programs. Automatic verifcation methods outperform humans by orders of magnitude, on average, as far as program verification is concerned. My point is that if it is true that real-world programs are far from the worst case -- and note that we do not currently have much evidence to support that belief, because we have not been able to regualrly and affordably verify programs with deductive proofs -- then automated tools will be able to exploit that as well (once we know what "that" is, assuming "that" exists at all).
> My point is that using theoretical computer science results to talk about the difficulty of finding mathematical proofs in real-world cases is a bit of a non-sequitur.
Why? That's the entire point of the entire field of computation complexity. But don't think I'm pessimistic. I just think software verification is an extremely complex subject, and anyone who says they can certainly tell now which approach will "win" simply does not know what they're talking about.
As Fred Brooks wrote, realizing the limitations and understanding the magnitude of the challenge is not pessimistic. Physicists don't feel that the speed of light is a non-sequitur. Quite the opposite: understanding the limitations and challenges allows us to focus our efforts where they'll do the most good, rather than bang our heads against the wall just because that's what we happen to know how to do.
> Dependent types in Haskell gives you the freedom to do more. You can choose not to use that freedom where it doesn't make sense.
Given that only 0.01% of programmers use Haskell in production I don't really care what approach Haskell uses. But as someone who takes verification seriously, I think contract systems give you the freedom to use deductive proofs -- just as dependent types do -- as well as other methods, which currently have proven much more practical and useful in the real world.
What I meant by non-sequitur is that I agree with your premises (that certain problems are computationally intractable or undecidable, and this is good to know), and I agree with your conclusion (full verification using deductive methods/dependent types is unfeasible for large software projects). However, I do not agree that the conclusion is true because of the premises.
That is, I don't think one can say that the reason why human-aided verification is unfeasible for some projects (say, the entire Linux kernel) is because of theoretical limitations. Yes, verifying the Linux kernel would take too much time, but not because of any specific theorem in theoretical computer science.
The reason why I am emphasizing this point is that I believe that referring to theoretical results in discussions like this makes it seems as though there is some kind of fundamental limitation that prevents people from verifying the Linux kernel in a proof assistant. Even though it is quite a jump to conclude that from theoretical results, especially when human intuition is involved.
I would say the reason that people haven't verified a program the size of the Linux kernel is that it's just too much work. Just like if I were to try to write the Linux kernel from scratch on my own, it would take me years and years. But not because of a mathematical theorem. Some things are just a lot of work without there being a deep reason behind it.
We can go much further with deductive verification combined with automated verification compared to automated verification alone, precisely because human intuition can be used to guide the automated tools.
EDIT: To give another example, I would have no problem with the following reasoning:
A SAT solver that is efficient for arbitrary SAT problems cannot be implemented, because SAT is NP-complete (assuming P =/= NP).
However, I would have a problem with the following line of reasoning:
Intel CPUs cannot be fully verified by people because SAT is NP-complete.
> That is, I don't think one can say that the reason why human-aided verification is unfeasible for some projects
That's not exactly what I said, or at least not what I think. I think that full, 100% confidence verification (the method is irrelevant) has some serious challenges as suggested by those theorems, and that it is, therefore, unjustified to believe that there would be a clear solution any time soon. One way to circumvent those challenges is, for example, to reduce the level of confidence.
> Yes, verifying the Linux kernel would take too much time, but not because of any specific theorem in theoretical computer science.
I'm not saying it definitely is, but I don't think anyone can say it isn't, either (well, technically, a theorem won't likely apply to a specific piece of code, except as a sample from some distribution).
> The reason why I am emphasizing this point is that I believe that referring to theoretical results in discussions like this makes it seems as though there is some kind of fundamental limitation that prevents people from verifying the Linux kernel in a proof assistant.
I think that the fundamental limitation (about the general case) suggests great challenges. And again, the specific method doesn't matter as long as it's sound. A proof assistant and a model checker are both bound by those results, as both are sound.
> Even though it is quite a jump to conclude that from theoretical results, especially when human intuition is involved.
I'm not sure human intuition is very relevant. We've had human intuition and proof assistants for the past 40 years, and still we haven't been able to prove software beyond a very small size. So I think that at the very least we should consider the possibility that if we can jump two feet up, then even though it seems like jumping all the way to the moon is a matter of degree, there just could be some fundamental obstacles in the way. I'm not saying we can't overcome them, but I'm saying we can't assume they're not there.
> But not because of a mathematical theorem. Some things are just a lot of work without there being a deep reason behind it.
OK, but people have written the Linux kernel, and that's a lot of work, too. Don't you think that if a mathematical theorem says that, in the worst case, A is much harder than B, and that in practice, A is, indeed, much harder than B, then maybe theory has something to do with it?
> We can go much further with deductive verification combined with automated verification compared to automated verification alone, precisely because human intuition can be used to guide the automated tools.
I think that's speculation and that it's unjustified by either theory or practice. It's unjustified by theory because if human intuition can consistently prove successful (to the point of being an industry technique), then whatever method is used suggests some recognized easier-than-worst-case subset, and so automated tools could take advantage of that, too.
It's unjustified by practice because you could have said the same thing thirty years ago, and here we are now, and automated tools are doing better on average (it's kind of complicated to compare, because there are cases where either one is much better than the other, but overall, when we look at "verification throughput", automated methods verify more software more cheaply). It is true that automated tools have some limitations that we haven't been able to break through, but so do manual proofs.
You could indeed make the case that the intractability/undecidability-type theorems suggest that the problem of program verification is inherently difficult, and not something that can be solved in most cases using a trick. It requires some deep thinking in many cases, the same kind of thinking required to find complex mathematical proofs.
Currently humans are much better at proofs that require this kind of deep thinking. Automated tools are better for proofs that are based on a simple structure, like a large enumeration of cases (as in SAT solving and model checking, oversimplifying things a bit). This is the very reason they are automated: the reasoning performed by them is so simple/well-structured that we are able to write a program to do it for us.
However, many proofs fall outside of this easy-to-automate subset, and require human assistance. Human intuition does consistently do better than automated tools, but there is no single easily recognizable "trick" or "method" to it, which makes it hard to implement in a tool.
In my view, the reason that theoretical results do not apply very directly to human reasoning is precisely that human reasoning is "intuitive" and hard to capture in a program. We are not really Turing machines in any practical sense. We are more adept at proving things (perhaps creating automated tools to help us along the way) than any known algorithm.
I absolutely agree that automated tools make much more sense for your average industry application, but if you want to be able to use the full power of human intuition, instead of restricting yourself to program properties that are easy to prove automatically, then there is no way around manual proofs.
> Human intuition does consistently do better than automated tools, but there is no single easily recognizable "trick" or "method" to it, which makes it hard to implement in a tool.
Except in software verification, the facts on the ground are that it is the automated tools that do better than humans. We're not talking hypotheticals or speculation here. If you've written software proofs you won't find this particularly surprising. They're tedious and frustrating, but rarely require much ingenuity. Attention to detail plays a far more important role [1]. They're not at all like proving mathematical theorems. Software proofs are to math proofs what painting a hospital is to painting the Sistine Chapel.
> In my view, the reason that theoretical results do not apply very directly to human reasoning is precisely that human reasoning is "intuitive" and hard to capture in a program.
Perhaps, but so far humans do not seem to outperform the expectations. So you're counting on a human ability that, to date, we've not seen. In fact, humans rarely if ever beat computational complexity expectations. SAT solvers do much better than humans at supposedly-NP-complete problems, for example, and so do various approximation algorithms. Why doesn't our intuition help us outperform algorithms there?
And again, if intuition were the decisive factor, how come we're doing such a terrible job? If algorithms beat us today, what kind of boost in intuition do you expect that will allow us to beat them tomorrow?
Look, formal proofs are great. They're clearly the gold standard. So I encourage everyone here who thinks proofs are the future of software correctness to try them out to write correct large and complex programs. The problem is that they'll fail, and either abandon formal methods altogether, or worse, waste their time proving what they can at the expense of verifying what they should.
> but if you want to be able to use the full power of human intuition, instead of restricting yourself to program properties that are easy to prove automatically, then there is no way around manual proofs.
Maybe. It's certainly possible that one day we'll figure out how to make manual proofs work reasonably well for software verification. It's also possible that by then automated tools will get even better, especially considering that, as they've shown greater promise over the past few decades, that's where most of the research is focused. Sure, there will always be some niches where tedious manual work is required, but I see no indication that that niche would grow beyond the size it occupies now, an important yet small part of formal verification.
I think the ?? can be already used for Result<Result<_,_>,_>.
I agree that the syntax should have `await' somewhere, and the observation that it "clearly" is not a field access is actually credible to me. This also open the possibility to have other kinds of postfix keyword e.g. .try or .match
We'll be limiting it to open source projects — but not just software ones, whether it's an open hardware project, or an open source book, it'll all be welcome.
As someone who was a "child prodigy", people making a big deal of my age only made me feel good about myself and wanting to achieve more. Of course, this may not be a universal experience, but has certainly been common among the other child prodigies that I've since met.
Any chance the Smart Routing would be available as a TCP/UDP proxy service? It would be great to take advantage of Cloudflare's network without having to hand over your SSL keys.
> Because of our physical and virtual presence around the world, Cloudflare is uniquely positioned to rebuild the core of the Internet. Every customer we bring on increases the size of our [surveillance] network.
Google was founded by CIA[1]. There’s no way Cloudflare isn’t.
The linked article does little to establish that, other than early Google got some funding from NSF programs that were of interest to the CIA.
That's not unusual being that Sergey was a CS grad student at Stanford. Cloudflare has a different history...it didn't sprout out of a university project.
No, but it is available as part of Google's cloud services [0]. Particularly noteworthy is the fact that they offer an SLA with 99.99% availability. Even crazier is their pending multi-region version due later this year which will come with a 99.999% SLA!!
This is the exact thing that I've been working on for the last year and a half. The biggest hurdle is actually handling the flow of money.
Because, unlike say Kickstarter/Patreon, where the money goes to a single entity/person, open source collaborations tend to be distributed and dynamic. So you effectively need to build a payments company and handle all of the headache that goes along with it.
But, having said that, I very much believe that something like this is needed — and thus why I'm still persevering with the project. Happy to chat further if cperciva/others are interested.
I think there are enough solo developers -- solo either because they're the only person working on a project, or because they're the only person working on a specific part of a project -- that this could be very useful even without the headache of splitting money between teams.
I agree that it would be good to be able to handle larger teams at some point, but it might be easier to figure out how to do that after seeing how people used a "solo developer" version of this.
tav, have you looked at Stripe's Connect product (https://stripe.com/connect)? Stripe already is a payments company, and Connect is built to let you use it somewhat like your own payments company, to pay others. Lyft uses it to pay its drivers for example, who also tend to be distributed and dynamic.
(Disclaimer: I work for Stripe, as of like last week. :)
Yes, that was actually the original plan. It all seemed so easy! Let developers connect their standalone Stripe accounts and have sponsors pay directly into those accounts. But this proved problematic for a few reasons, e.g.
* Most developers didn't want to handle matters like global VAT reporting individually, so ideally the money would flow through a central entity which took care of that for them so that they only have to deal with a single source of income.
* The need to support ad-hoc transfer of funds between standalone accounts when maintainers approved transfers to other developers they are collaborating with.
Stripe Connect with managed accounts helps to solve some of these issues, e.g. by creating charges on the platform account and then creating individual transfers to connected accounts. But according to https://stripe.com/docs/connect/charges-transfers:
"This approach is only fully supported when both your platform and the connected account are in the U.S. If your platform or the connected account is outside of the U.S., you can only use this approach for less than 10% of your total volume."
So, is there some other method that's available? Because right now, I'm looking at handling the payouts myself and using Stripe to handle just the incoming payment processing. And it'd be great to use Stripe for both instead!
tav, I'm sorry for my delay getting back to you. I just sent you an email at your profile address to introduce you to someone on our New Business team, who may be able to help.
Open source funding is a cause near and dear to my heart, and I hope something is able to work out.
https://imgur.com/gallery/qGBGwXc