The use of ‘=’ for assignment in programming languages comes, not directly from mathematics, but indirectly from the use of mathematics in science and engineering. As an example, consider the formula for kinetic energy, commonly written
𝑚𝑣²
𝐾 = ───
2
Why isn't it written 2K=mv², which expresses the same mathematical equality in a smaller, simpler form? Or any of the other equivalent rearrangements? It's because formulas have a convention, where the LHS is a single term naming the value you want, and the RHS contains the terms for values you have. That is, a formula doesn't just state an equality, it states a method for calculating something. That usage predates programming, and was explicitly copied by early programming languages like For[mula]tran[slator] that were designed for scientific & engineering calculations.
Because I started programming before taking maths at school, I didn't properly appreciate equality for a while.
Sure, algebra was fine, a(x+y)=ax+ay can go either way; but not ratios and other relationships.
What helped me was was geometry, where you can see it's just a relationship. All the components move together; one part isn't priviledged as the result.
e.g. you enlarge a circle.
It doesn't make sense to ask whether the radius made the circumference bigger, or the circumference made the radius bigger.
I remember starting with QBasic when I was around 7 or 8 years old, and I quickly got an idea – just put in the equations from math homework to find out the answer! Fighting through the error messages without English and trying to wrap my mind around the basic concepts of procedural programming was a world of pain.
Nice. I had fun in some junior high math competitions in part by writing TI BASIC programs on my calculator to brute force hard problems while I did the easy ones. Eventually they stopped allowing calculators altogether and focused on cheap memorized tricks instead of generalized problem solving.
I didn't downvote, but the reason is likely because your comment is off-topic. The parents are talking about experiences un-learning one-sided equality that they picked up programming before learning algebra; your comment is about programming a T-83 to help with schoolwork.
I believe it comes directly from conventional math exposition. In a general form it’s about emphasis.
“The change of subject from “The dog bit the boy” to “the boy was bitten by the dog” is similar to the change of subject in a formula, as for example … In each case, the two sentences state the same relationship, but with different emphasis.” [1]
While this reasoning may be common, I don't think it's to anyone's benefit. Talking about "the" formula for kinetic energy seems nonsensical, when there are so many ways to state that relationship. Another option is p^2=mK.
But there is a good reason to write it as K=mv^2/2 which has nothing to do with specifying a computation. It is the result of symbolic integration of p=mv with respect to v.
If the expression is supposed to be the definition and not just a derived equality, at least Mathematicians (and some Physicists) would use := instead, just as Niklaus Wirth tried to establish with Pascal.
There are many strange symbols in math already that can be difficult to type without something like LaTeX. I feel like something pronounceable is usually better unless you have a sufficiently general idea.
> "why haven't scientists come up with more symbols yet?"
Take an advanced math test and try to type it out. Being in no sense historically limited by typewriters/keyboards, you might be surprised at the complexity of the symbols.
> It's because formulas have a convention, where the LHS is a single term naming the value you want, and the RHS contains the terms for values you have.
I've seen that, too, but there is usually a reason for writing it in that form. The most common would be that you have a different equation with 2K in it, and so you want to make variable substitution simpler. Alternatively, if you are reading older papers, typesetting inline equations that don't fit in a single line was painful. For that reason, a formula might be rearranged to avoid needing any fractions.
While I don’t think the terminology is explicitly standardized, I think most people in the relevant fields would call that statement of the ideal gas law an equation but not a formula, the latter being a special case of the former.
Can you find any reference to the idea gas law as a “formula?” As far as I can tell, equations without a single variable on the left side are referred to as simply equations, while solutions of such equations in terms of one variable are referred to as formulas. This seems to be the case for every well-known identity I can think of, like the quadratic equation/formula. Can you think of any counterexamples?
Some more food for thought on the meaning of =, from Girard's "Proofs and Types" [0]:
> There is a standard procedure for multiplication, which yields for the inputs 27 and 37 the result 999. What can we say about that?
A first attempt is to say that we have an equality "27 x 37 = 999". This equality makes sense in the mainstream of mathematics by saying that the two sides denote the same integer [...] but it misses the essential point:
There is a finite computation process which shows that the denotations are equal.
> [...] if the two things we have were the same then we would never feel the need to state their equality. Concretely we ask a question, 27 x 37, and get an answer, 999. The two expressions have different senses and we must do something (make a proof or a calculation, or at least look in an encyclopedia) to show that these two senses have the same denotation.
Just a bit of background: Girard is paraphrasing Frege's famous paper On Sense and Reference[1] which is an investigation into the meaning of equality. As a result of that investigation, Frege shows that terms in a language have at least two kinds of meanings (sense and reference or denotation), which Girard presents in a programming context.
These quotes from Girard are great, as is the mention of Frege below.
Typically, the objects related by equality can be thought to have the same meaning with respect to extension and different meanings with respect to intension. Further, the difference in intension reveals something of the computational content of the extensional object being referred to.
Further topics to explore: the BHK interpretation of intuitionistic proof and the univalence axiom in homotopy type theory. Both of these topics give one some insight on the relationship between the computational content of mathematical objects and how this content pertains to the question of whether two objects are “the same.”
Finally, I did skim the article itself and found it lacking. The author seems to be aware of the fact that there are surprising, highly non-trivial properties of the (seemingly trivial) notion of equality in mathematics. And also to be aware of the fact that the use of ‘=‘ in CS contexts isn’t some sort of abuse of notation. But there seems to be very little of interest here beyond some circumstantial verification of these two general (and well-known) facts about equality in the mathematical and computational contexts.
> Further topics to explore: the BHK interpretation of intuitionistic proof and the univalence axiom in homotopy type theory.
Thanks! For the interested, Girard's book focuses on the Curry-Howard isomorphism, another great result linking computer programs (actually, the typed lambda-calculus) to mathematical proofs.
I agree that “=“ as interpreted by people doing math requires context, but in most situations they are able to translate it into a “correct” or formal notion of equality. For example, translating on the fly these ad hoc notions of equality into precise notions of equality in first order logic and/or set theory. For example,
f(x) = 2x + 3
Might be translate into something like,
For all x in the domain of f, f(x) = 2x + 3
Or maybe further,
f = { (x, y) in Cartesian product of domain and codomain | y = 2x + 3 }
Where equality is, I think, strictly defined here as set equality.
The articles other point in this example is that we might way “when x = 2, f(x) = 7.” Claiming that x is used both as an indeterminate value and a concrete value. Again, I think the ambiguity is resolved when translate using the correct quantifies, something like “for all x in the domain of f, if x = 2, then f(x) = 7.”
Or perhaps you might claim, “there exists an x in the domain of f such that f(x) = 7.” The important point being that the function f is formally NOT the formula f(x) = 2x + 3, but a particular set of ordered pairs, of which you can make formal statements about in first order logic.
Another example used was
A = {n^2 | n = 1, 2, ... 100}
But again this is just “syntactic sugar” that a reader would translate into perhaps
This comment contains an important key distinction between different usages of "=" that are often casually intermixed in such discussions: There is a major difference in how we quantify the logical variables that occur in formulas.
For example, if we consider the atomic formula x = 5+y, then we may mean the identity ∀x∀y (x = 5+y), where all variables are universally quantified.
Or we may mean ∃x∃y (x = 5+y), where the variables are existentially quantified. To determine whether this holds, we can search for a solution given by a substitution that makes the terms equal modulo some theory E we associate with =. If E is empty, then this corresponds to syntactic unification.
Confusingly, in the literature, sometimes "equation" is used for both, and an entire subthread in this discussion is due to this issue.
When one is asked to "solve for x" etc., then one answers whether there is any solution, thus solving the existentially quantified version. When one means "this identity holds", then one states the universally quantified sentence.
You're trying too hard. The problem with "=" is usability. If you read the previous article about "why does = mean assignment" you'll find the true reason:
Since assignment is about twice as frequent as equality testing in typical programs, it’s appropriate that the operator be half as long.
Of course, that's wrong. Trying to make it easier to type, a bigger usability problem is created, assigning counterintuitive symbols to functionality that most people would not associate to them.
So the problem is not what = means in maths, it's what it means for people learning a language. Anyway, smadge is right: most of the uses in maths are of the kind "a little imprecission saves tons of explanation". In other words: that's not a question of what = is, but a question of how we use it to get things done.
Oh and then there is the "I'm used to it so it must not be so bad" crowd and the rationalization ensues.
I disagree with what the previous article states: There are languages such as Prolog where = doesn't mean assignment. In logic, = means equality, often modulo some theory. In Prolog, = means (syntactic) unification, which is distinguished from identity only by the different quantification of variables.
I would like to emphasize the distinction that smadge made, since distinguishing between universal and existential quantification is important to properly discuss different meanings and usage modes of equations.
> Trying to make it easier to type, a bigger usability problem is created, assigning counterintuitive symbols to functionality that most people would not associate to them.
How do you know? It seems no such problem has been reported, so the burden is on you to show it exists. I don't think people assume that the same token has the same meaning in different languages.
Oh, please. This has been a flamewar since the 80's. Every angle you can think of has been tried before. The real question is why are we still using plain text. A lot of the difference between languages is this kind of absurd minutiae taken too seriously as if we're discussing about the reality fabric instead of mere conventions.
A flamewar is a disagreement of opinion. I am not aware of any actual problem reported as a result of using the `=` sign for things other than equality or anything suggesting it is generally counterintuitive.
In every formal system, such as FOL/ZFC, all operators must have a precise meaning. And while it is true that most informal uses of the `=` sign in mathematics could be translated to some formal expression making use of a precisely-defined `=` sign, I think the point was that some (or most) mathematical notation is not formal, and therefore saying that `=` always means a precise notion of equality is wrong, even if in a particular formal language it always has this precise meaning. Therefore, this is not a sin to use the same `=` sign in some other formal language (say, a programming language) to mean something other than its meaning in a particular mathematical formal language.
I agree with your interpretation of the article, but I find that the article doesnt match the title. I think the equals sign almost always represents usage of an RST equivalence of some sort, even if the formula containing it isnt an equality (with the irritating exception of Big O notation).
However, I think that using this as a justification for programming languages' use of the equals sign for assignment (whether or not it needs any justification at all) I think is a moot point because mathematics is more expository, and isnt always strictly rigorous as long as its clear that everything is well defined and correct in the end (although there is expressiveness in the formalities as well).
> Rather than precisely say, f(2) = 7, we say that for x=2, f(x) = 7. So x is simultaneously an indeterminate input and a concrete value
This seems like a perfectly by-the-book piece of second-order logic with two equality predicates.
i.e., the statement asserts that if you look at the space of all possible values for x, then for each value where the predicate "x = 2" holds, the other predicate "f(x) = 7" will also hold. It happens there is only a single value that will satisfy "x = 2", but that's not the equality's problem.
Fair point. I'd add that f(x) = 7 can be both equality of functions and equality of evaluations, and binding x=2 suddenly changes the meaning of the equality and the expression.
I think "f(x) = 7" is always equality of the evaluations, and if you want equality of the functions you should write "forall x, f(x) = 7". Admittedly mathematicians are lazy and don't always write that when it's clear from contex, but I don't think the ambiguity is caused by the "=" sign. The expression "f(x) < 7" would be equally (ha) ambiguous.
I wouldn't call that ambiguity. While several shorthand notations use =, it is always clear from the context which one and only one is referring to (and if there are multiple that all of them agree).
This touches on another point that one sees much of in mathematics lectures but little in math lectures.
Mathematical notation needs to be unambiguous but also facilitate communication and hence tends to be very terse. Thus when discussing addition on a finite field F_5, one usually starts defining equivalence classes [j] and an addition operator "+" and then says that [3]"+"[3]=[3+3]=[5+1]=[1]
followed by a disclaimer like:
"We hence see that this notion is compatible with the previously defined one when identifying ... . Hence, if there is no possible confusion, we will simply write ..."
Yeah, I think that's a good point and, I think, an important differecne between math notation and programming languages.
I think mathematicians thend to use a lot of "notational slang" or "ad-hoc syntactic sugar", if you will. This will make the notation often look imcomprehensible for people not familiar with the exact domain - however, there is usually a consistent meaning behind it. If one wanted, the notation could be "desugared" into a more rigid (but more verbose) form that expresses the same.
To take big-O-notation as an example, if you write something crazy-looking like
x + 5 = O(x)
What you mean is
The function "f(x) := x + 5" is a member of the set of functions "O(x)".
Where "O(x)" is itself shorthand for a convoluted set expression.
Similarly, if you write
x^3 + O(x)
You mean Take x cubed, then add the result of some function from the set O(x). (Whether you mean any function or one specific function is again context-dependent but usually it's made clear which of the two is meant.)
Contrast that with programming languages, where you often have a rigid syntax but higher-level semantics being quite fuzzy.
E.g., equals() and toString() in java are straight-forward to write but they can "mean" quite a lot of different things depending which kinds of objects you call them.
e.g., the "obvious" meaning of equals (value equality) works only with immutable value types - yet the method is defined on any kind of object. So it might also mean instance equality - or even entity equality if you deal with ORM proxies - or even wilder things...
I noticed this recently when I was trying to define a note-taking syntax for my math classes. I thought it would be smart to use := for definitions and = for equality, but then I was frustrated when = didn't always mean equals in the same way, and some things didn't really fit into either category. I ended up just giving up and switching back to abusing = in all situations. I think math has a really cool human aspect, it's very rigorous but also relies on the fact that your notes/proofs/whatevers are going to be read by a person.
The author itself admits in the postscript that he has embellished a bit the article, but allow me to take it at its face value: to me, it seems that the article confuses mathematics with its notation (and the same for computer science, but at this level CS is just a branch of mathematics). All the funny stuff he goes on describing follow from this confusion. When a mathematician does mathematics, they have very well defined concepts for "equality", "equality up to some equivalent relation" (my preferred: "equality up to diffeomorphisms that are isotopic to the identity") and so on. However notation is chosen saving on clarity and conciseness, sometimes at the expense of the direct mapping with underlying mathematical concepts. Thus in some case the sign "=" is meant to mean equality (in a certain sense), in some other cases it is not.
Computer languages make no exception: they are nothing else than formalisms to express computations. As for every other formalism, the meaning of signs is chosen to be what appears most comfortable in that context by the formalism designer. The statement "x = x+1" has very different interpretations depending on whether you consider it written in C or in standard polynomial equation theory; but in both cases there is a well known meaning for it. In exactly the same way the word "case" has different meaning depending on whether your are reading in English or in Italian.
There are few more examples that come to mind, like statements about intervals (π = 3.14 ± 0.01) and the usual notation for modular arithmetic; 3 * 3 = 1 (mod 4).
Oh, and the wonderful notation for integrals, ∫ 2x dx = x² + C
Well the (mod n) notation is just an implicit homomorphism, phi(3*3) = phi(1) where phi : I -> Z4. But during lectures it was pretty common for professors to just drop all the greek and squiggly lines and commonly write things like 9 = 1, and use '=' with whatever arcane RST relation which happened to be relevant at the time.
I don't think "=" is being overloaded in this example. The symbols "1", "3" and "*" are, since they're working in Z_4 rather than Z, but equality is just equality.
Sure; I think we’re in violent agreement here, it’s absolutely the case that people write the simpler version when the meaning is clear from context. I’ve definitely done that a bunch.
Piling on with a bit more pedantry, my experience is a bit different.
In my current ring theory course, we have indeed written things like 3 * 3 = 1 when working in |F_5 (not sure that notation is going to work as well as I hope, looks alright in the app I use), but it's not the equality symbol is overloaded, but the numbers themselves. Rather than using = to mean numeric equality and equality w.r.t. equivalence classes, we just use the numbers themselves as shorthand for their equivalence classes.
That seems odd to me. I don't think I've read any ring/algebra/module theory text that doesn't explicitly denote equivalence classes with, for example, square brackets.
There's a canonical ring homomorphism from the integers into any commutative ring with 1. When such a homomorphism is unique, you often omit it, hence mathematicians sometimes just write numbers without equivalence class brackets.
In my experience, mathematicians tend to drop the third dash.
I should also add that, as generally presented in higher level math, modular arithmetic actually does use literal equality. The abuse is in the numbers.
That is to say, when we write "9" in modular arithmetic, we mean the set of all numbers congruent to 9. In this case in "9=1 mod 4" the equality is literal because the set of numbers congruent to 9 is literally the same set as the set of numbers congruent to 1.
I probably should have said "common" notation; it's certainly what I learned. I think you're right in that many authors prefer ≡ to emphasize that it is an equivalence rather than equality relation.
I don't understand the schism. For example, 11=1 gives the equivalence of two different programs. Whereas 33 mod 4 and 1 are equal in Z_4 ...
It bothered me a lot when I took logics lectures that equality wasn't treated as operator or even just anything within a theory (whereas turnstyle was described, at least).
Yeah, sorry, I didn’t mean to split hairs. They’re definitely both in use. I do plenty of work in Zp and I feel bad every time I don’t write \equiv or the \pmod for that matter :)
This. It's even more obvious in linear algebra where mathematicians routinely start with the premise "Ax = b", even if there is no solution x that would satisfy the equation exactly.
The supposition in instances like these is one expressing a notion of equivalence, whether or not an equality results in a contradiction doesnt mean that the meaning of the symbol has changed.
CS already abuses equality all the time with big-O notation. Often you see stuff like f(n) = O(N²), when they mean that f ∈ O(N²). It's fine because everyone knows what's going on, but it's not using it in the sense of equality.
Sigh; it seems it's only programmers who think CS has a monopoly on big-O notation, or keep calling it abuse of notation and trying to use ∈, when it's really = that's the standard notation in mathematics (and for good reason).
Before Knuth popularized Big O notation in CS and started the field of analysis of algorithms, already in 1958 N. G. de Bruijn wrote an entire book on Asymptotic Methods in Analysis (not CS): see a few of its leading pages here: https://shreevatsa.wordpress.com/2014/03/13/big-o-notation-a...
And the notation was already being used by Bachmann in 1894 and Landau by 1909 in analytic number theory, well before computers. It was perfectly commonplace to use big-O notation with the equals sign very quickly: see e.g. this paper by Hardy and Littlewood (https://projecteuclid.org/download/pdf_1/euclid.acta/1485887...) from 1914, well before even Turing machines or lambda calculus were formulated, let alone actual computers or analysis of algorithms.
The statement about sine above is not something mathematicians would write. It makes little sense to use big-O notation in this context, as it doesn't say anything useful here: the O(x^7) element absolutely dominates the remaining explicit elements of lower order, so including them tells us absolutely nothing. In fact, sin(x) = O(1).
However, mathematicians do indeed use similar notation in this context, that is, little-o notation. It is in fact true that
Notice that in your example, you have o(x^5) and an explicit x^5 term. In my example I have O(x^7), but no explicit x^7 term.
It is true that I cannot think of a circumstance where you want to do this abuse of notation and would care if you were forced to use little-o or big-O instead of the other.
In my experience, it happens to be more common to use big-O.
I was taught to use tilde in big-O notation. Your use of "set element" operator is not quite correct either, because of the limit that's going on there.
I don't think the limit is really an issue here. Most CS textbooks define big-O with the limit ->infinity part baked into the definition. So, this is more of an issue of different people using different definitions than an issue of abuse of notation.
While operators are 'overloaded' all the time, equality is a bit of a special case as it is part of logic rather than some algebraic operation.
In model theory you don't require your models to have an equality operator, they have one simply by being logical constructs.
Then again mathematicians use quotient spaces so transparently that you might as well consider:
5 = 1 (mod 4)
as 'overloading the equality operator' even though the technical definition implies that those 5 and 1 are different from the 5 and 1 in the set of natural numbers, and in Z/4Z the symbols 1 and 5 refer to the same object.
Interestingly, when using mathematics to describe the semantics of programming languages (say, operational structural semantics for an imperative language), the assignment tends to use an arrow, i.e.
S[ x ↦ V ]
indicates that the new state is equal to old state S, but with variable x now bound to value V.
Math symbols and expressions are inconsistent just like regular languages. But, unlike math, other languages don't claim to be consistent.
It's not surprising that John von Neumann said "in mathematics you don't understand things. You just get used to them." - I've never heard a software developer say this about coding.
For example, I did not enjoy integrals at school because of the 'dx' at the end which means 'with respect to x' but which actually looks like a multiplication (* d * x).
I think that the reason why I never got deep into math is because the language of math is too inconsistent and has too many logical shortcuts and I can't operate in such environment.
My understanding was that the "d" is an operator, and there are notations in which an operator on an operand is notated by just putting the one before the other. Also, the dx corresponds to delta x in the limit definition of the integral. One reason for keeping it, is that it makes the units of measure work out if the integral involves things that have units. So, notational consistency aside, it saved my arse when doing physics problems. ;-)
(even in my pure math classes, I sometimes imagined that the variables had units, to help find mistakes).
But your point is well taken about the consistency of math notation. Math spent most of its history being scribbled by hand and read by humans. It got the job done. And it was not uncommon to invent a new notation on the fly to replace an abstraction with a single symbol. That's the precursor to the subroutine.
The need for perfect formality of notation is a new thing, brought on by the computer age. This may illustrate the point that programming is not "just math," and math is not a form of programming.
> I've never heard a software developer say this about coding.
This is sarcasm, right? We say this all the time about legacy spaghetti code or excessively clever magic our languages or (closest to von Neumann's statement, perhaps) the mountain of APIs and libraries and OS we use..
All these discussions of "=" are missing the point.
The notation "x = x + 1" is awful because at lhs x denotes a reference to an integer while at rhs x denotes the value hold by the reference. If you know C, it is similar to the difference between an integer pointer *x and an integer x. As an illustration, here are two programs that are doing the same thing, one in C and one in Haskell.
#include <stdio.h>
int main() {
int x = 0;
x = x + 1;
printf("%d\n", x);
return 0;
}
import Data.IORef
main :: IO ()
main = do
xref <- newIORef 0
x1 <- readIORef xref
writeIORef xref (x1 + 1)
x2 <- readIORef xref
print x2
We don't have such problem in Prolog. In prolog X can never be equal to X+1. X will always be X. Once X has a value, it never changes. One of the thing that gives most procedural programmers a headache.
Geometry distinguishes between equivalence and value. An "angle" isn't its degrees, but the geometric figure (two rays or segments meeting at an end-point of each). It's the measure of the angle that is the degrees.
You don't say "angles are equal" - you say they are congruent. It's their measures that are "equal".
Although congruency implies measure equality, it doesn't really mean that, but that the shapes are the same (can be rotated/translated to coincide).
Congruent means that you have some relevant information that you are discarding (projecting away) information, such as position and rotation. It's just as correct to say equal if you've already established the relevant context / quotient space.
Two triangles on a page are congruent, because they have different position. The three corners of an eauikao trisne are congruent, because their corner+angles are equal. If you take angle to mean corner, you say congruent. If you aren't also talking so about their position, you say equal.
In math you can slice things every which way, so it's impossible to use the different words for every different concept, so you have to establish a context
Equal vs congruent (and equal vs equivalent, which are also synonymous in math) is a crutch for beginners who are over reliant on their informal intuition.
I agree with the thoughts on the = sign but I'm not so sure about mutations.
> If mutation is so great, why do mathematicians use recursion so much? Huh? Huh?
> Well, I’ve got two counterpoints. The first is that the goal here is to reason about the sequence, not to describe it in a way that can be efficiently carried out by a computer.
Most high level languages try to avoid making the programmer describe the most efficient way to handle variables. The idea is to describe your algorithms and how they connect and allow the compiler (or interpreter) to figure out how to use registers etc to implement it. Of course that ideal breaks down sometimes but most high level programmers don't normally need to stress the low level details too much.
> My second point is that mathematical notation is so flexible and adaptable that it doesn’t need mutation the same way programming languages need it. In mathematics we have no stack overflows, no register limits or page swaps, no limitations on variable names or memory allocation, our brains do the continuation passing for us, and we can rewrite history ad hoc and pile on abstractions as needed to achieve a particular goal.
It's true that there's a limit to abstractions even the highest level languages can make if they want to remain general purpose. However I think languages can handle immutable variables as a default.
That's not to say I agree that programming should always follow mathematical notation. But I also don't think it's a bad ideal in many cases.
The thing that blew my mind was that there are some mathematical programming languages where the point is not to ever actually run the program. Just type-checking it is enough to prove the result. (In the "programs are proofs" sense.) In these languages, it's important not to allow infinite loops because you will never test the code.
Even though there's a correspondence, there's always going to be a difference between writing a program so that you can actually run it and writing proofs, where you don't, and ridiculously inefficient algorithms don't matter at all, so long as they don't diverge.
> To achieve the same result with recursion requires a whole other can of worms: memoization and tail recursive style and compiler optimizations to shed stack frames. It’s a lot more work to understand all that (to get to an equivalent solution) than it is to understand mutation!
It's true, you can't rely on tail call optimizations on every language. But practically all modern compilers for imperative languages transform your code into SSA form[1] in one of their intermediate stages, so the code:
int x = 1
x = x + 1
will invariably be transformed into:
int x0 = 1
int x1 = x0 + 1
in one of the intermediate stages - before applying further optimizations and finally converting it to stack-based bytecode or register-based machine code.
This means that mutating a variable, at least when dealing with local primitive values, is equivalent to assigning the value to a new variable.
Even in other cases, in-place mutation is not necessarily more efficient than copying and modifying values. For multi-threaded code, mutation often requires relying on synchronization which can be more expensive than copying around data.
> Simply stated, the goals of mathematics and programming are quite differently aligned. The former is about understanding a thing, and the latter is more often about describing a concrete process under threat of limited resources.
In the current day and age, if a programmer would wants to write the most efficient code, they need to understand a lot about their multi-stage optimizing compiler, Out-of-order CPUs, OS threads and so on - and they would still need to benchmark their implementation against others. It is not clear anymore that mutation is always faster.
I think that in 99% of the cases, clear, safe and maintainable code trump the micro-optimizations that may (and often may not) be gained from using mutable data. Immutable data is generally easier to reason about, always safer from data races and other bugs, and - in most languages languages - more maintainable.
The age of limited resources and straightforward compilers is long over, but it left mutable variables as its heritage. I'd argue they're still common not because they are necessary to deal with hardware limitations, but just because generations of programmers have gotten so used to them, it's hard to give up on them.
>> If mutation is so great, why do mathematicians use recursion so much?
> Huh? Huh?
The deal here is surely that induction and other recursive approaches are conducive to being reasoned about in traditional mathematical contexts (e.g. taking a walk). Mutation is impossible to keep track of, mentally. Though others' mileage will vary on that.
That it was a quote from the article seemed clear in the post in which the quote was presented, so while I agree that it can be difficult to clearly present quotes in some situations on HN and a good blockquote formatting facility would be preferable, I don't think that's really a problem here.
As I read the comments on this post, the top reply contains multiple mathematical symbols which aren't rendering on my recent Android phone. You can't pretend that HN has a huge proactive team working on these issues.
I'm loathe to infer that you think that the problem is "me" because I think the underlying question is whether the comments should be a friendly place for people who do not have time or inclination to read the article. You may see that as an appalling, lazy, degeneration in discourse; the reality is that reading the comments without wading through a blog post is a valid tactic. If clearer methods of quoting were available the two or three of us involved here would have wasted less time.
The "engineer-forward" alternative in which everyone strives to speak from a totalizing position of authority is just, frankly (as a technical person myself) unattainable.
Just to be clear, I am literally saying that IMHO it should be OK to converse in the comments without reading the link in question. If HN can't accommodate this, it's not a friendly platform for humans.
Let me confess that I’ve not read every answer. But the ones that I did read were all very concerned with “squaring” etc.
The simplest answer — and I think the reason many people have difficulty with both arithmetic and especially algebra — is that you need to deeply internalize just what the “=” sign symbolizes and asserts: that there is the very same number on each side.
In other words don’t be distracted by the symbols and operations. One way to think about this is that “a number is all the ways you can make it” (i.e. it can be thought of as “processes” (an infinite number of them) as well as a “value”).
This means whatever you can do to any number can be done on both sides of the “=” because there is just the same number underneath the gobblydegook on both sides.
This is what “=” actually means. And it’s why algebra is actually quite easy rather than mysterious or difficult. [0]
Typically we write that the solution is "x = –2". This to me is the most abusive form of usage for "=" in mathematics. The solution to the equation is –2. The solution to the equation x = –2 is also –2.
Solving the equation x = –2 is very easy. We can solve it just by looking at the equation. What we are really doing when solving an equation is transforming the original equation into a simpler equation with the same solution set. Tt gets tedious to write this all out so we just say things like "the solution is x = -2" when we've transformed the original equation to x = –2. This is weird because x is not the number -2. x is a variable that can assume a myriad of values. The only value of x that solves the equation is –2.
As the article states the abuse of the = sign in mathematics is rampant. We do it mostly without realizing it. In this sense mathematical language mimics human languages. All human languages are prone to abuse of rules and to shifting with the times.
The notation in mathematics, while much more precise than spoken human languages, is abused frequently and the purpose is to make things cognitively easier. The ancient Greeks didn't have symbols for numbers and in their mathematics they wrote everything out in Greek. This makes it very hard to do tedious calculations. Using symbols in lieu of writing out all the minutia makes doing math easier provided you learn the contextual meaning of the symbols. Over the centuries symbols have been introduced as a shorthand for complex ideas/objects/operations. If you want everything precisely stated then reading Principia Mathematica ought to cure you of this desire. Mathematics is written by humans for humans.
Code is written by humans for computers and hence the notation needs to be rigorously defined in the language you are using and why your code needs to be commented.
Suppose x is an integer such that x + 3 = 1. Then, x is -2. There's no solutions here, just implications and an alternative way of defining the value of x.
I think variables in equations are not meant to express the existence of variance within an equation, but a sense of context-dependency of the value of x. At least, IMO.
There is a solution. An equation is really a question.
x+3 = 1
is asking the question, “what value for x makes x+3 the number 1?”
The polynomial x+3 is defined for all values in R, the base ring you are working in. We are trying to find the elements of R for which x+3 is the element 1.
A solution necessitates a question, and questions associated with equations involving variables aren't restricted to: what is set of possible values which satisfy those equations?
I disagree. If you are saying 'the solution's is -2, you have to be clear what the problem is. This becomes clearer when you have a problem with multiple variables. Then saying 'x=-2, y=3', makes clear the value each variable is taking in the solution.
I feel like that example counters your claim in two different ways.
First, (1, 2) only makes sense if you assume that x is first and y is second, or in other words that they correspond to X_1 and X_2 for some vector X. This is probably a reasonable assumption for x and y, but what if you have some other arbitrary choice of variables?
a ρ + 2 = 0
Then the only unambiguous way to write a solution is with explicit labels: a = 1, ρ = -2.
Second, another way to “solve” the equation “x y + 2 = 0” is: “y = -2/x, for any x ∈ ℝ \ {0}”. You could argue that this is also a different kind of “equation” than the one you started with: the reason it can be seen as a solution is not (just) that it’s simpler than the original equation, but that it provides an algorithm to enumerate the set of individual solutions, as well as the set of solutions given some proposed x value. (That is, a set with one or zero members depending on whether x = 0.) However, even if it is a different kind of object, there’s no way to represent it in standard notation that doesn’t conflate ‘questions’ with ‘answers’. If you really want to use tuples, you could go for “{(x, -2/x) | x ∈ ℝ \ {0}}”, which avoids the equals sign – but the ∈ is playing a similar role, an algorithm (enumerate all members of this set) disguised as a test (is this value a member of the set?).
edit: Upon further reflection, I might actually just be expressing violent agreement with the point you were trying to make. shrug
Polynomials in multiple variables always have the variables ordered. Sometimes the ordering is not important or explicitly stated but in reality they are supposed to be ordered.
If we start with x y + 2 = 0 we can apply the function
f(x) = x - 2
to both sides of the equation. This gives us the equation
x y = -2
This is a different equation than the one we started with but these two equations have the same solution set. Can go a step further and transform this equation to
y = -2/x
This equation has the same solution set as the first equation. All three equations are equivalent. Solutions are ordered pairs of numbers.
Commonly in basic courses like calculus we tell students that the last form is preferable and we write solutions as (x, -2/x). We call this set the graph of the equation but really it’s the solution set of the equation.
In algebraic geometry x y + 2 = 0 is preferable. The solution set is called an algebraic variety. The solutions are ordered pairs in affine space.
The rules of algebra, as taught in low level courses, are rules that allow one to transform a given equation into simpler equation. In one variable the goal is to end up with something like x = 3 because such an equation is easy to solve. The solution is the 1-tuple 3.
Not all sets are recursively enumerate so using enumerate as you did can cause problems.
Each polynomial has only finitely many variables in it. If you were to work with uncountably many variables I assume one would impose a well ordering on those variables. I’ve never worked with a polynomial ring with uncountably many variables. No one is going to solve an equation with uncountably many variables so I’m not sure the point you are trying to make.
> Typically we write that the solution is "x = –2". This to me is the most abusive form of usage for "=" in mathematics. The solution to the equation is –2.
I disagree, though I think it's fine to think of a bare “-2” as a solution when you have a single variable, when you deal with equations or systems of multiple variables it breaks down. Sure, you can think of the solution in terms of untagged tuples when the variables have conventional orderings, but that just highlights that the lack of tagging the variable with the value is a shorthand, not the “true” form of the solution. And it's as much a shorthand in the one-variable case.
Let me be more precise. Assume x+3 is an element of R[x] with x an indeterminate and R a ring with characteristic not equal to 3. Then x+3 defines a natural map from R to R. The equation x+3=1 is just a shorthand way of asking for the pre-image of this map.
In two variables we get a map from R^2 to R^2and solutions are ordered pairs. By definition of an element of a polynomial ring over R the variables are ordered.
> Typically we write that the solution is "x = –2"
That is not overloading "=" at all. It is overloading the word "solution".
What your teacher probably told you is:
"x+3=1 is equivalent to x=-2 by .... Hence, the solution of x+3=1 is the same the solution of x=-2 (which is -2 in both cases). Since the solution of x=-2 is so obvious, we also say the solution is x=-2 to mean that it is the same solution as of the equation x=-2."
Same difference as in "Q:Would you like a large coffee or a small coffee? A: Large "
The correct answers should only be either phrase "A large coffee" or "a small coffee", but the answer was an adjective.
I don’t know the precise definition of overloaded. I was merely claiming that the use of = in an equation, like a polynomial equation, is very much different than the use of = in something like the statement of the distributive property of the real numbers.
I’m a mathematician so it’s something my teacher said as such but something that I noticed consciously when I started doing some programming.
> x is a variable that can assume a myriad of values.
No it can’t. x in this case is a bound variable. It’s not that it just so happens to take on -2, but that it is already bounded to -2 to make the statement true.
It really says: There exists some x in Z such that x + 3 = 1. What is that x, or what is a proof of the statement? The answer is in the form of a logical implication.
Let’s assume we are talking about Q, the rationals. x+3 is an element of Q[x]. This element of Q[x] defines a natural map from Q to Q. The equation x+3 = -2 is equivalent to finding the pre-image of -2 of this natural map.
x is actually just, in the language of computer science, syntactic sugar. In reality x+3 is really the infinite tuple
Lets continue to assume we are working over Q. Without further context I would take "x+3=-1" to mean that x,3, and -1 are all elements of Q. 3 and -1 being the obvious elements; and x being an a-priori unknown elements which we can easility derive to be 2.
Notably, x+3 is not a polynomial in the technical sense. If we wanted to consider x+3 a polynomial, we would be asking for the t value such that (x+3)[t] = (-1)[t]. Where (-1) is also a polynomial, and (g)[t] is the map Q[x] X Q -> Q given by standard polynomial evaluation.
Sure, this question is equivalent, but I see nothing in the original equation "x+3=-1" to suggest any involvement of formal polynomials.
x+3 is an element of the polynomial ring Q[x]. More precisely it is syntactic sugar for the infinite tuple
(3, 1, 0, 0, ....)
A polynomial ring in one variable is an infinite direct sum of the base ring with addition component wise and multiplication defined in a certain way. The expression x+3 meets the definition of a polynomial.
I know what a polynomial ring is. I am not questioning that the string "x+3" can be interpreted as an element of Q[x].
What I am questioning is the necessity to interpret "x+3" as an element of Q[x].
>The expression x+3 meets the definition of a polynomial.
Only if you take x=(0,1,0,...) and adopt the convention that any member q of the base field Q is assumed to represent qx^0 = (q,0,0,...).
That is to say, x+3 only meets the definition of a polynomial because you insist on interpenetrating it as such.
However, we can also handle "x+3=-1" without ever defining the notion of a polynomial.
Eg, we can say, suppose x \in Q such that "x+3=-1". From this premise, we can derive presisly what specific element of Q x must be.
In a more general setting, we might only be able to derive a set of potential values that x could have, or derive that x cannot possibly exist.
As I mentioned in my prior comment, I see no reason to intererperet the "x+3" in "x+3=-1" as a polynomial. If we were to do so, the question would be asking: find t \in Q such that (x+3)[t]=(-1)[t]. Where (g)[t] is polynomial evaluation.
Applying the definition of polynomial evaluation, we would get that the above equation implies: t+3=-1.
Are you now going to insist that "t+3" is a polynomial. Bearing in mind that we have defined t to be an element of Q, which was necessary to apply it as the second argument of polynomial evaluation; and we only got "t+3" as the output of polynomial evaluation, which is defined to result in an element of the base field.
We could modify are notion of polynomial evaltuation to instead be of the form R[x] X R[x] -> R[x], which also gives us (for free) the ability to apply polynomials to other polynomials. But if we were to do this, then when we say that the solution to "x+3=-1", is -4, we are taking "-4" itself to be a polynomial.
In practice this is fine (we identify the base field with the subring of degree 0 polynomials all the time). However, this entire approach breaks down when you start working with functions that do not fit within the framework of polynomial rings.
For instance, suppose I said that "(x+3)! = 120". Are you still going to insist that "x+3" is a polynomial?
What if I define a function id: Q -> Q. In the equation "id(x+3) = 2, are you still going to insist that "x+3" is a polynomial?
Only if you take x=(0,1,0,...) and adopt the convention that any member q of the base field Q is assumed to represent qx^0 = (q,0,0,...).
That is to say, x+3 only meets the definition of a polynomial because you insist on interpenetrating it as such.
That’s what we mathematicians do. In the context of the original post it is absolutely clear that x+3 is a polynomial. There is no other reasonable interpretation.
When you write things like:
Notably, x+3 is not a polynomial in the technical sense. If we wanted to consider x+3 a polynomial, we would be asking for the t value such that (x+3)[t] = (-1)[t]. Where (-1) is also a polynomial, and (g)[t] is the map Q[x] X Q -> Q given by standard polynomial evaluation.
it gives the impression that you don’t know what a polynomial is. The second sentence I quoted is not true. (EDIT: see note below, my interpretation of what was written was wrong.)
Of course if you change context then different interpretations arise. Which of course is the whole point of my original post. Like all spoken languages mathematical language is nuanced. Things must be interpreted in context.
When presented with the equation x+3=-1 x+3 is a polynomial. -1 is a polynomial.
I gather you do not think x^2 - x + 1 = 0 is a polynomial equation. Is x^3+4x a polynomial? Is there any other reasonable interpretation using accepted mathematical conventions? Perhaps you don’t think 2/(x+3) is a member of R(x). What is it a member of then?
Edit:
x+3 is a polynomial that defines a natural map from R to R. To solve the equation x+3=-1 is asking for the pre-image of -1 of this map. This is what it means to solve this equation. It’s solution set is an algebraic variety. I see no other reasonable interpretation. The whole branch of algebraic geometry is about precisely this. Studying zero sets of polynomial equations.
That we teach people rules they can apply to find the answer does not detract that what is really going is as I’ve described and as you did describe with the second quoted text.
> If we wanted to consider x+3 a polynomial, we would be asking for the t value such that (x+3)[t] = (-1)[t]
Bearing in mind that the example I have in mind is the equation "x+3=-1" with the solution of "2", in what sense in the above sentence not true?
>When presented with the equation x+3=-1 x+3 is a polynomial. -1 is a polynomial.
Fair enough. In that case, I assume you would consider the equation "x+3=-1" to be false, as it is clear that (3,1,0,0,...) != (-1,0,0,...). Unless of course you are asking, as I had suggested, that you are looking for the particular element at which polynomial evalutation yields an equal result on both sides. If this is the case then, as far as I can tell, you are introducing the machinery of formal polynomials for the sole purposes of overloading the "=" symbol in a confusing way.
>I gather you do not think x^2 - x + 1 = 0 is a polynomial equation.
Define "polynomial equation" If you are asking if I would consider that an equation taking place in R[x], then (absent some other context) the answer is no. Even with other context I would say that "x^2 - x + 1 = 0" is false as a polynomial equation. You might be able to get me to call equations done in the quotient ring R[x]/<x^2-x+1> polynomial equations, in which case "x^2 - x + 1 = 0" would be both a polynomial equation and true at the same time.
If you are asking if I would call "x^2 - x + 1 = 0" in an informal setting, then the answer is yes. However, I do not see how this is relevant, as the whole point of this comment chain was the formal notion of polynomials.
>x^3+4x a polynomial?
Informally, yes. Formally, it depends on context. However, absent some context, I would not consider "x^3 +4x" to be a formal polynomial.
>Is there any other reasonable interpretation using accepted mathematical conventions?
Yes, x^3 +4x is the member of the base ring corresponding to "(x * x * x) + (4 * x)", where x is some other member of the ring.
>Perhaps you don’t think 2/(x+3) is a member of R(x). What is it a member of then?
I am glad you asked. I believe my above answer regarding x^3+4x still applies. However, let me ask you: Is x^3+4x a member of R(x)?
>What is it a member of then?
Again, depending on context. Without context, I would consider 2/(x+3) to be a member of R.
One of the points of my original comment was that when it comes to equations the = does not mean equal in the sense of stating two objects are the same. In the context of an algebraic equation to solve the = sign is really a question. It’s asking, what is the set of values that make the statement true?
I know of no mathematician who thinks x^2-x+1=0 is anything other than a polynomial equation. Specifically it’s shorthand notation for the variety of the ideal generated by x^2-x+1. And in general you don’t look associate this variety with the quotient ring of the ideal generated by the polynomial. You look at the quotient of the radical of that ideal.
Without any further information the only reasonable interpretation of x+3 is that it is a polynomial. Without any further context in an algebraic equation x is a variable and is not assumed to be an element of the base ring.
In the context of function spaces like C(R) it’s a different matter. And viewing x+3 as an element of C(R) the only reasonable interpretation of x+3=1 is that we are finding the pre-image of 1. And to do this for a complicated function means solving an equation. And solving an equation by hand, the context of my original comment, means reducing the equation to a simpler one. In the case I gave this means reducing x+3=1 to the simpler equation x=-2 whose solution is obtained by inspection. That’s the goal of all the algebraic manipulations we bore beginning algebra students with. Reduce complicated equation to simpler equation. One whose solution is obtained by inspection.
Your view of how to interpret x^3+4x is too simplistic because the only to way to algebraically manipulate that object is by considering it as an R[x] or R(x). You have to view the x as an indeterminate in some larger ring than the base ring.
>It’s asking, what is the set of values that make the statement true?
For this to be the case, there would need to be a statement in the first place. And that statement would involve the =, so you necessarily still have the = symbol representing something other than questioness. I would further say that the equation itself is still just a statement, and any "question" interpretation is based entirely on the context where the equation is presented.
Further, lets take seriously the notion that "x+1" is a polynomial in the formal sense. What does it mean to find the set of values for which x+1=2 is true? Normally, I would say that we are looking for the set of x values which makes that equation true. However, we are insisting that the LHS is a polynomial (again, in the formal sense). This means that there is no variable. The "x" in the LHS is literally a value. It makes no sense to ask what values of (0,1,0,0,...) make that equation true. The fact that we give (0,1,0,...) a standard name of x does not suddenly make the question sensical. Nor does the fact x is often used to represent variables.
>I know of no mathematician who thinks x^2-x+1=0 is anything other than a polynomial equation.
To be clear, outside of very particular contexts I would still call x^2-x+1=0 a polynomial equation, because it is extremly useful to talk about polynomials without invoking all of the machinery of formal polynomials.
>And viewing x+3 as an element of C(R) the only reasonable interpretation of x+3=1 is that we are finding the pre-image of 1.
I disagree. Viewing x+3 as an element of C(R), the only reasonable interpretation of x+3=1 is the statement (x↦x+3)=(x↦1).
Viewing x+3 as an element of R would allow us to treat the equation x+3=1 in the "obvious" way. We can prove that the statement x+3=1 implies that x=-2.
Further, I would agree with you that, absent other context, when given an equation which contains an "x" in it, there is some implication that we are supposed to solve for x.
>Your view of how to interpret x^3+4x is too simplistic because the only to way to algebraically manipulate that object is by considering it as an R[x] or R(x).
Why? Suppose I don't know what R[x] or R(x) is. We certainly don't teach highschoolers what either of those are, and they seem to be able to do "algebra" just fine.
Here is a simple approach to dealing with x^3+4x without considering it a member of R[x]. For concreteness, I want to solve x^3+4x=0.
Suppose x \in R such that x^3 +4x = 0.
By the distributive property, this equation is true iff x(x^2 +4x)=0.
By direct calculation, we can verify that x=0 is consistent with this equation, and therefore consistent with the original equation.
Consider the case where x != 0.
Note that the function f(n) = n/x is a bijection. Therefore, we have x(x^2 + 4)=0 iff f(x(x^2+4)) = f(0).
By direct computation, we get that this is true iff x^2+4 = 0.
We know that x^2 >=0, and 4>0.
Therefore, x^2+4 > 0.
This is a contradiction, which means that the case where x!=0 is impossible.
This means that we have proven that x=0.
Now, suppose we were working over C.
Continuing from x^2 + 4 = 0, we can show that:
x^2+4 = 0 iff
(x+2i)(x-2i) = 0 iff
x+2i = 0 OR x-2i = 0 iff
x=2i OR x=-2i
Since the cases x=0 and x!=0 are exhaustive, we have proven the statement x \in {0, 2i, -2i}.
I solved this using the method we teach school children and without invoking any notion of polynomials.
You don't see the problem with this? Saying x=0 and that x is an element base ring means that x is the element 0. You can't later in your problem write "...x=2i..." if you are going to persist in your view that x is an element of the base ring. What you have shown is that the variety of the ideal generated by x^3+4x is the same as the variety of the ideal generated by x, x-2i, x+2i if we are talking about C as the base ring. If the base ring is Q then a different thing is shown.
You can't logically say, in a consistent manner, that x is in R and x^3+4x = 0 and that x is 3 different values. An element of a ring is not three different elements. An element of a ring is itself. If you want to vary the object x then you need to enlarge your ring to an algebraic structure that admits x so that it behaves the way you wish to view it. Your view of what is really happening when solving an equation is not rigorously sound. The proper way to view this is in the context of algebraic geometry.
> Saying x=0 and that x is an element base ring means that x is the element 0.
I never said that x=0. I said that x could be 0. More specifically, I stated that the statement x^3+4x=0 AND x=0 is not inconsistent.
All I have shown in is that (assuming we are working in C), the statement x^3 + x=0 implies that x \in {0, 2i, -2i}.
I suppose you could complain that I have not defined a sense in which {0, 2i, -2i} is correct while {0, 2i, -2i, 7} is incorrect, as it is still a true statement that x^3+4x=0 implies x \in {0, 2i, -2i, 7}.
However, you can easily make this intuition rigourous by saying that the question is to compute the set {x | x^3+4x=0}. Sure, this is invoking machinery not explicitly present in the statement x^3+4x=0. I will even concede that we do not make this machinery explicit when teaching highschool students. However, it is far less machinery than your approach.
I am not claiming that the algebraic approach is not rigourously sound; merely that it is not the only rigourously sound approach.
As far as I can tell, you are claiming that it is the only rigourously sound way of stateing the question.
>You can't logically say, in a consistent manner, that x is in R and x^3+4x = 0 and that x is 3 different values.
I believe I have made this point clear, but I never claimed x is 3 different values. The claim I made was that x is a member of the set {0, 2i, -2i}
Later you have x=2i or x=-2i. Persisting with the idea that you can view x as an element in the base ring and at the same time allowing its value to change or vary indicates you don’t really understand these issues. If x is in R it is a single value. If you want to vary it you need to expand R to include x and add the approroate algebraic structure.
It’s shocking that you think x^3+4x is not a polynomial. The whole discussion I started originally was that math language, like all other human languages, is nuanced and there are lots of abuse of notations. This is ok because math is written for humans by humans. The standard interpretation of x^3+4x is that it’s a polynomial. This is not disputable.
To be clear, by the time I got to this statement, I had changed the question (from a base of Q to C)
Further, the complete statement I was making at that point was:
If ((x^3+4x=0) AND x != 0) then ((x=2i) OR (x=-2i))
I am not allowing x to vary at all here. Suppose, for the sake of arguement, we had x=2i. It would still be true that ((x=2i) OR (x=-2i)).
> Persisting with the idea that you can view x as an element in the base ring and at the same time allowing its value to change or vary indicates you don’t really understand these issues.
Persisting with the idea that I am doing this indicates that your are not reading what I am writing.
Why? Suppose I don't know what R[x] or R(x) is. We certainly don't teach highschoolers what either of those are, and they seem to be able to do "algebra" just fine.
We don’t teach high schoolers what is really going on. We mask what is really going on because making it all precise is not effective or helpful at this stage of development. We teach rules to manipulate equations. We don’t use the language of algebraic geometry because that is too complicated. We don’t say to them the variety of the ideal generated by x+2 is the same as the ideal generated 4x+8 are the same so that x=-2 is equivalent to 3x+1=-x-7.
However, we are insisting that the LHS is a polynomial (again, in the formal sense). This means that there is no variable. The "x" in the LHS is literally a value. It makes no sense to ask what values of (0,1,0,0,...) make that equation true. The fact that we give (0,1,0,...) a standard name of x does not suddenly make the question sensical. Nor does the fact x is often used to represent variables.
You are clearly not an algebraist. The polynomial
(3, 1, 0, 0, 0, ....)
Induces a natural map from R to R that is generally called “evaluation”. It maps the number 5 to 8 for instance. We abuse notation and say to begininnng students, “replace x with 5”. We dumb things down. Instead of asking for the pre-image of 8 under this natural map we ask for what values of x do we get a value of 8. The shorthand way of writing this is to say solve the equation:
x+3=8
That you don’t know this is disconcerting since you’ve obscuoulsy had more than an elementary mathematical education. You are confusing the simplistic view of what is taught in basic courses with what is really going on.
Ask a million mathematicians, “is x^2+3x a polynomial” and without hesitation they’ll say yes. Because in standard uasage of that expression it is a polynomial. That’s the default interpretation.
Read the first chapter of any beginning algebraic geometry textbook. x^2+3x-1=0 is an algebraic variety. This is the standard interpretation of that equation.
>The polynomial (3, 1, 0, 0, 0, ....) Induces a natural map from R to R that is generally called “evaluation”.
I am well aware of this. If you look at my comments within this very chain, you will see that I made reference to polynomial evaluation, which I will call Ev(f, x). I Am not denying that the partial application given by f' = x -> f(f,x) is naturally induced by the polynomial f. Nor that this is so natural that it often makes sense to identify f with f' so that we would consider f=f', even though they are different types of objects.
>Ask a million mathematicians, “is x^2+3x a polynomial” and without hesitation they’ll say yes.
As will I, because the distinction between formal polynomials and expressions which can be naturally modeled as polynomials is so unimportant that it is almost never worth thinking about.
Put another way, how would you compute the following sets:
{ x \in C | x^3 - x = 8 }
{ x \in C | x^3 = x + 8 }
{ x \in C | log(x) = x^x }
{ x \in C | sin(x) = .7 }
{ x \in N | exists y \in N such that 5x + 3y = 1 }
Are you really claiming that these questions are ill-posed without stating them in terms of algebraic geometry?
I believe you are the only person trained in mathematics in the world that thinks x^2+3x=0 is not a polynomial equation.
You don’t understand the underlying algebraic theory. This is evidenced by your claim
The "x" in the LHS is literally a value. It makes no sense to ask what values of (0,1,0,0,...) make that equation true.
You can’t write such a statement if you really understand that “what values make x+3 the number 5” is a nicer way of conveying the question “under the natural evil map induced by x+3 what is the pre-image of 5”. The text of your I quoted is wrong.
> I believe you are the only person trained in mathematics in the world that thinks x^2+3x=0 is not a polynomial equation.
Then you are not reading my comments.
> To be clear, outside of very particular contexts I would still call x^2-x+1=0 a polynomial equation, because it is extremely useful to talk about polynomials without invoking all of the machinery of formal polynomials.
>You don’t understand the underlying algebraic theory.
My claim is that the underlying algebraic theory is not necessary to rigorously state or solve the problems under discussion.
>You can’t write such a statement if you really understand that “what values make x+3 the number 5” is a nicer way of conveying the question “under the natural evil map induced by x+3 what is the pre-image of 5”.
My claim is that "what values make x+3 the number 5" translates directly into { x \in F | x+3=5}. I further claim that the "x+3" in this interpenetration is not a polynomial in the formal sense
I am not disputing that this is set is the same set as the pre-image of 5 under the natural map induced by x+3.
Your post goes to the point at the heart of philsophical number theory.
What does equality mean ?
Yup – you got functinal equivalence, isomorphism, and temporary assignment of values.
But I think you could prove – that all these types of equality – are “instances” of “different implementations” of “equivalence.
They are no more equivalent than 1 = 1 is equivalent.
I.e. 1 = 1 means I think we can define a bijective “counting function” that proves there’s the “same number” of “elemetns” in the “sets”
I think (not sure) – if you define – counting fucntion / same number / elements / sets differently – you get the differing definitions of equivalence you enumerate.
The interesting thing for me is that 1 = 1 is defined clear in 4 of peano’s axioms
And you could mentally – try to develop different (and potentially) – more powerful notions of “equivalence” – with differing axioms
A final point… the prevalence of several “similar” concepts of equivalence in computer science – may point to an underlying “platonic idea” of equivalence – that either exists dormant in the world awaiting for us to discover it; or is a useful “technologocial” construct – that has accelerated “progress”
I have an engineers understanding of higher maths - overly general and very patchy. Short of taking an undergraduate math course, are there any resources to help me parse math notation? For example, while brushing up on endogeneity/exogeneity, E[B'|X] = 0 completely threw me - I had to search Google for the use cases of a bar/pipe aka latex vert/mid. I usually lose interest in a paper if I get stuck trying to decode the syntax.
Also, if I were you, I would always look up introductory textbooks before trying to read math literature. The example you gave is conditional probability; any textbook on probability theory would cover it.
> The usual way to get half an apple is to chop one into "two equal parts". Of course, the parts are actually NOT EQUAL - if they were, there would be only one part! They are merely ISOMORPHIC.
Sometimes we even leave it as "i" instead of "0 < i < n" or "i = 1 to n". Sometimes the range of the summation doesn't even need to be determined in intermediate steps.
When you say 'i=0', what you mean is that that is the base case, and the sigma specifies a bunch of other cases.
i_1 =/= i_2.
As xg15 noted, it's perfectly fine to say (x=2) => (x + 3 = 5). The problem the first example really addresses is that in mathematics, the namespaces are loosely defined, but in programming they aren't. 'i' can mean several things at once, and it doesn't really matter because those things never really interact in the same context. In programming, you need to specify the name 'i' every time you want to reference it, so it's important that you have a stricter namespace rule.
One can at least say that in formal ZFC the symbol "=" has exactly one interpretation. And it's this interpretation that people talking about Haskell are referring to.
Aren't we all taught to write "y = 3" as the answer to algebra questions? That's how I always thought of it, not as assignment, but as a declaration of truth.
When I learnt programming, i was confused by x=x+1;
After I understood what it really meant, I wondered why they didn’t use some other symbol to capture this semantic. Say something like x <- x+1 ; Which implies assignment rather than equality - That way this would be unambiguous and I feel is more clear. I now guess the choice of using ‘=‘ was probably an attempt at making a (compromised) choice given the limited symbols that were available back when High level languages were first written?
Go write a couple hundred lines of R (please don't actually do this R is atrocious imo) and then you'll understand why '=' is used instead of '<-' - because it's a pain in the ass, even with hotkeys
Similar notation is already used for mappings in general albeit from left to right with the bar arrow, and obscures one of the necessary characteristics of recursive functions, which is a mapping to the natural numbers.
Edit: Misunderstood, thought you were talking about math, not programming languages.
> In Python, or interpreting the expression literally, the value of n would be a tuple, producing a type error. (In Javascript, it produces 2.[link] How could it be Javascript if it didn’t?)
That linked JS code uses ^, which is xor, not pow. Math.pow([], 2) = NaN. Or maybe that was the joke and it flew completely over my head.
IMO one of the most irritating abuses of notation that I've come across given that it requires no additional effort to use the 'is an element of' symbol instead.
Sigh, even here, on an article whose very purpose is to point out that = does not mean equality as used by mathematicians, there are people calling this an abuse of notation. It's not; it's the standard notation in asymptotics. Using set-theoretic notation is cumbersome — you have to use “∈” in things like 3n^2 + 5 = O(n^2), but “⊆” in things like (n+O(√n))^2 = n^2 + O(n√n) — and defeats much of the point of using O() notation in the first place. I've mostly seen programmers / CS people insist on calling it abuse of notation and try to pedantically and counterproductively use set-theoretic notation, when mathematicians have been freely using the = sign for more than a century, from before the birth of computers. (Not repeating my comment from earlier: https://news.ycombinator.com/item?id=16834297)
Wow, so disagreeing with the point of this article immediately provokes a typed-out condescending sigh from you? That's worth a sigh more than anything. Furthermore, I was a mathematics person first, and got into computer science later, so I'm not a "CS person" insisting on calling it an abuse of notation.
I have two points:
Firstly, although the article does a good job making the point that there's no reason to get on programming languages' case for using formalized, ad hoc syntax, it does a terrible job of backing up it's actual title. The bottom line is the equals sign is very closely associated with the concept of RST equivalence in mathematics, to the point where a specificity such as the symbol for the isomorphism (which, ironically, he used as a counterexample) is abstracted away, because isomorphism is an RST relation. If two things are isomorphic, they are, in some sense, the same (or equal), so use of an equals sign is natural.
Secondly, if you had read the first source you posted, maybe you wouldn't have claimed programmers/CS people were the ones pedantically insisting on calling it an abuse of notation since, on page 6, Bruijin writes, "The trouble is, of course, due to abusing the equality sign =."[1] Furthermore, after defining the parameters around the use of the O-symbol, he also writes, on page 7, "It is obvious that the sign = is really the wrong sign for such relations, because it suggests symmetry, and there is no such symmetry."[1]
That's not “worse”, that's the entire point of using O() notation! The beauty of O() notation is that it lets us carry out, fully rigorously, computations like
without dealing with a mess of sets and quantifiers. Please take a look at some works where asymptotic expressions are dealt with proficiently; you'll understand. (de Bruijn's book https://news.ycombinator.com/item?id=16834297 for example, or at least Chapter 9 of Concrete Mathematics.) I recently worked out an example here: https://cs.stackexchange.com/a/88562/891 — replacing all O() equations with “∈” and “⊆”, though it can be done ( https://math.stackexchange.com/a/86096/205 ), is just cumbersome and only distracts from what's going on.
I think you lost a factor log n there. Pretty sure (n + O(log n))^2 is (n^2 + O(n log n)).
In any case your notation works perfectly fine as an equality of sets. It's not that unusual to add and multiply sets together, there's really only one sensible definition, and the slight abuse of notation to use x instead of {x} is perfectly acceptable if you're careful.
The problem is that f = h + O(g) is generally used in a way that's false as an equality of sets, it's usually used to mean f ∈ h + O(g), which just makes using '=' needlessly sloppy notation (especially since you can make it correct by just changing a single character, no mess or quantifiers required).
Still, that's nothing compared to the suggestions on stack overflow which suggest using = to mean ⊆. Which is a terrible idea, equivalent to using = instead of ≤. In fact you point out one of the reasons it's terrible yourself, when you note that all equality signs only work one way, going against all mathematical convention.
Even stronger, ⊆ has some very nice properties. It's a total order on the big O sets, which is one of the nicer properties you can have (it's merely a partial order on the f + O(g) sets, but you can't have everything). Why on earth would you sacrifice all that just so you can avoid a scary math symbol?
(Thanks for the correction, and the polite response.)
The purpose of notation is to communicate effectively, and ideally notation should match the thoughts that the writer has and wishes the reader to have. That is, notation should match thought, rather than humans change their thinking to match notation.
When a mathematician (who works often with asymptotics, say in analytic number theory) writes something like O(n), they are not thinking of a set; they are thinking of “some unspecified quantity that is at most a constant times n”. (As de Bruijn illustrates with his L() example: https://shreevatsa.wordpress.com/2014/03/13/big-o-notation-a...) So for example, the sentence
(n + O(log n))^2 = n^2 + O(n log n)
is thought of (by the person writing it) as something like
“when you take n and add a quantity that is at most a constant times log n, and square it, you get n^2 plus at most a constant times n log n”
and not as something like
“the set of functions obtainable by adding the function n ↦ n to a member of the set of functions that map n to at most a constant times log n, and squaring the sum, is a subset of the set of functions obtainable by adding the function n ↦ n^2 to a member of the set of the functions that map n to at most a constant times n log n”,
even if the latter is the fully formal and precise way of articulating it. (Consider “the sky is blue” versus “the sky is a member of the set of all blue things”, where “the set of all blue things” was never a part of the original speaker's thoughts.)
That's one reason for preferring the equals sign.
I think what's happening is that we lack a good theory or notation for talking about unspecified things the way we think of them, and set theory (and associated notation) is the closest thing that anyone's bother to develop. (An unspecified thing is “just” a member of some set. And when someone wants to be fully formal, that works fine enough and in fact the differences disappear. As Thurston says (https://shreevatsa.wordpress.com/2016/03/26/multiple-ways-of...): “Unless great efforts are made to maintain the tone and flavor of the original human insights, the differences start to evaporate as soon as the mental concepts are translated into precise, formal and explicit definitions.”) Using the equals sign here captures the original thought better than ∈ or ⊆.
What's so terrible about a one-way = sign anyway? The problem can't be that it goes “against all mathematical convention”, because using it that way is the mathematical convention for over a century, ever since soon after Bachmann introduced it in 1894. It seems that as people learn more mathematics, they learn to accept a great many things and extensions to notation, but the thought of an asymmetric = sign, used like “is”, is just too much to bear for some people. But everyone who works with asymptotics enough does get used to it and comes to appreciate it, so it can't be that either...
I have no problem with using notation in a more 'intuitive' way than the technical definitions allow. However I won't ever acknowledge such notation as correct. If you want to be precise you'll need to use precise notation, and insisting that abusing the '=' sign is precise will do more harm than good. Unless you consider 1+1=3 to be good notation.
If you seriously think you're not missing out by removing the distinction between ⊆ and =, consider that O and subset relations are enough to denote all relations that you'd normally use the different big Os for that nobody ever bothers to remember.
So f being of order o(g) is equivalent to O(f) being a proper subset of O(g) (O(f)⊊O(g)), f being of order Theta(g) is simply equivalent to O(f)=O(g), and f being of order Omega(g) is simply O(f)⊋(g). You might need to restrict yourself to nonnegative monotonically increasing functions, but that's pretty much done in practice anyway. And this might differ from some of the existing definitions that are being used, but those do vary a bit across sources, and these definitions are as sensible as any, and are compatible with the natural partial order on sets of the form O(f) given by inclusion.
Seriously though there's nothing you'll gain from throwing away a nice inequality relation just because it saves you from writing ⊆.
I'm also not sure what you have against thinking of something as sets, but you can't be precise in mathematics and insist you're not using sets, at least not without going through a lot more trouble then you'd ever avoid that way. In particular it's fine to think of:
>“when you take n and add a quantity that is at most a constant times log n, and square it, you get n^2 plus at most a constant times n log n”
and
>“the set of functions obtainable by adding the function n ↦ n to a member of the set of functions that map n to at most a constant times log n, and squaring the sum, is a subset of the set of functions obtainable by adding the function n ↦ n^2 to a member of the set of the functions that map n to at most a constant times n log n”,
as being the same thing. One is just written in slightly more obnoxious math speak. However if you insist that the meanings are different then something is wrong, because that would imply that the most straightforward mathematical interpretation is apparently different from what you meant.
The notation is also used elsewhere, for instance in math and physics to connote "here's how the next term in this series behaves" when using a series approximation of a function.
This produces 2 because ^ is the bitwise XOR operator in Javascript. Arrays are not numeric types, so they appear to be coerced to 0 for this comparison. In short, they are effectively logging "0 XOR 2", which is 2.
^ is XOR. The author’s “translation” to JavaScript made inconsistent changes in the notation. Here we’re XOR between 2 and a non numeric value, which is coerced into a Number (NaN I think here), and NaN bitshifted with anything gives that thing.
The use of ‘=’ for assignment in programming languages comes, not directly from mathematics, but indirectly from the use of mathematics in science and engineering. As an example, consider the formula for kinetic energy, commonly written
Why isn't it written 2K=mv², which expresses the same mathematical equality in a smaller, simpler form? Or any of the other equivalent rearrangements? It's because formulas have a convention, where the LHS is a single term naming the value you want, and the RHS contains the terms for values you have. That is, a formula doesn't just state an equality, it states a method for calculating something. That usage predates programming, and was explicitly copied by early programming languages like For[mula]tran[slator] that were designed for scientific & engineering calculations.