Great reading, thanks. Describes how to handle ±0, works with difference to avoid truncation errors.
First half of the paper is arriving at this correct snippet, second part of the paper is about optimizing it.
bool DawsonCompare(float af, float bf, int maxDiff)
{
int ai = *reinterpret_cast<int*>(&af);
int bi = *reinterpret_cast<int*>(&bf);
if (ai < 0)
ai = 0x80000000 - ai;
if (bi < 0)
bi = 0x80000000 - bi;
int diff = ai - bi;
if (abs(diff) < maxDiff)
return true;
return false;
}
> The extra half ulp error makes no difference to the accuracy of calculations
It absolutely does matter. The first, and most important reason, is one needs to know the guarantees of every operation in order to design numerical algorithms that meet some guarantee. Without knowing that the components provide, it's impossible to design algorithms on top with some guarantee. And this is needed in a massive amount of applications, from CAD, simulation, medical and financial items, control items, aerospace, and on and on.
And once one has a guarantee, making the lower components tighter allows higher components to do less work. This is a very low level component, so putting the guarantees there reduces work for tons of downstream work.
All this is precisely what drove IEEE 754 to become a thing and to become the standard in modern hardware.
> the problem is that languages traditionally rely on an OS provided libm leading to cross architecture differences
No, they don't not things like sqrt and atanh and related. They've relied on compiler provided libs since, well, as long as there have been languages. And the higher level libs, like BLAS, are built on specific compilers that provide guarantees by, again, libs the compiler used. I've not seen OS level calls describing the accuracy of the floating point items, but a lot of languages do, including C/C++ which underlies a lot of this code.
> The first, and most important reason, is one needs to know the guarantees of every operation in order to design numerical algorithms that meet some guarantee
sure, but a 1 ulp guarantee works just as well here while being substantially easier to provide.
> And the higher level libs, like BLAS, are built on specific compilers that provide guarantees
Sure, but Blas doesn't provide any accuracy guarantees so it being built on components that sort of do has pretty minimal value for it. For basically any real application, the error you experience is error from the composition of intrinsics, not the composed error of those intrinsic themselves, and that remains true even if those intrinsics have 10 ULP error or 0.5 ULP error
I guess you're right, I was probably mislead this whole time. I went through my old analysis class book [1] and there doesn't seem to be an explicit definition of elementary functions. The best I can find is this paragraph (I translate from italian):
> The elementary functions of analysis, that is powers, roots, exponentials, logarithms and their inverses, functions obtained from the former by arithmetic operations or composition, admit the limit f(p) for x → p, for any p in their set of definition. The study of such functions, which is not limited to the sole real functions of real variable, is carried out naturally in the setting of metric spaces.
That said, I'm relatively sure that a definition was given in class and it didn't include arbitrary roots: despite being notoriously difficult, the exam didn't require students to draw the graph of any elementary function including implicitly-defined algebraic roots.
I picked up another one of the old recommended books [2] and it seems to be similarly vague; while the book currently taught in my university [3], gives this definition:
> The following functions (from ℂ to ℂ) are called the elementary functions of the Analysis:
> 1) Rational functions (integral or fractional)
> 2) Algebraic functions (explicit or implicit)
> 3) The exponential function
> 4) The logarithm function
> 5) All those functions that can be obtained by combining a finite number of times the functions of kind 1)...4).
So, roots of arbitrary polynomials implicitly defined are indeed considered elementary. I never knew this.
So, I did a bit of research and I wasn't going crazy: there are apparently two competing definitions of "elementary" in use [1]:
> the class of functions [...] is what I would call exponential-logarithmic functions or EL functions; that is, they are the functions that can be expressed using some finite combination of constant functions, the identity function, exp, log, composition, and arithmetic operations (+−×÷). Some authors call this class of functions elementary functions, but that term is now more commonly used in a different sense, which includes algebraic functions.
Evidently my professor was in the exponential-logarithmic camp.
This isn't unique, or even the least compute way to do this. For example, let f(x,y) = 1/(x-y). This too is universal. I think there's a theorem stating for any finite set of binary operators there is a single one replacing it.
write x#y for 1/(x-y).
x#0 = 1/(x-0) = 1/x, so you get reciprocals.
Then (x#y)#0 = 1/((1/(x-y)) - 0) = x-y, so subtraction.
it's common problem to show in any (insert various algebraic structure here ) inverse and subtraction gives all 4 elementary ops.
Ooh, that 2nd link has a nice construction by Terry Tao giving a clear way to show infinitely many such functions exist for pretty much any set of operations.
I’m fairly certain that the difference between the approaches is that the f(x,y) function you mentioned requires limits to represent certain concepts while the eml approach is essentially a tree or a chain of computations meant to represent a model of a system.
They're not series, that's just a convenient way to think about defining and calculating them. I've never found it particularly useful to deal with the series definitions either, and none of the (good) approximation methods I'm aware of actually take that approach.
Moreover, EML is complete in a way that your suggested function isn't: If you take a finite combination of basis functions, can it build periodic functions? Hardy proved over a century ago that real (+,-,/,*,exp,ln) can't do this (and answering the paper's unresolved question about similar real-valued functions in the negative). EML being able to build periodic functions is a lot less surprising for obvious reasons, but still pretty neat.
EML goes through complex numbers and infinities internally.
Previous commenter meant that there are infinite series inside log and exp.
To achieve a thing like sin(x) from his universal expression, yes you'd need infinite series of those, not a finite set of operations, but to get all 36 basic function you'd need a finite set of different infinite series.
So exp and log just happen to be the labels for two element set of such infinite series that you'd have to use to make 36 functions out of the operator that pervious commenter proposed.
All that said, I still think it's pretty neat too.
I understand your point. The paper is more about the depth of the tree to represent and audit a model versus the raw CPU clock cycles. It takes the exponent and logarithm as given since for all practical purposes, in a scientific context, they are.
To represent something like sin(x) with f(x,y) requires infinite steps. Conversely, with eml you get an exact result in around 4 using identities and such.
One could argue that we do Taylor Series approximations on the hardware to represent trigonometric functions, but that highlights the key aspect of the eml approach. You can write a paper with those four steps that describes an exact model, here sin(x). And people can take that paper and optimize the result.
This paper is about an auditable grammar that you can compute with.
You can reduce all Boolean logic to NAND, but that doesn't actually mean that semiconductor fabs translate their HDL to NAND gates, because it is possible to build complex gates that directly implement higher level operations in a single gate.
Your "cost of computation" objection can be easily resolved by adding more operators, which makes it boring from a research perspective.
Meanwhile the loss of expressivity can only be compensated by encoding algorithms directly into the expression tree. Your objection that an infinite series is a bad thing rings hollow, since you now introduce the concept of an infinitely sized expression tree. That sounds much more impractical than implementing an algorithm for the exponential and logarithm functions.
Show me a way to physically compute exp or ln that is less gates than add. More gates means costlier in $, more energy in compute, and for these functions, higher latency.
You don’t get to make up free ops, claim there is no cost in reality, and hand wave away reality.
There are infinitely many ways to do what the paper did. There’s no gain other than it’s pretty. It loses on every practical front to simply using current ops and architectures.
Conceptual elegance is worth something, isn't it? I don't mean just aesthetic pleasure, as in recreational mathematics, but there's often (not always) value in being able to succinctly describe a wide range of phenomena with a small number of primitives. It could open up new ways of understanding or working that wasn't possible before. Not saying this specific discovery fits the description, but it seems too early to dismiss the idea entirely based on its im/practicality compared to existing solutions.
Aren't there examples in the history of mathematics, where a new idea was criticized for being impractical, then later found to have applications or implications, possibly unexpected even to the person who discovered it?
I feel sad when pragmatics comes in scientific discussions... that's not what science should be (I think).
But I value the discussion this paper is bringing to distinct contexts (even outside academia). That by itself adds value to this work.
No, it approximates exp poorly over an infinitesimally small interval compared to exp. Resistors and capacitors are no where ideal components, which is why they have spec sheets to show how quickly they diverge.
If we’re making sloppy approximations to a tiny range of exp, then I too can do it with a few terms.
The feasibility of memristor analog circuits is evident, and I believe this paper represents a valuable early exploration. We've been constrained by Boolean logic for quite some time now.
The world has had many types of logic before and after Boolean logic was created, many used in computing. Boolean logic isn’t a constraint; it’s used where it’s useful, and others are used where they’re useful.
> This includes constants such as e, pi, and i; arithmetic operations including addition, subtraction, multiplication, division, and exponentiation as well as the usual transcendental and algebraic functions.
And those come from the infinite series needed to compute exp and ln. They’re just as much work either way. The exp and ln way are vastly costlier for every op, including simply adding 1 and 2.
It's not about being costly or not, this is completely irrelevant to the point being made. eml is just some abstract function, that maps ℝ² to ℝ. Same as every other mathematical function it is only really defined by the infinite set of correspondences from one value to some other value. It is NOT exp(x) - ln(y), same as exp is not a series (as you wrongfully stated in another comment). exp can be expressed (and/or defined) as a series to a mathematician familiar with a notion of series, and eml can be expressed as exp(y) - ln(y) to a mathematician familiar with exp and ln. They can also be expressed/defined multiple other ways.
I am not claiming this is better than 1/(x-y) in any way (I have no idea, maybe it isn't if you look closely enough), but you are simply arguing against the wrong thing. Author didn't claim eml to be computationally efficient (it even feels weird to say that, since computational efficiency is not a trait of a mathematical function, but of a computer architecture implementing some program) or anything else, only that (eml, 1) are enough to produce every number and function that (admittedly, somewhat vaguely defined) a scientific calculator can produce.
However, I want to point out that it's weird 1/(x-y) didn't appear on that graph in Figure 1, since if it's as powerful as eml, it should have all the same connections as eml, and it's a pity Odrzywołek's paper misses it.
Oh, glad you are still here. Because I kept wondering about 1/(x-y), and came to the conclusion it actually cannot do nearly as much as eml. So maybe you could confirm if I understood your assumptions correctly and help me to sort it out overall.
In your original post you were kinda hand-wavy about what we have except for x # y := 1/(x-y), but your examples make it clear you also assume 0 exists. Then it's pretty obvious how to get: identity function, reciprocity, negation, substraction & addition. But I effectively couldn't get anywhere past that. In fact, I got myself convinced that it's provably impossible to define (e.g.) multiplication as a tree of compositions of # and 0.
So here's my interpretation of what you actually meant:
1. I suppose, you assumed we already have ℕ and can sample anything from it. Meaning, you don't need to define 5, you just assume it's there. Well, to level things out, (#, 0, 1) are enough to recover ℤ, so I assume you assumed at least these three. Is that right?
2. Then, I suppose you assumed that since we have addition, multiplication simply follows from here. I mean at this point we clearly have f(x) = 3x, or 4x, or 5x, … so you decided that the multiplication is solved. Because I couldn't find how to express f(x, y) = x⋅y, and as far, as I can tell, it's actually impossible. If I'm wrong, please show me x⋅y defined as a sequence of compositions of (#, 0, 1).
3. This way (assuming №2) we get (ℚ, +, -, ⋅, /). Then, I suppose, you assume we can just define exp(x) as its Taylor series, so we also have all roots, trig functions, etc., and then we obviously have all numbers from ℝ, that are values of such functions acting on ℚ. Exactly as we do in any calculus / real analysis book, with limits and all that jazz.
If that's what you actually meant, I'm afraid you completely missed the point, and 1/(x-y) in fact isn't nearly as good as eml for the purposes of Odrzywołek's paper. Now, I didn't actually verify his results, so I just take them for granted (they are totally believable though, since it's easy to se how), but what he claims is that we can use eml essentially as a gate, like Sheffer stroke in logic, and express "everything else" just as a sequence of such gates and constant 1 (and "everything else" here is what I listed in №3). No words, limits, sets and other familiar mathematical formalism, just one operation and one constant, and "induction" is only used to get all of ℕ, everything else is defined as a finite tree of (eml, 1).
All of these are standard fare in abstract algebra classes, and I didn’t care to write it all out. Once you have the “inverse” operations - and reciprocal, the entire structure follows, for a large set of objects, whether N or Q or R or C or finite fields or division rings, and a host of other structures. So I only wrote - and 1/x
Then, subtraction is (x#y)#0 = x-y. Reciprocal is x#0 = 1/x. Addition follows from x+y=x-((x-x)-y). This used the additive identity 0.
Multiplication follows from
x^2= x-1/(1/x + 1/(1-x)), so we can square things. Then -2xy = (x-y)^2 -x^2 - y^2 is constructible. Then we can divide by -2 via x/-2 = 1/((0-1/x)-1/x), and there’s multiplication. In terms of #, this expression only needed the constant 1, which is the multiplicative identity.
Now mult and reciprocal give x * 1/y = x/y, division.
Any nontrivial ring needs additive and multiplicative identities 1!=0, which are the only constants needed above. If you assume this is Q or R or C, it may be possible to derive one from the other, not sure. But if you’re in these fields, you know 0 and 1 exist.
Then any element of Q is a finite set of ops. R can be constructed in whatever way you want: Dedekind cuts, Cauchy sequences, whatever usual constructions. Or assume R exists, and compute in it via the f(x,y).
This also works over finite fields (eml does not), division rings, even infinite fields of positive characteristic, function fields (think elements are ratio of polynomials), basically any algebraic object with the 4 ops.
You forgot to create numbers. You need 0 (explicitly used in your construction), and you need another number so you generate numbers that are not 0 and infinity.
I don't think this can do any of the "standard" constants or what we generally consider to be closed-form expressions, though ! (E.g., no e, pi, exp, log, etc.)
Yes it can, by using the same infinite series that exp and ln use to compute. This one just costs less in money, hardware, energy, and is faster for basically every basic op.
It’s only finite by putting the infinite series into an operation.
And the basic monomial basis is not a single binary operation capable of reproducing the set of basic arithmetic ops. If you want trivial and basic, pick Peano postulates. But that’s not what this thread was about.
well, the statement is: is there a single operation, built from elementary operations, such that all _other_ elementary operations have finite representations.
this preprint answers that in the affirmative
otoh, (x, y) -> 1/(x-y) does not answer this question at all. you can argue that the preprint does so "via the infinite series in an operation" (which I have no idea what that means; surely if exp(x) qualifies then so must 1/(x-y) if we pick a monomial basis?) but ¯\_(ツ)_/¯
now, do I think that this is groundbreaking magical research (as I'm currently seeing on twitter) no... But it's neat!
The exp and ln are infinite series. Exp is roughly the infinite series for cos AND the infinite series for sin. Hiding that every op is an infinite series behind a name doesn’t make things free. It just makes even trivial ops like 1+2 vastly more work.
They are not infinite series per se. They can be represented by infinite series in several ways but there are standard ways to define them that do not involve infinite series. The logarithm in particular is not even represented by an infinite series (in form of Taylor expansion) defined in the whole complex plane. And knowledge/use of trigonometric functions greatly precedes such infinite series representations.
Moreover, the point is not always numerical computation. I don’t think anybody argues that eml sounds like an efficient way to compute elementary functions numerically. It may or may not still be useful for symbolic computations.
The article is about producing all elementary functions, which 1/(x-y) clearly doesn’t, as it doesn’t produce any transcendental function. Like many of such universality-style results it may not have practical applications, but may still be interesting on its own right.
Any transcendental function can be produced by arithmetic, since its complete for R.
Go ahead and show how to compute exp or ln without an infinite series without circular reasoning. You can’t, since they’re transcendental.
There are infinitely many ways to make these binary operators. Picking extremely high compute cost ones really doesn’t make a good basis for computation.
> Any transcendental function can be produced by arithmetic, since its complete for R.
Not without some form of limit process or construction. You can approximate e with the basic arithmetic operations but not actually get an exact form in finite steps. And you definitely cannot transverse an infinite binary tree, so the main point of the result in the article is missed by your arguments.
Again, you are mixing separate things. Nobody said that eml is some way to approximate elementary functions more efficiently. It is a way to express elementary functions in a finite amount of operations. Meaning, computing symbolically, not numerically. Eg I may care that exp(3)*exp(2)=exp(5) without caring to approximate exp(5) numerically. The paper is literally under "Computer Science > Symbolic Computation", not "numerical analysis" or "engineering" after all.
And to be precise:
> Go ahead and show how to compute exp or ln without an infinite series without circular reasoning. You can’t, since they’re transcendental.
You don't necessarily need "infinite series", you need some limit process. A basic example is that exp(x) can be approximated by (1 + x/n)^n for large n. For the logarithm you can use a formula involving the arithmetic–geometric mean which you can approximate using an iterative process/recursion without infinite series. You can also approximate the exponential by using Newton's method together with that, see [0].
Yep, I’ve written numerical methods papers, and am very well aware of the field.
A limit process is a definition. Try computing with it. You’ll end up with an infinite sequence, or an approximation.
An iterative process is an infinite series. They’re equivalent.
Newtons method is the same. Completely equivalent to an infinite series as you increase precision.
And both require constants, infinitely precise. So you’re still not doing anything the 1/(x-y) operation cannot do, and to do those series you’ll compute using things amenable to being done via ops easy to do by hand or machine via the 1/(x-y) op.
Do you claim all things you don’t understand are LLMs? This is what I mean by these and many of your other comments being extremely poor quality to the point of deliberate ignorance.
The paper above was published in 2012 [1], so that’s quite a feat for an LLM. This takes about zero effort to check.
Put some thought or effort into your claims; they’ll look less silly.
The compute, energy, and physical cost of this versus a simple x+y is easily an order of magnitude. It will not replace anything in computing, except maybe fringe experiments.
Those “few drones” have completely kept the US military, ships and all, far away since they can damage and sink large expensive vessels with tiny cheap drones.
How did the planes and ships and missles fare in Iraq or Afghanistan? Oh yeah, decades and trillions spent and nothing changed. Iran is much larger and well armed everywhere, with support by China and Russia and others….
The old govt was about to be toppled by people sick of it. The US attack unified those people behind the leaders son, someone they’d not have taken before, and entrenched a new generation against the US. So far the carrot and stick has them openly mocking Trump and the US as Trump makes threat, draws line, folds yet again, repeats.
How’d that plan work out in Iraq or Afghanistan, both much smaller, less armed countries? Decades and trillions spent, and what exactly did the US “win”?
reply