Hacker News new | past | comments | ask | show | jobs | submit login

There's some movement in that direction.

However, the R core committers are essentially not only volunteers, but they're all (afaik) academic statisticians. One of the people who made strides in this direction is primarily an computational statistician at Iowa (Luke Tierney / compiler package). Building a high performance runtime/jit is wildly out of their scope of expertise.

In retrospect, and I think many of them would agree, building and maintaining their own runtime was a giant mistake. Yet here we are.

Serious compiler people (Jan Vitek, others) have made strides towards a faster implementation (his in java / fastr IIRC), but it suffers from the same problem as cpython: there are millions of lines of C code in packages or internal functions that have the details of the R interpreter / C interface deeply embedded in them. In fact, there's probably far more "R" code written in C than in R. Undoing this mess is not easy, and probably not possible.

Oh, reading Evaluating the Design of the R Language [1] will shed some more light on why it's hard to make R run fast.

[1] http://r.cs.purdue.edu/pub/ecoop12.pdf

edited to correctly describe Luke as per gbrown




I think, and I'm pretty sure most of R core would agree, that building and maintaining their own runtime _was_ the right thing to do. Otherwise R would have been at mercy of maintainers who were interested in problems other than creating an expressive language for data analysis.


I don't think calling Luke an "agricultural statistician" is at all reflective of his work. Not everything in Iowa is corn, and Luke has been working in computationally intensive statistical methodology and statistical software development for decades.


He created lisp-stat in the late 80's

https://www.jstatsoft.org/article/view/v013i09

"While R and Lisp are internally very similar, in places where they differ the design choices of Lisp are in many cases superior. The difficulty of predicting performance and hence writing code that is guaranteed to be efficient in problems with larger data sets is an issue that R will need to come to grips with, and it is not likely that this can happen without some significant design changes."


Hmm, you're quite right; I'm not sure how I came to believe that.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: