Hacker Newsnew | past | comments | ask | show | jobs | submit | armanboyaci's commentslogin

Should we validate before we verify the software?


Yes.


What happens if you have a prior knowledge for $k$ as a probability distribution?


>Being able to apply statistics is like having a secret superpower.

I totally with this sentence. BUT If you ask for my opinion, merely knowing a list of statistical formulas is not very helpful. Most of the time, people don’t remember the underlying assumptions, so there is a fair chance they will use them in inappropriate situations.

I recommend watching these two YouTube videos. The presenters advocate using simulation/bootstrapping/shuffling methods instead of memorizing formulas.

Jake Vanderplas - Statistics for Hackers https://www.youtube.com/watch?v=Iq9DzN6mvYA

John Rauser - Statistics Without the Agonizing Pain https://www.youtube.com/watch?v=5Dnw46eC-0o


IIRC, Jake's video inspired the example section in the Python random module docs. It takes about 15 minutes with those examples to learn how to put Jake's ideas into practice. https://docs.python.org/3/library/random.html#examples .


> The presenters advocate using simulation/bootstrapping/shuffling methods instead of memorizing formulas.

Yeah, I often find it much easier to make a little Python script to do 10,000 monte-carlo trial, as opposed to properly" working things out and then not even being confident-enough in my result anyway.


It makes no sense to memorize the formulas when most any statistical formula you'd actually use has a package or three that can run it in a way that's already probably reasonably benchmarked and not prone to you fat fingering some error rolling your own.


Assumptions are the part that matters.


What assuming the package is correct? Sure it could be wrong in its implementation, but one could simulate expected results and compare the output of the tool if one doesn't trust that the community of data scientists nerds have somehow missed that the storied louvain package or whatever else is incorrect for years.


While I really liked the video by vanderplas, I did return to it after a year or two and paused every time he presented a problem and then tried to solve it using for loops and thinking hard.

I barely succeeded in any of it. So at that point just look up the formula instead of bootstrapping.

I’ll give the second one a shot too.


I believe your description of software development is highly aligned with the ideas of peter naur, programming as theory building.

https://pages.cs.wisc.edu/~remzi/Naur.pdf


I love when this paper pops up, because now I get to recommend a relevant episode of a really great, really nerdy podcast:

https://futureofcoding.org/episodes/061.html


Thanks for the pointer, lots of good looking topics in the back log!


> I wish there was two types of ipynb files, one for file with just code and markdown (for example ipynbc), and one for keeping code+markdown+results.

I believe you can achieve that if you use jupytext library, right?


I am keeping my CV on overleaf and it is very convenient. Did you try that option?


The issue for me with LaTeX is the amount of time I spend messing around trying to get different packages to play nicely together.

Sometimes you get lucky and everything just works. If it doesn't... you google it, and pray that someone else hit the same problem and solved it, because trying to actually figure out the problem from first principles is doomed -- the language (and ecosystem) makes Perl look sane.


https://www.amazon.com/Solution-Selling-Fieldbook-Practical-...

If you are planning to build enterprise software solutions then I highly recommend this book. It contains very helpfull checklists and templates.


I recommend this blog: https://yetanothermathprogrammingconsultant.blogspot.com/?m=...

Not all posts are business related but you can learn many practical tricks hard to find in books.


GAMS is such a wild language/environment.

I don’t know of anything better, but I’m currently reliving nightmares from my Masters


JuMP is comparably good I think. People reasonably don’t want to add a new language to their stack. But if you’re just formulating MPs it’s as nice as anything, free, and you have a well designed modern programming language if you need it.


I add my +1 to this. I often come across this blog posts while working as a OR professional.


You can compute the max-flow of an undirected graph. The edges have capacities and in the undirected case you assume that capacity can be used in both 'directions'.


I found this explanation: https://www.promptingguide.ai/techniques/rag

> General-purpose language models can be fine-tuned to achieve several common tasks such as sentiment analysis and named entity recognition. These tasks generally don't require additional background knowledge.

> For more complex and knowledge-intensive tasks, it's possible to build a language model-based system that accesses external knowledge sources to complete tasks. This enables more factual consistency, improves reliability of the generated responses, and helps to mitigate the problem of "hallucination".

> Meta AI researchers introduced a method called Retrieval Augmented Generation (RAG) to address such knowledge-intensive tasks. RAG combines an information retrieval component with a text generator model. RAG can be fine-tuned and its internal knowledge can be modified in an efficient manner and without needing retraining of the entire model.


Thank you!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: