Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yea, fortunately pythonpy supports lazy iteration over sys.stdin when you really need it. Just like in python, the syntax won't be as nice as using a list. But it works:

  py 'itertools.count(1)' | py 'itertools.islice(stdin, 0, 10, 2)'
However, the number of times that you need this are surprisingly rare. Most lazy operations don't require that each row be aware of the surrounding row context, and using the much simpler:

  py -x 'new_row_from_old_row(x)'
will get the job done in a lazy fashion. Usually, when you need rows to be context aware, as in:

  py -l 'sorted(l)'
or

  py -l 'set(l)'
it's just not possible to accomplish your task without reading in all of stdin.


Cool :), glad it's supported, at least for the simple case of line-wise transforms.

Some things can't be done without reading everything. But there are still a number of operations on "all of stdin" that can safely be done lazily. I'm particularly fond of "divide stdin into chunks of lines separated by <predicate>" [0]. Which does need context, but only enough to determine where the current chunk ends (typically a few lines).

`py` seems to be aimed at a single expression per invocation (nice and simple), while `piep` recreates pipelines internally (more complex but also means pipelines can produce arbitrary objects rather than single-line strings). So I'm not really sure how you'd do the above in `py` anyway.

[0] http://gfxmonk.net/dist/doc/piep/#piep.list.BaseList.divide




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: