Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> text "scraping" is just better for most cases.

Unix pipes handle bytes, not just text. For instance, copy a hard disk partition to a remote hard disk via ssh:

  (dd if=/dev/hda1) | (ssh root@host dd of=/dev/sda1)
The KISS principle ("Keep it simple, Stupid") is the best way in many cases. In Unix you can quickly enter a pipe without special coding which does also non-trivial stuff. For instance,

  find -name "*.xls" -exec echo -n '"{}" ' \; | xargs xls2csv | grep "Miller" | sort 
gets a sorted list of all entries in all Excel files which contain the name "Miller", no matter how deeply the files are located in the directories. Can you do this in Powershell quickly? I don't know, I am actually curious.

Objects in pipes are convenient and powerful. However, for most applications of pipes they are probably overkill. If things get tough you can simply use files rather than objects.

I would not be surprised if many Windows users will prefer bash pipes from the Ubuntu subsystem in Windows 10 rather than Powershell because it is much more handy for most pipe applications.



> Can you do this in Powershell quickly? I don't know, I am actually curious.

You can basically do the same thing with Powershell. I don't know of a 'built-in' module to handle XLS->CSV conversion, so you need to bring one in:

  Install-Module ImportExcel
Then:

  ls *.xlsx -r | %{ Import-excel $_ } | ? { $_.Name -eq "Miller" } | sort
That's my naive, Powershell noob approach anyway.

There's another option however, where you can leverage Excel itself (granted, this is likely to be a Windows only approach):

  $excel = New-Object -com excel.application
And you can now open XLS/XLSX files and operate on them as an actual excel document (including iterating through workbooks/ worksheets, etc.). It's all just objects.


Thanks for pointing this out. However, your example is not equivalent since "ls" searches only in the current directory. Nevertheless it is good to know that it's basically possible since it helps a lot to manage mixed Windows/Linux networks.


Not true. The -r flag I specified is for "recurse", will search current and all sub-directories.


A better way to do that find command is:

    find -name '*.xls' -print0 | xargs -0 xls2csv




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: