Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

He actually has a point there. There are so many different versions of "CSV" floating around that I'm not at all sure I'd want to deal with a parser that could handle most of them. Ever generated a CSV file from a spreadsheet or DB interface program? Did it have a big list of options on how the CSV would be formatted, so you could easily read the generated file into whatever downstream you were using?

Yeah.



> I'm not at all sure I'd want to deal with a parser that could handle most of them.

Python's CSV parser will handle almost anything you throw at it and it is widely used to great success.

> Ever generated a CSV file from a spreadsheet or DB interface program? Did it have a big list of options on how the CSV would be formatted, so you could easily read the generated file into whatever downstream you were using?

Just about every single CSV file that I've ever had to read was generated by someone other than me. Frequently (but not always), they come from a non-technical person.

Sometimes those CSV files even have NUL bytes in them. Yeah. Really. I swear. It's awful and Python's CSV parser fell over when trying to read them. (You can bet that my parser won't.)

> He actually has a point there.

His point is to use regexes instead of a proper CSV parser. I'm hard pressed to think of a reason to ever do such a thing:

1. A regex is much harder to get correct than using a standard CSV parser. 2. A regex will probably be slower than a fast CSV parser.


"Python's CSV parser will handle almost anything you throw at it and it is widely used to great success."

I like the word "almost". It's kept me in cheesy-puffs for years, now. :-)

I was actually only speaking to the post I replied to. CSV is a mess.


A good CSV Parser (Python's) let's you specify the dialect of CSV file - it doesn't make assumptions as to the format of the CSV file.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: