gateProtect is a company providing security solutions focused on unified threat management (all-in-one firewalls).
Backend Software Engineer:
Help us write the control application of a network security device using Clojure. You are an excellent software developer and know many different paradigms from object oriented to functional and used your knowledge to create complex systems in many different languages like C++, Haskell or a Lisp dialect. Prior knowledge of Clojure is not required if you know another Lisp dialect. You also know the details of low lewel systems programming under Linux.
Backend Software Test Engineer:
Write automated tests that check if the production code is working using Python. A strong understanding of network protocols, related tools and Linux is more important than excellent programming skills.
Please contact job@gateprotect.de for more details and mention Hacker News.
The "old" pf_ring is only meant for packet capture and involves packet copies, so it is several times slower than netmap. There is a newer "Direct Network Access" (DNA) version of pf_ring which avoids copies and has the same performance of netmap, but is much more fragile because in DNA the userspace program writes directly into the NIC registers and rings (so it can crash the entire OS), whereas in netmap the NIC programming is filtered by system calls.
Crawling "only" 120k pages can be done easily with a pure Python solution over a normal home / office internet connection. The packages urllib, urllib2, robotexclusionrulesparser and lxml are a good start.
Important: Don't forget to implement a crawl rate limit.
Thank you for posting that link to previous HN discussion. They mention Scrapy http://scrapy.org/ and I looked at it. I liked the fact that it is Python based and the tutorial is very good. They even have a shell to test HPath Selectors. Now I have a better understanding of the process. Of course, it is not like filling a form as the case with 80legs, but I am having fun working through the tutorial. I also ran a couple of small jobs with 80legs but I am unable to see the results. I guess 80legs would be good for huge projects. In any case, I will try to work with both. Thanks again.
http://gateprotect.com/en-GB/company/jobs.html
gateProtect is a company providing security solutions focused on unified threat management (all-in-one firewalls).
Backend Software Engineer: Help us write the control application of a network security device using Clojure. You are an excellent software developer and know many different paradigms from object oriented to functional and used your knowledge to create complex systems in many different languages like C++, Haskell or a Lisp dialect. Prior knowledge of Clojure is not required if you know another Lisp dialect. You also know the details of low lewel systems programming under Linux.
Backend Software Test Engineer: Write automated tests that check if the production code is working using Python. A strong understanding of network protocols, related tools and Linux is more important than excellent programming skills.
Please contact job@gateprotect.de for more details and mention Hacker News.