Onsite Senior Site Reliability Engineer - Paris - FR Ref. CS-backend-2015-08...

Onsite Senior Site Reliability Engineer - Paris - FR

Ref. CS-backend-2015-08-SRO URL http://www.contentsquare.com/en/jobs/#senior-site-reliabilit...

CONTEXT:

  Content Square is one of the current fastest growing 
  company, deploying lots of analytics tools through a 
  critical data pipeline. This infrastructure needs to 
  remain strongly reliable and available, with minimum 
  downtimes.

RESPONSIBILITIES:

  - Build and maintain alerting tools, metrics, and methodologies to 
    reduce possible downtimes.
  - Ensure production-ready applications fit the expected availability 
    constraints.
  - React to system inefficiencies and resolve issues quickly to ensure 
    system availability and performance.
  - Troubleshooting experience tracking down performance, load, 
    networking, I/O and memory problems.
    Coordinate engineering and external communications.

REQUIREMENTS:

  - 5+ years of experience with Linux system administration.
  - Experience with monitoring systems using tools (like Nagios, Icinga, 
    Shinken, OpenTSDB) and writing health checks.
  - Interest in learning and managing newer technologies like Spark,     
    Hadoop, Elasticsearch, Kafka…
  - Experience of a classical network stack : CDN, DNS, load balancers, 
    TCP/IP...
  - Good understanding of how to think about data durability (think 
    backups, max time to recovery, and generally how to avoid losing 
    data at all costs)

PREFERRED:

  - Experience with system management tools like Puppet or Chef
  - Experience with Scala and/or JVM.