At XITE we are about to scale fast, and we are looking for a passionate Senior Data Engineer to take the lead role in our Data Engineering team. You will be responsible for the overall Data Stack performance, integrity of the data pipelines, messaging platform and data stores. These include large advertising and interaction logs, features for personalization algorithms, and music video metadata. Our Data Stack includes Kafka, ClickHouse, Superset, the Elastic Stack, and Cassandra. To accomplish your goals you will manage a small engineering team, and work closely with other technical teams within XITE.
Acting as a technical leader for developing the Data Stack and resolving problems, with end-to-end ownership of data quality in our core datasets and data pipelines
Manage, mentor, coaching, and steering colleagues across technical challenges
Design, test, install and maintain highly scalable and data-intensive systems
Orchestrate data projects such as real-time data exchange with third parties, and data migration projects
Translate software designs and user requirements into specific data models that are efficient, scalable, and easy to work with
Identify and solve issues with data pipelines regarding consistency, integrity, and completeness
Review, maintain and extend distributed systems in production. Support other teams for usage and integration with those systems
Maintain the technical excellence of the Data Engineering team
Drive the culture across the business for data quality and its best practices, advocating for Data Engineering with both technical and non-technical audiences
Proven professional experience as a Data Engineer or related position, working with systems and data infrastructure at scale
Proficient in Python. In addition, Scala proficiency is a plus
Experience with crafting and building large scale data and ETL pipelines in distributed environments with technologies such as Kafka, ClickHouse, Elastic, Cassandra, Spark, etc.
Experience optimizing data models, pipelines and procedures for performance, cost, and usability
Knowledge of the main architecture models and concepts like replication, sharding, consistency, horizontal and vertical scaling, quorum, idempotency
Experience in supervising and mentoring team members
Able to drive and take the lead in projects from a technical perspective
Understanding of basic analytics and machine learning concepts
Preferably a university degree in Software Engineering or other relevant field
Excellent communication (written and spoken) and stakeholder management
Preferred skills / tool experience includes
Production level experience with multi-regional Kafka platform (brokers, connectors, mirrors)
Proven experience with SQL, NoSQL and OLAP databases. Preferably PostgreSQL, Cassandra and ClickHouse or another OLAP database such as Druid or Pinot.
Pipeline orchestration with Apache Airflow or other tools like Luigi.
Google Cloud environment.
Docker and Kubernetes.
Distributed processing frameworks like Apache Spark, Dask or Hadoop.
Infrastructure provisioning automation tools such as Ansible and Terraform.
Proficiency with the PyData stack is a plus.
Production level experience with the Elastic stack (Elasticsearch, Logstash, and Kibana) is a plus.
What do you get out of it?
At XITE we make sure you’re taken care of by providing you the opportunity to develop your career in a young, fast growing (international) company. We provide a challenging work environment where professionalism, initiative and creativity are a must. We have a unique combination of music, television and new technology. And let’s not forget; we have chef prepared lunches, Friday afternoon drinks and rooftop parties - click here to get an impression!
Do you feel up for the challenge? Then hit that ‘Apply for this Job’ button!