Senior Data Engineer

Job description

At XITE we are about to scale fast, and we are looking for a passionate Senior Data Engineer to take the lead role in our Data Engineering team. You will be responsible for the overall Data Stack performance, integrity of the data pipelines, messaging platform and data stores. These include large advertising and interaction logs, features for personalization algorithms, and music video metadata. Our Data Stack includes Kafka, ClickHouse, Superset, the Elastic Stack, and Cassandra. To accomplish your goals you will manage a small engineering team, and work closely with other technical teams within XITE.


  • Acting as a technical leader for developing the Data Stack and resolving problems, with end-to-end ownership of data quality in our core datasets and data pipelines

  • Manage, mentor, coaching, and steering colleagues across technical challenges

  • Design, test, install and maintain highly scalable and data-intensive systems

  • Orchestrate data projects such as real-time data exchange with third parties, and data migration projects

  • Translate software designs and user requirements into specific data models that are efficient, scalable, and easy to work with

  • Identify and solve issues with data pipelines regarding consistency, integrity, and completeness

  • Review, maintain and extend distributed systems in production. Support other teams for usage and integration with those systems

  • Maintain the technical excellence of the Data Engineering team

  • Drive the culture across the business for data quality and its best practices, advocating for Data Engineering with both technical and non-technical audiences


  • Proven professional experience as a Data Engineer or related position, working with systems and data infrastructure at scale

  • Proficient in Python. In addition, Scala proficiency is a plus

  • Experience with crafting and building large scale data and ETL pipelines in distributed environments with technologies such as Kafka, ClickHouse, Elastic, Cassandra, Spark, etc.

  • Experience optimizing data models, pipelines and procedures for performance, cost, and usability

  • Knowledge of the main architecture models and concepts like replication, sharding, consistency, horizontal and vertical scaling, quorum, idempotency

  • Experience in supervising and mentoring team members

  • Able to drive and take the lead in projects from a technical perspective

  • Understanding of basic analytics and machine learning concepts

  • Preferably a university degree in Software Engineering or other relevant field

  • Excellent communication (written and spoken) and stakeholder management

Preferred skills / tool experience includes

  • Production level experience with multi-regional Kafka platform (brokers, connectors, mirrors)

  • Proven experience with SQL, NoSQL and OLAP databases. Preferably PostgreSQL, Cassandra and ClickHouse or another OLAP database such as Druid or Pinot.

  • Pipeline orchestration with Apache Airflow or other tools like Luigi.

  • Google Cloud environment.

  • Docker and Kubernetes.

  • Distributed processing frameworks like Apache Spark, Dask or Hadoop.

  • Infrastructure provisioning automation tools such as Ansible and Terraform.

  • Proficiency with the PyData stack is a plus.

  • Production level experience with the Elastic stack (Elasticsearch, Logstash, and Kibana) is a plus.

What do you get out of it?

At XITE we make sure you’re taken care of by providing you the opportunity to develop your career in a young, fast growing (international) company. We provide a challenging work environment where professionalism, initiative and creativity are a must. We have a unique combination of music, television and new technology. And let’s not forget; we have chef prepared lunches, Friday afternoon drinks and rooftop parties - click here to get an impression!

Do you feel up for the challenge? Then hit that ‘Apply for this Job’ button!