Senior Data Engineer – Spark

Position Description

Sayari is looking for a mid-level to senior Data Engineer to join our Infrastructure team located in Washington, DC. The Infrastructure team is an integral part of our Engineering division and works closely with our Software Engineering & Data Science teams, as well as other key stakeholders across the business.

Apply Now


What You Will Do:
As a member of Sayari’s engineering team, you will work to build the next generation of our entity resolution engine. This work will involve quickly getting up to speed on the existing production engine (implemented in Apache Spark), and then starting to focus on improvements that will take it to the next level. This is a cutting edge problem where no definitive solution exists, and you’ll have the opportunity to implement schemes described in academia and bring your own insights to the table.
What You Will Need:
  • Strong experience with any two of: Python, Java, Scala
  • 2+ years of experience using Apache Spark
  • Solid experience with a NoSQL database like Cassandra or Neo4J
  • Experience working on a cloud platform like GCP, AWS, or Azure
  • Experience working collaboratively with git
What We Would Like:
  • Experience with, or interest in, graph databases
  • Experience with entity resolution/record linkage techniques (e.g., similarity joins, blocking schemes, string similarity)
  • Experience with data orchestration frameworks like Apache Airflow
Who You Are:
  • Strong process-oriented self-starter, with impeccable organizational skills
  • Experienced in supporting and working with cross-functional teams in a dynamic environment
  • Interested in learning from and mentoring team members
  • Passionate about open source development and innovative technology
What We Offer:
  • Limitless growth and learning opportunities
  • A collaborative and positive culture – your team will be as smart and driven as you
  • A strong commitment to diversity, equity & inclusion
  • Outstanding competitive compensation & comprehensive benefits package, including full healthcare coverage plans, commuter benefits, 401K matching, generous vacation, and a variety of other benefits

Sayari is a venture-backed and founder-led global corporate data provider and commercial intelligence platform, serving financial institutions, legal and advisory service providers, multinationals, journalists, and governments. Thousands of analysts and investigators in over 30 countries rely on our products to safely conduct cross-border trade, research front-page news stories, confidently enter new markets, and prevent financial crimes such as corruption and money laundering.

We at Sayari define our culture by a dedication to our mission of using open data to prevent illicit commercial and financial activity, a passion for finding novel approaches to complex problems, and an understanding that diverse perspectives create optimal outcomes. We embrace cross-team collaboration, open source work, encourage training and learning opportunities, and reward initiative and innovation. If you like working with supportive, high-performing, and curious teams, Sayari is the place for you.

Sayari maintains a graph of global corporate information containing hundreds of millions of entities and relationships. The structures and techniques that we use store the information in this graph are critical to allowing our clients to benefit from this unique dataset. The application that serves this graph to our clients is powered by several databases fed by ETL pipelines that run in Apache Spark.

We are continually looking to expand our tech stack with new and innovative technologies. Have an idea for something cutting edge? Come talk to us about it.

Application Instructions:

We will only consider complete applications. Please fill out the form below in full to be considered.

  • Resume & any salary requirement
  • Optional: Brief note to highlight relevant experience or skills
  • Optional: Share links to any public repos of your previous work

Apply Now:

Sayari is an equal opportunity employer and strongly encourages diverse candidates to apply. We believe diversity and inclusion mean our team members should reflect the diversity of the United States. No employee or applicant will face discrimination or harassment based on race, color, ethnicity, religion, age, gender, gender identity or expression, sexual orientation, disability status, veteran status, genetics, or political affiliation. We strongly encourage applicants of all backgrounds to apply.
Back to Careers