Sayari is a risk intelligence provider that equips the public and private sectors with visibility into complex commercial relationships. They are seeking an Entry-Level Data Engineer to join their Data team, where the role involves writing and deploying scripts, analyzing datasets, and collaborating with other engineers.
Responsibilities
- Write and deploy crawling scripts to collect source data from the web
- Write and run data transformers in Scala Spark to standardize bulk data sets
- Write and run modules in Python to parse entity references and relationships from source data
- Diagnose and fix bugs reported by internal and external users
- Analyze and report on internal datasets to answer questions and inform feature work
- Work collaboratively on and across a team of engineers using basic agile principles
- Give and receive feedback through code reviews
Skills
- Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related technical field — or equivalent hands-on experience
- Working knowledge of SQL and relational databases (such as Postgres)
- Experience writing code in Python (e.g., pandas, NumPy, Scrapy) or Java/Scala
- Familiarity with data processing frameworks like Apache Spark, or strong interest in learning them on the job
- Understanding of object-oriented programming principles and collaborative development in shared repositories
- Ability to work closely with data scientists, analysts, and engineers to help solve complex problems across large, diverse datasets
- Exposure to workflow orchestration tools such as Apache Airflow and CI/CD pipelines
- Familiarity with graph, search, or NoSQL databases
- Experience contributing to data ingestion, transformation, or ETL pipelines
- Comfort working with containerized applications (e.g., Docker)
- Experience using cloud-based data tools in AWS or GCP environments
- Introductory experience or coursework involving machine learning, especially in distributed systems like Spark
- Awareness of entity resolution concepts or interest in learning how entities are linked across data sources
- Experience working with international or non-English datasets
Benefits
- 100% fully paid medical, vision, and dental for employees and their dependents
- Generous time off; we observe all US federal holidays, close our office for a winter break (12/24-12/31), in addition to granting 18 PTO days and 10 sick days
- Outstanding compensation package; competitive commissions for revenue roles and quarterly bonuses for non-revenue positions
- A strong commitment to diversity, equity, and inclusion
- Eligibility to participate in additional benefits such as 401k match up to 5%, 100% paid life insurance (up to $100,000 coverage),, and parental leave
- A collaborative and positive culture - your team will be as smart and driven as you
- Limitless growth and learning opportunities
Company Overview
Sayari is a mission-driven company that aims to empower both the public and private sectors with the comprehensive, evidence-based model of global commercial relationships they need to safeguard their economic futures. It was founded in 2015, and is headquartered in Washington, District of Columbia, USA, with a workforce of 201-500 employees. Its website is https://sayari.com.Company H1B Sponsorship
Sayari has a track record of offering H1B sponsorships, with 1 in 2024, 2 in 2023, 1 in 2020. Please note that this does not guarantee sponsorship for this specific role.
Apply Now