Associate Data Engineer
Candid
Data Science
United States
USD 70k-95k / year
Position summary
Candid is a nonprofit that provides the most comprehensive data and insights about the social sector. We get you the information you need to do good. Candid currently has an opportunity for an Associate Data Engineer. The Associate Data Engineer supports the day-to-day operations of Candid’s cloud data platform. This role is responsible for maintaining ingestion and transformation pipelines built on Apache Iceberg, validating data outputs through schema and structural changes, and assisting with storage management, platform observability, and metadata operations. The Associate Data Engineer develops foundational skills across the modern data lakehouse stack while taking direct ownership of pipeline maintenance, documentation, and validation activities.
Position: Associate Data Engineer
Reporting to: Data Operations Manager
Supervises: N/A
Schedule: 35-hour work week, Monday through Friday
Compensation: $70,000 - $95,000 (this range is for the NYC area and will be adjusted for other localities; additionally, factors like skills and experience will be considered).
Location: Remote. In-person attendance is expected twice per year during our annual, weeklong all-staff summits. Additional in-person meeting participation is expected at least once per quarter for senior leaders and at least once per month for the executive team. Staff not located in the NYC area are expected to travel for these meetings.
Benefits: Health insurance (medical, dental, vision), retirement contribution with additional option for a match, paid life insurance and AD&D, paid leave time (PTO, compassionate leave, volunteer, holiday, parental), short-term and long-term disability, pre-tax transit, flexible spending accounts, supplemental insurance, summer hours, and Public Service Loan Forgiveness (PSLF) program eligible employer.
Responsibilities
- Pipeline Maintenance, Documentation, & Validation: Serve as the primary owner of ingestion pipelines and transformation table adjustments. Ensure continued, reliable data delivery and apply routine changes as business and schema needs evolve. Validate transformation outputs against expected results after schema or structural changes, documenting findings and escalating anomalies to the appropriate teams.
- Storage & Platform Support: Assist with scheduling compaction and cleanup jobs to maintain Iceberg table health and query performance. Support partition evolution and snapshot retention management to control storage growth.
- Observability & Metadata: Assist in implementing and maintaining CloudWatch metrics, alarms, and dashboards to ensure pipeline visibility. Contribute to tracking and reporting on platform performance metrics. Help maintain AWS Glue metadata refresh and statistics jobs that support query planning and optimization within the data platform.
- Schema Coordination: Assist with coordinating schema changes across ingestion and transformation layers to maintain consistency end to end. Collaborate with the Data Operations Engineer to communicate impacts and sequence changes safely.
- Infrastructure & Security: Support and maintain RBAC and ABAC (least privilege, standardized roles, and consistent tagging). Participate in access reviews and audits, documenting changes and escalating risks as needed.
Requirements
- 1– 3 years of experience in data engineering, analytics engineering, or a closely related technical role; internships and relevant academic project work considered.
- Solid SQL skills, including writing, reading, and debugging queries against relational or columnar data stores.
- Familiarity with cloud data concepts: object storage (Amazon S3), columnar file formats (i.e. Parquet), data-interchange formats (JSON, XML), or open table formats (i.e. Apache Iceberg).
- Experience with or exposure to distributed SQL query engines such as Trino or Starburst
- Familiarity with AWS services such as S3 and Glue.
- Exposure to Apache Airflow, SSIS or another workflow orchestration platform
- Experience writing or maintaining data pipelines in Python.
- Familiarity with on-prem relational data systems (i.e. Microsoft SQL Server).
- Strong attention to detail, especially around data validation and output accuracy.
- Strong analytical and problem-solving skills.
- Excellent written and verbal communication skills; ability to document findings clearly for both technical and non-technical audiences.
- Ability to work independently and collaboratively as part of a distributed team.
- Willingness to perform other duties and special projects as needed/requested.
- Sensitivity and respect for racial, gender, sexual orientation, and cultural differences.
- Champions and represents Candid’s core values: We’re driven, direct, accessible, curious, and inclusive.
Candid’s mission is to get you the information you need to do good.
The world’s problems are only growing, and change can’t wait. Nonprofits are needed now more than ever, but all too often their work goes without adequate support.
Candid makes it easier and faster for nonprofits and funders to connect in pursuit of solutions to change the world. Candid is where nonprofits find grants, donors find nonprofits that inspire them, and all can gain insights about the work being done for good.
Candid is a qualifying nonprofit organization as defined by the Public Service Loan Forgiveness Program. As such, Candid employees may claim their employment time on their PSLF application. We offer a competitive salary and excellent benefits. Due to the high volume of applicants we typically receive, we regret that we can only contact candidates we would like to interview.
For more information on positions available at Candid, please visit our website: Work with us
Candid is an equal opportunity employer. Candid provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local laws.
This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.