Send this job to your inbox!
We are seeking a pragmatic and skilled Senior Data Engineer to join a growing team focused on stabilizing and evolving core data infrastructure. This role will play a critical part in enabling machine learning and analytics by improving data quality, reliability, and scalability at the foundational level.
Many strategic initiatives, including those involving AI/ML, are currently blocked by foundational data challenges such as:
Inconsistent or poorly structured data
Manual, fragile workflows
Schema limitations and lack of observability
Low maturity DevOps/data platform tooling
A hands-on, systems-oriented Senior Engineer will bring the right level of expertise and ownership to address these core issues.
Refactor and evolve the data schema (e.g., PostgreSQL) to support scalability and data integrity
Build, optimize, and maintain batch and streaming pipelines using tools such as Airflow, Kafka, or equivalent
Develop reliable derived datasets to support analytics and reporting
Enhance data validation, observability, and logging across all pipelines
Support clean, structured inputs for downstream AI/ML applications
Collaborate with backend engineering to integrate data solutions into monolithic or microservice-based architectures
Contribute to internal data documentation and enforce engineering best practices
Strong proficiency in Python (e.g., Pandas, SQLAlchemy, PySpark) and SQL
Hands-on experience with PostgreSQL, including schema design, partitioning, and performance tuning
Practical experience deploying and maintaining data pipelines in production environments
Familiarity with ETL/ELT orchestration tools such as Airflow or dbt
Experience working with streaming data platforms (e.g., Kafka, Pub/Sub)
Comfort working in low-maturity environments lacking CI/CD, GitHub Enterprise, or monitoring setups
Exposure to data validation tools (e.g., Great Expectations) and observability stacks (e.g., Grafana, DataDog)
Awareness of working in regulated data environments (e.g., HIPAA, GDPR)
Experience in healthcare, mental health tech, or other regulated industries
Familiarity with Django or integration with monolithic web frameworks
Experience supporting data operations in early-stage or startup environments
Infrastructure-as-code familiarity and experience with tools like Docker
Thrives in ambiguity and incomplete systems
Enjoys untangling messy data and building durable solutions
Prioritizes doing things right over using trendy tech
Comfortable working independently and growing into broader ownership over time
Re-architecting a basic Postgres-based data layer (not Snowflake-scale)
Writing ingestion pipelines and resolving data inconsistencies
Introducing CI/CD for data jobs and building foundational monitoring/logging
Collaborating cross-functionally with AI, backend, and infrastructure teams
Laying the groundwork for scalable, production-ready systems
Expect a fully built modern data stack (e.g., Snowflake, data mesh, mature abstractions)
Have only worked in highly mature data environments and need heavy tooling support
Are focused primarily on ML or analytics rather than infrastructure and orchestration
Struggle to collaborate with shared DevOps/platform teams or require full ownership of production environments
Phone
Job Type
Remote Status
Get notified about new listings!
Can't find the job you want?
Submit a general applicationLoading Jobs...