AWS Data Engineer

Job Number

190035

Location

The Hague

Offer Salary

Highly market competitive salary

Job Posted

12 January 2023

  • As an AWS Data Engineer, you will help build out the cloud platform for business intelligence and analytics applications for the entire organization. This platform lies at the center of the company’s strategy for expanding business intelligence activities in the future. Additionally, the platform and its corresponding big datalake form the backbone of ongoing data science and machine learning efforts.

    This is 100% hands-on role

    Key Responsibilities
    • Work with stakeholders and vendors to identify new opportunities for ingesting, processing, and displaying valuable data to inform and drive business decisions across the organization
    • Build reusable pipelines, tools, and applications to support more than 100 technical and business users on an AWS cloud platform
    • Handle big data (>1TB per day) ETL while maintaining high data quality and low cost of processing
    • Maintain, update, and improve existing platform to ensure continuing regulatory compliance, including network security, data quality, and GDPR
    • Migrate legacy applications from on-prem deployments while improving data quality and reducing total cost of ownership

    Must-Haves for the Role
    • 5+ years of professional experience with the Python language
    • 5+ years of experience working on AWS, using services such as:
    o AWS Glue (PySpark jobs and data catalog)
    o AWS Redshift
    o AWS Lambda
    o AWS CodeCommit, CodeBuild, and CodePipeline (CI/CD tools)
    • 5+ years deploying cloud infrastructure using infrastructure-as-code (IaC), such as AWS CloudFormation, AWS CDK, or Terraform
    • Strong knowledge of data querying languages, including different dialects of SQL for Redshift, Athena, RDS, and Hive
    • Strong experience building reusable data pipelines and deploying via CI/CD
    • Strong experience writing reusable Python packages/software
    • Extensive experience with SageMaker Notebooks, including Glue and PySpark kernels
    • Extensive experience with Alembic database migrations and automation of other database administration tasks
    • Experience collecting source data from a variety of systems in order to perform ETL, including SFTP, REST API, AWS Appflow, and AWS DMS
    • Experience migrating legacy systems to the cloud, such as Hive/Hadoop and Teradata

    Nice-to-Haves for the Role
    • Knowledge of AWS datalake tools, including AWS Lake Formation and AWS EMR
    • Familiarity with GDPR and encryption of personally identifiable information (PII) at the dataset/column level
    • Experience with data quality monitoring tools, including Deequ, Great Expectations, and/or AWS CloudWatch
    • Experience with AWS data visualization tools, including AWS QuickSight

    If you are interested in this position, please send your CV with motivation to Ray Parker at Shuter Smith International – ray@shutersmith.com