Skip navigation EPAM

Lead Big Data Engineer Remote

  • hot

Lead Big Data Engineer Description

Job #: 50676
Striving for excellence is in our DNA. Since 1993, we have been helping the world’s leading companies imagine, design, engineer, and deliver software and digital experiences that change the world. We are more than just specialists, we are experts.

DESCRIPTION


A remote Lead Big Data Engineer is needed. This position is a part of our new EPAM Anywhere program for remote workers. EPAM Anywhere offers a variety of IT jobs for remote workers. Join us to work on ambitious and long-term projects, get a stable workload, and enjoy a work-life balance!

Project technologies and tools

  • Programming Languages: Java/ Scala/Python/SQL/Bash
  • Big Data stack: Hadoop, Yarn, HDFS, MapReduce, Hive, Spark, Kafka, Flume, Sqoop, Zookeper
  • NoSQL: Cassandra/ Hbase/MongoDB
  • Queues and Stream processing: Kafka Streams; Flink; Spark Streaming; Storm; Event Hub; IOT Hub MQTT; Storage Queues; Service Bus; Stream Analytics
  • Data Visualization: Tableau/ QlikView
  • ETL & Streaming Pipelines: Pentaho; Talend; Apache Oozie, Airflow, NiFi; Streamsets
  • Operation: Cluster operation, Cluster planning
  • Search: Solr, Elasticsearch/ELK
  • InMemory: Ignite, Redis
  • Solid Cloud experience with 2 or more leading cloud providers (AWS/Azure/GCP): Storage; Compute; Networking; Identity and Security; NoSQL; RDBMS and Cubes; Big Data Processing; Queues and Stream Processing; Serverless; Data Analysis and Visualization; ML as a service (SageMaker; Tensorflow)
  • Enterprise Design Patterns (ORM, Inversion of Control etc.)
  • Development Methods (TDD, BDD, DDD)
  • Version Control Systems (Git, SVN)
  • Testing: Component/ Integration Testing, Unit testing (JUnit)
  • Deep understanding of SQL queries, joins, stored procedures, relational schemas; SQL optimization
  • Experience in various messaging systems, such as Kafka, ZeroMQ/ RabbitMQ
  • Rest, Thrift, GRPC, SOAP
  • Build Systems: Maven, SBT, Ant, Gradle
  • Docker, Kubernetes, Yarn, Mesos

Responsibilities

  • Lead, design and implement innovative analytical solution using Hadoop, NoSQL and other Big Data related technologies, evaluating new features and architecture in Cloud/ on premise/ Hybrid solutions
  • Work with product and engineering teams to understand requirements, evaluate new features and architecture to help drive decisions
  • Build collaborative partnerships with architects and key individuals within other functional groups
  • Perform detailed analysis of business problems and technical environments and use this in designing quality technical solution
  • Actively participate in code review and test solutions to ensure it meets best practice specifications
  • Build and foster a high performance engineering culture, mentor team members and provide team with the tools and motivation
  • Write project documentation

Requirements

  • More than 5 years' experience in software development with Big Data technologies (e.g. administration, configuration management, monitoring, debugging and performance tuning)
  • Engineering experience and practice in Data Management, Data Storage, Data Visualization, Disaster Recovery, Integration, Operation, Security
  • Strong experience building data ingestion pipelines (simulating Extract, Transform, Load workload), Data Warehouse or Database architecture
  • Strong experience with data modeling; hands-on development experience with modern Big Data components
  • Cloud: experience in designing, automation, provisioning, deploying and administering scalable, available and fault tolerant systems
  • Good understanding of CI/CD principles and best practices
  • Analytical approach to problem-solving with an ability to work at an abstract level and gain consensus; excellent interpersonal, leadership and communication skills
  • Data-oriented personality and possessing compliance awareness, such as PI, GDPR, HIPAA
  • Motivated, independent, efficient and able to handle several projects; work under pressure with a solid sense for setting priorities
  • Ability to work in a fast-paced (startup like) agile development environment
  • Strong experience in high load and IoT Data Platform architectures and infrastructures
  • Vast experience with Containers and Resource Management systems: Docker, Kubernetes, Yarn
  • Experience in direct customer communications
  • Experience in technology/team leading of data oriented projects
  • Solid skills in infrastructure troubleshooting, support and practical experience in performance tuning and optimization, bottleneck problem analysis
  • Experienced in different business domains
  • English proficiency – B2 and higher
  • Advanced understanding of distributed computing principles

We offer

  • Competitive compensation depending on experience and skills
  • Work in enterprise-level projects long-term
  • Full-time remote work (you can work from anywhere you are)
  • Unlimited access to learning courses (LinkedIn learning, EPAM training courses, English regular classes, Internal Library)
  • Community of 30,100+ industry’s top professionals

Hello. How Can We Help You?


Our Offices