Lead Machine Learning Ops Engineer
Company: The Friedkin Group
Location: Cypress
Posted on: October 20, 2024
Job Description:
Living Our ValuesAll associates are guided by Our Values. Our
Values are the unifying foundation of our companies. We strive to
ensure that every decision we make and every action we take
demonstrates Our Values. We believe that putting Our Values into
practice creates lasting benefits for all of our associates,
shareholders, and the communities in which we live.Why Join Us
- Career Growth: Advance your career with opportunities for
leadership and personal development.
- Culture of Excellence: Be part of a supportive team that values
your input and encourages innovation.
- Competitive Benefits: Enjoy a comprehensive benefits package
that looks after both your professional and personal needs. Total
RewardsOur Total Rewards package underscores our commitment to
recognizing your contributions. We offer a competitive and fair
compensation structure that includes base pay and performance-based
rewards. Compensation is based on skill set, experience,
qualifications, and job-related requirements. Our comprehensive
benefits package includes medical, dental, and vision insurance,
wellness programs, retirement plans, and generous paid leave.
Discover more about what we offer by visiting our Benefits page. A
Day In The LifeAs a Lead Machine Learning Ops Engineer, you will
play a pivotal role in implementing DevOps and ML Ops practices
within the Corporate Data & Analytics Team to support AI/ML
application enablement across The Friedkin Group of companies. Your
primary responsibility will be to drive the adoption of best
practices in DevOps and ML Ops, accelerating the deployment of
AI/ML and data-driven solutions that meet our business needs. We
seek a motivated and skilled individual with a strong background in
DevOps and ML Ops, a deep understanding of Infra Ops, and solid
knowledge of AI/ML data and analytics cloud services and
components. You will collaborate closely with data scientists,
machine learning engineers, data engineers, software engineers, and
platform architects, utilizing the latest tools and technologies to
deploy and maintain AI/ML and advanced analytics solutions, as well
as integrate analytic models with existing business applications.
As a Lead Machine Learning Ops Engineer you will:
- Develop automated build and deployment processes to enable
continuous delivery of software releases, enhance the existing
CI/CD pipelines for AIML application development and
deployment.
- Collaborate with data scientists, data engineers, data
analysts, software engineers, IT specialists, and stakeholders to
accelerate deployment of AI applications via CI/CD pipelines and
maintain the SLAs of those applications at the centralized
platform.
- Design, develop and maintain infrastructure using
infrastructure as code tools such as Terraform, Ansible,
CloudFormation etc.
- Templatize existing Databricks CLI codes to manage Databricks
platform as code for AIML data pipelines (batch processing, batch
streaming and streaming) and model serving endpoints.
- Enhance the existing DevOps practices to improve the overall
AIML application development lifecycle.
- Work closely with cross-functional teams to ensure that
applications are highly available and scalable.
- Collaborate with development teams and cloud platform team to
ensure that infrastructure meets the requirements of the
application.
- Establish and maintain best practices for cloud security,
compliance, and cost optimization. What We Need From You
- Bachelor's Degree Computer Science, Computer Engineering,
Information Technology, Software Engineering or equivalent
technical discipline and 10+ years of experience in software
engineering with a strong background in DevOps and Infrastructure
as Code, supporting Machine Learning and Data Science workloads
preferred. or
- Master's Degree Computer Science, Computer Engineering,
Information Technology, Software Engineering or equivalent
technical discipline and 5+ years of experience in software
engineering with a strong background in DevOps and Infrastructure
as Code, supporting Machine Learning and Data Science workloads
preferred.
- Expertise on code versioning tools, such as Gitlab, GitHub,
Azure DevOps, Bitbucket etc., GitHub Preferred, familiar with
branch level code repository management.
- Experience deploying Machine Learning solutions on cloud
platforms (e.g., AWS, Azure, or GCP). Databricks, and AWS
Preferred.
- Proficient with GitHub actions to automate testing and
deployment of data and ML workloads from CI/CD provider to
Databricks.
- Strong knowledge of infrastructure automation tools such as
Terraform, Ansible, CloudFormation etc.
- Experience with data processing frameworks/tools/platform such
as Databricks, Apache Spark, Kafka, Flink, AWS cloud services for
batch processing, batch streaming and streaming.
- Experience containerizing analytical models using Docker and
Kubernetes or other container orchestration platforms.
- Technical expertise across all deployment models on public
cloud, private cloud, and on-premises infrastructure.
- Experience in event-driven, and microservice architectures for
enterprise level platform development.
- Expertise in Linux, and knowledge of networking and security
concepts
- Effective communication skills and a sense of ownership and
drive.
- Capable of coaching/mentoring individuals and teams. Physical
and Environmental Requirements The physical requirements described
here are representative of those that must be met by an associate
to successfully perform the essential functions of the job. While
performing the duties of the job, the associate is required on a
daily basis to analyze and interpret data, communicate, and remain
in a stationary position for a significant amount of the work day
and frequently access, input, and retrieve information from the
computer and other office productivity devices. The associate is
regularly required to move about the office and around the
corporate campus. The associate must frequently move up to 10
pounds and occasionally move up to 25 pounds. Travel
Requirements20% The associate is occasionally required to travel to
other sites, including out-of-state, where applicable, for
business. Join UsThe Friedkin Group and its affiliates are
committed to ensuring equal employment opportunities, including
providing reasonable accommodations to individuals with
disabilities. If you have a disability and would like to request an
accommodation, please contact us at . We celebrate diversity and
are committed to creating an inclusive environment for all
associates.We are seeking candidates legally authorized to work in
the United States, without Sponsorship.
Keywords: The Friedkin Group, Pearland , Lead Machine Learning Ops Engineer, Engineering , Cypress, Texas
Didn't find what you're looking for? Search again!
Loading more jobs...