Home / Jobs / HPC Engineer, Machine Learning Infrastructure

Share:
Facebook X Linkedin WhatsApp

Published 2024-03-15

HPC Engineer, Machine Learning Infrastructure


HPC Engineer, Machine Learning Infrastructure

Company details

Type of job: Remote
Country: France
City: Paris
Company: Hugging Face


Description of the offer

Here at Hugging Face, we’re on a journey to advance good Machine Learning and make it more accessible. Along the way, we contribute to the development of technology for the better.

We have built the fastest-growing, open-source, library of pre-trained models in the world. With more than 1 Million+ models and 320K+ stars on GitHub, over 15.000 companies are using HF technology in production, including leading AI organizations such as Google, Elastic, Salesforce, Grammarly and NASA.

 

About the role:

We are looking for a HPC Engineer responsible for developing and scaling our distributed large cluster. The ideal candidate will have experience provisioning large compute clusters for AI workflows and strong experience supporting teams to create best practices for reliability and scalability.

Responsibilities

  • Design, develop, deploy, and maintain reliable and scalable infrastructure that enables efficient training workloads.
  • Manage large compute clusters for AI Training and development.
  • Create tooling and infrastructure that abstract compute and storage in ML workflows
  • Measure and optimize system performance.
  • Monitor and troubleshoot infrastructure issues, ensuring high availability and performance of AI workloads.
  • Stay up to date with the latest advancements in AI infrastructure technologies and recommend improvements to enhance system efficiency and performance.
  • Work closely with AI software engineering teams to ensure infrastructure can handle all system requirements.
  • Provide primary operational support and engineering for multiple teams.

Qualifications:

  • 7+ years of experience in a DevOps or infrastructure Engineer role building machine learning infrastructure and working with large GPU clusters.
  • Knowledge of cloud providers such as AWS, GCP, infra-as-code frameworks and observability tools.
  • Familiarity with Python Scientific stack, Pytorch.
  • Experience with data structures, data modeling, and database management as well as object and file storage systems.
  • Strong communication, collaboration, and documentation skills.
  • Experience with Linux, Git, containers, networking and command line tools.
  • Strong programming skills in Python, Golang, and/or Rust.

About you:

If you are a passionate HPC Engineer with a keen interest in AI and thrive in a challenging and innovative setting, we would love to hear from you. Join our team and contribute to the advancement of AI technologies while working alongside talented professionals in a collaborative and stimulating environment.

 

More about Hugging Face

We are actively working to build a culture that values diversity, equity, and inclusivity. We are intentionally building a workplace where people feel respected and supported—regardless of who you are or where you come from. We believe this is foundational to building a great company and community. Hugging Face is an equal opportunity employer and we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

We value development. You will work with some of the smartest people in our industry. We are an organization that has a bias for impact and is always challenging ourselves to continuously grow. We provide all employees with reimbursement for relevant conferences, training, and education.

We care about your well-being. We offer flexible working hours and remote options. We offer health, dental, and vision benefits for employees and their dependents. We also offer 12 weeks of parental leave (20 for birthing mothers) and unlimited paid time off.

We support our employees wherever they are. While we have office spaces in NYC and Paris, we’re very distributed and all remote employees have the opportunity to visit our offices. If needed, we’ll also outfit your workstation to ensure you succeed.

We want our teammates to be shareholders. All employees have company equity as part of their compensation package. If we succeed in becoming a category-defining platform in machine learning and artificial intelligence, everyone enjoys the upside.

We support the community. We believe major scientific advancements are the result of collaboration across the field. Join a community supporting the ML/AI community.


How to apply?

Click on the button to get the company email or employment application form.
Apply on home page
Tags:


More options: Modify job, Delete job

Remember that you do not have to pay to participate in selection processes. Protect yourself from fraud with our Security Tips. If they ask you for money for an application you can Report job.


Promote your Job

Promote your job offer in the first positions.
$30USD for 30 days

Promote now

 

Top cities

Featured links

Follow us on Instagram @publiremote

@publiremote

 

 


Recents jobs

Published Tuesday 17 de December, 2024

Creative SEO Content Writer

Compose.ly
Remote
United States

  Objective We are seeking a talented and creative SEO Content Writer to produce high-quality, engaging content that supports our mission to make

, ,


Published Tuesday 17 de December, 2024

Paid Media Specialist

Ahrefs
Remote
United States

We’re looking for a paid media specialist who can drive impactful advertising campaigns, experiment boldly, and help us showcase Ahrefs industry-leading tools to

,


Published Monday 9 de December, 2024

Senior Paid Search Manager

Deel
Remote
United States

Who we are is what we do. Deel and our family of growing companies are made up of global teams dedicated to helping



Frequently Asked Questions (FAQ)

PubliRemote.com is an online portal dedicated to connecting employers and remote workers. We offer a platform where companies can post remote work opportunities and professionals can find employment from anywhere in the world.   At PubliRemote, you will find a wide variety of remote jobs, both for freelancers and for people who want to Work From Home with a fixed schedule.

Publiremote provides a user-friendly platform for job seekers, helping them find high-quality, remote opportunities across various fields. With a focus on matching skilled professionals with roles that offer flexibility and work-life balance, Publiremote is the go-to site for anyone looking to take control of their career.

Start exploring today and discover remote roles in graphic design, customer service, proofreading, content moderation, illustration, video editing, and freelance writing that let you work from wherever you feel most inspired.

Posting a job is easy! Just sign up, click on “Post a Job,” and follow the steps to enter your job details. You can publish your job listing for free or choose to promote it for added visibility.

A promoted job costs $30 for 30 days. Promoted listings appear at the top of search results, making it easier for candidates to find your post.

 

Publiremote.com focuses on remote job opportunities across a variety of fields, including tech, marketing, design, and customer service. We welcome positions that allow candidates to work from anywhere or in a remote capacity.

 

To attract quality candidates, be clear about the job role, requirements, and any specific skills needed. Adding a descriptive title and highlighting remote perks or benefits also helps your listing stand out.

 

We accept secure payments through Stripe, which supports most major credit and debit cards. Your payment information is protected, ensuring a safe transaction.

 

Yes! If you need help, our support team is ready to assist you. Please reach out via our Contact Us page, and we’ll get back to you as soon as possible.