Senior Site Reliability Engineer
Location: San Diego
Posted on: January 26, 2023
Want to help us, help others? We're hiring!
is a global community of over 100 million people with a common
purpose of helping one another. Our mission is to help people help
each other by making it safe and easy for people to ask for help
and support causes-for themselves, each other, and their
communities. Since 2010, GoFundMe has become a trusted global
leader in online fundraising, with $17 billion raised from over 200
million donations. Our vision is to become the most helpful place
in the world.
The GoFundMe team is searching for an experienced Site Reliability
Engineer (SRE). You will be responsible for the full system
lifecycle including infrastructure provisioning, system
configuration, deployments, monitoring, and incident response in
production environments. The SRE uses technical analysis to assess
the availability, latency, scalability, and efficiency of a product
or infrastructure and builds reliability into systems. To ensure
the highest level of application performance and availability, the
reliability engineer works closely with development teams, relevant
functional operations teams, network engineers, database
administrators, technology vendors and partners. The successful
reliability engineer effectively guides incident responses, helps
identify root causes and provides recommendations or solutions to
mitigate and resolve issues.
- Design and build out our cloud infrastructure (we run
everything in AWS).
- Participate in software and system performance analysis,
tuning, and service capacity planning.
- Manage the availability, scalability, security, and performance
of our platform and applications.
- Diagnose bottlenecks for the full stack and provide
recommendations to overcome the bottlenecks as an interim work
around, while long-term solutions are investigated.
- Periodically assess all monitoring requirements and implement
enhancements to meet or exceed changing business needs.
- Proactively review, recommend, and implement changes to the
live infrastructure after ensuring the right validation has been
- Use data analysis to pick up trends before they become major
- Perform 24/7 on-call duties.
- 5+ years of experience in operating high-traffic SaaS
- Deep expertise in the mentality, processes, and tools needed to
deliver five nines.
- Skills to build a fully automated, highly elastic cloud
orchestration framework on AWS.
- Strong working knowledge of Linux and its underlying
components, system statistics, performance tuning, filesystems and
- Solid scripting skills (e.g. Bash, Python).
- Development experience (e.g. Python, PHP, Java,
- Experience with continuous integration frameworks.
- Experience with performance diagnostics, performance tuning,
capacity planning, and monitoring.
- BS in Computer Science or equivalent.
- Good verbal and written communication skills.
Technologies you are likely to be working with...
AWS, Ansible, Terraform, MySQL/Aurora, Redshift, Nginx, Apache,
Docker, Kubernetes, Elasticsearch, Kafka, Memcached, Redis,
RabbitMQ, Jenkins, Git, Bash, Python, PHP, Java, Kotlin, Ruby,
Nessus, Nagios, Sumologic, NewRelic, PagerDuty
Why you'll love it here...
- Market competitive pay
- Rich healthcare benefits including employer paid premiums for
medical/dental/vision (100% for employee only plans and 85% for
employee + dependent plans) and employer HSA
- 401(k) retirement plan with company matching
- Hybrid workplace with fully remote flexibility for many
- Monetary support for new hire setup, hybrid work & wellbeing,
family planning, and commuting expenses
- A variety of mental and wellness programs to support
- Generous paid parental leave and family planning
- Supportive time off policies including vacation, sick/mental
health days, volunteer days, company holidays, and a floating
- Learning & development and recognition programs
- Gives Back Program where employees can nominate a fundraiser
every week for a donation from the company
- Inclusion, diversity, equity, and belonging are vital to our
priorities and we continue to evolve our strategy to ensure DEI is
embedded in all processes and programs at GoFundMe. Our Diversity,
Equity, and Inclusion team is always finding new ways for our
company to uphold and represent the experiences of all of the
people in our organization.
- Employee resource groups
- Your work has a real purpose and will help change lives on a
- You'll be a part of a fun, supportive team that works hard and
celebrates accomplishments together.
- We live by our core values: consider everything, do the right
thing, spread empathy, delight the customer, and give
- We are a certified Great Place to Work, are growing fast and
have incredible opportunities ahead!
GoFundMe is proud to be an equal opportunity employer that actively
pursues candidates of diverse backgrounds and experiences. We are
committed to providing diversity, equity, and inclusion training to
all employees, and we do not discriminate on the basis of race,
color, religion, ethnicity, nationality or national origin, sex,
sexual orientation, gender, gender identity or expression,
pregnancy status, marital status, age, medical condition, mental or
physical disability, or military or veteran status.
Learn more about GoFundMe:
Keywords: GoFundMe, San Diego , Senior Site Reliability Engineer, Professions , San Diego, California
Didn't find what you're looking for? Search again!