We are currently seeking a dynamic and experienced Lead DevOps Engineer with Observability expertise to join our team at Bridgenext. Your primary responsibility will be to design, implement, and maintain a Datadog-based observability platform for our clients, focusing on Azure, Salesforce, O365, Power Apps, and other SaaS solutions, while also driving core Azure DevOps initiatives such as CI/CD automation, infrastructure as code (IaC), infrastructure provisioning and maintenance. You will work collaboratively with a team of talented engineers to develop, test, and deploy these solutions, ensuring they are seamlessly integrated and delivered continuously. This is a challenging yet rewarding opportunity for someone deeply passionate about Azure and with extensive experience in this field. As we continue to expand our software teams in North America, this is an exciting time to join our fast-growing team and make an impact in the industry.
Core Technology Stack: In this role, you will primarily be working on Azure Cloud Infrastructure, CI/CD automation and the Datadog platform, primarily focusing on Azure, Salesforce, O365, Power Apps, and other SaaS solutions.
Employment Type: This is a contract to hire position.
Workplace: Remote, working from home. Timezone will be in Central Standard Time (CST).
Responsibilities include but are not limited to the following:
- Develop a comprehensive understanding of the architecture, infrastructure, and applications of the client's cloud-based solutions across Azure, Salesforce, O365, Power Apps, and other SaaS solutions.
- Plan, design and maintain an end-to-end Datadog-based monitoring solution to monitor the client's infrastructure, applications, and services across the cloud ecosystem.
- Develop and implement the monitoring strategy using Datadog and other relevant tools and technologies to ensure observability across the client's cloud ecosystem.
- Develop and maintain relevant documentation for monitoring architecture, data sources, alerting, and dashboards.
- Implementation, maintenance and supporting Azure DevOps platform (CI/CD Automation), different Azure Cloud Services, IAC (biceps/Terraform), application deployment with capability to resolve platform-related issues when required.
- Collaborate with the client's development and operations teams to ensure seamless integration of the monitoring solution and troubleshoot issues as they arise.
- Continuously improve the monitoring solution by gathering and analyzing metrics and logs to identify opportunities for automation and optimization.
- Define and implement standards and best practices for observability across the client's cloud ecosystem.
- Ensure compliance with security and regulatory requirements while monitoring the client's cloud ecosystem.
- Stay current with industry trends, technologies, and advancements in monitoring and observability to drive innovation and improvement in the monitoring solution.
- Provide regular status updates to the client and project stakeholders and ensure timely delivery of the monitoring solution and DevOps projects.
Must Have Skills:
- Strong experience with Datadog-based monitoring solutions and other relevant monitoring tools and technologies
- Good knowledge of Azure DevOps Platform and Azure Cloud Services such as Function apps, service bus, container apps, logic apps etc. with capability to resolve platform-related issues when required
- Hands-on experience with DevOps methodologies, including CI/CD pipeline implementation, infrastructure automation, and application deployment
- Expertise in designing and implementing monitoring solutions across Azure, Salesforce, O365, Power Apps, and other SaaS solutions
- In-depth knowledge of observability concepts, data sources, and data collection methodologies
- Strong experience with log analysis and data visualization tools
- Excellent communication and collaboration skills to work with cross-functional teams and stakeholders
- Ability to analyze metrics and logs to identify automation and optimization opportunities
- Strong analytical and problem-solving skills to troubleshoot and resolve issues as they arise
- Strong understanding of security and compliance requirements for monitoring and observability solutions
Preferred Skills:
- Experience with cloud-native architectures and microservices
- Familiarity with scripting languages like Python, Bash, or PowerShell
- Experience with container orchestration platforms like Kubernetes or Docker
- Familiarity with automation tools like Ansible, Chef, or Puppet
- Experience with cloud providers other than Azure
- Familiarity with ITIL best practices and IT service management
Professional Skills:
- Solid written, verbal, and presentation communication skills
- Strong team and individual player
- Maintains composure during all types of situations and is collaborative by nature
- High standards of professionalism, consistently producing high quality results
- Self-sufficient, independent requiring very little supervision or intervention
- Demonstrate flexibility and openness to bring creative solutions to address issues
Bridgenext is an Equal Opportunity Employer
#LI-MA1
#LI-REMOTE