Senior Site Reliability Engineer (Remote)other related Employment listings - Fairfax, VA at Geebo

Senior Site Reliability Engineer (Remote)

GovCIO is a team of transformers--people who are passionate about transforming government I.T. We believe in making a difference by developing digital strategies and delivering the technology-related innovation governmental operations that improve the citizen experience every day. But we can't do it alone. We welcome and nurture an inclusive and diversified work culture. Because different backgrounds, experiences, abilities, and perspectives make us better decision-makers, problem solvers, and creators. We're changing the face of I.T. - from our diverse staff to the end-products we develop. And we're excited to expand our team. Are you ready to be a transformer? Responsibilities As a Senior Site Reliability Engineer, you will apply your senior application product expert skills to support building processes that manage and improve OIT's response posture to system events impacting end users and Veterans. This includes working with business partners to improve communication and responsiveness to application failures by minimizing impacts in performance degradation and availability, working towards a significant reduction in application downtime and impact to the users. You will be working with a team of site reliability engineers, both junior and senior level, to support an engineering team lead to perform the required deliverables. Areas of support include:
Triage Major Incident Management (MIM) and Problem Management (PM) incidents by deconstructing application performance, interoperability, instrumentation, and human factors to facilitate resolution and development of resilient solutions. Support coordination and ensure all High Priority Incident (HPI) and Critical Priority Incident (CPI) are triaged properly and routed to the appropriate and correct groups for immediate resolution. Perform enterprise root cause analysis (RCA) and identification in coordination with appropriate OI&T organizations Capture technical information from the relevant stakeholders and synthesize it into useful information in various formats for OIT senior management and other VA components. Support the collection, development, and/or editing of content for white papers and other communication devices; and assess and evaluate the effectiveness of executive communication to effect process improvement. Demonstrate proficiency with DevOps tools, JIRA, ServiceNow, and MS Project and perform tasks using the tools Analyze incident record data, research trends, and digest findings into written recommendations and strategies for improving the posture of the VA's information technology services, reducing both MTTR and incident occurrence frequency. Case management and follow-through post-incident resolution for root cause analysis, developing permanent fixes and preventative strategies to reduce MTTR and incident reoccurrence. Digesting and writing technical recommendations for case management and trend analysis presentations. Required Skills and Experience Masters Degree is preferred in Business Administration, Business Management, Computer Science, Information Systems, Information Resource Management, Industrial Engineering, Operations Research, or related fields 5
years of relative experience Certifications in relevant UX software plus 3-5 years of relevant experience; 8 to 10 years of relevant experience may be substituted for education (13-15 years total) Be a technical expert with expertise across multiple technology areas and the ability to diagnose complex issues throughout many technologies. Must be able to identify and mitigate risks to the product Must be able to provide oral and written discussion of analytical findings using narrative and graphic forms. Must be able to use qualitative and quantitative analytical skills to assess the effectiveness of the operations. Identifying symptoms for process improvement. Communications including being able to craft content for executive-level presentations. IT background and ability to understand technical content. Experience working with packet capture analysis using tools such as Wireshark or Netscout. Experience with monitoring tools such as Splunk, AppDynamics, SolarWinds or Dynatrace. ServiceNow experience is nice to have. Understands the RCA process and can work across teams to guide the implementation of solutions to identified incident root causes. Broad understanding of ITIL. #zr #Dice
Salary Range:
$100K -- $150K
Minimum Qualification
DevOps & Site ReliabilityEstimated Salary: $20 to $28 per hour based on qualifications.

Don't Be a Victim of Fraud

  • Electronic Scams
  • Home-based jobs
  • Fake Rentals
  • Bad Buyers
  • Non-Existent Merchandise
  • Secondhand Items
  • More...

Don't Be Fooled

The fraudster will send a check to the victim who has accepted a job. The check can be for multiple reasons such as signing bonus, supplies, etc. The victim will be instructed to deposit the check and use the money for any of these reasons and then instructed to send the remaining funds to the fraudster. The check will bounce and the victim is left responsible.