Chronicles of an On Call Engineer: Day 3

Today is day 3 of 7 as the primary on-call engineer.

I almost thought I’d go today without getting paged but that was not to be so.

At 4:19pm, I got paged by Opsgenie. A consumer service owned by a different team was reporting lots of errors and was rapidly burning through its error budget.

2 things went well;

  1. It was during working hours so I just had to pull in the team responsible. After looking at the logs to see what the errors being reported were, they quickly came to the conclusion that they should be warnings instead of errors as duplicate key issues (duplicates are produced by the publisher) are not under their control and don’t make much sense to be reported as errors.
  2. I created my first postmortem ticket as incident manager. This was the first incident I’ve managed that had an action item that needed to be attended to. The previous incidents were infrastructural glitches that didn’t require any further analysis or action items. Postmortem activities are high priority so it was great seeing the ticket picked up and underway. I can’t wait to read the postmortem doc the team comes up with. My name would be on the document as incident manager (IM) 🥰

All in all, it was a good day of on-call responsibilities. Loving it so far.

