Monitoring and Observability


Published on January 29, 2024 by David Wardlaw

6 min READ

In this post I am going to talk about monitoring and observability.

Now what are monitoring and observability?

Monitoring is defined as…..

"to watch and check a situation carefully for a period of time in order to discover something about it"

From <https://dictionary.cambridge.org/dictionary/english/monitoring>

Observability is defined as…..

"to watch carefully the way something happens or the way someone does something, especially in order to learn more about it"

From <https://dictionary.cambridge.org/dictionary/english/observe?q=Observing>

So the main difference between the 2 is that one is around checking to discover something whilst the other is about watching something in order to learn more about it.

Why are both important?

They are both important as they can tell us different things about a system and can provide us with information that can feed into decisions and actions that can be made both now and in the future to improve software quality as well as improve testing.

Monitoring

I will use my hobby of photography as an example. Now Common Kingfishers are birds that are found in the UK and they sit on branches (as well as other things) not too high above  the water watching and waiting for a fish. When they spot a fish,  they dive into the water to try and catch it.  Now during lockdown I set myself a challenge of getting a picture of a Kingfisher on a branch. Now down at  my local rover there was a particular branch that a kingfisher would sit on. I started to monitor this branch, and this involved every time I went down there I would make a note of the time when I saw the Kingfisher on the branch.

Now this was great, I could look for patterns to understand when the kingfisher was more likely to be on the branch so I could try and get the photo that I wanted. This was great but in It didn’t tell me the whole story. This is where Observability comes in….

Observability

Observability enables you to learn more about something. Going back to my Kingfisher example, by observing it I was able to learn more about it and was even able to predict when it would fly on to the branch. Monitoring was great as I had an idea of the times it would more than likely be on the branch but it did not tell me what led it to land on the branch. By observing its behaviour I was able to learn its pattern of flying to the branch from 1 of 2 locations. I also learnt that if someone raised there camera just as it was about to land it would fly off and not come back for a good 10 minutes. So by observing the Kingfisher I could get predict when it would land on the branch and this in turn enabled me to have a better chance of getting the picture I wanted. For example, if I knew it was going to fly onto the branch from location 1 I could raise my camera ready to take the shot when it was at location 1 rather than waiting for it top land, which would usually result in it flying off when I raised my camera.

This is all great but how does this relate to software development…..

Software Development

Well by monitoring and observing we can change what we do both now and in the future based upon the feedback from our monitoring and observing. This could range from fixing a bug that a customer doesn’t know about to turning off a feature if our users are not using it. But the key thing is that we make changes to the software that benefits our users.

Monitoring and observability are linked and you need to observe to understand what the monitoring is telling you.

For example, lets' say you have added a feature that presents a page prior to checkout for an e-commerce site that makes suggestions of other things the user may want to add to their basket before they go to the checkout.

Through monitoring you monitor how many people are adding something that has been suggested to them to their basket. You discover that hardly anyone is adding something that has been suggested to them. So you start to observe how the users are using the system to see if that will tell you anything. This will help you find the why. You find out that users are using a quick checkout option that skips this page and takes them straight to the checkout. Great, so now you understand why the feature is not being used by observing how the users use the system. Monitoring told you the what and the observing told you the why. Off of the back of this you can make a decision, you may decide to remove the feature or change the flow of the site.

Monitoring is more of a way to bring to your attention things going on in the system that you may care about and need to investigate further. Observability is something that can be used to drive change in your system to improve its quality. Now you can have observability without monitoring and visa vera but that’s like having Fish without chips. You can have it but it kinda doesn’t make sense….

Thought and Planning

Now monitoring and observability are not things you can just tac onto the end of feature development. They require thought and planning early on. You need to answer questions like:

Monitoring

What would be valuable to monitor? 

Why? 

How? 

Who will track what we are monitoring?

Who will collate the results?

How will we communicate what is being monitored?

What do we need to observe should we wish to investigate further?

 Observability

What do we need to observe for what we are monitoring?

How will we observe?

How will this link to what we are monitoring?

Who will collate the results?

How will we communicate what is being observed?

Who will we share these observations with?

How Observability can drive change?

Sometimes it is difficult to get feedback from users. Asking them about how they use the system can be a futile process. But by adding observability you can get that information first hand. You can actually see what the user is doing rather than them telling you what they did (Or what they thought they did). What you observe can feed into future roadmap discussions about the product as well as what changes to features or new features may be needed. This can drive change which in turn can drive added value to your users which in turn will help the business.