Blog

System Status Page

At CloudBolt, we believe that software solutions should be easy to maintain, manage, and understand. We also believe they should be self-regulating and self-healing, when possible. You will see a focus on this starting in 8.4—Tallman but also continuing through our 9.x releases, which will give you better visibility into CloudBolt’s internal status, management capabilities directly from the web UI, and reduce the number of times you need to ssh to the CB VM to check things or perform actions.

CloudBolt 8.4—Tallman introduces a new Admin page called “System Status” which provides several tools for checking on the health of CloudBolt itself.

The System Status Page in 8.4—Tallman

To see the System Status page in your newly installed/upgraded CloudBolt 8.4-Tallman, navigate to Admin > Support Tools > System Status. You will see a page that looks a bit like this:

There are three main parts of this page.

1. CloudBolt Mode

This section provides a way to put CloudBolt into admin-only maintenance mode. This prevents any user who is not a Super Admin or CloudBolt admin from logging in or navigating in this CloudBolt instance. This is useful for times when you need to perform maintenance on CloudBolt (eg. upgrading it, making changes to the database, etc), and you want to prevent users from accessing it while in an intermediate state, but you yourself need to perform some preparation and verification within the CB UI before and after the maintenance.

2. Job Engine

This section shows the status of each job engine worker, each running on a different CloudBolt VM now that active-active Job Engines are supported. It also shows a chart of all jobs run in the last hour and day per job engine. When things are healthy, and the job engines are not near their max concurrency limit, there should be a fairly even split of how many jobs are being run by each worker.

3. Health Checks

This section has several kinds of checks:

  • Indications of the health of a specific service, as would be seen from the Linux command line when running `service <name> status`
  • Tests of OS-level health, such as a check of available disk space on the root partition
  • Functional tests, which perform some basic action to make sure systems are working properly. Functional tests in 8.4—Tallman include writing a file to disk and deleting it, creating an entry in the database and deleting it, and adding an entry to memcache and deleting it.

Ensuring the health of the systems that underlie CloudBolt can help you quickly hone in on the root cause of an issue, and we hope that the system status page will help narrow the time it takes to troubleshoot and resolve issues with CloudBolt.

What’s Next for the System Status Page

We have some ideas for what we might add next:

  • Uptime metrics for each job engine worker
  • The average time for jobs to complete for each worker
  • Disk space checks for all partitions on the CB VM
  • CPU, memory, I/O, and network utilization for the CB VM
  • Uptime for the CB VM as a whole
  • Network health checks, including:
    • testing DNS lookups
    • testing pinging the gateway
    • testing connections to any configured proxies

If there are any of these that seem like they would be especially useful to you, we’d love to hear that to help us prioritize. We’d also love to hear any additional ideas you have for this new page!

Related Blogs

 
thumbnail
FinOps Evolved: Key Insights from Day One of FinOps X Europe 2024

The FinOps Foundation’s flagship conference has kicked off in Europe, and it’s set to be a remarkable event. Attendees familiar…

 
thumbnail
FinOps for AI: Navigating the Wild West of Generative AI Costs

Buckle up, folks! The rapid evolution of cloud services and the rise of generative AI are reshaping how organizations approach…

 
thumbnail
Is Your FinOps Practice Ready for AI? Here’s How to Find Out

As a FinOps leader, you’re likely seeing the mad dash toward AI across industries—from automating workflows to cutting operational costs,…