Stash It App for Monitoring with Sensu
When I joined AppsFlyer 4 months ago, I got introduced to the most fascinating real-time environment I’ve ever known. AppsFlyer’s services are on the Amazon cloud, which runs many micro-services, performing many different tasks simultaneously. This complex environment requires constant, round-the-clock monitoring; one of the tools we use to monitor performance is Sensu App (the open source monitoring framework).
Sensu App intends to provide an open framework for building comprehensive monitoring solutions, without imposing restrictions or being overly opinionated. While new user-facing “features” may be developed exclusively in Sensu Enterprise, the framework that makes those features possible will always be a part of Sensu Core.
The AppsFlyer system is monitored 24/7 and the team is always on call, ready to respond if an issue arises. For every warning or error notification, even known ones, regardless the time of day, our talented and dedicated R&D team would login into Uchiwa under the VPN (virtual private network) and stash or delete the specific event/client. This usually occurred (mostly) during the night, when the load on the system is at its peak, affecting our “beauty sleep.”
Seeing the sleepless eyes of my colleagues, prompted me to write a small app for (iOS and Android) which encapsulates most of the abilities the admin screen has, called “Stash It”.
Consider the following scenario, there is a known issue which continues to trigger alerts at night. With the mobile app, it’s much easier to stash the specific alert rather than to get out of bed locating your computer login to VPN and stash via the admin panel.
Having said that, this provides a solution for known issues only. If a known error persists in causing alerts, then the App is a very good option for stashing the error with a defined expiration. Each event provides a detailed description of the event, so the R&D team has the information to decide the proper way to handle it.
The “Stash It” App can be configured to work with every Sensu server out there. It also requires ausername and a password authentication for obvious reasons. After a successful login, there is a list of all the events and warnings sorted by severity, with detailed information and several actions that can be performed on each one.
There are also 3 more lists provided to the user:
- List of tests/checks running by Sensu
- List of current machines
- List of stashes
Once the App will be available (via Apple Store, Google Store) to the general public, you can download it and use it with your own Sensu servers. For now, we’ve made it available to everyone as opensource (iOS version only).
Looking Towards the Next Phase:
- Android app coming soon
- Add Push notifications for critical errors.
- Integrate with Graphite API (Graphite – Scalable real-time Graphing) and show graphs at the related error time.