Scoutflo Documentation
  • 🚀Welcome to Scoutflo💙
  • Overview
    • What is Scoutflo?
    • Getting Started
    • Scoutflo Architecture
    • Basic Concepts
    • Our Products
  • Our Products
    • Scoutflo Deploy
    • Scoutflo Atlas
      • About
      • Key Features
      • Scoutflo Sandbox
        • How to use
        • Available Product Sandboxes
      • Scoutflo Health Score
        • Overview
        • Key Metrics
          • Security Key Elements
          • Code Quality & Maintenance Key Elements
          • Support Key Elements
          • Community Activeness Key Elements
          • Business Readiness Key Elements
        • Calculation
        • Use case of these Scores
        • Process of Score calculation
        • FAQ
      • Product Qualification
      • Find the right product
      • Product Information and Maintenance
      • Product Stakeholders
  • Key Feature
    • Infrastructure Provisioning
      • Create a new Cluster
        • Add Credentials
        • VPC Configuration
      • Edit an existing Cluster
        • Security Scans for Cluster
      • Delete a Cluster
    • Service Deployment
      • Service Onboarding
      • Service Cost Prediction
      • Service Deployment
      • Delete a Service
    • Helm Service Deployment
      • Customized Helm Deployment
      • Open Source Helm Deployment
      • Open Source Service Catalog
      • Edit an App
      • Delete an App
    • Database Deployment
    • RBAC
      • Set Up your custom Roles
    • Workspace
    • Dora Dashboard
    • Kubernetes Dashboard
    • Notification (Coming Soon)
    • Alert Management
      • Default Alert Rules
  • Guide
    • Terraform and Scoutflo
    • AWS EKS Best Practices Guide
    • Kubernetes and Scoutflo
    • ArgoCD and Scoutflo
    • Connect your Cloud
    • Scoutflo Deploy Free Trial Cluster
    • Add-on deployments
    • Custom Configurations
    • Terminology Guide
    • Workflow Action ID
  • Integrations
    • Scoutflo Integration
    • Version Control tool
      • GitHub App
    • Container Registry
      • AWS ECR Container Registry
      • Docker Hub Container Registry
    • Slack
    • Scoutflo Add-Ons
  • Fundamentals
    • GitOps with Scoutflo
    • Container/OCI Registry
    • Monitoring
    • AWS EKS Cluster
    • List of IAM permissions for your scoutflo IAM user on AWS
  • FAQs
    • General
    • Scoutflo Atlas
    • Scoutflo Deploy
    • Scoutflo Sandbox
    • Contact Us
  • Glossary
Powered by GitBook
On this page
  • Alerts Dashboard (Grafana)
  • Alerts Setup with Prometheus and Alert Manager (Cluster Settings Page)
  • Firing Alerts Page (Cluster Post-Deployment)
  1. Key Feature

Alert Management

Alerts Dashboard, Setup, and Firing Alerts

PreviousNotification (Coming Soon)NextDefault Alert Rules

Last updated 4 months ago

Alerts Dashboard (Grafana)

The Alerts Dashboard provides a consolidated view of the alerts configured for your infrastructure. This dashboard integrates Grafana with Prometheus and Alert Manager, offering visual insights into the current state of alerts and metrics.

  • Integrated Grafana Dashboards: The dashboard includes various embedded Grafana panels, each tailored to specific metrics such as cluster health, application resource usage, storage utilization, and scaling activities.

  • Live Data Feed: Prometheus scrapes data from your cluster, and the Alert Manager processes this data to trigger necessary alerts. This information is visualized in real-time on the dashboard.

  • Customizable Views: Users can switch between different Grafana panels to focus on specific alerts or metrics relevant to their current requirements.

How to Use

  1. On the main Scoutflo screen > Access the 'Alerts Overview' .

  2. You can select the preferred cluster you want to monitor from the drop down.

  3. Use the embedded Grafana panels to inspect the health and performance of your cluster.

  4. Monitor live alerts triggered by Prometheus and Alert Manager directly on the dashboard.


Alerts Setup with Prometheus and Alert Manager (Cluster Settings Page)

The Alerts Setup page is part of the Cluster Settings. This page provides a preconfigured setup of 15-20 base Prometheus alert rule templates tailored for your cluster. Users can customize these rules or create new ones based on their requirements.

  • Pre-configured Templates: These templates are designed to address common cluster alerting needs. Examples include:

    • High CPU usage threshold.

    • Memory allocation breaches.

    • Storage nearing capacity.

  • Editable Parameters: Users can modify thresholds, add conditions, or adjust alerting intervals to suit their infrastructure.

  • Rule Management: A user-friendly interface allows easy management of all alert rules from one place.

How to Use

  1. Go to the 'My Clusters' screen > Click on the cluster you want to edit alerts for.

  2. You will be redirected to the Post Deployment screen of that cluster, navigate to the Cluster Settings page.

  3. Review the list of pre-configured alert rules.

  4. Click on the 'Edit' button at top right and then edit any rule for it's parameters or conditions.

  5. Click on the 'Push' button to make changes to the alert rule on your infra.

  6. Use the 'Create' button to create new alerts.


Firing Alerts Page (Cluster Post-Deployment)

The Firing Alerts Page provides real-time visibility into all active alerts triggered by Prometheus and Alert Manager. This page lists alerts that are currently firing due to resource usage exceeding predefined thresholds.

  • Live Alert Feed: The page dynamically updates as new alerts are fired or resolved.

  • Alert Details: Each alert entry includes:

    • Resource affected (e.g., CPU, Memory, Storage).

    • Current value and threshold breach details.

    • Timestamp of when the alert was triggered.

  • Slack Notifications: Users receive immediate Slack notifications for all fired alerts, ensuring timely action.

How to Use

  1. Go to the 'My Clusters' screen > Click on the cluster you want to edit alerts for.

  2. You will be redirected to the Post Deployment screen of that cluster, navigate to the 'Alerts' section on this screen.

  3. View the list of active alerts, sorted by severity and timestamp.

  4. Monitor the resolution status as alerts are cleared.

  5. You can click on the 'Graph Link' to open the monitored data over time for this metric.


  • Alerting Workflow:

    1. Prometheus scrapes metrics from your cluster.

    2. Alert Manager evaluates these metrics against defined rules.

    3. Alerts are visualized on the Alerts Dashboard and the Firing Alerts Page.

    4. Notifications are sent via Slack and displayed in Grafana.

  • Integration Points:

    • Grafana Dashboards: For visual monitoring.

    • Slack: For instant alert notifications.

    • Cluster Settings: For rule configuration and customization.

This feature ensures comprehensive monitoring and alerting for your Kubernetes clusters, empowering users to maintain infrastructure health and respond quickly to issues.