Project Plan- Log-Analysis and Alerting for OPNFV-VSPERF (Metrics)
Goals
Help user to understand the test behavior, and analyze the performance results from metrics generated by VSPERF and Alert-Management solution to send alerts that will be notified to VSPERF.
Tasks
Week | Activity |
---|---|
Week 1 - Week 3
| Understanding Prometheus, Alert Manager, and Grafana Understanding Collectd, Collectd-Exporter, cAdvisor Deployment of Monitoring Stack (containers) |
Week 4 - Week 6
| Creating, Configurting and testing Alerts - BM Creating, Configurting and testing Alerts - OS Alert Notification and Handling - BM Alert Notification and Handling - OS HA Deployment of Monitoring Stack |
Week 7 - Week 9 | Automated deployment using Ansible Enhance the Alert solution for K8S data Custom Dashboards for metrics visualization |
Week 10 - Week 12 | Client side automated deployment using Ansible Release Complete Monitoring Solution Custom Analytics - Causation, Trend/Pattern |
Deliverables
Client-Side Ansible Playbook:
Deploy and Configure agents (collectd)
Server-Side Ansible playbooks
Deploy K8S Cluster
Deploy and configure PAG stack
Alerting Configuration
Jupyter Notebooks
Metrics Analysis
Visualization and alert management in OPNFV airship.
Evaluation Criteria
1st Evaluation (end of week 3): Understanding Prometheus, Alert Manager, and Grafana, Understanding Collectd, Collectd-Exporter, cAdvisor and Deployment of Monitoring Stack
2nd Evaluation (end of week 6): Creating, Configurting and testing Alerts - BM & OS, , Alert Notification and Handling - BM & OS, Starting with Create Alert Visualization
3rd Evaluation (end of week 9): HA Deployment of monitoring solution, Complete Create Alert Visualization and Enhance the Alert solution for K8S data
Final Evaluation (end of week 12): Custom analytics and complete Release of Complete Monitoring Solution
Deliverables not Completed
Visualization and alert management in OPNFV airship (OS)
Unfortunately the OPNFV-Airship deployments were not stable and the LMA components of the Airship constantly crashed.
OPNFV-Airship team could not fix the issues.
Whenever the OPNFV-Airship team is ready, and their deployments stable, I would be more than willing to contribute.
Recommendation for Future Work
Complete the planned OPNFV - Airship (openstack) monitoring part
Closed Loop Automation for the complete logs and metrics analysis system
Code examples
Results
HA - Setup for P.A.G stack
Grafana Dashboard: OVS Stats (Avg. RX values Panel)
Grafana Dashboard: Memory Panel
Insights Gained
Logging and Monitoring is really important
Management tools and functioning of an Open Source Org
Code and contributions are recognized and will be used by many
People are there to help you
Presentation Slides