IBM Multicloud Manager - Monitoring and Event Management
Author: Rafal Szypulka (rafal.szypulka@pl.ibm.com)
This chapter is focused on monitoring and event management features delivered by IBM Multicloud Manager 3.1.2.
- Overall MCM dashboard
- MCM Application monitoring
- IBM Cloud Event Management for IBM Multicloud Manager
- Conclusion
Overall MCM dashboard
IBM Multicloud Manager Overview dashboard is available from the Overview section in the MCM menu.
You can view details of your IBM Cloud Private clusters and other cloud service providers supported by IBM Cloud Private. You can also view details about your applications. The Overview dashboard is continuously refreshed in real time.
The following information about clusters is provided:
- Name of the cloud service with the number of clusters
- Cluster compliance
- Pod details
- Cluster status
- Cluster resources (VCPU/Memory usage)
- Storage usage
You can also view the information about each application and clusters where this application has been deployed:
- Number of clusters
- Number of Kubernetes types
- Number of regions
- Number of nodes
- Number of pods
The Overview page can be further personalized with the filtering feature. Click Filter results menu to specify what information is displayed on your page.
MCM Application monitoring
Metrics collection and visualization
To access an Application Health View Dashboard (shown below), you must first access the Applications page from the MCM Menu:
Next to each application, under the DASHBOARD column, there is a Launch Health View
button. If you click it, you will open the Health View Grafana dashboard for that application.
A Grafana Dashboard for MCM applications is generated automatically for each deployed application and shows metrics related to resource utilization (CPU, memory, network) of the application containers and overall resource utilization of the clusters where an application has been deployed.
MCM federated Prometheus is a data source for an application monitoring dashboard. MCM Controller installation deploys a federated Prometheus instance which will pull selected metric data from the Prometheus instances located on managed ICP clusters.
The deployment name for the MCM Controller's Federated Prometheus is: mcm-controller-ibm-mcm-prod-prometheus
Initially, just after MCM installation, the MCM federated Prometheus instance doesn't collect any data. Its configuration is generated dynamically during application deployment via MCM. The Target ICP clusters are added to the MCM federated Prometheus instance's ConfigMap during application deployment.
The example below shows a dynamic update of the MCM federated Prometheus ConfigMap after deployment of the application to three ICP clusters:
scrape_configs:
- job_name: mcm-dynamic-se-prod-312-ubuntu
honor_labels: true
params:
match[]:
- '{job="kubernetes-cadvisor"}'
scrape_interval: 1m
scrape_timeout: 30s
metrics_path: /apis/mcm.ibm.com/v1alpha1/namespaces/se-prod-312-ubuntu/clusterstatuses/se-prod-312-ubuntu/monitor/federate
scheme: https
static_configs:
- targets:
- kubernetes.default:443
labels:
cluster_name: se-prod-312-ubuntu
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: false
- job_name: mcm-dynamic-se-stg-312-ubuntu
honor_labels: true
params:
match[]:
- '{job="kubernetes-cadvisor"}'
scrape_interval: 1m
scrape_timeout: 30s
metrics_path: /apis/mcm.ibm.com/v1alpha1/namespaces/se-stg-312-ubuntu/clusterstatuses/se-stg-312-ubuntu/monitor/federate
scheme: https
static_configs:
- targets:
- kubernetes.default:443
labels:
cluster_name: se-stg-312-ubuntu
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: false
- job_name: mcm-dynamic-se-dev-312-ubuntu
honor_labels: true
params:
match[]:
- '{job="kubernetes-cadvisor"}'
scrape_interval: 1m
scrape_timeout: 30s
metrics_path: /apis/mcm.ibm.com/v1alpha1/namespaces/se-dev-312-ubuntu/clusterstatuses/se-dev-312-ubuntu/monitor/federate
scheme: https
static_configs:
- targets:
- kubernetes.default:443
labels:
cluster_name: se-dev-312-ubuntu
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: false
In the example above, the generated configuration instructs MCM federated Prometheus instance to collect cAdvisor metrics from three child ICP Prometheus instances, located respectively on three ICP clusters: se-prod-312-ubuntu
, se-stg-312-ubuntu
and se-dev-312-ubuntu
.
More information about Prometheus federation mechanisms: https://prometheus.io/docs/prometheus/latest/federation/.
IBM Cloud Event Management for IBM Multicloud Manager
IBM Cloud Event Management (CEM) allows to set up a real-time incident management for the applications and infrastructure managed by the Multi Cloud Manager. Incidents are generated from events/alerts which indicate that something has happened on an application, service, or another monitored object. Cloud Event Management can receive events from various monitoring sources, either on premise or in the cloud.
In the MCM environment, the CEM collects alerts from Prometheus instances located at each managed cluster. The Cloud Event Management Controller for MCM (deployed using alerttargetcontroller
helm chart) automatically configures managed Prometheus Alertmanager instances to send alert notifications to the central CEM instance installed on MCM Controller.
Installing the Cloud Event Management for IBM Multicloud Manager
The IBM Cloud Event Management for MCM is included inside IBM Multicloud Manager installation package. Unpack the MCM installation archive mcm-3.1.2.tgz
and inside you will find two CEM PPA packages:
- Could Event Management Controller - alerttargetcontroller-ppa-0.0.2.tar.gz
- Cloud Event Management - cem-mcm-ppa-ibm-cem-2.2.0.tar.gz
Load both PPA packages to the local container registry on MCM hub cluster and alerttargetcontroller
on every managed cluster.
docker login <MCM-cluster-hostname>:8500
cloudctl login -a https://<MCM-cluster-hostname>:8443 --skip-ssl-validation -n kube-system
cloudctl catalog load-archive -a <ppa-archive> --registry <MCM-cluster-hostname>:8500/kube-system
The following procedure describes installation and configuration steps on example ICP 3.1.2 cluster running MCM controller. The sequence of the steps is important.
Prerequisites
- MCM Controller and MCM Klusterlet deployed on MCM hub ICP 3.1.2 cluster and Klusterlet deployed on managed ICP clusters.
- CEM PPA packages imported as per instructions above.
- CEM users have an
Administrator
role within a Team which has a resource management assigned to a managed cluster namespaces.
Installation of the Cloud Event Management controller
Go to the ICP Catalog and deploy the alerttargetcontroller
chart on both MCM hub cluster and managed clusters in the kube-system
namespace.
MCM Fullname Override
option can be obtained using:
kubectl get po -n kube-system | grep klusterlet
Copy the pod name part before the klusterlet
word (klusterlet-ibm-mcmk-prod
on the example below):
# kubectl get po | grep klusterlet
klusterlet-ibm-mcmk-prod-klusterlet-657958f69f-v7cw9 3/3 Running 0 30h
klusterlet-ibm-mcmk-prod-weave-scope-49sxf 1/1 Running 0 30h
ICP Cluster namespace
is the cluster namespace created during klusterlet deployment. In our case the namespace name is mcm-se-dev-31
. It can be obtained using:
# kubectl get clusters --all-namespaces
NAMESPACE NAME ENDPOINTS STATUS AGE
se-stg-31 se-stg-31 172.16.40.68:8001 Ready 1d
se-dev-31 se-dev-31 172.16.40.98:8001 Ready 1d
After chart deployment, make sure the alerttargetcontroller
pod is running.
# kubectl get pod -n kube-system|grep alerttarget
atc-alerttargetcontroller-alerttargetcontroller-77f87fb77cx6fph 1/1 Running 1 10h
Installation of the Cloud Event Management for IBM Multicloud Manager
Deploy ibm-cem
chart you loaded together with CEM image to local-charts
repository in the kube-system
namespace on the MCM hub cluster.
- NOTE: The CEM chart provided with ICP's built-in
ibm-charts
repository is aCommunity Edition
version which is not designed to work with MCM.
In our setup we used ICP UI console hostname for both ICP Master IP
and Ingress Domain
options.
After deployment, wait a couple of minutes until all CEM pods are started and run the following command to configure OIDC registration with IBM Cloud Private:
kubectl exec -n kube-system -t `kubectl get pods -l release=cem -n kube-system \
| grep "cem-ibm-cem-cem-users" | grep "Running" | head -n 1 \
| awk '{print $1}'` bash -- "/etc/oidc/oidc_reg.sh" "`echo $(kubectl get secret platform-oidc-credentials -o yaml -n kube-system \
| grep OAUTH2_CLIENT_REGISTRATION_SECRET: | awk '{print $2}')`"
Verify the alerttargets CRD has been created:
kubectl get alerttargets --all-namespaces
NAMESPACE NAME AGE
se-stg-31 se-stg-31-se-stg-31 2h
se-dev-31 se-dev-31-se-dev-31 2h
At this point the Alertmanager ConfigMaps for Prometheus instances are located on managed clusters: monitoring-prometheus-alertmanager
should be automatically updated by the alerttargetcontroller
.
To see the ConfigMap YAML for each managed cluster, run the following command on each managed cluster:
kubectl get cm monitoring-prometheus-alertmanager -n kube-system -o yaml
In the route:
-> routes:
section you should see the following:
- receiver: cemwebhook
group_by:
- alertname
- instance
- severity
continue: true
group_wait: 10s
group_interval: 10s
repeat_interval: 1m
And in the receivers:
section you should see:
- name: cemwebhook
webhook_configs:
- send_resolved: true
http_config:
tls_config:
insecure_skip_verify: true
url: https://172.16.40.68:8443/norml/webhook/prometheus/cem-1/695c8db7-344c-4af7-84f3-10f99eab440a/Snfok7F3_0ndxxxxxx762ZWnsMmnPtnLG69ID_rzctg
Note, the CEM url:
will be different in your environment.
The CEM Alert Target Controller adds a couple of sample alert definitions to the AlertRules
CRD. This can be verified using by running the following command:
kubectl get alertrules
NAME ENABLED AGE CHART RELEASE ERRORS
cem-alert-rules true 2h
(...)
These alert rules can be customized based on your requirements. We recommend considering our best practice alert definitions for ICP platform.
Check the Prometheus Alertmanager logs to verify that there are no errors while sending the webhook notifications to CEM.
kubectl logs <alertmanager pod> -n kube-system
User management
CEM console access is managed through ICP Teams
page in the ICP Console. To create a team with Administrator
role, run the following steps:
- On the MCM HUB Cluster, go to Manage -> Identity & Access.
- Click on the Teams menu.
- Create a Team and call it
CEM
. - Click the Users tab and assign the
Administrator
role to the users in the Team. - Finally, click the Resources tab and add a row of type Namespace with the namespace name for each managed cluster.
First login to the CEM console
Cloud Event Management console can be accessed from the Multicloud Manager console. Logon to MCM UI as one of the Team members (mentioned in the User Management section) and select Event Management
.
You may be asked to authenticate again with the ICP user and you should see one or more subscriptions.
Example subscription cem-1
on the picture above is the name of the ICP Team
authorized to manage a cluster namespace.
Click Launch
and then on the Incidents
tab. If some defined Prometheus alerts are active (you can verify it via Prometheus Alertmanager UI available from the ICP console Platform -> Alerting
), you should see those incidents in the CEM UI:
Now click on one of the incidents to see the Prometheus alert details:
Click on the Generator URL
link to open the Prometheus UI on the managed cluster and see the current result of the PromQL query that generated this alert:
Conclusion
IBM Cloud Event Management for IBM Multicloud Manager enables you to access Prometheus alert information for each of your managed clusters from a centralized location.
To learn more about using and operating Cloud Event Management, check out the Cloud Event Management documentation.