Creating and Alerting on Logs-based Metrics - GSP091

Creating and Alerting on Logs-based Metrics - GSP091

Overview

Log-based metrics are Cloud Monitoring metrics that are based on the content of log entries. These metrics can help you identify trends, extract numeric values out of the logs, and set up an alert when a certain log entry occurs by creating a metric for that event. You can use both system and user-defined log-based metrics in Cloud Monitoring to create charts and alerting policies.google

The log-based metrics interface is divided into two metric-type panes: System metrics and User-defined metrics.

System-defined log-based metrics are provided by Cloud Logging for use by all Google Cloud projects.They calculated only from logs that have been ingested by Logging. If a log has been explicitly excluded from ingestion, it isn't included in these metrics.

User-defined log-based metrics are created by you to track things in your Google Cloud project. For example, you might create a log-based metric to count the number of log entries that match a given filter.

Creating an alert from a metric lets you create an alerting policy based on the log-based metric.

Objectives

In this lab you will learn how to:

  • Create a log-based alert

  • Create a system-defined log-based metric

  • Create a user-defined log-based metric

  • Create an alert for the user-defined log-based metric

Task 1. Log-based alert

Log-based alerts notify you whenever a specific message appears in your logs. Try it out by setting up a log-based alert to tell you when a VM stops running.

  1. From Cloud Console, in the Search bar, type in “logs explorer”, then click on the Logs Explorer result.

  2. Click the Show Query slide bar.

  3. Enter the following parameters to create Log Based Alert:

resource.type="gce_instance" protoPayload.methodName="v1.compute.instances.stop"
  1. Click Create alert link.

  2. Add the following parameters, click Next to move to the next parameter.

  • Alert name: stopped vm

  • Choose logs to include in the alert: will auto-fill with the query you entered

  • Set notification frequency and autoclose duration: Time between notifications is 5 min and Incident autoclose duration is 1 hr. Click Next.

Who should be notified (optional):

  • Click on the dropdown arrow next to Notification Channels, then click on Manage Notification Channels.

  • A Notification channels page will open in the new tab.

  • Scroll down the page and click on ADD NEW for Email.

  • Enter your personal email in the Email Address field and a Display name.

  • Click Save.

  • When done, return to the Logs Explorer tab you were in previously.

  • Refresh the Notification Channels, then select the channel you just created. Click OK.

  1. Click Save.

Click Check my progress to verify the objective.

Create the Log-based alert

Check my progress

You will now cause your VM to stop.

  1. Go to the 2nd Cloud Console tab, and navigate to Navigation menu > Compute Engine > VM instances.

  2. Check the box next to instance1, then click Stop at the top of the page, then click Stop again in the pop-up window. The green check mark will turn to a gray circle when the instance has been stopped.

  3. In the Search bar, type "monitoring", then choose the Monitoring option.

  4. Click on the Alerting tab. You'll see that your alert has registered. Under Alert Policies click the See all policies link and you'll see the log-based alert you created listed.

Task 2. Log-based metric

Using log-based metrics you can define a metric that tracks errors in the logs to proactively respond to similar problems and symptoms before they are noticed by end users.

  1. At the beginning of the lab you deployed a standard GKE cluster. Run the following command to ensure that the cluster named gmp-cluster has been created:
gcloud container clusters list

If your cluster status says PROVISIONING, wait a moment and run the command above again. Repeat until the status is RUNNING.

  1. Authenticate the cluster:
gcloud container clusters get-credentials gmp-cluster

You should see the following message:

Fetching cluster endpoint and auth data.
kubeconfig entry generated for gmp-cluster.
  1. Create a namespace to work in:
kubectl create ns gmp-test
  1. Now run the following to deploy a simple application that emits metrics at the /metrics endpoint:
kubectl -n gmp-test apply -f https://storage.googleapis.com/spls/gsp091/gmp_flask_deployment.yaml
kubectl -n gmp-test apply -f https://storage.googleapis.com/spls/gsp091/gmp_flask_service.yaml
  1. Verify that the namespace is ready and emitting metrics:
kubectl get services -n gmp-test

You should see the following:

NAME    TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)        AGE
hello   LoadBalancer   10.0.12.114   34.83.91.157   80:32058/TCP   71s

Click Check my progress to verify the objective.

Deploy the simple application that emits metrics

Check my progress

  1. Re-run the command until you see the External-IP address populated.

  2. Check that the Python Flask app is serving metrics with the following command:

curl $(kubectl get services -n gmp-test -o jsonpath='{.items[*].status.loadBalancer.ingress[0].ip}')/metrics

You should see the following:

# HELP flask_exporter_info Multiprocess metric
# TYPE flask_exporter_info gauge
flask_exporter_info{version="0.18.5"} 1.0

Task 3. Create a log-based metric

  1. Return to Logs Explorer.

  2. Click Create metric link.

  3. On the Create metric page, input the following:

  • Metric type: leave the default setting, Counter

  • Log based metric name: hello-app-error

  • Filter selection: update the following into the Build filter:

severity=ERROR
resource.labels.container_name="hello-app"
textPayload: "ERROR: 404 Error page not found"
  1. Click Create metric.

Click Check my progress to verify the objective.

Create the log-based metric

Check my progress

Task 4. Create a metrics-based alert

  1. In the left pane of Logging window select Log-based Metrics. Then in user-defined metrics click on 3 vertical dots next to metrics and select Create alert from metric.

  2. Under Select a Metric, the metric parameters will automatically fill in.

  • Update the Rolling window to 2 min.

  • Accept the other default settings

  • Click Next.

  1. You will need to set Notifications. Feel free to re-use the channel you created earlier in the lab.

  2. Name the alert policy log based metric alert.

  3. Click Create Policy.

Click Check my progress to verify the objective.

Create the metrics-based alert

Check my progress

Task 5. Generate some errors

Next you'll generate some errors to match the log-based metric you created and trigger the metric-based alert.

  1. In Cloud Shell, run the following to generate some errors:
timeout 120 bash -c -- 'while true; do curl $(kubectl get services -n gmp-test -o jsonpath='{.items[*].status.loadBalancer.ingress[0].ip}')/error; sleep $((RANDOM % 4)) ; done'
  1. Return to the Logs Explorer page, and go to the Severity section on the lower left side. Click on the Error severity. Now you can search for the 404 Error page not found error. View more information by expanding one of the 404 Error messages.

  2. Return to the Monitoring page, and click on Alerting. You will see the 2 policies you created.

  3. Click on the Alert policies link, and you should see both alerts in the Incidents section. Click on an incident to see details.

Note: The log-based metric alert will eventually resolve itself. If you need more time to investigate, run the errors script again and wait for the alert to be triggered again.

Click Check my progress to verify the objective.


Solution of Lab

export ZONE=

curl -LO raw.githubusercontent.com/quiccklabs/Labs_solutions/master/Creating%20and%20Alerting%20on%20Logs%20based%20Metrics/quicklabgsp091.sh
sudo chmod +x quicklabgsp091.sh
./quicklabgsp091.sh