Trending keywords: security, cloud, container,

Why and How to Use Audit Logs to Secure Kubernetes

SHARE:

In general, Kubernetes doesn’t offer security monitoring or threat detection features. It expects you – the cluster admin – to monitor for and react to security issues yourself.

However, Kubernetes does provide a very important tool for helping to detect potential security events in the form of audit logs. By systematically recording details about access requests that are issued to the Kubernetes API, audit logs provide a centralized resource that you can use to detect suspicious activity across your entire cluster.

This article defines Kubernetes audit logs, explains how to work with them, and provides an example of using audit logs to track security events.

What Is a Kubernetes Audit Log?

Put simply, a Kubernetes audit log is a log that records information from the Kubernetes auditing service.

The purpose of audit logs is to help cluster admins track which requests were initiated, who initiated them, which resource(s) were affected, and the result of each request.

Thus, by logging and analyzing audit data, you can gain early visibility into potential security issues within your cluster, such as requests for unauthorized access to resources or requests that are initiated by unknown users or services. Audit logs can also be extremely useful when you’re researching an existing security breach (although, hopefully, you’ll catch problems before they reach that point!).

Audit Logs Only Exist if in Kubernetes You Create Them

Although it’s common to hear folks talk about “Kubernetes audit logs,” this nomenclature is a bit misleading because Kubernetes doesn’t create a specific audit log per se. In other words, it’s not as if Kubernetes automatically logs all audit events to a specific file that you can simply open or tail to keep track of security events.

Instead, Kubernetes provides the facilities that admins can optionally use to record security events and stream them to a backend of the admins’ choosing. Thus, you can create various types of audit logs in Kubernetes, but their exact nature will depend on the configuration you set up. And there is no audit log by default unless you set it up first.

Kubernetes Audit Events and Stages

Kubernetes registers auditing data based on two key concepts: stages and audit events.

An audit event is any request to the API server, and stages align with the steps that the server goes through as it handles each request.

There are four possible “stages” for each event:

  • RequestReceived: At this stage, the API server has received the request but has not started processing it yet.
  • ResponseStarted: The server has started processing the request but has not yet sent a response.
  • ResponseComplete: The server has finished processing the request and has sent a response.
  • Panic: This stage happens when the API server “panics” in response to a request.

You can tell Kubernetes to record information to an audit log for every stage of every audit event that takes place within your cluster. That way, you can track not just when security-relevant requests occur, but also how they are handled.

This granularity is beneficial because it helps you evaluate the severity of security incidents. For instance, a potentially malicious request that is blocked at the ResponseStarted stage is less worrisome than one where the API server sends back a response that grants the request. You’ll probably want to know about both types of events, but you’ll prioritize the latter. Kubernetes auditing makes it easy to tell the difference.

Audit Log Format

By default, Kubernetes generates data about each audit event in JSON format. An example event that you can store in a log file might look like this:

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1beta1",
  "metadata": {
    "creationTimestamp": "2018-10-08T08:26:55Z"
  },
  "level": "Request",
  "timestamp": "2018-10-08T08:26:55Z",
  "auditID": "288ace59-97ba-4121-b06e-f648f72c3122",
  "stage": "ResponseComplete",
  "requestURI": "/api/v1/pods?limit=500",
  "verb": "list",
  "user": {
    "username": "admin",
    "groups": ["system:authenticated"]
  },
  "sourceIPs": ["10.0.138.91"],
  "objectRef": {
    "resource": "pods",
    "apiVersion": "v1"
  },
  "responseStatus": {
    "metadata": {},
    "code": 200
  },
  "requestReceivedTimestamp": "2018-10-08T08:26:55.466934Z",
  "stageTimestamp": "2018-10-08T08:26:55.471137Z",
  "annotations": {
    "authorization.k8s.io/decision": "allow",
    "authorization.k8s.io/reason": "RBAC: allowed by
    
    ClusterRoleBinding "admin-cluster-binding" of ClusterRole "cluster-
    admin" to User "admin""
  
  }
}Code language: PHP (php)

Enabling Auditing in Kubernetes API Server

Again, while Kubernetes provides the facilities for logging audit events, it doesn’t actually log them for you by default. You need to enable and configure this feature in order to generate audit logs.

To do this, you need to specify the location of two files in your API server config:

--audit-policy-file=/etc/kubernetes/audit-policy.yaml \
--audit-log-path=/var/log/audit.logCode language: JavaScript (javascript)

Here, audit-policy.yaml is the file that defines which audit events to log and how to log them, while audit.log is the location (on your master node) where the log data is actually stored.

Defining the Audit Policy File

You typically wouldn’t want to log every single API request that happens within your cluster. Doing so will leave you with so much data that it becomes hard to filter out the noise.

That’s why you create an audit policy file. An audit policy file is a YAML-formatted file that specifies which events to log and how much data to log about each one.

For example:

apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
  - "RequestReceived"
rules:
  # Log pod changes at RequestResponse level
  - level: RequestResponse
    resources:
    - group: ""
      # Resource "pods" doesn't match requests to any subresource of pods,
      # which is consistent with the RBAC policy.
      resources: ["pods"]
  # Only check access to resource "pods"
  - level: Metadata
    resources:
    - group: ""
      resources: ["pods/log", "pods/status"]
  # Don't log watch requests by the "system:kube-proxy" on endpoints or services
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]
    resources:
    - group: "" # core API group
      resources: ["endpoints", "services"]
  # Don't log authenticated requests to certain non-resource URL paths.
  - level: None
    userGroups: ["system:authenticated"]
    nonResourceURLs:
    - "/api*" # Wildcard matching.
    - "/version"
  # Log the request body of configmap changes in kube-system.
  - level: Request
    resources:
    - group: "" # core API group
      resources: ["configmaps"]
    # This rule only applies to resources in the "kube-system" namespace.
    # The empty string "" can be used to select non-namespaced resources.
    namespaces: ["kube-system"]
  # Log configmap and secret changes in all other namespaces at the Metadata level.
  - level: Metadata
    resources:
    - group: "" # core API group
      resources: ["secrets", "configmaps"]
  # A catch-all rule to log all other requests at the Metadata level.
  - level: Metadata
    # Long-running requests like watches that fall under this rule will not
    # generate an audit event in RequestReceived.
    omitStages:
      - "RequestReceived"Code language: PHP (php)

As you can see from the comments above, this policy file narrows the types of audit events that are recorded. It ignores events on the RequestReceived stage, for example, and it only tracks access to pods.

Detecting Security Events with Audit Logs

As a simple example of Kubernetes auditing in action, imagine that we’ve created an auditing policy file that looks like this:

apiVersion: audit.k8s.io/v1beta1
kind: Policy
omitStages:
- "RequestReceived"
Rules:
  - level: Request
  users: ["admin"]
  Resources:
    - group: ""
    resources: ["*"]
  - level: Request
    user: ["system:anonymous"]
  resources:
    - group: ""
    resources: ["*"]Code language: JavaScript (javascript)

Using this auditing configuration, we can detect when a new user who is not associated with an existing Role or ClusterRole issues a request.

For instance, imagine that the user tries to list pods with:

kubectl get podsCode language: JavaScript (javascript)

Because the user doesn’t have permission to list pods, the API server will deny the request (kubectl will respond with “Error from server (Forbidden): pods is forbidden: User”).

At the same time, the API server will record an audit event that looks something like this:

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1beta1",
  "metadata": {
    "creationTimestamp": "2018-10-08T10:00:20Z"
  },
  "level": "Request",
  "timestamp": "2018-10-08T10:00:20Z",
  "auditID": "5fc5eab3-82f5-480f-93d2-79bfb47789f1",
  "stage": "ResponseComplete",
  "requestURI": "/api/v1/namespaces/default/pods?limit=500",
  "verb": "list",
  "user": {
    "username": "system:anonymous",
    "groups": ["system:unauthenticated"]
  },
  "sourceIPs": ["10.0.141.137"],
  "objectRef": {
    "resource": "pods",
    "namespace": "default",
    "apiVersion": "v1"
  },
  "responseStatus": {
    "metadata": {},
    "status": "Failure",
    "reason": "Forbidden",
    "code": 403
  },
  "requestReceivedTimestamp": "2018-10-08T10:00:20.605009Z",
  "stageTimestamp": "2018-10-08T10:00:20.605191Z",
  "annotations": {
    "authorization.k8s.io/decision": "forbid",
    "authorization.k8s.io/reason": ""
  }
}Code language: JSON / JSON with Comments (json)

By tracking the audit log, then, admins can detect the request, which will alert them to the existence of a user account that probably shouldn’t exist.

Getting the Most from Kubernetes Audit Logs

In a large-scale cluster where the API server handles hundreds or thousands of requests each hour, it’s not practical to tail the audit log by hand in order to detect potential risks.

Instead, you’ll want to stream the event log data to a detection tool that can automatically monitor audit events and generate alerts when something looks awry.

There are two ways to do this:

  • Monitor the audit log file directly: You can configure your intrusion detection tool to monitor the audit log file directly on your master node. You’ll typically need the tool to be running on the master node for this to work, however, which may not be desirable because it increases the load placed on the master. (Of course, you could try to collect the log file from the master node and send it to an external tool using a logging agent that runs locally, but that doesn’t really solve the issue because you still have to run extra software – the logging agent – on the master.)
  • Send events over HTTP: You can use webhooks to send event data to an external security tool over HTTP. This way, your security tool can run entirely separately from your cluster.

For example, in order to stream audit event data to Falco, the open source runtime security tool, you would configure Falco as a backend webhook in your kube-apiserver file:

apiVersion: v1
kind: Config
clusters:
- name: falco
 cluster:
   server: http://$FALCO_SERVICE_CLUSTERIP:8765/k8s_audit
contexts:
- context:
   cluster: falco
   user: ""
 name: default-context
current-context: default-context
preferences: {}
users: []Code language: JavaScript (javascript)

Under this setup, you can use Falco (hosted on a server of your choice) to alert you to security events as they happen. You don’t need to worry about monitoring the Kubernetes audit file directly or running security software directly within your cluster.

Auditing is an essential part of any Kubernetes security strategy. Although audit logs – like many things in Kubernetes – are a bit complex to set up and manage, they are also highly configurable. Kubernetes gives you great control over exactly which types of auditing data you record, where it is stored, and how you work with it.

By being strategic in determining which types of audit events to log (avoid noise!), as well as integrating audit data with intrusion detection tools (like Falco) that can alert you to potential risks in real time, you maximize your ability to find and fix Kubernetes security threats before they turn into major problems.