Elasticsearch ILM VS. OpenSearch ISM Policy, Instructions & More

Opster Expert Team - Gustavo

Last updated: Jan 23, 2023

| 3 min read

Opster Team

Last updated: Jan 23, 2023

| 3 min read

In addition to reading this guide, we recommend you run the Elasticsearch Health Check-Up. It will detect issues and improve your Elasticsearch performance by analyzing your shard sizes, threadpools, memory, snapshots, disk watermarks and more.

The Elasticsearch Check-Up is free and requires no installation.

To manage all aspects of your OpenSearch operation, you can use Opster’s Management Console (OMC). The OMC makes it easy to orchestrate and manage OpenSearch in any environment. Using the OMC you can deploy multiple clusters, configure node roles, scale cluster resources, manage certificates and more – all from a single interface, for free. Check it out here.

Introduction

Over time we’ve seen that some of the Elasticsearch functions remain the same in OpenSearch, while others have begun to change. This is mainly because of the X-Pack features in Elasticsearch, so OpenSearch began to take its own path.

One of these features is how to handle the indices across the cluster, and when to delete them. This is commonly known as Data Retention, but Elasticsearch and Opensearch go one step further, also defining where the data should go before being deleted.

Elasticsearch calls it ILM (Index Lifecycle Management), and Opensearch calls it ISM (Index State Management). The goal in both is the same, but we will see that the execution is different.

We will first compare the similarities, differences, and then we will create a simple hot-warm-delete schema in Opensearch.

Similarities between ILM and ISM

Both ILM and ISM work based on rolled over indices (Data Streams make this job easier).
Indices can be moved based on index age, size, or number of documents.
Both support shrinking, merge segments, changing of replicas, close index, index priority, read only, allocation and snapshot.

Differences between ILM and ISM

In OpenSearch, hot/warm node types are configured as node attributes (this is true also for older Elasticsearch versions). Currently in Elasticsearch, hot/warm node types are configured as node roles.
Elasticsearch ILM comes with hot/warm/cold/frozen/delete phases defined out of the box. In Opensearch you need to create and name the phases yourself.
Opensearch exposes actions, states and transitions that you have to configure in the ISM. Elasticsearch requires phases, and actions within each phase.
Opensearch ISM supports notifications by Slack, Email, Chime, etc, for each of the phase transitions and errors. In Elasticsearch, you cannot create notifications from the ILM configuration.
In ISM you can assign indices directly from the policy (or manually, index template is being deprecated). The policy stores which indices will apply, while in Elasticsearch the index itself stores the policy, but the policy doesn’t store the indices. Elasticsearch allows index template, or manual Policy assignment.

Creating an ISM Policy

We will create a policy with the following requirements:

Rollover the index after 30 days or 50gb primary shard size to avoid oversharding the index.
Stay in the hot zone for 7 days.
Move to the warm zone and stay there for 15 days, we will reduce replicas to 0 during this phase.
Delete and notify via email when the index was deleted.

We will use the datalogs-* pattern we created in the Data Streams article.

Run the following in Dev Tools to have the policy created. Make sure to replace the channel IDs with your own channels. For instructions on how to create your own channels, follow this guide. You can also refer to the channel creation docs here.

PUT _plugins/_ism/policies/example_hwd
{
  "policy": {
    "description": "Hot/Warm/Delete example",
    "schema_version": 1,
    "error_notification": {
      "channel": {
        "id": "tmHzgYEB62Ttjfftxmj-"
      },
      "message_template": {
        "source": "Index {{ctx.index}} failed",
        "lang": "mustache"
      }
    },
    "default_state": "hot",
    "states": [
      {
        "name": "hot",
        "actions": [
          {
            "retry": {
              "count": 3,
              "backoff": "exponential",
              "delay": "1m"
            },
            "rollover": {
              "min_index_age": "30d",
              "min_primary_shard_size": "50gb"
            }
          }
        ],
        "transitions": [
          {
            "state_name": "warm",
            "conditions": {
              "min_rollover_age": "7d"
            }
          }
        ]
      },
      {
        "name": "warm",
        "actions": [
          {
            "retry": {
              "count": 3,
              "backoff": "exponential",
              "delay": "1m"
            },
            "replica_count": {
              "number_of_replicas": 0
            }
          }
        ],
        "transitions": [
          {
            "state_name": "delete",
            "conditions": {
              "min_rollover_age": "15d"
            }
          }
        ]
      },
      {
        "name": "delete",
        "actions": [
          {
            "retry": {
              "count": 3,
              "backoff": "exponential",
              "delay": "1m"
            },
            "notification": {
              "channel": {
                "id": "tmHzgYEB62Ttjfftxmj-"
              },
              "message_template": {
                "source": "Index: {{ctx.index}} Deleted",
                "lang": "mustache"
              }
            }
          }
        ],
        "transitions": []
      }
    ],
    "ism_template": [
      {
        "index_patterns": [
          "datalogs-*"
        ],
        "priority": 100
      }
    ]
  }
}

You can edit it visually by going to “Index policies” under “Index Management” in the left menu.

Notes and good things to know

Each data stream can only have one ISM policy, so it will fail if two policies with the same priority affect the same indices/data streams. Priority was increased to 100 to avoid collisions.

Also note that we are assuming all nodes are equal in terms of hardware, so we don’t really care about allocating our data in different nodes when moving shards from one phase to another (e.g: hot to warm).

In this article we review how to build the architecture to support shard allocation in our ISM policy. In this way, we can use our hardware efficiently and save costs with the least possible impact, moving the more active data to the fastest hardware, and the least searched to the slower hardware.

Conclusion

Both Elasticsearch with ILM and Opensearch with ISM offer simple ways to handle our indices’ data over time. We can move data across nodes to save costs, and move the oldest data to the cheapest hardware by configuring the most rigorous retention policies. With both tools we just have the initial configuration steps, and then the engine will take care of moving the data around.

Elasticsearch made some progress on formalizing the hot-warm infrastructure, creating specific nodes for it, and Opensearch integrated alerts within the ISM policies, which is very useful. We should be seeing more new features in both tools to make this process smoother and more efficient every day.

Elasticsearch Elasticsearch ILM VS. OpenSearch ISM Policy – Comparison, Explanation and Instructions (OpenSearch ILM)

Quick links

Introduction

Similarities between ILM and ISM

Differences between ILM and ISM

Creating an ISM Policy

Notes and good things to know

Conclusion

Find & fix Elasticsearch problems

Try AutoOps to find & fix Elasticsearch problems

Related Articles