Elasticsearch

This section presents the different actions related to Elasticsearch that you might be required to do while updating/upgrading OKA.

Backup

First, ensure that Elasticsearch database is running and accessible .
Backup your Elasticsearch database. To learn more about Elasticsearch snapshots see How to create a snapshot.
Note: the snapshots are incremental, so not all the data are saved during each snapshot/backup process.
1. You must configure your snapshot directory (here we chose /dir_path/backup_repo) before trying to take a snapshot for the first time:
  Create snapshot directory:
  
  mkdir -p /dir_path/backup_repo
  
  Grant read/write access to the directory for Elasticsearch to store snapshots:
  
  sudo chown elasticsearch: /dir_path//backup_repo
  
  Add this line in /etc/elasticsearch/elasticsearch.yml:
  
  path.repo: /dir_path/backup_repo
  
  Restart Elasticsearch:
  
  sudo systemctl restart elasticsearch
  
  Query Elasticsearch to create a snapshot directory (in our case it will be /dir_path/backup_repo), this directory does not need to exist beforehand, Elasticsearch will create it automatically. You can replace my_backup by the name you want:
  
  curl -X PUT "http://host:port/_snapshot/my_backup" -H 'Content-Type: application/json' -d ' { "type": "fs", "settings": { "location": "/dir_path/backup_repo", "compress": "true" } }'
2. Backup your Elasticsearch database (take a snapshot). We recommend that you create a snapshot with a name that includes the current date (see below example with $(date +%Y%m%d)). The wait_for_completion=true option will make the command synchronous: it will wait until the backup has been completed:
  cuindexrl -X PUT "http://host:port/_snapshot/my_backup/snapshot_oka_$(date +%Y%m%d)?wait_for_completion=true" -H 'Content-Type: application/json' -d ' { "ignore_unavailable": true, "include_global_state": true, "metadata": { "taken_by": "your_name", "taken_because": "backup before upgrading OKA" } }'
3. If you don’t include the wait_for_completion=true option (thus the above command is asynchronous), you can still check the completion of your snapshot with:
  SNAPSHOT_NAME=snapshot_oka_$(date +%Y%m%d) curl -s "http://${ES_HOST}/_snapshot/my_backup/${SNAPSHOT_NAME}"
On completion, check in the output the "state" of the backup, it should be "SUCCESS".

Note

If your Elasticsearch is secured with https (recommended), you need to change the http:// by https:// in the above commands.

If you also need to specify a username and password (recommended) to connect, then you also need to specify the credentials by adding -u username:password to the curl command.

Warning

The default behavior for selecting indices in a snapshot operation is to include all indices ("indices": "*"). To specify a selection of specific indices, the “indices” field should be modified to list the desired index names, separated by commas ("indices": "index1,index2,index3"). OKA’s main indices that should be backed-up can be seen per cluster directly on the Cluster page.

Note

Snapshots must be identified by unique names (${SNAPSHOT_NAME}). The above command use the date of the snapshot as unique identifier (snapshot_oka_$(date +%Y%m%d)).

Note

You can add multiple metadata to your snapshot. We recommend to add at least your name and the reason why the snapshot was taken.

Note

Over time, snapshot repositories can accumulate stale data that is no longer referenced by existing snapshots. Use this command to clean the snapshot repository curl -XPOST "http://host:port/_snapshot/my_backup/_cleanup". In case you need to copy the snapshot to another Elasticsearch (e.g., preproduction VM), we recommend that you run this command first. This command can take a long time to complete.

Restore

Important

If restoring to a new version of Elasticsearch, the 4 steps in configure your snapshot directory should be repeated for this new version.

# Here we restore all indices, apart from the system indices (``-.*`` meaning to exclude all indices starting by ``.``)
curl -X POST "http://host:port/_snapshot/my_backup/${SNAPSHOT_NAME}/_restore" -H 'Content-Type: application/json' -d '
{
      "indices": "*,-.*",
      "ignore_unavailable": true,
      "include_global_state": false,
      "include_aliases": true
}'

Note

Modify the snapshot name (${SNAPSHOT_NAME}) to match the one provided during the backup procedure. You can list the available snapshots with:

curl -X GET "http://host:port/_cat/snapshots/${SNAPSHOT_NAME}"

Note

You can make this command synchronous by adding ?wait_for_completion=true after _restore.

Migrating from Elasticsearch 7 to Elasticsearch 9

Important

Elasticsearch does not support migrating directly from version 7 to version 9, you will need to first migrate to version 8, then to version 9.

First of all, create a snapshot of you Elasticsearch database in v7.
On your new Elasticsearch server, install Elasticsearch 8 (see Elasticsearch).
Copy your snapshot from your v7 server to your v8 server, and import you snapshot in v8 (see above on how to restore).
Check depreciations. You should see depreciations in the output of the following command:
```
curl -X GET "http://localhost:9200/_migration/deprecations?pretty"
```

You can then use the following script to reindex the indices for them to be v8 compatible, and still be accessible by OKA (creation of aliases):

reindex.oka.sh

#!/bin/bash
################################################################################
# Copyright (c) 2017-2025 UCit SAS
# All Rights Reserved
#
# This software is the confidential and proprietary information
# of UCit SAS ("Confidential Information").
# You shall not disclose such Confidential Information
# and shall use it only in accordance with the terms of
# the license agreement you entered into with UCit.
################################################################################
# ================================================================
# OKA Elasticsearch Reindexing Script for ES9 Migration
# ================================================================
# This script reindexes all Elasticsearch indices to make them
# compatible with Elasticsearch 9.x
# ================================================================

# Default configuration
ES_HOST="localhost"
ES_PORT="9200"
ES_SCHEME="http"
ES_USER=""
ES_PASSWORD=""
CONFIRM_EACH_INDEX=false
SKIP_ALREADY_REINDEXED=true
LOG_FILE="/tmp/reindex_oka_$(date +%Y%m%d_%H%M%S).log"

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color

# ================================================================
# Helper Functions
# ================================================================

show_usage() {
    cat << EOF
Usage: $0 [OPTIONS]

Options:
    -h, --host HOST          Elasticsearch host (default: localhost)
    -p, --port PORT          Elasticsearch port (default: 9200)
    -s, --scheme SCHEME      Connection scheme: http or https (default: http)
    -u, --user USER          Elasticsearch username (if authentication required)
    -w, --password PASS      Elasticsearch password (if authentication required)
    -c, --confirm-each       Ask for confirmation before each index reindexing
    --no-skip-reindexed      Don't skip indices that are already reindexed (default: skip them)
    -l, --log-file FILE      Log file path (default: /tmp/reindex_oka_TIMESTAMP.log)
    --help                   Show this help message

Examples:
    # Basic usage (local ES without auth)
    $0

    # ES with HTTPS and authentication
    $0 --scheme https --user elastic --password mypassword

    # Confirm each index before reindexing
    $0 --confirm-each

    # Re-run and only process failed indices
    $0

    # Process all indices even if already reindexed
    $0 --no-skip-reindexed

EOF
    exit 0
}

log() {
    echo -e "$1" | tee -a "${LOG_FILE}"
}

log_success() {
    log "${GREEN}✓ $1${NC}"
}

log_error() {
    log "${RED}✗ $1${NC}"
}

log_warning() {
    log "${YELLOW}⚠ $1${NC}"
}

log_info() {
    log "${BLUE}ℹ $1${NC}"
}

log_step() {
    log "${BLUE}  → $1${NC}"
}

# Build curl command with authentication if needed
curl_es() {
    local auth_param=""
    if [[ -n "${ES_USER}" ]] && [[ -n "${ES_PASSWORD}" ]]; then
        auth_param="-u ${ES_USER}:${ES_PASSWORD}"
    fi

    # Longer timeouts for reindex operations
    curl --connect-timeout 10 --max-time 600 -s -k "${auth_param}" "$@"
}

# Get full ES URL
get_es_url() {
    echo "${ES_SCHEME}://${ES_HOST}:${ES_PORT}"
}

# Check if index is already reindexed (ends with -v8)
is_already_reindexed() {
    local index=$1
    [[ "${index}" =~ -v8$ ]]
}

# Check if corresponding v8 index exists
has_v8_version() {
    local old_index=$1
    local new_index="${old_index}-v8"
    local es_url
    es_url=$(get_es_url)

    local check
    check=$(curl_es "${es_url}/_cat/indices/${new_index}?h=index" | tr -d '[:space:]')
    [[ "${check}" = "${new_index}" ]]
}

# ================================================================
# Main Reindexing Function
# ================================================================

reindex_with_smart_alias() {
    local old_index=$1
    local new_index="${old_index}-v8"
    local es_url
    es_url=$(get_es_url)

    log "\n========================================="
    log "Processing index: ${old_index}"
    log "========================================="

    # Check if already reindexed
    if [[ "${SKIP_ALREADY_REINDEXED}" = true ]] && has_v8_version "${old_index}"; then
        log_warning "Index already has a -v8 version, skipping"
        return 2
    fi

    # Ask for confirmation if enabled
    if [[ "${CONFIRM_EACH_INDEX}" = true ]]; then
        read -r -p "Reindex this index? (y/n): " confirm
        if [[ "${confirm}" != "y" ]] && [[ "${confirm}" != "Y" ]]; then
            log_warning "Skipped by user"
            return 2
        fi
    fi

    # Step 1: Check if index exists
    log_step "Step 1/10: Checking if index exists..."
    local index_check
    index_check=$(curl_es "${es_url}/_cat/indices/${old_index}?h=index" | tr -d '[:space:]')

    if [[ "${index_check}" = "${old_index}" ]]; then
        log_success "Index exists"
    else
        log_error "Index ${old_index} does not exist, skipping"
        return 1
    fi

    # Step 2: Retrieve existing aliases
    log_step "Step 2/10: Retrieving existing aliases..."
    local alias_json
    alias_json=$(curl_es "${es_url}/${old_index}/_alias")
    local existing_aliases
    existing_aliases=$(echo "${alias_json}" | python3 -c "
import sys, json
try:
    data = json.load(sys.stdin)
    for idx, idx_data in data.items():
        aliases = list(idx_data.get('aliases', {}).keys())
        print(','.join(aliases))
except:
    pass
" 2>/dev/null)

    # Step 3: Determine target aliases
    log_step "Step 3/10: Determining target aliases..."
    local target_aliases=""
    local need_name_alias=false

    if [[ -n "${existing_aliases}" ]] && [[ "${existing_aliases}" != "" ]]; then
        target_aliases="${existing_aliases}"
        log_info "Found existing aliases: ${target_aliases}"
    else
        target_aliases="${old_index}"
        need_name_alias=true
        log_info "No existing aliases, will create alias '${old_index}'"
    fi

    # Step 4: Count documents in old index
    log_step "Step 4/10: Counting documents in old index..."
    local old_count
    old_count=$(curl_es "${es_url}/${old_index}/_count" | python3 -c "import sys, json; print(json.load(sys.stdin).get('count', 0))" 2>/dev/null)
    log_info "Document count: ${old_count}"

    # Step 5: Retrieve settings and mappings
    log_step "Step 5/10: Retrieving index settings and mappings..."
    local index_config
    index_config=$(curl_es "${es_url}/${old_index}")
    log_success "Settings and mappings retrieved"

    # Step 6: Create new index with proper settings including analyzers
    log_step "Step 6/10: Creating new index: ${new_index}..."

    local new_index_config
    new_index_config=$(echo "${index_config}" | python3 -c "
import sys, json
try:
    data = json.load(sys.stdin)
    old_idx = list(data.keys())[0]
    settings = data[old_idx].get('settings', {}).get('index', {})
    mappings = data[old_idx].get('mappings', {})

    # Clean settings (remove those that cannot be set at creation)
    clean_settings = {}

    # Copy important settings including analysis
    allowed_settings = [
        'number_of_shards', 'number_of_replicas', 'refresh_interval',
        'max_result_window', 'analysis', 'similarity', 'max_ngram_diff',
        'max_shingle_diff'
    ]

    for key in allowed_settings:
        if key in settings:
            clean_settings[key] = settings[key]

    # Also check for analysis in the root settings object
    root_settings = data[old_idx].get('settings', {})
    if 'analysis' in root_settings and 'analysis' not in clean_settings:
        clean_settings['analysis'] = root_settings['analysis']

    # Keep original number of shards, set 0 replicas for faster reindexing
    # If number_of_shards is not present in settings, default to 1
    if 'number_of_shards' not in clean_settings:
        clean_settings['number_of_shards'] = 1
    # Always set replicas to 0 during reindexing for performance
    clean_settings['number_of_replicas'] = 0

    result = {
        'settings': {
            'index': clean_settings
        },
        'mappings': mappings
    }

    print(json.dumps(result, indent=2))
except Exception as e:
    import traceback
    print(json.dumps({
        'error': str(e),
        'traceback': traceback.format_exc(),
        'settings': {'number_of_shards': 1, 'number_of_replicas': 0}
    }), file=sys.stderr)
    print(json.dumps({'settings': {'number_of_shards': 1, 'number_of_replicas': 0}}))
" 2>/tmp/python_error_$$.log)

    # Check if Python had errors
    if [[ -s /tmp/python_error_$$.log ]]; then
        log_warning "Python processing had warnings:"
        cat /tmp/python_error_$$.log | tee -a "${LOG_FILE}"
        rm -f /tmp/python_error_$$.log
    fi

    local create_result
    create_result=$(curl_es -X PUT "${es_url}/${new_index}" \
      -H 'Content-Type: application/json' \
      -d "${new_index_config}")

    if echo "${create_result}" | grep -q "acknowledged"; then
        log_success "New index created successfully"
        # Display the number of shards used
        local shards_count
        shards_count=$(echo "${new_index_config}" | python3 -c "import sys, json; d=json.load(sys.stdin); print(d.get('settings', {}).get('index', {}).get('number_of_shards', 'unknown'))" 2>/dev/null)
        log_info "Number of shards: ${shards_count}"
    else
        log_error "Failed to create new index"
        log_error "Response: ${create_result}"
        log_error "Config used:"
        echo "${new_index_config}" | tee -a "${LOG_FILE}"

        # Ask user what to do
        echo ""
        log_warning "An error occurred while creating the index."
        read -r -p "Do you want to (c)ontinue with next index, (r)etry this index, or (a)bort? [c/r/a]: " action
        case ${action} in
            r|R)
                log_info "Retrying..."
                reindex_with_smart_alias "${old_index}"
                return $?
                ;;
            a|A)
                log_error "Aborting script as requested by user"
                exit 1
                ;;
            *)
                log_info "Continuing with next index..."
                return 1
                ;;
        esac
    fi

    # Step 7: Reindex data (ASYNC VERSION WITH PROGRESS TRACKING)
    log_step "Step 7/10: Reindexing data asynchronously (this may take a while)..."
    log_info "Started at: $(date '+%Y-%m-%d %H:%M:%S')"

    local reindex_start
    reindex_start=$(date +%s)

    # Start async reindex
    local reindex_task
    reindex_task=$(curl_es -X POST "${es_url}/_reindex?wait_for_completion=false" \
      -H 'Content-Type: application/json' \
      -d "{
        \"source\": {
          \"index\": \"${old_index}\"
        },
        \"dest\": {
          \"index\": \"${new_index}\",
          \"op_type\": \"create\"
        }
      }")

    # Extract task ID
    local task_id
    task_id=$(echo "${reindex_task}" | python3 -c "import sys, json; print(json.load(sys.stdin).get('task', ''))" 2>/dev/null)

    if [[ -z "${task_id}" ]] || [[ "${task_id}" = "" ]]; then
        log_error "Failed to start reindex task"
        log_error "Response: ${reindex_task}"

        echo ""
        read -r -p "Do you want to (c)ontinue with next index, (r)etry this index, or (a)bort? [c/r/a]: " action
        case ${action} in
            r|R)
                curl_es -X DELETE "${es_url}/${new_index}" >/dev/null 2>&1
                log_info "Retrying..."
                reindex_with_smart_alias "${old_index}"
                return $?
                ;;
            a|A)
                log_error "Aborting script as requested by user"
                exit 1
                ;;
            *)
                log_info "Continuing with next index..."
                return 1
                ;;
        esac
    fi

    log_info "Reindex task ID: ${task_id}"

    # Poll task status
    local completed=false
    local check_interval=5
    local max_wait=3600  # 1 hour max
    local elapsed=0
    local last_progress_time=0

    while [[ "${completed}" = false ]] && [[ ${elapsed} -lt ${max_wait} ]]; do
        sleep "${check_interval}"
        elapsed=$((elapsed + check_interval))

        local task_status
        task_status=$(curl_es "${es_url}/_tasks/${task_id}")
        local is_completed
        is_completed=$(echo "${task_status}" | python3 -c "import sys, json; print(str(json.load(sys.stdin).get('completed', False)))" 2>/dev/null)

        if [[ "${is_completed}" = "True" ]]; then
            completed=true
            local reindex_result="${task_status}"
            break
        fi

        # Show progress every 5 seconds
        if [[ $((elapsed - last_progress_time)) -ge 5 ]]; then
            last_progress_time=${elapsed}
            local progress
            progress=$(echo "${task_status}" | python3 -c "
import sys, json
try:
    data = json.load(sys.stdin)
    status = data.get('task', {}).get('status', {})
    created = status.get('created', 0)
    total = status.get('total', 0)
    if total > 0:
        pct = (created * 100) // total
        print(f'{created}/{total} ({pct}%)')
    else:
        print('In progress...')
except:
    print('Checking...')
" 2>/dev/null)
            log_info "Progress: ${progress} (${elapsed}s elapsed)"
        fi
    done

    if [[ "${completed}" = false ]]; then
        log_error "Reindex task did not complete within ${max_wait}s"
        log_warning "Task ${task_id} may still be running in background"
        log_warning "You can check its status with: curl ${es_url}/_tasks/${task_id}"

        echo ""
        read -r -p "Do you want to (c)ontinue with next index or (a)bort? [c/a]: " action
        case ${action} in
            a|A)
                log_error "Aborting script as requested by user"
                exit 1
                ;;
            *)
                log_info "Continuing with next index..."
                return 1
                ;;
        esac
    fi

    local reindex_end
    reindex_end=$(date +%s)
    local reindex_duration=$((reindex_end - reindex_start))

    log_info "Completed at: $(date '+%Y-%m-%d %H:%M:%S')"
    log_info "Duration: ${reindex_duration} seconds"

    # Extract results from completed task
    local new_count
    new_count=$(echo "${reindex_result}" | python3 -c "
import sys, json
try:
    data = json.load(sys.stdin)
    response = data.get('response', {})
    print(response.get('total', 0))
except:
    print(0)
" 2>/dev/null)

    local failures
    failures=$(echo "${reindex_result}" | python3 -c "
import sys, json
try:
    data = json.load(sys.stdin)
    response = data.get('response', {})
    print(len(response.get('failures', [])))
except:
    print(0)
" 2>/dev/null)

    log_info "Documents reindexed: ${new_count}"
    log_info "Failures: ${failures}"

    if [[ "${failures}" = "0" ]] && [[ "${new_count}" = "${old_count}" ]]; then
        log_success "Reindexing successful: ${new_count} documents"
    elif [[ "${new_count}" = "0" ]]; then
        log_error "No documents were reindexed - possible task failure"
        log_warning "Full task result:"
        echo "${reindex_result}" | python3 -m json.tool 2>/dev/null | tee -a "${LOG_FILE}"

        echo ""
        read -r -p "Do you want to (c)ontinue with next index, (r)etry this index, or (a)bort? [c/r/a]: " action
        case ${action} in
            r|R)
                curl_es -X DELETE "${es_url}/${new_index}" >/dev/null 2>&1
                log_info "Retrying..."
                reindex_with_smart_alias "${old_index}"
                return $?
                ;;
            a|A)
                log_error "Aborting script as requested by user"
                exit 1
                ;;
            *)
                log_info "Continuing with next index..."
                return 1
                ;;
        esac
    else
        log_error "Reindexing issue detected (old: ${old_count}, new: ${new_count}, failures: ${failures})"
        log_warning "Full task result:"
        echo "${reindex_result}" | python3 -m json.tool 2>/dev/null | tee -a "${LOG_FILE}"

        echo ""
        read -r -p "Do you want to (c)ontinue with next index, (r)etry this index, or (a)bort? [c/r/a]: " action
        case ${action} in
            r|R)
                curl_es -X DELETE "${es_url}/${new_index}" >/dev/null 2>&1
                log_info "Retrying..."
                reindex_with_smart_alias "${old_index}"
                return $?
                ;;
            a|A)
                log_error "Aborting script as requested by user"
                exit 1
                ;;
            *)
                log_info "Continuing with next index..."
                return 1
                ;;
        esac
    fi

    # Step 8: Handle aliases and delete old index
    if [[ "${need_name_alias}" = true ]]; then
        # Case: No existing aliases - need to create alias with old index name

        log_step "Step 8/10: Deleting old index before creating alias..."
        local delete_result
        delete_result=$(curl_es -X DELETE "${es_url}/${old_index}")

        if echo "${delete_result}" | grep -q "acknowledged"; then
            log_success "Old index deleted: ${old_index}"
        else
            log_error "Failed to delete old index"
            log_error "Response: ${delete_result}"
            return 1
        fi

        log_step "Step 9/10: Creating alias '${old_index}' pointing to '${new_index}'..."
        local alias_result
        alias_result=$(curl_es -X POST "${es_url}/_aliases" \
          -H 'Content-Type: application/json' \
          -d "{
            \"actions\": [
              {\"add\": {\"index\": \"${new_index}\", \"alias\": \"${old_index}\"}}
            ]
          }")

        if echo "${alias_result}" | grep -q "acknowledged"; then
            log_success "Alias '${old_index}' created and points to '${new_index}'"
        else
            log_error "Failed to create alias"
            log_error "Response: ${alias_result}"
            return 1
        fi

    else
        # Case: Existing aliases - switch them to new index

        log_step "Step 8/10: Switching existing aliases to new index..."
        log_info "Aliases to switch: ${target_aliases}"

        # Build alias actions JSON
        local alias_actions='{"actions":['

        for alias in $(echo "${target_aliases}" | tr ',' ' '); do
            log_info "  Processing alias: ${alias}"
            alias_actions="${alias_actions}{\"remove\":{\"index\":\"${old_index}\",\"alias\":\"${alias}\",\"must_exist\":false}},"
            alias_actions="${alias_actions}{\"add\":{\"index\":\"${new_index}\",\"alias\":\"${alias}\"}},"
        done

        alias_actions="${alias_actions%,}]}"

        local alias_result
        alias_result=$(curl_es -X POST "${es_url}/_aliases" \
          -H 'Content-Type: application/json' \
          -d "${alias_actions}")

        if echo "${alias_result}" | grep -q "acknowledged"; then
            log_success "Aliases switched to new index"
        else
            log_error "Failed to switch aliases"
            log_error "Response: ${alias_result}"
            return 1
        fi

        log_step "Step 9/10: Deleting old index..."
        local delete_result
        delete_result=$(curl_es -X DELETE "${es_url}/${old_index}")

        if echo "${delete_result}" | grep -q "acknowledged"; then
            log_success "Old index deleted: ${old_index}"
        else
            log_error "Failed to delete old index"
            log_error "Response: ${delete_result}"
        fi
    fi

    # Step 10: Verify aliases
    log_step "Step 10/10: Verifying aliases..."
    for alias in $(echo "${target_aliases}" | tr ',' ' '); do
        local check
        check=$(curl_es "${es_url}/_alias/${alias}" | python3 -c "
import sys, json
try:
    data = json.load(sys.stdin)
    indices = list(data.keys())
    print(','.join(indices))
except:
    pass
" 2>/dev/null)

        if echo "${check}" | grep -q "${new_index}"; then
            log_success "Alias '${alias}' correctly points to ${new_index}"
        else
            log_warning "Issue: Alias '${alias}' does not point to new index"
            log_warning "Current target: ${check}"
        fi
    done

    # Configure replicas
    log_step "Configuring number of replicas to 0..."
    curl_es -X PUT "${es_url}/${new_index}/_settings" \
      -H 'Content-Type: application/json' \
      -d '{"index":{"number_of_replicas":0}}' > /dev/null
    log_success "Replicas configured"

    log_success "Index ${old_index} → ${new_index} completed successfully"

    return 0
}

# ================================================================
# Main Function
# ================================================================

main() {
    local es_url
    es_url=$(get_es_url)

    log "========================================="
    log "  OKA REINDEXING FOR ELASTICSEARCH 9"
    log "========================================="
    log "Date: $(date '+%Y-%m-%d %H:%M:%S')"
    log "Elasticsearch URL: ${es_url}"
    log "Authentication: $([[ -n "${ES_USER}" ]] && echo "Enabled (user: ${ES_USER})" || echo "Disabled")"
    log "Log file: ${LOG_FILE}"
    log "Confirm each index: $([[ "${CONFIRM_EACH_INDEX}" = true ]] && echo "Yes" || echo "No")"
    log "Skip already reindexed: $([[ "${SKIP_ALREADY_REINDEXED}" = true ]] && echo "Yes" || echo "No")"
    log ""

    # Verify Elasticsearch is accessible
    log_step "Checking Elasticsearch connectivity..."
    local es_test
    es_test=$(curl_es "${es_url}/")

    if ! echo "${es_test}" | grep -q "cluster_name"; then
        log_error "Cannot connect to Elasticsearch at ${es_url}"
        log_error "Response: ${es_test}"
        log_error ""
        log_error "Please check:"
        log_error "  - Elasticsearch is running: systemctl status elasticsearch"
        log_error "  - Host and port are correct: ${ES_HOST}:${ES_PORT}"
        log_error "  - Scheme (http/https) is correct: ${ES_SCHEME}"
        log_error "  - Network connectivity: ping ${ES_HOST}"
        if [[ -n "${ES_USER}" ]]; then
            log_error "  - Credentials are valid: user=${ES_USER}"
        fi
        exit 1
    fi

    log_success "Connected to Elasticsearch"

    local es_version
    es_version=$(echo "${es_test}" | python3 -c "import sys, json; print(json.load(sys.stdin)['version']['number'])" 2>/dev/null)
    log "Elasticsearch version: ${es_version}"
    log ""

    # Create safety snapshot
    log_warning "IMPORTANT: Creating safety snapshot..."
    local snapshot_name
    snapshot_name="before_reindex_$(date +%Y%m%d_%H%M%S)"

    log_step "Creating snapshot: ${snapshot_name}"
    local snapshot_result
    snapshot_result=$(curl_es -X PUT "${es_url}/_snapshot/oka_backup/${snapshot_name}?wait_for_completion=true" \
      -H 'Content-Type: application/json' \
      -d '{
        "indices": "*,-.*",
        "ignore_unavailable": true,
        "include_global_state": false
      }')

    if echo "${snapshot_result}" | grep -q "SUCCESS"; then
        log_success "Snapshot created: ${snapshot_name}"
    else
        log_warning "Snapshot creation failed or timed out"
        log_warning "It's recommended to have a backup before proceeding"
        read -r -p "Continue anyway? (y/n): " continue_without_snapshot
        if [[ "${continue_without_snapshot}" != "y" ]] && [[ "${continue_without_snapshot}" != "Y" ]]; then
            log "Operation cancelled by user"
            exit 0
        fi
    fi

    log ""

    # List indices to reindex
    log_step "Retrieving list of indices to reindex..."
    local all_indices
    all_indices=$(curl_es "${es_url}/_cat/indices?h=index" | grep -v "^\." | sort)

    # Filter out indices that are already reindexed (-v8) if skip is enabled
    local indices=""
    if [[ "${SKIP_ALREADY_REINDEXED}" = true ]]; then
        log_info "Filtering out already reindexed indices (ending with -v8)..."
        while IFS= read -r idx; do
            if ! is_already_reindexed "${idx}"; then
                if [[ -z "${indices}" ]]; then
                    indices="${idx}"
                else
                    indices="${indices}\n${idx}"
                fi
            fi
        done <<< "${all_indices}"
        indices=$(echo -e "${indices}")
    else
        indices="${all_indices}"
    fi

    local total
    total=$(echo "${indices}" | wc -l)

    log_info "Number of indices to process: ${total}"
    log ""

    # Show sample of indices
    log "First 10 indices:"
    echo "${indices}" | head -10 | while read -r idx; do
        log "  - ${idx}"
    done
    if [[ "${total}" -gt 10 ]]; then
        log "  ... and $((total - 10)) more"
    fi
    log ""

    # Ask for confirmation
    log_warning "This operation will reindex ${total} indices."
    log_warning "Estimated duration: ~$((total * 2)) minutes (depends on data size)"
    log ""
    read -r -p "Do you want to continue? (type YES in uppercase): " confirmation

    if [[ "${confirmation}" != "YES" ]]; then
        log "Operation cancelled by user"
        exit 0
    fi

    log ""
    log "Starting reindexing process..."
    log ""

    # Process each index
    local counter=0
    local success=0
    local failed=0
    local skipped=0
    local start_time
    start_time=$(date +%s)

    while IFS= read -r index; do
        counter=$((counter + 1))
        log "\n╔═══════════════════════════════════════════════════════════╗"
        log "║ Progress: [${counter}/${total}] - $(date '+%H:%M:%S')"
        log "╚═══════════════════════════════════════════════════════════╝"

        reindex_with_smart_alias "${index}"
        local result=$?

        if [[ ${result} -eq 0 ]]; then
            success=$((success + 1))
        elif [[ ${result} -eq 2 ]]; then
            skipped=$((skipped + 1))
        else
            failed=$((failed + 1))
        fi

        # Show progress summary
        log ""
        log_info "Current progress: Success: ${success}, Failed: ${failed}, Skipped: ${skipped}"
    done <<< "${indices}"

    local end_time
    end_time=$(date +%s)
    local total_duration=$((end_time - start_time))
    local duration_minutes=$((total_duration / 60))
    local duration_seconds=$((total_duration % 60))

    # Final summary
    log "\n========================================="
    log "  REINDEXING SUMMARY"
    log "========================================="
    log_success "Successfully processed: ${success} indices"
    if [[ "${skipped}" -gt 0 ]]; then
        log_warning "Skipped: ${skipped} indices"
    fi
    if [[ "${failed}" -gt 0 ]]; then
        log_error "Failed: ${failed} indices"
    fi
    log "Total duration: ${duration_minutes}m ${duration_seconds}s"
    log ""

    # Final alias verification
    log "Final alias verification (first 30):"
    curl_es "${es_url}/_cat/aliases?v&s=alias" | head -31 | tee -a "${LOG_FILE}"
    log ""

    log "Complete log available at: ${LOG_FILE}"
    log ""

    if [[ "${failed}" -eq 0 ]]; then
        log_success "Reindexing completed successfully!"
        log ""
        log "========================================="
        log "  NEXT STEPS"
        log "========================================="
        log ""
        log "1. Verify OKA is working correctly:"
        log "   curl ${es_url}/_cat/indices?v"
        log ""
        log "2. Upgrade Elasticsearch to version 9:"
        log "   sudo systemctl stop elasticsearch"
        log "   sudo yum update elasticsearch -y"
        log "   sudo systemctl start elasticsearch"
        log ""
    else
        log_error "Some indices failed to reindex."
        log_error "Please check the log file: ${LOG_FILE}"
        log ""
        log "You can re-run this script to retry only the failed indices."
        log "The script will automatically skip already reindexed indices."
    fi
}

# ================================================================
# Parse Command Line Arguments
# ================================================================

while [[ $# -gt 0 ]]; do
    case $1 in
        -h|--host)
            ES_HOST="$2"
            shift 2
            ;;
        -p|--port)
            ES_PORT="$2"
            shift 2
            ;;
        -s|--scheme)
            ES_SCHEME="$2"
            shift 2
            ;;
        -u|--user)
            ES_USER="$2"
            shift 2
            ;;
        -w|--password)
            ES_PASSWORD="$2"
            shift 2
            ;;
        -c|--confirm-each)
            CONFIRM_EACH_INDEX=true
            shift
            ;;
        --no-skip-reindexed)
            SKIP_ALREADY_REINDEXED=false
            shift
            ;;
        -l|--log-file)
            LOG_FILE="$2"
            shift 2
            ;;
        --help)
            show_usage
            ;;
        *)
            echo "Unknown option: $1"
            show_usage
            ;;
    esac
done

# ================================================================
# Execute Main Function
# ================================================================

main

Note

If you have large indices, we recommend that you run this script in tmux to prevent it from being killed on network disconnect.

Also, reindexing indices requires free disk space, as data will temporarily be duplicated. Ensure that your server has enough space to hold at least twice the biggest index.

Warning

This script is given as example, and will try to migrate all indices in your Elasticsearch.

If you share Elasticsearch with other tools, adapt the process to your needs, but in this case, you need to keep the same behavior as this script for OKA migrations: the ID of the indices must remain the same for OKA. We achieve this by creating aliases: the ID of the old index is the ID of the alias pointing to the newly migrated index.

Upgrade to Elasticsearch 9 and start OKA:

systemctl stop elasticsearch
# See Elasticsearch documentation to enable repos for version 9
dnf update --enablerepo=elasticsearch elasticsearch
systemctl start elasticsearch
systemctl start oka