Elasticsearch

This section presents the different actions related to Elasticsearch that you might be required to do while updating/upgrading OKA.


Backup

  1. First, ensure that Elasticsearch database is running and accessible .

  2. Backup your Elasticsearch database. To learn more about Elasticsearch snapshots see How to create a snapshot.

    Note: the snapshots are incremental, so not all the data are saved during each snapshot/backup process.

    1. You must configure your snapshot directory (here we chose /dir_path/backup_repo) before trying to take a snapshot for the first time:

      1. Create snapshot directory:

        mkdir -p /dir_path/backup_repo
        
      2. Grant read/write access to the directory for Elasticsearch to store snapshots:

        sudo chown elasticsearch: /dir_path//backup_repo
        
      3. Add this line in /etc/elasticsearch/elasticsearch.yml:

        path.repo: /dir_path/backup_repo
        
      4. Restart Elasticsearch:

        sudo systemctl restart elasticsearch
        
      5. Query Elasticsearch to create a snapshot directory (in our case it will be /dir_path/backup_repo), this directory does not need to exist beforehand, Elasticsearch will create it automatically. You can replace my_backup by the name you want:

        curl -X PUT "http://host:port/_snapshot/my_backup" -H 'Content-Type: application/json' -d '
        {
        "type": "fs",
        "settings": {
              "location": "/dir_path/backup_repo",
              "compress": "true"
        }
        }'
        
    2. Backup your Elasticsearch database (take a snapshot). We recommend that you create a snapshot with a name that includes the current date (see below example with $(date +%Y%m%d)). The wait_for_completion=true option will make the command synchronous: it will wait until the backup has been completed:

      cuindexrl -X PUT "http://host:port/_snapshot/my_backup/snapshot_oka_$(date +%Y%m%d)?wait_for_completion=true" -H 'Content-Type: application/json' -d '
      {
            "ignore_unavailable": true,
            "include_global_state": true,
            "metadata": {
                  "taken_by": "your_name",
                  "taken_because": "backup before upgrading OKA"
            }
      }'
      
    3. If you don’t include the wait_for_completion=true option (thus the above command is asynchronous), you can still check the completion of your snapshot with:

      SNAPSHOT_NAME=snapshot_oka_$(date +%Y%m%d)
      curl -s "http://${ES_HOST}/_snapshot/my_backup/${SNAPSHOT_NAME}"
      

    On completion, check in the output the "state" of the backup, it should be "SUCCESS".

Note

If your Elasticsearch is secured with https (recommended), you need to change the http:// by https:// in the above commands.

If you also need to specify a username and password (recommended) to connect, then you also need to specify the credentials by adding -u username:password to the curl command.

Warning

The default behavior for selecting indices in a snapshot operation is to include all indices ("indices": "*"). To specify a selection of specific indices, the “indices” field should be modified to list the desired index names, separated by commas ("indices": "index1,index2,index3"). OKA’s main indices that should be backed-up can be seen per cluster directly on the Cluster page.

Note

Snapshots must be identified by unique names (${SNAPSHOT_NAME}). The above command use the date of the snapshot as unique identifier (snapshot_oka_$(date +%Y%m%d)).

Note

You can add multiple metadata to your snapshot. We recommend to add at least your name and the reason why the snapshot was taken.

Note

Over time, snapshot repositories can accumulate stale data that is no longer referenced by existing snapshots. Use this command to clean the snapshot repository curl -XPOST "http://host:port/_snapshot/my_backup/_cleanup". In case you need to copy the snapshot to another Elasticsearch (e.g., preproduction VM), we recommend that you run this command first. This command can take a long time to complete.


Restore

Important

If restoring to a new version of Elasticsearch, the 4 steps in configure your snapshot directory should be repeated for this new version.

# Here we restore all indices, apart from the system indices (``-.*`` meaning to exclude all indices starting by ``.``)
curl -X POST "http://host:port/_snapshot/my_backup/${SNAPSHOT_NAME}/_restore" -H 'Content-Type: application/json' -d '
{
      "indices": "*,-.*",
      "ignore_unavailable": true,
      "include_global_state": false,
      "include_aliases": true
}'

Note

Modify the snapshot name (${SNAPSHOT_NAME}) to match the one provided during the backup procedure. You can list the available snapshots with:

curl -X GET "http://host:port/_cat/snapshots/${SNAPSHOT_NAME}"

Note

You can make this command synchronous by adding ?wait_for_completion=true after _restore.


Migrating from Elasticsearch 7 to Elasticsearch 9

Important

Elasticsearch does not support migrating directly from version 7 to version 9, you will need to first migrate to version 8, then to version 9.

  1. First of all, create a snapshot of you Elasticsearch database in v7.

  2. On your new Elasticsearch server, install Elasticsearch 8 (see Elasticsearch).

  3. Copy your snapshot from your v7 server to your v8 server, and import you snapshot in v8 (see above on how to restore).

  4. Check depreciations. You should see depreciations in the output of the following command:

    curl -X GET "http://localhost:9200/_migration/deprecations?pretty"
    
  5. You can then use the following script to reindex the indices for them to be v8 compatible, and still be accessible by OKA (creation of aliases):

    reindex.oka.sh
    #!/bin/bash
    ################################################################################
    # Copyright (c) 2017-2025 UCit SAS
    # All Rights Reserved
    #
    # This software is the confidential and proprietary information
    # of UCit SAS ("Confidential Information").
    # You shall not disclose such Confidential Information
    # and shall use it only in accordance with the terms of
    # the license agreement you entered into with UCit.
    ################################################################################
    # ================================================================
    # OKA Elasticsearch Reindexing Script for ES9 Migration
    # ================================================================
    # This script reindexes all Elasticsearch indices to make them
    # compatible with Elasticsearch 9.x
    # ================================================================
    
    # Default configuration
    ES_HOST="localhost"
    ES_PORT="9200"
    ES_SCHEME="http"
    ES_USER=""
    ES_PASSWORD=""
    CONFIRM_EACH_INDEX=false
    SKIP_ALREADY_REINDEXED=true
    LOG_FILE="/tmp/reindex_oka_$(date +%Y%m%d_%H%M%S).log"
    
    # Colors for output
    RED='\033[0;31m'
    GREEN='\033[0;32m'
    YELLOW='\033[1;33m'
    BLUE='\033[0;34m'
    NC='\033[0m' # No Color
    
    # ================================================================
    # Helper Functions
    # ================================================================
    
    show_usage() {
        cat << EOF
    Usage: $0 [OPTIONS]
    
    Options:
        -h, --host HOST          Elasticsearch host (default: localhost)
        -p, --port PORT          Elasticsearch port (default: 9200)
        -s, --scheme SCHEME      Connection scheme: http or https (default: http)
        -u, --user USER          Elasticsearch username (if authentication required)
        -w, --password PASS      Elasticsearch password (if authentication required)
        -c, --confirm-each       Ask for confirmation before each index reindexing
        --no-skip-reindexed      Don't skip indices that are already reindexed (default: skip them)
        -l, --log-file FILE      Log file path (default: /tmp/reindex_oka_TIMESTAMP.log)
        --help                   Show this help message
    
    Examples:
        # Basic usage (local ES without auth)
        $0
    
        # ES with HTTPS and authentication
        $0 --scheme https --user elastic --password mypassword
    
        # Confirm each index before reindexing
        $0 --confirm-each
    
        # Re-run and only process failed indices
        $0
    
        # Process all indices even if already reindexed
        $0 --no-skip-reindexed
    
    EOF
        exit 0
    }
    
    log() {
        echo -e "$1" | tee -a "${LOG_FILE}"
    }
    
    log_success() {
        log "${GREEN}$1${NC}"
    }
    
    log_error() {
        log "${RED}$1${NC}"
    }
    
    log_warning() {
        log "${YELLOW}$1${NC}"
    }
    
    log_info() {
        log "${BLUE}$1${NC}"
    }
    
    log_step() {
        log "${BLUE}$1${NC}"
    }
    
    # Build curl command with authentication if needed
    curl_es() {
        local auth_param=""
        if [[ -n "${ES_USER}" ]] && [[ -n "${ES_PASSWORD}" ]]; then
            auth_param="-u ${ES_USER}:${ES_PASSWORD}"
        fi
    
        # Longer timeouts for reindex operations
        curl --connect-timeout 10 --max-time 600 -s -k "${auth_param}" "$@"
    }
    
    # Get full ES URL
    get_es_url() {
        echo "${ES_SCHEME}://${ES_HOST}:${ES_PORT}"
    }
    
    # Check if index is already reindexed (ends with -v8)
    is_already_reindexed() {
        local index=$1
        [[ "${index}" =~ -v8$ ]]
    }
    
    # Check if corresponding v8 index exists
    has_v8_version() {
        local old_index=$1
        local new_index="${old_index}-v8"
        local es_url
        es_url=$(get_es_url)
    
        local check
        check=$(curl_es "${es_url}/_cat/indices/${new_index}?h=index" | tr -d '[:space:]')
        [[ "${check}" = "${new_index}" ]]
    }
    
    # ================================================================
    # Main Reindexing Function
    # ================================================================
    
    reindex_with_smart_alias() {
        local old_index=$1
        local new_index="${old_index}-v8"
        local es_url
        es_url=$(get_es_url)
    
        log "\n========================================="
        log "Processing index: ${old_index}"
        log "========================================="
    
        # Check if already reindexed
        if [[ "${SKIP_ALREADY_REINDEXED}" = true ]] && has_v8_version "${old_index}"; then
            log_warning "Index already has a -v8 version, skipping"
            return 2
        fi
    
        # Ask for confirmation if enabled
        if [[ "${CONFIRM_EACH_INDEX}" = true ]]; then
            read -r -p "Reindex this index? (y/n): " confirm
            if [[ "${confirm}" != "y" ]] && [[ "${confirm}" != "Y" ]]; then
                log_warning "Skipped by user"
                return 2
            fi
        fi
    
        # Step 1: Check if index exists
        log_step "Step 1/10: Checking if index exists..."
        local index_check
        index_check=$(curl_es "${es_url}/_cat/indices/${old_index}?h=index" | tr -d '[:space:]')
    
        if [[ "${index_check}" = "${old_index}" ]]; then
            log_success "Index exists"
        else
            log_error "Index ${old_index} does not exist, skipping"
            return 1
        fi
    
        # Step 2: Retrieve existing aliases
        log_step "Step 2/10: Retrieving existing aliases..."
        local alias_json
        alias_json=$(curl_es "${es_url}/${old_index}/_alias")
        local existing_aliases
        existing_aliases=$(echo "${alias_json}" | python3 -c "
    import sys, json
    try:
        data = json.load(sys.stdin)
        for idx, idx_data in data.items():
            aliases = list(idx_data.get('aliases', {}).keys())
            print(','.join(aliases))
    except:
        pass
    " 2>/dev/null)
    
        # Step 3: Determine target aliases
        log_step "Step 3/10: Determining target aliases..."
        local target_aliases=""
        local need_name_alias=false
    
        if [[ -n "${existing_aliases}" ]] && [[ "${existing_aliases}" != "" ]]; then
            target_aliases="${existing_aliases}"
            log_info "Found existing aliases: ${target_aliases}"
        else
            target_aliases="${old_index}"
            need_name_alias=true
            log_info "No existing aliases, will create alias '${old_index}'"
        fi
    
        # Step 4: Count documents in old index
        log_step "Step 4/10: Counting documents in old index..."
        local old_count
        old_count=$(curl_es "${es_url}/${old_index}/_count" | python3 -c "import sys, json; print(json.load(sys.stdin).get('count', 0))" 2>/dev/null)
        log_info "Document count: ${old_count}"
    
        # Step 5: Retrieve settings and mappings
        log_step "Step 5/10: Retrieving index settings and mappings..."
        local index_config
        index_config=$(curl_es "${es_url}/${old_index}")
        log_success "Settings and mappings retrieved"
    
        # Step 6: Create new index with proper settings including analyzers
        log_step "Step 6/10: Creating new index: ${new_index}..."
    
        local new_index_config
        new_index_config=$(echo "${index_config}" | python3 -c "
    import sys, json
    try:
        data = json.load(sys.stdin)
        old_idx = list(data.keys())[0]
        settings = data[old_idx].get('settings', {}).get('index', {})
        mappings = data[old_idx].get('mappings', {})
    
        # Clean settings (remove those that cannot be set at creation)
        clean_settings = {}
    
        # Copy important settings including analysis
        allowed_settings = [
            'number_of_shards', 'number_of_replicas', 'refresh_interval',
            'max_result_window', 'analysis', 'similarity', 'max_ngram_diff',
            'max_shingle_diff'
        ]
    
        for key in allowed_settings:
            if key in settings:
                clean_settings[key] = settings[key]
    
        # Also check for analysis in the root settings object
        root_settings = data[old_idx].get('settings', {})
        if 'analysis' in root_settings and 'analysis' not in clean_settings:
            clean_settings['analysis'] = root_settings['analysis']
    
        # Keep original number of shards, set 0 replicas for faster reindexing
        # If number_of_shards is not present in settings, default to 1
        if 'number_of_shards' not in clean_settings:
            clean_settings['number_of_shards'] = 1
        # Always set replicas to 0 during reindexing for performance
        clean_settings['number_of_replicas'] = 0
    
        result = {
            'settings': {
                'index': clean_settings
            },
            'mappings': mappings
        }
    
        print(json.dumps(result, indent=2))
    except Exception as e:
        import traceback
        print(json.dumps({
            'error': str(e),
            'traceback': traceback.format_exc(),
            'settings': {'number_of_shards': 1, 'number_of_replicas': 0}
        }), file=sys.stderr)
        print(json.dumps({'settings': {'number_of_shards': 1, 'number_of_replicas': 0}}))
    " 2>/tmp/python_error_$$.log)
    
        # Check if Python had errors
        if [[ -s /tmp/python_error_$$.log ]]; then
            log_warning "Python processing had warnings:"
            cat /tmp/python_error_$$.log | tee -a "${LOG_FILE}"
            rm -f /tmp/python_error_$$.log
        fi
    
        local create_result
        create_result=$(curl_es -X PUT "${es_url}/${new_index}" \
          -H 'Content-Type: application/json' \
          -d "${new_index_config}")
    
        if echo "${create_result}" | grep -q "acknowledged"; then
            log_success "New index created successfully"
            # Display the number of shards used
            local shards_count
            shards_count=$(echo "${new_index_config}" | python3 -c "import sys, json; d=json.load(sys.stdin); print(d.get('settings', {}).get('index', {}).get('number_of_shards', 'unknown'))" 2>/dev/null)
            log_info "Number of shards: ${shards_count}"
        else
            log_error "Failed to create new index"
            log_error "Response: ${create_result}"
            log_error "Config used:"
            echo "${new_index_config}" | tee -a "${LOG_FILE}"
    
            # Ask user what to do
            echo ""
            log_warning "An error occurred while creating the index."
            read -r -p "Do you want to (c)ontinue with next index, (r)etry this index, or (a)bort? [c/r/a]: " action
            case ${action} in
                r|R)
                    log_info "Retrying..."
                    reindex_with_smart_alias "${old_index}"
                    return $?
                    ;;
                a|A)
                    log_error "Aborting script as requested by user"
                    exit 1
                    ;;
                *)
                    log_info "Continuing with next index..."
                    return 1
                    ;;
            esac
        fi
    
        # Step 7: Reindex data (ASYNC VERSION WITH PROGRESS TRACKING)
        log_step "Step 7/10: Reindexing data asynchronously (this may take a while)..."
        log_info "Started at: $(date '+%Y-%m-%d %H:%M:%S')"
    
        local reindex_start
        reindex_start=$(date +%s)
    
        # Start async reindex
        local reindex_task
        reindex_task=$(curl_es -X POST "${es_url}/_reindex?wait_for_completion=false" \
          -H 'Content-Type: application/json' \
          -d "{
            \"source\": {
              \"index\": \"${old_index}\"
            },
            \"dest\": {
              \"index\": \"${new_index}\",
              \"op_type\": \"create\"
            }
          }")
    
        # Extract task ID
        local task_id
        task_id=$(echo "${reindex_task}" | python3 -c "import sys, json; print(json.load(sys.stdin).get('task', ''))" 2>/dev/null)
    
        if [[ -z "${task_id}" ]] || [[ "${task_id}" = "" ]]; then
            log_error "Failed to start reindex task"
            log_error "Response: ${reindex_task}"
    
            echo ""
            read -r -p "Do you want to (c)ontinue with next index, (r)etry this index, or (a)bort? [c/r/a]: " action
            case ${action} in
                r|R)
                    curl_es -X DELETE "${es_url}/${new_index}" >/dev/null 2>&1
                    log_info "Retrying..."
                    reindex_with_smart_alias "${old_index}"
                    return $?
                    ;;
                a|A)
                    log_error "Aborting script as requested by user"
                    exit 1
                    ;;
                *)
                    log_info "Continuing with next index..."
                    return 1
                    ;;
            esac
        fi
    
        log_info "Reindex task ID: ${task_id}"
    
        # Poll task status
        local completed=false
        local check_interval=5
        local max_wait=3600  # 1 hour max
        local elapsed=0
        local last_progress_time=0
    
        while [[ "${completed}" = false ]] && [[ ${elapsed} -lt ${max_wait} ]]; do
            sleep "${check_interval}"
            elapsed=$((elapsed + check_interval))
    
            local task_status
            task_status=$(curl_es "${es_url}/_tasks/${task_id}")
            local is_completed
            is_completed=$(echo "${task_status}" | python3 -c "import sys, json; print(str(json.load(sys.stdin).get('completed', False)))" 2>/dev/null)
    
            if [[ "${is_completed}" = "True" ]]; then
                completed=true
                local reindex_result="${task_status}"
                break
            fi
    
            # Show progress every 5 seconds
            if [[ $((elapsed - last_progress_time)) -ge 5 ]]; then
                last_progress_time=${elapsed}
                local progress
                progress=$(echo "${task_status}" | python3 -c "
    import sys, json
    try:
        data = json.load(sys.stdin)
        status = data.get('task', {}).get('status', {})
        created = status.get('created', 0)
        total = status.get('total', 0)
        if total > 0:
            pct = (created * 100) // total
            print(f'{created}/{total} ({pct}%)')
        else:
            print('In progress...')
    except:
        print('Checking...')
    " 2>/dev/null)
                log_info "Progress: ${progress} (${elapsed}s elapsed)"
            fi
        done
    
        if [[ "${completed}" = false ]]; then
            log_error "Reindex task did not complete within ${max_wait}s"
            log_warning "Task ${task_id} may still be running in background"
            log_warning "You can check its status with: curl ${es_url}/_tasks/${task_id}"
    
            echo ""
            read -r -p "Do you want to (c)ontinue with next index or (a)bort? [c/a]: " action
            case ${action} in
                a|A)
                    log_error "Aborting script as requested by user"
                    exit 1
                    ;;
                *)
                    log_info "Continuing with next index..."
                    return 1
                    ;;
            esac
        fi
    
        local reindex_end
        reindex_end=$(date +%s)
        local reindex_duration=$((reindex_end - reindex_start))
    
        log_info "Completed at: $(date '+%Y-%m-%d %H:%M:%S')"
        log_info "Duration: ${reindex_duration} seconds"
    
        # Extract results from completed task
        local new_count
        new_count=$(echo "${reindex_result}" | python3 -c "
    import sys, json
    try:
        data = json.load(sys.stdin)
        response = data.get('response', {})
        print(response.get('total', 0))
    except:
        print(0)
    " 2>/dev/null)
    
        local failures
        failures=$(echo "${reindex_result}" | python3 -c "
    import sys, json
    try:
        data = json.load(sys.stdin)
        response = data.get('response', {})
        print(len(response.get('failures', [])))
    except:
        print(0)
    " 2>/dev/null)
    
        log_info "Documents reindexed: ${new_count}"
        log_info "Failures: ${failures}"
    
        if [[ "${failures}" = "0" ]] && [[ "${new_count}" = "${old_count}" ]]; then
            log_success "Reindexing successful: ${new_count} documents"
        elif [[ "${new_count}" = "0" ]]; then
            log_error "No documents were reindexed - possible task failure"
            log_warning "Full task result:"
            echo "${reindex_result}" | python3 -m json.tool 2>/dev/null | tee -a "${LOG_FILE}"
    
            echo ""
            read -r -p "Do you want to (c)ontinue with next index, (r)etry this index, or (a)bort? [c/r/a]: " action
            case ${action} in
                r|R)
                    curl_es -X DELETE "${es_url}/${new_index}" >/dev/null 2>&1
                    log_info "Retrying..."
                    reindex_with_smart_alias "${old_index}"
                    return $?
                    ;;
                a|A)
                    log_error "Aborting script as requested by user"
                    exit 1
                    ;;
                *)
                    log_info "Continuing with next index..."
                    return 1
                    ;;
            esac
        else
            log_error "Reindexing issue detected (old: ${old_count}, new: ${new_count}, failures: ${failures})"
            log_warning "Full task result:"
            echo "${reindex_result}" | python3 -m json.tool 2>/dev/null | tee -a "${LOG_FILE}"
    
            echo ""
            read -r -p "Do you want to (c)ontinue with next index, (r)etry this index, or (a)bort? [c/r/a]: " action
            case ${action} in
                r|R)
                    curl_es -X DELETE "${es_url}/${new_index}" >/dev/null 2>&1
                    log_info "Retrying..."
                    reindex_with_smart_alias "${old_index}"
                    return $?
                    ;;
                a|A)
                    log_error "Aborting script as requested by user"
                    exit 1
                    ;;
                *)
                    log_info "Continuing with next index..."
                    return 1
                    ;;
            esac
        fi
    
        # Step 8: Handle aliases and delete old index
        if [[ "${need_name_alias}" = true ]]; then
            # Case: No existing aliases - need to create alias with old index name
    
            log_step "Step 8/10: Deleting old index before creating alias..."
            local delete_result
            delete_result=$(curl_es -X DELETE "${es_url}/${old_index}")
    
            if echo "${delete_result}" | grep -q "acknowledged"; then
                log_success "Old index deleted: ${old_index}"
            else
                log_error "Failed to delete old index"
                log_error "Response: ${delete_result}"
                return 1
            fi
    
            log_step "Step 9/10: Creating alias '${old_index}' pointing to '${new_index}'..."
            local alias_result
            alias_result=$(curl_es -X POST "${es_url}/_aliases" \
              -H 'Content-Type: application/json' \
              -d "{
                \"actions\": [
                  {\"add\": {\"index\": \"${new_index}\", \"alias\": \"${old_index}\"}}
                ]
              }")
    
            if echo "${alias_result}" | grep -q "acknowledged"; then
                log_success "Alias '${old_index}' created and points to '${new_index}'"
            else
                log_error "Failed to create alias"
                log_error "Response: ${alias_result}"
                return 1
            fi
    
        else
            # Case: Existing aliases - switch them to new index
    
            log_step "Step 8/10: Switching existing aliases to new index..."
            log_info "Aliases to switch: ${target_aliases}"
    
            # Build alias actions JSON
            local alias_actions='{"actions":['
    
            for alias in $(echo "${target_aliases}" | tr ',' ' '); do
                log_info "  Processing alias: ${alias}"
                alias_actions="${alias_actions}{\"remove\":{\"index\":\"${old_index}\",\"alias\":\"${alias}\",\"must_exist\":false}},"
                alias_actions="${alias_actions}{\"add\":{\"index\":\"${new_index}\",\"alias\":\"${alias}\"}},"
            done
    
            alias_actions="${alias_actions%,}]}"
    
            local alias_result
            alias_result=$(curl_es -X POST "${es_url}/_aliases" \
              -H 'Content-Type: application/json' \
              -d "${alias_actions}")
    
            if echo "${alias_result}" | grep -q "acknowledged"; then
                log_success "Aliases switched to new index"
            else
                log_error "Failed to switch aliases"
                log_error "Response: ${alias_result}"
                return 1
            fi
    
            log_step "Step 9/10: Deleting old index..."
            local delete_result
            delete_result=$(curl_es -X DELETE "${es_url}/${old_index}")
    
            if echo "${delete_result}" | grep -q "acknowledged"; then
                log_success "Old index deleted: ${old_index}"
            else
                log_error "Failed to delete old index"
                log_error "Response: ${delete_result}"
            fi
        fi
    
        # Step 10: Verify aliases
        log_step "Step 10/10: Verifying aliases..."
        for alias in $(echo "${target_aliases}" | tr ',' ' '); do
            local check
            check=$(curl_es "${es_url}/_alias/${alias}" | python3 -c "
    import sys, json
    try:
        data = json.load(sys.stdin)
        indices = list(data.keys())
        print(','.join(indices))
    except:
        pass
    " 2>/dev/null)
    
            if echo "${check}" | grep -q "${new_index}"; then
                log_success "Alias '${alias}' correctly points to ${new_index}"
            else
                log_warning "Issue: Alias '${alias}' does not point to new index"
                log_warning "Current target: ${check}"
            fi
        done
    
        # Configure replicas
        log_step "Configuring number of replicas to 0..."
        curl_es -X PUT "${es_url}/${new_index}/_settings" \
          -H 'Content-Type: application/json' \
          -d '{"index":{"number_of_replicas":0}}' > /dev/null
        log_success "Replicas configured"
    
        log_success "Index ${old_index}${new_index} completed successfully"
    
        return 0
    }
    
    # ================================================================
    # Main Function
    # ================================================================
    
    main() {
        local es_url
        es_url=$(get_es_url)
    
        log "========================================="
        log "  OKA REINDEXING FOR ELASTICSEARCH 9"
        log "========================================="
        log "Date: $(date '+%Y-%m-%d %H:%M:%S')"
        log "Elasticsearch URL: ${es_url}"
        log "Authentication: $([[ -n "${ES_USER}" ]] && echo "Enabled (user: ${ES_USER})" || echo "Disabled")"
        log "Log file: ${LOG_FILE}"
        log "Confirm each index: $([[ "${CONFIRM_EACH_INDEX}" = true ]] && echo "Yes" || echo "No")"
        log "Skip already reindexed: $([[ "${SKIP_ALREADY_REINDEXED}" = true ]] && echo "Yes" || echo "No")"
        log ""
    
        # Verify Elasticsearch is accessible
        log_step "Checking Elasticsearch connectivity..."
        local es_test
        es_test=$(curl_es "${es_url}/")
    
        if ! echo "${es_test}" | grep -q "cluster_name"; then
            log_error "Cannot connect to Elasticsearch at ${es_url}"
            log_error "Response: ${es_test}"
            log_error ""
            log_error "Please check:"
            log_error "  - Elasticsearch is running: systemctl status elasticsearch"
            log_error "  - Host and port are correct: ${ES_HOST}:${ES_PORT}"
            log_error "  - Scheme (http/https) is correct: ${ES_SCHEME}"
            log_error "  - Network connectivity: ping ${ES_HOST}"
            if [[ -n "${ES_USER}" ]]; then
                log_error "  - Credentials are valid: user=${ES_USER}"
            fi
            exit 1
        fi
    
        log_success "Connected to Elasticsearch"
    
        local es_version
        es_version=$(echo "${es_test}" | python3 -c "import sys, json; print(json.load(sys.stdin)['version']['number'])" 2>/dev/null)
        log "Elasticsearch version: ${es_version}"
        log ""
    
        # Create safety snapshot
        log_warning "IMPORTANT: Creating safety snapshot..."
        local snapshot_name
        snapshot_name="before_reindex_$(date +%Y%m%d_%H%M%S)"
    
        log_step "Creating snapshot: ${snapshot_name}"
        local snapshot_result
        snapshot_result=$(curl_es -X PUT "${es_url}/_snapshot/oka_backup/${snapshot_name}?wait_for_completion=true" \
          -H 'Content-Type: application/json' \
          -d '{
            "indices": "*,-.*",
            "ignore_unavailable": true,
            "include_global_state": false
          }')
    
        if echo "${snapshot_result}" | grep -q "SUCCESS"; then
            log_success "Snapshot created: ${snapshot_name}"
        else
            log_warning "Snapshot creation failed or timed out"
            log_warning "It's recommended to have a backup before proceeding"
            read -r -p "Continue anyway? (y/n): " continue_without_snapshot
            if [[ "${continue_without_snapshot}" != "y" ]] && [[ "${continue_without_snapshot}" != "Y" ]]; then
                log "Operation cancelled by user"
                exit 0
            fi
        fi
    
        log ""
    
        # List indices to reindex
        log_step "Retrieving list of indices to reindex..."
        local all_indices
        all_indices=$(curl_es "${es_url}/_cat/indices?h=index" | grep -v "^\." | sort)
    
        # Filter out indices that are already reindexed (-v8) if skip is enabled
        local indices=""
        if [[ "${SKIP_ALREADY_REINDEXED}" = true ]]; then
            log_info "Filtering out already reindexed indices (ending with -v8)..."
            while IFS= read -r idx; do
                if ! is_already_reindexed "${idx}"; then
                    if [[ -z "${indices}" ]]; then
                        indices="${idx}"
                    else
                        indices="${indices}\n${idx}"
                    fi
                fi
            done <<< "${all_indices}"
            indices=$(echo -e "${indices}")
        else
            indices="${all_indices}"
        fi
    
        local total
        total=$(echo "${indices}" | wc -l)
    
        log_info "Number of indices to process: ${total}"
        log ""
    
        # Show sample of indices
        log "First 10 indices:"
        echo "${indices}" | head -10 | while read -r idx; do
            log "  - ${idx}"
        done
        if [[ "${total}" -gt 10 ]]; then
            log "  ... and $((total - 10)) more"
        fi
        log ""
    
        # Ask for confirmation
        log_warning "This operation will reindex ${total} indices."
        log_warning "Estimated duration: ~$((total * 2)) minutes (depends on data size)"
        log ""
        read -r -p "Do you want to continue? (type YES in uppercase): " confirmation
    
        if [[ "${confirmation}" != "YES" ]]; then
            log "Operation cancelled by user"
            exit 0
        fi
    
        log ""
        log "Starting reindexing process..."
        log ""
    
        # Process each index
        local counter=0
        local success=0
        local failed=0
        local skipped=0
        local start_time
        start_time=$(date +%s)
    
        while IFS= read -r index; do
            counter=$((counter + 1))
            log "\n╔═══════════════════════════════════════════════════════════╗"
            log "║ Progress: [${counter}/${total}] - $(date '+%H:%M:%S')"
            log "╚═══════════════════════════════════════════════════════════╝"
    
            reindex_with_smart_alias "${index}"
            local result=$?
    
            if [[ ${result} -eq 0 ]]; then
                success=$((success + 1))
            elif [[ ${result} -eq 2 ]]; then
                skipped=$((skipped + 1))
            else
                failed=$((failed + 1))
            fi
    
            # Show progress summary
            log ""
            log_info "Current progress: Success: ${success}, Failed: ${failed}, Skipped: ${skipped}"
        done <<< "${indices}"
    
        local end_time
        end_time=$(date +%s)
        local total_duration=$((end_time - start_time))
        local duration_minutes=$((total_duration / 60))
        local duration_seconds=$((total_duration % 60))
    
        # Final summary
        log "\n========================================="
        log "  REINDEXING SUMMARY"
        log "========================================="
        log_success "Successfully processed: ${success} indices"
        if [[ "${skipped}" -gt 0 ]]; then
            log_warning "Skipped: ${skipped} indices"
        fi
        if [[ "${failed}" -gt 0 ]]; then
            log_error "Failed: ${failed} indices"
        fi
        log "Total duration: ${duration_minutes}m ${duration_seconds}s"
        log ""
    
        # Final alias verification
        log "Final alias verification (first 30):"
        curl_es "${es_url}/_cat/aliases?v&s=alias" | head -31 | tee -a "${LOG_FILE}"
        log ""
    
        log "Complete log available at: ${LOG_FILE}"
        log ""
    
        if [[ "${failed}" -eq 0 ]]; then
            log_success "Reindexing completed successfully!"
            log ""
            log "========================================="
            log "  NEXT STEPS"
            log "========================================="
            log ""
            log "1. Verify OKA is working correctly:"
            log "   curl ${es_url}/_cat/indices?v"
            log ""
            log "2. Upgrade Elasticsearch to version 9:"
            log "   sudo systemctl stop elasticsearch"
            log "   sudo yum update elasticsearch -y"
            log "   sudo systemctl start elasticsearch"
            log ""
        else
            log_error "Some indices failed to reindex."
            log_error "Please check the log file: ${LOG_FILE}"
            log ""
            log "You can re-run this script to retry only the failed indices."
            log "The script will automatically skip already reindexed indices."
        fi
    }
    
    # ================================================================
    # Parse Command Line Arguments
    # ================================================================
    
    while [[ $# -gt 0 ]]; do
        case $1 in
            -h|--host)
                ES_HOST="$2"
                shift 2
                ;;
            -p|--port)
                ES_PORT="$2"
                shift 2
                ;;
            -s|--scheme)
                ES_SCHEME="$2"
                shift 2
                ;;
            -u|--user)
                ES_USER="$2"
                shift 2
                ;;
            -w|--password)
                ES_PASSWORD="$2"
                shift 2
                ;;
            -c|--confirm-each)
                CONFIRM_EACH_INDEX=true
                shift
                ;;
            --no-skip-reindexed)
                SKIP_ALREADY_REINDEXED=false
                shift
                ;;
            -l|--log-file)
                LOG_FILE="$2"
                shift 2
                ;;
            --help)
                show_usage
                ;;
            *)
                echo "Unknown option: $1"
                show_usage
                ;;
        esac
    done
    
    # ================================================================
    # Execute Main Function
    # ================================================================
    
    main
    

    Note

    If you have large indices, we recommend that you run this script in tmux to prevent it from being killed on network disconnect.

    Also, reindexing indices requires free disk space, as data will temporarily be duplicated. Ensure that your server has enough space to hold at least twice the biggest index.

    Warning

    This script is given as example, and will try to migrate all indices in your Elasticsearch.

    If you share Elasticsearch with other tools, adapt the process to your needs, but in this case, you need to keep the same behavior as this script for OKA migrations: the ID of the indices must remain the same for OKA. We achieve this by creating aliases: the ID of the old index is the ID of the alias pointing to the newly migrated index.

  6. Upgrade to Elasticsearch 9 and start OKA:

    systemctl stop elasticsearch
    # See Elasticsearch documentation to enable repos for version 9
    dnf update --enablerepo=elasticsearch elasticsearch
    systemctl start elasticsearch
    systemctl start oka