Elasticsearch - Dumping documents from multi-node to single node
Elasticsearch three node cluster:
Elasticsearch is running as three node cluster, task is to copy and restore the multi-node to single node cluster.
node 1 : "http://node1:9300, http://node1:9200"
node 2 : "http://node2:9300, http://node2:9200"
node 3 : "http://node2:9300, http://node3:9200"
As the shards getting distributed between nodes so no single node will have the complete data. When we manually copy and restore to single node instance there will be an unassigned shards of each node. Follow these steps to restore from multi node to single node.
- Create the single node cluster using the
docker-compose.ymlfile
cluster.name: jinnabalu_cluster
#node.name: "node-one"
#index.number_of_shards: 1
#index.number_of_replicas: 0
network.bind_host: 0.0.0.0
#network.host: 0.0.0.0
#discovery.zen.ping.multicast.enabled: false
cluster.routing.allocation.disk.threshold_enabled: true
cluster.routing.allocation.disk.watermark.flood_stage: 200mb
cluster.routing.allocation.disk.watermark.low: 500mb
cluster.routing.allocation.disk.watermark.high: 300mb
version: '2'
services:
jinnabalu_cluster-elasticsearch:
container_name: jinnabalu_cluster-elasticsearch
image: elasticsearch:2.4.1
environment:
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
volumes:
- /var/db/elasticsearch/data:/usr/share/elasticsearch/data
- ./elasticsearch-conf.yml:/usr/share/elasticsearch/config/elasticsearch.yml
ports:
- 9200:9200
- 9300:9300
- Copy the folder from
node 1, with thescp -r /var/db/node1/elasticsearch/** /var/db/cassandra - Start the cluster using
docker-compose up -d - By default, Elasticsearch will re-assign shards to nodes dynamically. with unassigned shards.
- Check with the shards with
curl -X GET "localhost:9200/_cat/shards" - Number of nodes in the cluster was three so there was no extra node to create the replica, and restore the unassigned indexes, So the health was turning to
red. Created the index with settings property and set the number_of_replicas as 0.
curl -X PUT "localhost:9200/_settings?pretty" -H 'Content-Type: application/json' -d'
{
"index" : {
"number_of_replicas" : 0
}
}
'
- Check with shards again and note down the number of unassigned node shards
- Manually copy the shards which are unassigned from
node 2ornode 3 - Example, copy index
clientshards 2, 4
scp -r <path_multinode_data_folder>/<cluster_name>/nodes/0/indices/client/2 <path_multinode_data_folder>/<cluster_name>/nodes/0/indices/client/
scp -r <path_multinode_data_folder>/<cluster_name>/nodes/0/indices/client/4 <path_multinode_data_folder>/<cluster_name>/nodes/0/indices/client/
- Restart the elasticsearch
docker-compose downanddocker-compose up -d - Check with the shards and see if any unassigned shards exist and repeat the same as above.
- When we are done with restoring the shards, node health will be turned into the
green
Missing shards can be copied manually to the folder. However, if you’ve disabled shard allocation (perhaps you did a rolling restart and forgot to re-enable it), you can re-enable shard allocation.
# v0.90.x and earlier
curl -X PUT "localhost:9200/_settings?pretty" -H 'Content-Type: application/json' -d'
"index.routing.allocation.disable_allocation": false
}'
# v1.0+
curl -X PUT "localhost:9200/_settings?pretty" -H 'Content-Type: application/json' -d'
"transient" : {
"cluster.routing.allocation.enable" : "all"
}
}'
Elasticsearch will then reassign shards as normal. This can be slow, consider raising indices.recovery.max_bytes_per_sec and cluster.routing.allocation.node_concurrent_recoveries to speed it up.
If you’re still seeing issues, something else is probably wrong, so look in your Elasticsearch logs for errors. If you see EsRejectedExecutionException your thread pools may be too small.
Finally, you can explicitly reassign a shard to a node with the reroute API.
# Suppose shard 4 of index "my-index" is unassigned, so you want to
# assign it to node search03:
curl -XPOST "localhost:9200/_cluster/reroute" -H 'Content-Type: application/json' -d'
"commands": [{
"allocate": {
"index": "my-index",
"shard": 4,
"node": "search03",
"allow_primary": 1
}
}]
}'
Related Post:
Why Python for Production Services
Vector Aggregator — Transform and Route
Vector Agent — Lightweight Log Collection
HashiCorp Vault — Centralized Secret Management
Vault Auth Methods — Token vs AppRole
Test Coverage and CI Integration
Why Structured Logging Matters
structlog — JSON Logging with Context
Secret Workflow — Local to Production
pytest — Fixtures, Conftest, and Async Testing
Pydantic — Request & Response Validation
Prometheus Metrics — RED Method
Project Structure with pyproject.toml
Auto-Instrumentation for FastAPI
OpenTelemetry — Traces, Spans, and Context
Log Rotation and Disk Management
Jaeger — Visualizing Distributed Traces
Integration Tests for API Endpoints
Health Checks and Readiness Probes
FastAPI — Async-First HTTP Framework
Error Handling & Response Models
Elasticsearch + Kibana — Search and Visualize
Dual Output — Stdout and File Logging
Docker — Containerize from Day One
Dependency Updates and Maintenance
Dependency Auditing with pip-audit
Request-Scoped Logging with Correlation IDs
Environment-Based Config with pydantic-settings
RESTful Route Design with FastAPI Router
K8s Contributor Playground, Learning by Contributing
Adding Try in PWD button to README file
Open JDK docker container commands shell access to the container
AWS EBS Volmes - Create and attach the EBS volume with mounting