Elasticsearch Query Performance Optimization: Complete Guide for 2025
Learn proven techniques to optimize Elasticsearch query performance, reduce latency from seconds to milliseconds, and handle 10x more queries per second. Includes real-world benchmarks and configuration examples.

Elasticsearch Query Performance Optimization: Complete Guide for 2025
Slow Elasticsearch queries can cripple your application's user experience and waste infrastructure costs. In this comprehensive guide, we'll show you exactly how to optimize your Elasticsearch queries from first principles to advanced techniques.
Why Query Performance Matters
Every 100ms of latency costs you real money:
- E-commerce: Amazon found that every 100ms of latency costs 1% in sales
- User experience: 53% of mobile users abandon sites that take over 3 seconds
- Infrastructure costs: Slow queries require more nodes = higher AWS/cloud bills
Understanding Elasticsearch Query Execution
Before optimizing, understand how Elasticsearch executes queries:
1. Query Phase
- Coordinator node receives the query
- Query broadcasts to all relevant shards
- Each shard executes the query locally
- Top N results from each shard returned to coordinator
2. Fetch Phase
- Coordinator identifies which documents to retrieve
- Fetch requests sent to relevant shards
- Full documents assembled
- Results returned to client
Top 10 Query Optimization Techniques
1. Use Filter Context Instead of Query Context
Before (Slow):
{
"query": {
"bool": {
"must": [
{ "term": { "status": "active" }},
{ "range": { "price": { "gte": 100 }}}
]
}
}
}
After (Fast):
{
"query": {
"bool": {
"filter": [
{ "term": { "status": "active" }},
{ "range": { "price": { "gte": 100 }}}
]
}
}
}
Impact: 2-5x faster for filtering queries. Filter context:
- Cacheable (reused across queries)
- Binary yes/no (no scoring calculation)
- Perfect for exact matches, ranges, existence checks
2. Optimize Field Data Types
Choosing the right field type has massive performance impact:
Text vs Keyword:
text
: Use for full-text search (analyzed, slower)keyword
: Use for exact matching, aggregations, sorting (faster)
Example mapping:
{
"mappings": {
"properties": {
"email": { "type": "keyword" },
"description": { "type": "text" },
"status": { "type": "keyword" },
"price": { "type": "float" },
"created_at": { "type": "date" }
}
}
}
3. Reduce Result Set Size
Use _source
filtering:
{
"_source": ["id", "title", "price"],
"query": { "match_all": {} }
}
Impact: 50-70% faster when you don't need all fields.
4. Leverage Query Caching
Elasticsearch has three cache layers:
Node Query Cache:
- Caches filter results
- Automatically managed
- Enable with:
index.queries.cache.enabled: true
Shard Request Cache:
- Caches entire search response
- Enable with:
?request_cache=true
- Perfect for dashboards with repeated queries
Field Data Cache:
- For aggregations and sorting
- Monitor with:
GET /_stats/fielddata
5. Optimize Aggregations
Slow aggregation:
{
"aggs": {
"categories": {
"terms": { "field": "category", "size": 10000 }
}
}
}
Fast aggregation:
{
"aggs": {
"categories": {
"terms": {
"field": "category",
"size": 100,
"execution_hint": "map"
}
}
}
}
6. Use Scroll API for Large Result Sets
Don't do this for large datasets:
{ "query": { "match_all": {} }, "size": 10000 }
Use Scroll API instead:
POST /my-index/_search?scroll=1m
{ "query": { "match_all": {} }, "size": 1000 }
// Then scroll:
POST /_search/scroll
{ "scroll": "1m", "scroll_id": "DXF1..." }
7. Avoid Wildcard Queries on Text Fields
Extremely slow:
{ "query": { "wildcard": { "description": "*search*" }}}
Better alternatives:
- Use n-gram tokenizer
- Use match_phrase_prefix
- Use completion suggester for autocomplete
8. Optimize Index Structure
Shard sizing:
- Target: 10-50GB per shard
- Too many small shards = overhead
- Too few large shards = poor parallelization
Formula for shard count:
Optimal shards = (Data size in GB / 30GB) × 1.5
Round up to nearest integer
9. Use Query Profiler
Identify bottlenecks:
{
"profile": true,
"query": { "match": { "title": "elasticsearch" }}
}
Look for:
build_scorer
time > 50msnext_doc
time > 100ms- Large
collector
time
10. Implement Query Result Pagination
Search After (preferred for deep pagination):
{
"query": { "match_all": {} },
"size": 100,
"sort": [{ "_id": "asc" }],
"search_after": ["last_id_from_previous_page"]
}
Real-World Performance Improvements
Case Study 1: E-commerce Product Search
Before:
- 500ms average query time
- 200 queries/second capacity
After optimization:
- 50ms average query time (10x improvement)
- 2,000 queries/second capacity
Changes made:
- Switched filtering to filter context
- Enabled request caching
- Optimized field mappings (text → keyword for filters)
- Reduced _source fields returned
Case Study 2: Log Analytics Platform
Before:
- 3-5 second aggregation queries
- Frequent OOM errors
After optimization:
- 300-500ms aggregation queries
- Stable memory usage
Changes made:
- Time-based index strategy (daily indices)
- Reduced aggregation size limits
- Implemented index lifecycle management
- Added more RAM for field data cache
Monitoring Query Performance
Key Metrics to Track
# Query latency
GET /_nodes/stats/indices/search
# Cache hit rates
GET /_stats/request_cache,query_cache
# Slow query log
PUT /my-index/_settings
{
"index.search.slowlog.threshold.query.warn": "500ms",
"index.search.slowlog.threshold.query.info": "200ms"
}
Alerting Thresholds
- Query latency p95 > 500ms: Investigate
- Query latency p99 > 1s: Critical
- Cache hit rate < 80%: Review caching strategy
- Field data evictions > 10/min: Increase heap or optimize queries
Advanced Optimization Techniques
1. Index Sorting
Pre-sort indices for faster range queries:
{
"settings": {
"index": {
"sort.field": "timestamp",
"sort.order": "desc"
}
}
}
2. Adaptive Replica Selection
Enable in Elasticsearch 7.0+:
{
"cluster.routing.use_adaptive_replica_selection": true
}
3. Frozen Indices
For rarely-accessed data:
POST /old-logs-2023/_freeze
Common Pitfalls to Avoid
- Deep pagination without search_after: Kills performance after page 100
- Too many shards: Each shard = overhead. Aim for 10-50GB per shard
- Not using keyword for aggregations: Analyzing text fields = slow
- Ignoring heap size: 50% of RAM to heap (max 32GB)
- Script fields in production: Painless scripts are slow
Performance Benchmarking
Rally - Official Benchmarking Tool
# Install
pip3 install esrally
# Run benchmark
esrally race --track=geonames --target-hosts=localhost:9200
Custom Load Testing
import time
from elasticsearch import Elasticsearch
es = Elasticsearch(['localhost:9200'])
start = time.time()
for i in range(1000):
es.search(index='products', body={
'query': {'match': {'title': 'laptop'}}
})
end = time.time()
print(f'1000 queries in {end-start:.2f}s')
print(f'Average: {(end-start)/1000*1000:.2f}ms per query')
Elasticsearch 8.x Specific Optimizations
1. Use TSDB for Time-Series Data
{
"settings": {
"index.mode": "time_series",
"index.routing_path": ["host.name"]
}
}
Benefits:
- 70% storage reduction
- Faster aggregations
- Better compression
2. Leverage Runtime Fields
{
"runtime_mappings": {
"day_of_week": {
"type": "keyword",
"script": {
"source": "emit(doc['@timestamp'].value.getDayOfWeekEnum().toString())"
}
}
}
}
When to Get Expert Help
Consider professional Elasticsearch consulting if:
- Queries consistently exceed 1s latency
- Cluster CPU constantly > 80%
- OOM errors despite heap tuning
- Unable to handle traffic spikes
- Planning migration to Elasticsearch 8.x
Our team has optimized 500+ Elasticsearch clusters, achieving average performance improvements of 10-50x.
Conclusion
Query performance optimization is not a one-time task—it's an ongoing process:
- Monitor continuously: Track p95/p99 latencies
- Optimize iteratively: Start with highest-impact changes
- Test rigorously: Use Rally or custom benchmarks
- Document changes: Know what worked and why
Implementing even 3-4 techniques from this guide can yield 5-10x performance improvements.
Next Steps
- Run query profiler on your slowest queries
- Implement filter context for non-scoring queries
- Optimize field mappings (text vs keyword)
- Enable request caching for repeated queries
- Set up monitoring and alerting
Need help optimizing your Elasticsearch cluster? Schedule a free consultation with our performance engineering team.
Query Quotient Team
Elasticsearch Expert at QueryQuotient
Need Help with Your Elasticsearch Implementation?
Our team of certified Elasticsearch and OpenSearch experts can help you optimize performance, improve security, and scale your search infrastructure.