Performance

Elasticsearch Query Performance Optimization: Complete Guide for 2025

Learn proven techniques to optimize Elasticsearch query performance, reduce latency from seconds to milliseconds, and handle 10x more queries per second. Includes real-world benchmarks and configuration examples.

January 10, 2025

15 min read

Query Quotient Team

#query optimization#performance tuning#elasticsearch#query performance#latency reduction#caching

Elasticsearch Query Performance Optimization: Complete Guide for 2025

Slow Elasticsearch queries can cripple your application's user experience and waste infrastructure costs. In this comprehensive guide, we'll show you exactly how to optimize your Elasticsearch queries from first principles to advanced techniques.

Why Query Performance Matters

Every 100ms of latency costs you real money:

E-commerce: Amazon found that every 100ms of latency costs 1% in sales
User experience: 53% of mobile users abandon sites that take over 3 seconds
Infrastructure costs: Slow queries require more nodes = higher AWS/cloud bills

Understanding Elasticsearch Query Execution

Before optimizing, understand how Elasticsearch executes queries:

1. Query Phase

Coordinator node receives the query
Query broadcasts to all relevant shards
Each shard executes the query locally
Top N results from each shard returned to coordinator

2. Fetch Phase

Coordinator identifies which documents to retrieve
Fetch requests sent to relevant shards
Full documents assembled
Results returned to client

Top 10 Query Optimization Techniques

1. Use Filter Context Instead of Query Context

Before (Slow):

{
  "query": {
    "bool": {
      "must": [
        { "term": { "status": "active" }},
        { "range": { "price": { "gte": 100 }}}
      ]
    }
  }
}

After (Fast):

{
  "query": {
    "bool": {
      "filter": [
        { "term": { "status": "active" }},
        { "range": { "price": { "gte": 100 }}}
      ]
    }
  }
}

Impact: 2-5x faster for filtering queries. Filter context:

Cacheable (reused across queries)
Binary yes/no (no scoring calculation)
Perfect for exact matches, ranges, existence checks

2. Optimize Field Data Types

Choosing the right field type has massive performance impact:

Text vs Keyword:

text: Use for full-text search (analyzed, slower)
keyword: Use for exact matching, aggregations, sorting (faster)

Example mapping:

{
  "mappings": {
    "properties": {
      "email": { "type": "keyword" },
      "description": { "type": "text" },
      "status": { "type": "keyword" },
      "price": { "type": "float" },
      "created_at": { "type": "date" }
    }
  }
}

3. Reduce Result Set Size

Use _source filtering:

{
  "_source": ["id", "title", "price"],
  "query": { "match_all": {} }
}

Impact: 50-70% faster when you don't need all fields.

4. Leverage Query Caching

Elasticsearch has three cache layers:

Node Query Cache:

Caches filter results
Automatically managed
Enable with: index.queries.cache.enabled: true

Shard Request Cache:

Caches entire search response
Enable with: ?request_cache=true
Perfect for dashboards with repeated queries

Field Data Cache:

For aggregations and sorting
Monitor with: GET /_stats/fielddata

5. Optimize Aggregations

Slow aggregation:

{
  "aggs": {
    "categories": {
      "terms": { "field": "category", "size": 10000 }
    }
  }
}

Fast aggregation:

{
  "aggs": {
    "categories": {
      "terms": { 
        "field": "category",
        "size": 100,
        "execution_hint": "map"
      }
    }
  }
}

6. Use Scroll API for Large Result Sets

Don't do this for large datasets:

{ "query": { "match_all": {} }, "size": 10000 }

Use Scroll API instead:

POST /my-index/_search?scroll=1m
{ "query": { "match_all": {} }, "size": 1000 }

// Then scroll:
POST /_search/scroll
{ "scroll": "1m", "scroll_id": "DXF1..." }

7. Avoid Wildcard Queries on Text Fields

Extremely slow:

{ "query": { "wildcard": { "description": "*search*" }}}

Better alternatives:

Use n-gram tokenizer
Use match_phrase_prefix
Use completion suggester for autocomplete

8. Optimize Index Structure

Shard sizing:

Target: 10-50GB per shard
Too many small shards = overhead
Too few large shards = poor parallelization

Formula for shard count:

Optimal shards = (Data size in GB / 30GB) × 1.5
Round up to nearest integer

9. Use Query Profiler

Identify bottlenecks:

{
  "profile": true,
  "query": { "match": { "title": "elasticsearch" }}
}

Look for:

build_scorer time > 50ms
next_doc time > 100ms
Large collector time

10. Implement Query Result Pagination

Search After (preferred for deep pagination):

{
  "query": { "match_all": {} },
  "size": 100,
  "sort": [{ "_id": "asc" }],
  "search_after": ["last_id_from_previous_page"]
}

Real-World Performance Improvements

Case Study 1: E-commerce Product Search

Before:

500ms average query time
200 queries/second capacity

After optimization:

50ms average query time (10x improvement)
2,000 queries/second capacity

Changes made:

Switched filtering to filter context
Enabled request caching
Optimized field mappings (text → keyword for filters)
Reduced _source fields returned

Case Study 2: Log Analytics Platform

Before:

3-5 second aggregation queries
Frequent OOM errors

After optimization:

300-500ms aggregation queries
Stable memory usage

Changes made:

Time-based index strategy (daily indices)
Reduced aggregation size limits
Implemented index lifecycle management
Added more RAM for field data cache

Monitoring Query Performance

Key Metrics to Track

# Query latency
GET /_nodes/stats/indices/search

# Cache hit rates
GET /_stats/request_cache,query_cache

# Slow query log
PUT /my-index/_settings
{
  "index.search.slowlog.threshold.query.warn": "500ms",
  "index.search.slowlog.threshold.query.info": "200ms"
}

Alerting Thresholds

Query latency p95 > 500ms: Investigate
Query latency p99 > 1s: Critical
Cache hit rate < 80%: Review caching strategy
Field data evictions > 10/min: Increase heap or optimize queries

Advanced Optimization Techniques

1. Index Sorting

Pre-sort indices for faster range queries:

{
  "settings": {
    "index": {
      "sort.field": "timestamp",
      "sort.order": "desc"
    }
  }
}

2. Adaptive Replica Selection

Enable in Elasticsearch 7.0+:

{
  "cluster.routing.use_adaptive_replica_selection": true
}

3. Frozen Indices

For rarely-accessed data:

POST /old-logs-2023/_freeze

Common Pitfalls to Avoid

Deep pagination without search_after: Kills performance after page 100
Too many shards: Each shard = overhead. Aim for 10-50GB per shard
Not using keyword for aggregations: Analyzing text fields = slow
Ignoring heap size: 50% of RAM to heap (max 32GB)
Script fields in production: Painless scripts are slow

Performance Benchmarking

Rally - Official Benchmarking Tool

# Install
pip3 install esrally

# Run benchmark
esrally race --track=geonames --target-hosts=localhost:9200

Custom Load Testing

import time
from elasticsearch import Elasticsearch

es = Elasticsearch(['localhost:9200'])

start = time.time()
for i in range(1000):
    es.search(index='products', body={
        'query': {'match': {'title': 'laptop'}}
    })
end = time.time()

print(f'1000 queries in {end-start:.2f}s')
print(f'Average: {(end-start)/1000*1000:.2f}ms per query')

Elasticsearch 8.x Specific Optimizations

1. Use TSDB for Time-Series Data

{
  "settings": {
    "index.mode": "time_series",
    "index.routing_path": ["host.name"]
  }
}

Benefits:

70% storage reduction
Faster aggregations
Better compression

2. Leverage Runtime Fields

{
  "runtime_mappings": {
    "day_of_week": {
      "type": "keyword",
      "script": {
        "source": "emit(doc['@timestamp'].value.getDayOfWeekEnum().toString())"
      }
    }
  }
}

When to Get Expert Help

Consider professional Elasticsearch consulting if:

Queries consistently exceed 1s latency
Cluster CPU constantly > 80%
OOM errors despite heap tuning
Unable to handle traffic spikes
Planning migration to Elasticsearch 8.x

Our team has optimized 500+ Elasticsearch clusters, achieving average performance improvements of 10-50x.

Conclusion

Query performance optimization is not a one-time task—it's an ongoing process:

Monitor continuously: Track p95/p99 latencies
Optimize iteratively: Start with highest-impact changes
Test rigorously: Use Rally or custom benchmarks
Document changes: Know what worked and why

Implementing even 3-4 techniques from this guide can yield 5-10x performance improvements.

Next Steps

Run query profiler on your slowest queries
Implement filter context for non-scoring queries
Optimize field mappings (text vs keyword)
Enable request caching for repeated queries
Set up monitoring and alerting

Need help optimizing your Elasticsearch cluster? Schedule a free consultation with our performance engineering team.

Query Quotient Team

Elasticsearch Expert at QueryQuotient

Need Help with Your Elasticsearch Implementation?

Our team of certified Elasticsearch and OpenSearch experts can help you optimize performance, improve security, and scale your search infrastructure.