Performance

Elasticsearch Query Performance Optimization: Complete Guide for 2025

Learn proven techniques to optimize Elasticsearch query performance, reduce latency from seconds to milliseconds, and handle 10x more queries per second. Includes real-world benchmarks and configuration examples.

15 min read
Query Quotient Team
#query optimization#performance tuning#elasticsearch#query performance#latency reduction#caching
Elasticsearch Query Performance Optimization: Complete Guide for 2025

Elasticsearch Query Performance Optimization: Complete Guide for 2025

Slow Elasticsearch queries can cripple your application's user experience and waste infrastructure costs. In this comprehensive guide, we'll show you exactly how to optimize your Elasticsearch queries from first principles to advanced techniques.

Why Query Performance Matters

Every 100ms of latency costs you real money:

  • E-commerce: Amazon found that every 100ms of latency costs 1% in sales
  • User experience: 53% of mobile users abandon sites that take over 3 seconds
  • Infrastructure costs: Slow queries require more nodes = higher AWS/cloud bills

Understanding Elasticsearch Query Execution

Before optimizing, understand how Elasticsearch executes queries:

1. Query Phase

  1. Coordinator node receives the query
  2. Query broadcasts to all relevant shards
  3. Each shard executes the query locally
  4. Top N results from each shard returned to coordinator

2. Fetch Phase

  1. Coordinator identifies which documents to retrieve
  2. Fetch requests sent to relevant shards
  3. Full documents assembled
  4. Results returned to client

Top 10 Query Optimization Techniques

1. Use Filter Context Instead of Query Context

Before (Slow):

{
  "query": {
    "bool": {
      "must": [
        { "term": { "status": "active" }},
        { "range": { "price": { "gte": 100 }}}
      ]
    }
  }
}

After (Fast):

{
  "query": {
    "bool": {
      "filter": [
        { "term": { "status": "active" }},
        { "range": { "price": { "gte": 100 }}}
      ]
    }
  }
}

Impact: 2-5x faster for filtering queries. Filter context:

  • Cacheable (reused across queries)
  • Binary yes/no (no scoring calculation)
  • Perfect for exact matches, ranges, existence checks

2. Optimize Field Data Types

Choosing the right field type has massive performance impact:

Text vs Keyword:

  • text: Use for full-text search (analyzed, slower)
  • keyword: Use for exact matching, aggregations, sorting (faster)

Example mapping:

{
  "mappings": {
    "properties": {
      "email": { "type": "keyword" },
      "description": { "type": "text" },
      "status": { "type": "keyword" },
      "price": { "type": "float" },
      "created_at": { "type": "date" }
    }
  }
}

3. Reduce Result Set Size

Use _source filtering:

{
  "_source": ["id", "title", "price"],
  "query": { "match_all": {} }
}

Impact: 50-70% faster when you don't need all fields.

4. Leverage Query Caching

Elasticsearch has three cache layers:

Node Query Cache:

  • Caches filter results
  • Automatically managed
  • Enable with: index.queries.cache.enabled: true

Shard Request Cache:

  • Caches entire search response
  • Enable with: ?request_cache=true
  • Perfect for dashboards with repeated queries

Field Data Cache:

  • For aggregations and sorting
  • Monitor with: GET /_stats/fielddata

5. Optimize Aggregations

Slow aggregation:

{
  "aggs": {
    "categories": {
      "terms": { "field": "category", "size": 10000 }
    }
  }
}

Fast aggregation:

{
  "aggs": {
    "categories": {
      "terms": { 
        "field": "category",
        "size": 100,
        "execution_hint": "map"
      }
    }
  }
}

6. Use Scroll API for Large Result Sets

Don't do this for large datasets:

{ "query": { "match_all": {} }, "size": 10000 }

Use Scroll API instead:

POST /my-index/_search?scroll=1m
{ "query": { "match_all": {} }, "size": 1000 }

// Then scroll:
POST /_search/scroll
{ "scroll": "1m", "scroll_id": "DXF1..." }

7. Avoid Wildcard Queries on Text Fields

Extremely slow:

{ "query": { "wildcard": { "description": "*search*" }}}

Better alternatives:

  • Use n-gram tokenizer
  • Use match_phrase_prefix
  • Use completion suggester for autocomplete

8. Optimize Index Structure

Shard sizing:

  • Target: 10-50GB per shard
  • Too many small shards = overhead
  • Too few large shards = poor parallelization

Formula for shard count:

Optimal shards = (Data size in GB / 30GB) × 1.5
Round up to nearest integer

9. Use Query Profiler

Identify bottlenecks:

{
  "profile": true,
  "query": { "match": { "title": "elasticsearch" }}
}

Look for:

  • build_scorer time > 50ms
  • next_doc time > 100ms
  • Large collector time

10. Implement Query Result Pagination

Search After (preferred for deep pagination):

{
  "query": { "match_all": {} },
  "size": 100,
  "sort": [{ "_id": "asc" }],
  "search_after": ["last_id_from_previous_page"]
}

Real-World Performance Improvements

Case Study 1: E-commerce Product Search

Before:

  • 500ms average query time
  • 200 queries/second capacity

After optimization:

  • 50ms average query time (10x improvement)
  • 2,000 queries/second capacity

Changes made:

  1. Switched filtering to filter context
  2. Enabled request caching
  3. Optimized field mappings (text → keyword for filters)
  4. Reduced _source fields returned

Case Study 2: Log Analytics Platform

Before:

  • 3-5 second aggregation queries
  • Frequent OOM errors

After optimization:

  • 300-500ms aggregation queries
  • Stable memory usage

Changes made:

  1. Time-based index strategy (daily indices)
  2. Reduced aggregation size limits
  3. Implemented index lifecycle management
  4. Added more RAM for field data cache

Monitoring Query Performance

Key Metrics to Track

# Query latency
GET /_nodes/stats/indices/search

# Cache hit rates
GET /_stats/request_cache,query_cache

# Slow query log
PUT /my-index/_settings
{
  "index.search.slowlog.threshold.query.warn": "500ms",
  "index.search.slowlog.threshold.query.info": "200ms"
}

Alerting Thresholds

  • Query latency p95 > 500ms: Investigate
  • Query latency p99 > 1s: Critical
  • Cache hit rate < 80%: Review caching strategy
  • Field data evictions > 10/min: Increase heap or optimize queries

Advanced Optimization Techniques

1. Index Sorting

Pre-sort indices for faster range queries:

{
  "settings": {
    "index": {
      "sort.field": "timestamp",
      "sort.order": "desc"
    }
  }
}

2. Adaptive Replica Selection

Enable in Elasticsearch 7.0+:

{
  "cluster.routing.use_adaptive_replica_selection": true
}

3. Frozen Indices

For rarely-accessed data:

POST /old-logs-2023/_freeze

Common Pitfalls to Avoid

  1. Deep pagination without search_after: Kills performance after page 100
  2. Too many shards: Each shard = overhead. Aim for 10-50GB per shard
  3. Not using keyword for aggregations: Analyzing text fields = slow
  4. Ignoring heap size: 50% of RAM to heap (max 32GB)
  5. Script fields in production: Painless scripts are slow

Performance Benchmarking

Rally - Official Benchmarking Tool

# Install
pip3 install esrally

# Run benchmark
esrally race --track=geonames --target-hosts=localhost:9200

Custom Load Testing

import time
from elasticsearch import Elasticsearch

es = Elasticsearch(['localhost:9200'])

start = time.time()
for i in range(1000):
    es.search(index='products', body={
        'query': {'match': {'title': 'laptop'}}
    })
end = time.time()

print(f'1000 queries in {end-start:.2f}s')
print(f'Average: {(end-start)/1000*1000:.2f}ms per query')

Elasticsearch 8.x Specific Optimizations

1. Use TSDB for Time-Series Data

{
  "settings": {
    "index.mode": "time_series",
    "index.routing_path": ["host.name"]
  }
}

Benefits:

  • 70% storage reduction
  • Faster aggregations
  • Better compression

2. Leverage Runtime Fields

{
  "runtime_mappings": {
    "day_of_week": {
      "type": "keyword",
      "script": {
        "source": "emit(doc['@timestamp'].value.getDayOfWeekEnum().toString())"
      }
    }
  }
}

When to Get Expert Help

Consider professional Elasticsearch consulting if:

  • Queries consistently exceed 1s latency
  • Cluster CPU constantly > 80%
  • OOM errors despite heap tuning
  • Unable to handle traffic spikes
  • Planning migration to Elasticsearch 8.x

Our team has optimized 500+ Elasticsearch clusters, achieving average performance improvements of 10-50x.

Conclusion

Query performance optimization is not a one-time task—it's an ongoing process:

  1. Monitor continuously: Track p95/p99 latencies
  2. Optimize iteratively: Start with highest-impact changes
  3. Test rigorously: Use Rally or custom benchmarks
  4. Document changes: Know what worked and why

Implementing even 3-4 techniques from this guide can yield 5-10x performance improvements.

Next Steps

  1. Run query profiler on your slowest queries
  2. Implement filter context for non-scoring queries
  3. Optimize field mappings (text vs keyword)
  4. Enable request caching for repeated queries
  5. Set up monitoring and alerting

Need help optimizing your Elasticsearch cluster? Schedule a free consultation with our performance engineering team.

Query Quotient Team

Elasticsearch Expert at QueryQuotient

Need Help with Your Elasticsearch Implementation?

Our team of certified Elasticsearch and OpenSearch experts can help you optimize performance, improve security, and scale your search infrastructure.