Amazon OpenSearch Service features

Why OpenSearch Service?

Amazon OpenSearch Service is a fully managed service that simplifies the deployment and operation of search, observability, and log analytics applications, enabling customers to focus on deriving insights from their data instead of managing underlying infrastructure. The service provides flexible storage options, a vector engine for rich lexical and vector search, high-performance indexing capabilities, and robust security features to support a wide range of data-driven use cases. Beyond these core capabilities, Amazon OpenSearch Service offers seamless upgrades and patches, allowing customers to stay up-to-date without disruption. The service also enables infrastructure changes for cost optimization without any downtime, and provides a serverless deployment option with automatic scaling to dynamically adjust resources as needed. Additionally, Amazon OpenSearch Service features 24/7 monitoring, self-healing capabilities, and a 99.99% SLA for highly available, multi-AZ deployments with standby. The service integrates with other AWS offerings, including zero-ETL integration with Amazon S3, Amazon DynamoDB, and Amazon DocumentDB for a cohesive data analytics ecosystem. The service also includes visualization capabilities with OpenSearch Dashboards and Kibana (7.10 and earlier) and you can deploy and run the latest versions of OpenSearch and 19 versions of ALv2 Elasticsearch (7.10 and earlier). 

Next-gen OpenSearch Service UI for enhanced data exploration and collaboration

OpenSearch Service now offers a new, easy-to-use analytics experience that allows you to analyze operational data across multiple managed clusters, serverless collections, and Amazon S3 data sources from a single endpoint. This feature-rich experience supports a variety of use cases, including observability, security analytics, and logs workloads. Teams can analyze data from diverse sources without switching endpoints, reducing complexity and improving efficiency. Additionally, a new collaboration experience called Workspaces enables you to create dedicated views of operational dashboards, saved queries, and other team-related content. Teams can create dedicated environments to collaborate on dashboards, investigations, and other relevant content to increase ease of use and boost productivity.

Search

OpenSearch Service provides real-time document search capabilities that go beyond database search. This fully managed service uses the OpenSearch engine for search. OpenSearch is a full-featured, Lucene-based, portable, platform-agnostic open-source search engine supporting keyword search, natural language search, synonyms, multiple languages, and more. Core search capabilities include the following:

  • Acquires data from a database or content management system, a web or intranet crawler, or a streaming service
  • Provides search APIs to build a frontend on top of the search services
  • Powers searches across many attributes
  • Finds new documents that match a set of saved queries with prospective search (percolation)
  • Assesses usage patterns and performs capacity planning and cost prediction with OpenSearch Service monitoring capabilities
  • Uses built-in machine learning (ML) algorithms for k-nearest neighbors (k-NN) search to accomplish vector search, similarity search, semantic search, and more
  • Uses built-in ML algorithms for Learning to Rank to calculate relevance scores
  • Provides simple, scalable, and high-performing vector storage and search to power ML-augmented search experiences and generative AI applications
  • Uses multiple query languages, including SQL

Search resources

Video: AWS On Air for search 

Video: LexisNexis on ML-driven search 

Demo: Improve search results with Amazon OpenSearch Service 

Workshop: Improve search relevance with ML in Amazon OpenSearch Service

Blog: Novartis AG uses OpenSearch Service k-NN and SageMaker to power search and recommendation

Reference architecture diagram: Search-backed applications

Deployment and management

Getting started with OpenSearch Service is easy. You can set up and configure your OpenSearch Service cluster using the AWS Management Console or a single API call through the AWS Command Line Interface (AWS CLI). You can specify the number of instances, instance types, storage options, and modify or delete existing clusters at any time.

OpenSearch Service makes it easy to upgrade your OpenSearch and Elasticsearch clusters (up to version 7.10) to newer versions without any downtime, using in-place version upgrades. In-place upgrades remove the hassle of taking a manual snapshot, restoring it to a cluster running the newer version, and updating all your endpoint references.

OpenSearch Service provides built-in event monitoring and alerting, enabling you to monitor the data stored in your cluster and automatically send notifications based on preconfigured thresholds. Built using the OpenSearch alerting plugin, this feature lets you configure and manage alerts using your Kibana or OpenSearch Dashboards interface and the REST API. You can receive notifications through custom webhooks, Slack, Amazon Simple Notification Service (Amazon SNS), and Amazon Chime. You can also view cluster health metrics, including number of instances, cluster health, searchable documents, CPU, and memory, as well as disk utilization for data and master nodes through Amazon CloudWatch, at no additional charge.

With OpenSearch Service, there’s no need for OpenSearch query domain-specific language (DSL) proficiency. Write SQL queries with OpenSearch SQL or use the OpenSearch Piped Processing Language (PPL), a query language that lets you use pipe (|) syntax to explore, discover, and query your data. OpenSearch Dashboards also includes a SQL and PPL workbench.

OpenSearch Service offers built-in OpenSearch Dashboards and Kibana (Elasticsearch version 7.10 and previous) and integrates with Logstash, so you can ingest and visualize your data using the open-source tools that you prefer. Perform trace analytics with support from OpenSearch Service for the open-source OpenTelemetry standard and continue to use your existing code with direct access to Elasticsearch APIs and plugins such as Kuromoji, Phonetic Analysis, Ingest Processor Attachment, Ingest User Agent Processor, and Mapper Murmur3.

With OpenSearch Service, you can securely connect your applications to your managed Elasticsearch (version 7.10 and previous) or OpenSearch environment from your Amazon Virtual Private Cloud (Amazon VPC) or through the public internet, configuring network access using VPC security groups or IP-based access policies. You can also securely authenticate users and control access using Amazon CognitoAWS Identity and Access Management (IAM), or basic authentication with a username and password. OpenSearch Service uses the OpenSearch security plugin, helping you define granular permissions for indices, documents, or fields. You can also extend Kibana with read-only views and secure multi-tenant support. OpenSearch Service also supports built-in encryption for data at rest and in transit, so you can protect your data when it is stored in your domain or in automated snapshots and transferring between nodes in your domain. OpenSearch Service is HIPAA-eligible and compliant with PCI DSS, SOC, ISO, and FedRAMP standards, making it easy to build applications that meet compliance requirements.

Serverless: Automatically provision and continually adjust to get fast data ingestion rates and millisecond response times during changing usage patterns and demand with Amazon OpenSearch Serverless.

Storage tiering

Hot storage allows for fast retrieval of frequently accessed data. UltraWarm is a warm storage tier that complements the OpenSearch Service hot storage tier by providing less expensive storage for older and less frequently accessed data while still providing an interactive querying experience. UltraWarm stores data in Amazon Simple Storage Service (Amazon S3) and uses custom, highly optimized nodes, purpose-built on the AWS Nitro System, to cache, prefetch, and query that data quickly.

With UltraWarm, you can retain up to 3 PB of data in a single OpenSearch Service cluster while reducing cost per GB by nearly 90% compared to the hot storage tier. You can also easily query and visualize the data in your Kibana (version 7.10 and previous) or OpenSearch Dashboards interface. Analyze both your recent (weeks) and historical (months or years) log data without spending hours or days restoring archived logs.

UltraWarm is a fully managed, low-cost, warm storage tier for OpenSearch Service. It is compatible with OpenSearch, Elasticsearch (until version 7.10), OpenSearch Dashboards, and Kibana (until version 7.10), helping you analyze data using the same tools that OpenSearch Service provides today. UltraWarm seamlessly integrates with existing OpenSearch Service features such as integrated alerting, SQL querying, and more. 

UltraWarm helps you cost-effectively expand the data that you want to analyze on OpenSearch Service. You can gain valuable insights on data that previously may have been deleted or archived. With UltraWarm, you can now economically retain more of your data to interactively analyze it whenever you want.

OpenSearch Service supports two integrated storage tiers, hot and UltraWarm. The hot tier is powered by data nodes that are used for indexing, updating, and providing the fastest access to data. UltraWarm nodes complement the hot tier by providing a low-cost, read-only tier for older and less frequently accessed data.

UltraWarm uses Amazon S3 for storage, which is designed for 99.999999999 percent durability and removes the need to configure an Elasticsearch replica for your warm data. Additionally, if you have more than one UltraWarm node, in the event of a node failure, the other UltraWarm nodes will automatically access the data as needed.

UltraWarm supports up to 3 PB of primary data. UltraWarm is designed to allow you to fully utilize 100% of this storage. And, because UltraWarm stores data on Amazon S3 for durability, you do not need to use additional storage for Elasticsearch replicas.

UltraWarm delivers an interactive experience in OpenSearch Dashboards and Kibana by implementing granular I/O caching, prefetching, and query engine optimizations to provide similar performance to high-density instances using local storage.

To get started with UltraWarm, create a new OpenSearch Service domain with UltraWarm enabled through the console, CLI, or APIs. Once your domain is created, you can move data from hot to UltraWarm using the OpenSearch/Elasticsearch APIs. For more information, see the OpenSearch Service Developer Guide.

Cold storage is the lowest-cost storage option for OpenSearch Service, which allows you to retain infrequently accessed data in Amazon S3 and only pay for compute when you need it. Cold storage builds on UltraWarm, which provides specialized nodes that store data in Amazon S3 and uses a sophisticated caching solution to provide an interactive experience. By decoupling compute resources from storage, cold storage helps you retain any amount of data in your OpenSearch Service domain while reducing cost per GB to near Amazon S3 storage prices. Detach historical or infrequently accessed warm data while not in use and free up compute to help lower costs. Discover and selectively attach your cold data to your domain’s UltraWarm nodes in seconds with your choice of a Kibana (version 7.10 and previous) or OpenSearch Dashboards interface and easy-to-use APIs. With cold storage, you can query the attached cold data with a similar interactive experience and performance as your warm data.

OpenSearch includes certain Apache-licensed Elasticsearch code from Elasticsearch B.V. and other source code. Elasticsearch B.V. is not the source of that other source code. ELASTICSEARCH is a registered trademark of Elasticsearch B.V.

Cold storage is a fully managed lowest-cost storage tier for OpenSearch Service that makes it easier for you to securely store and analyze your historical logs on demand. Cold storage helps you fully detach storage from compute when you are not actively performing data analysis and allows you to keep your data readily available at low cost. Cold storage data is available within the OpenSearch Service domain through your UltraWarm nodes. Cold storage seamlessly integrates with OpenSearch and OpenSearch Dashboards, as well as Elasticsearch (versions 7.9 and 7.10) and Kibana (versions 7.9 and 7.10). It helps you analyze data using the same tools that OpenSearch Service provides today.

Cold storage helps you cost-effectively expand the data that you want to analyze on OpenSearch Service and gain valuable insights on data that previously may have been deleted or archived. Cold storage is a great fit if you have the need to do research or forensic analysis on your older data and you want to use all the capabilities of OpenSearch Service to do so, at an affordable price. Cold storage is built for scale and is backed by Amazon S3. Find and discover the data that you need, attach it to the UltraWarm nodes in your cluster, and make it available for analysis in seconds. Attached cold data is subject to the existing fine-grained access control policies that limit access at the index, document, and field level.

With cold storage, OpenSearch Service supports three integrated storage tiers: hot, UltraWarm, and cold. The hot tier is used for indexing, updating, and providing the fastest access to data. UltraWarm provides a seamless extension of the hot tier by providing compute nodes that provide a highly performant interactive experience for data that is durably stored in Amazon S3 and needs to be persistently available, currently supporting up to 3PB of data in a single domain. With cold storage, you can now detach indices from UltraWarm while not in use and free up compute to help lower costs. With the cold storage APIs and OpenSearch Dashboards and Kibana interface, you can discover indices based on index patterns and data timestamps to easily find what you need for analysis. That data can then be attached to the domain and ready for analysis in seconds. When you are done with analysis, simply detaching the data to free up your compute again. 

Cold storage is built for scale. While the storage limits for hot and warm data remain at 3 PB, you can store any amount of data in cold storage.

Cold storage builds on UltraWarm, which provides specialized nodes that store data in Amazon S3 and uses a sophisticated caching solution to provide an interactive experience. Cold data must first be attached to the UltraWarm nodes of your OpenSearch Service domain. Once attached, queries on this data are powered by existing UltraWarm nodes offering the same performance as your warm data. Attaching cold indices to your domain takes seconds if there is sufficient UltraWarm capacity available for the requested data. If you need additional capacity, UltraWarm data nodes must be added, which can take up to a few minutes.

Security analytics

Help your security operations (SecOps) teams detect potential threats quickly while having the tools to help with security investigations, all with low data retention costs. Secure your business data and rapidly detect potential security threats. OpenSearch Service provides out-of-the-box support for over 2,200 open-source Sigma security rules to detect potential security threats by filtering through the security findings. You can even customize or use default Sigma rules to rapidly detect potential security threats and send alerts to a preselected destination. Use out-of-the-box support for multiple log sources. including Windows, NetFlow, AWS CloudTrail, DNS, and more. 

OpenSearch security analytics is designed to help investigate, detect, analyze, and respond to security threats that could jeopardize the operations of business-critical functions. These threats include the potential exposure of confidential data, cyber attacks, and other adverse security events. It includes the tools and features necessary for defining detection parameters, generating alerts, and responding effectively to potential threats.

We currently support eight log types including NetFlow, DNS logs, Apache access logs, Windows logs, AD/LDAP logs, Linux system logs, AWS CloudTrail logs, and Amazon S3 access logs.

You can use your existing ingestion pipelines that send JSON formatted data to OpenSearch.

Yes, OpenSearch security analytics packages over 2,200 Sigma security rules for out-of-the-box use with different types of security detectors. These rules are preselected once you provide minimal configuration about the log source.

Yes, custom rules can be added for the supported log types above. These rules need to be in a Sigma rule format and can be imported into OpenSearch before using with a security detector.

Yes, the logs must be in JSON format. We recommend sending them in ECS (Elastic Common Schema) format.

OpenSearch security analytics is available to you for no additional cost or licensing fees. You pay the same cost as you would to ingest other data into OpenSearch Service.

Security analytics comes preinstalled with OpenSearch Service running OpenSearch version 2.5 or higher.

Amazon Security Lake automatically centralizes security data from cloud, on-premises, and custom sources into a purpose-built data lake stored in your account. This aggregated data is normalized into a common format, stored in S3 buckets. This data can be ingested into OpenSearch Service, which allows you to visualize, query, and create reports. Security analytics provides a security rules engine that can help you detect and alert on potential security events, as well as help you correlate them to help with your investigation.

Yes, you can bring additional logs from Security Lake into OpenSearch and create a detector to run relevant rules on the ingested logs.

OpenSearch Optimized Instances

OR1, the OpenSearch Optimized Instance family, that delivers up to 30% price-performance improvement over existing instances in internal benchmarks and uses Amazon S3 to provide 11 9s of durability. With OR1, Amazon OpenSearch Service uses OpenSearch innovation and AWS technologies to reimagine how data is indexed and stored in the cloud. OR1 enables customers to more economically and reliably scale their OpenSearch deployments without compromising on the interactive analytics experience they expect. 

OR1, the OpenSearch Optimized Instance family for Amazon OpenSearch Service managed clusters, that delivers up to 30% price-performance improvement over existing instances in internal benchmarks and uses Amazon S3 to provide 11 9s of durability. With OR1, Amazon OpenSearch Service uses OpenSearch innovation and AWS technologies to reimagine how data is indexed and stored in the cloud. OR1 enables customers to more economically and reliably scale their OpenSearch deployments without compromising on the interactive analytics experience they expect. OR1 offers pay-as-you-go and reserved instance pricing, with a simple hourly rate for the instance(s) and storage provisioned.

Customers widely use Amazon OpenSearch Service for operational log analytics because of its ability to ingest high volumes of data while also providing rich and interactive analytics over this data. OR1, the OpenSearch Optimized Instance family, that delivers up to 30% price-performance improvement over existing instances in internal benchmarks and uses Amazon S3 to provide 11 9s of durability. If you are running indexing heavy operational analytics workloads, you can benefit from the improved performance and improved compute efficiency. Additionally, in the event of a failure OpenSearch can performs automatic data recovery to the last successful operation, improving reliability of the domain.

Amazon OpenSearch Service supports two replication strategy – logical (document) and physical (segment) replication. In case of logical replication, the data is indexed on all the copies individually, leading to duplication of effort. In case of physical replication, data is indexed only on the primary copy and additional copies are created by copying data from the primary. OR1, the new instances for Amazon OpenSearch Service managed clusters, use physical replication to write data to the remote store based on Amazon S3. The Amazon S3 repository, a highly durable data store, serves as the source of truth for all replication and recovery operations. The innovative design leads to indexing performance improvements and an improved durability posture for Amazon OpenSearch Service domains.

Amazon OpenSearch Service supports cluster manager nodes (master nodes), data nodes and warm nodes. For data nodes, customers can select from – general purpose, memory optimized, compute optimized, storage optimized, and now OpenSearch optimized instances, depending on the role and workload characteristics. For warm nodes, Amazon OpenSearch Service provides ultrawarm instances that are optimized to reduce cost of storing warm data. OR1 are they first instance option in the new OpenSearch Optimized instance family. OR1 are memory optimized and available as data nodes. OR1 provide improved indexing throughput over standard memory optimized instances. Additionally, OR1 provide data durability without relying on snapshots and provide fast automated recovery. Both OR1 and Ultrawarm instances use a local store (EBS) and a remote store (Managed Storage - based on Amazon S3) to store data. For OR1 a copy of the data is kept in both the local store and remote store, whereas for Ultrawarm, to reduce storage costs, data is kept primarily in the remote store and depending on access pattern data is moved to the local store. 

OR1 instances use EBS as a local store and Amazon S3 as the remote store. All data is synchronously written to the Amazon S3, designed to provide 99.999999999% (11 9s) of data durability.

OR1 instances can be used as data nodes for all new Amazon OpenSearch Service managed clusters created on OpenSearch version 2.11 or later and have encryption at rest enabled. At the time of launch, OR1 instances will not be available for managed clusters that were created using other instances for data nodes. For OR1, you need to provision Graviton instances for cluster managers.

In the event of a red index, OR1 instances automatically restores the missing shards from the remote store (Amazon S3). The recovery time varies by volume of data to be recovered.