Amazon Data Firehose
Prepare and load real-time data streams into data stores and analytics tools
Overview
Amazon Data Firehose is the easiest way to reliably load streaming data into data lakes, data stores and analytics tools. It can capture, transform, and load streaming data into Amazon S3, Amazon Redshift, Amazon OpenSearch Service (successor to Amazon Elasticsearch Service), generic HTTP endpoints, and service providers like Datadog, New Relic, and MongoDB. It is a fully managed service that automatically scales to match the throughput of your data and requires no ongoing administration. It can also batch, compress, transform, and encrypt the data before loading it, minimizing the amount of storage used at the destination and increasing security.
You can easily create a Firehose stream from the Amazon Web Services Management Console, configure it with a few clicks, and start sending data to the stream from hundreds of thousands of data sources to be loaded continuously to Amazon Web Services – all in just a few minutes. You can also configure your Firehose stream to automatically convert the incoming data to columnar formats like Apache Parquet and Apache ORC, before the data is delivered to Amazon S3, for cost-effective storage and analytics.
With Firehose, you only pay for the amount of data you transmit through the service, and if applicable, for data format conversion and VPC delivery. There is no minimum fee or setup cost.
Benefits
Easy to use
Amazon Data Firehose provides a simple way to capture, transform, and load streaming data with just a few clicks in the Amazon Web Services Management Console. You can simply create a Firehose stream, select the destinations, and you can start sending real-time data from hundreds of thousands of data sources simultaneously. The service takes care of stream management, including all the scaling, sharding, and monitoring, needed to continuously load the data to destinations at the intervals you specify.
Integrated with Amazon Web Services services and service providers
Amazon Data Firehose is integrated with Amazon S3, Amazon Redshift, and Amazon OpenSearch Service. IIt can also deliver data to generic HTTP endpoints and directly to service providers like Datadog, New Relic, MongoDB, and Splunk. From the Amazon Web Services Management Console, you can point Firehose to an Amazon S3 bucket, Amazon Redshift table, or Amazon OpenSearch Service domain. You can then use your existing analytics applications and tools to analyze streaming data.
Serverless data transformation
Amazon Data Firehose enables you to prepare your streaming data before it is loaded to data stores. With Firehose, you can easily convert raw streaming data from your data sources into formats required by your destination data stores, without having to build your own data processing pipelines.
Near real-time
Amazon Data Firehose captures and loads data in near real time. It loads new data into Amazon S3, Amazon Redshift, and Amazon OpenSearch Service within 60 seconds after the data is sent to the service. As a result, you can access new data sooner and react to business and operational events faster.
No ongoing administration
Amazon Data Firehose is a fully managed service which automatically provisions, manages and scales compute, memory, and network resources required to load your streaming data. Once set up, Data Firehose loads data continuously as it arrives.
Pay only for what you use
With Amazon Data Firehose, you pay only for the volume of data you transmit through the service, and if applicable, for data format conversion. There are no minimum fees or upfront commitments.
Use cases
Amazon Data Firehose is the easiest way to reliably load streaming data into data lakes, data stores and analytics tools. It can capture, transform, and load streaming data into Amazon S3, Amazon Redshift, and Amazon OpenSearch Service, enabling near real-time analytics with existing business intelligence tools and dashboards you’re already using today. Below are examples of key use cases that our customers tackle using Amazon Data Firehose.