Pub/sub messaging: Apache Kafka versus Apache Pulsar
Apache Kafka set the bar for large-scale distributed messaging, but Apache Pulsar has some neat tricks of its own
These days, massively scalable pub/sub messaging is virtually synonymous with Apache Kafka. Apache Kafka continues to be the rock-solid, open-source, go-to choice for distributed streaming applications, whether you’re adding something like Apache Storm or Apache Spark for processing or using the processing tools provided by Apache Kafka itself. But it isn’t the only game in town.
Developed by Yahoo and now an Apache Software Foundation project, Apache Pulsar is going for the crown of messaging that Apache Kafka has worn for many years. Apache Pulsar offers the potential of faster throughput and lower latency, along with a compatible API that allows developers to switch from one to another with relative ease.
How should one choose between the venerable stalwart Apache Kafka and the upstart Apache Pulsar? Let’s look at their core open source offerings and what the core maintainers’ enterprise editions bring to the table.
Developed by LinkedIn and released as open source back in 2011, Apache Kafka has spread far and wide, pretty much becoming the default choice for many when thinking about adding a service bus or pub/sub system to an architecture. Since it’s debut, the ecosystem has grown considerably, adding the Scheme Registry to enforce schemas in messaging, Kafka Connect for easy streaming from other data sources such as databases to Kafka, Kafka Streams for distributed stream processing, and most recently KSQL for performing SQL-like querying over Kafka topics. (A topic in Kafka is the name for a particular channel.)
The standard use-case for many real-time pipelines built over the past few years has been to push data into Apache Kafka and then use a stream processor such as Apache Storm or Apache Spark to pull in data, perform and processing, and then publish output to another topic for downstream consumption. With Kafka Streams and KSQL, all of your data pipeline needs can be handled without having to leave the Apache Kafka project at any time, though of course, you can still use an external service to process your data if required.
This article originally appeared on InfoWorld.com. To read the full article, click here.
Nastel Technologies is the global leader in Integration Infrastructure Management (i2M). It helps companies achieve flawless delivery of digital services powered by integration infrastructure by delivering tools for Middleware Management, Monitoring, Tracking, and Analytics to detect anomalies, accelerate decisions, and enable customers to constantly innovate, to answer business-centric questions, and provide actionable guidance for decision-makers. It is particularly focused on IBM MQ, Apache Kafka, Solace, TIBCO EMS, ACE/IIB and also supports RabbitMQ, ActiveMQ, Blockchain, IOT, DataPower, MFT, IBM Cloud Pak for Integration and many more.
The Nastel i2M Platform provides:
- Secure self-service configuration management with auditing for governance & compliance
- Message management for Application Development, Test, & Support
- Real-time performance monitoring, alerting, and remediation
- Business transaction tracking and IT message tracing
- AIOps and APM
- Automation for CI/CD DevOps
- Analytics for root cause analysis & Management Information (MI)
- Integration with ITSM/SIEM solutions including ServiceNow, Splunk, & AppDynamics