What I Learned From Building Apache Kafka At LinkedIn
Apache Kafka wasn’t the first open source project I was involved in at LinkedIn. We’d also built a key-value store, a workflow system, and a number of other things. The biggest difference with Kafka was that I think we were much more intentional about the space and what would be possible. The other projects began as quick hacks or pet projects. By the time I started working on Kafka, I had much more perspective on the larger technology ecosystem and what problems would be worth solving.
The basic thesis was that the advances in distributed systems made it possible to build all kinds of horizontally scalable data systems, and that one of the biggest gaps would be how all the data and applications in a big digital company plugged together. We thought there could be some data fabric for this in the same way relational databases backed individual apps. This was very much about solving back from the end state of how a company would work and then attacking the most important part of that.
I think people underestimate this kind of “end state” thinking in the early part of a project because the early progress and impact are nowhere close to reaching that end state vision, so the difference isn’t immediately observable. In the early use cases both the job scheduler and Kafka were pretty valuable. The big difference was about what they could grow into.
Investors often think about this as the “addressable market” for a company. If the market is limited that will be the ultimate constraint on the company’s success. I think people should think about investing their time in a similar way.
New categories are hard
When we open sourced the key value store LinkedIn built, it was immediately pretty successful and got production adoption. I didn’t realize originally how much of this was because it was easy to place in existing categories. Basically it was a database, and people knew what that was, and the abstraction it provided was a distributed hash table, and programmers understand hash tables. So the level of effort to understand what it was and where it might be useful to you was really low.
Kafka was the opposite. It was something genuinely new. Kafka addressed some problems message queues had, but was really something quite different: a kind of database or filesystem for large scale event stream storage and processing. We eventually started calling this an “event streaming platform” but at the time we didn’t really have a phrase for it.
This made it really hard to get adoption for Kafka early on because there wasn’t really an existing category that answered the “what is it?” question. I often say that Kafka was released to “resounding silence”.
This actually cuts the other way over time, I think. There are a lot of positive aspects to doing something genuinely new because once it starts to catch on, you get to be emblematic of the new trend and category.
People underestimate both the time and impact of doing anything really well
I originally gave a time estimate for implementing Kafka of 3 months. We’re still working on it many, many years later, so that means you probably shouldn’t take time estimates from software engineers too seriously.
This article originally appeared on forbes.com To read the full article and see the images, click here.
Nastel Technologies is the global leader in Integration Infrastructure Management (i2M). It helps companies achieve flawless delivery of digital services powered by integration infrastructure by delivering tools for Middleware Management, Monitoring, Tracking, and Analytics to detect anomalies, accelerate decisions, and enable customers to constantly innovate, to answer business-centric questions, and provide actionable guidance for decision-makers. It is particularly focused on IBM MQ, Apache Kafka, Solace, TIBCO EMS, ACE/IIB and also supports RabbitMQ, ActiveMQ, Blockchain, IOT, DataPower, MFT, IBM Cloud Pak for Integration and many more.
The Nastel i2M Platform provides:
- Secure self-service configuration management with auditing for governance & compliance
- Message management for Application Development, Test, & Support
- Real-time performance monitoring, alerting, and remediation
- Business transaction tracking and IT message tracing
- AIOps and APM
- Automation for CI/CD DevOps
- Analytics for root cause analysis & Management Information (MI)
- Integration with ITSM/SIEM solutions including ServiceNow, Splunk, & AppDynamics