David RomeroSoftware EngineerFor several years I have a hard goal. I want to become a great software engineer, so I work hard developing quality software every day. Fan of new technologies. I will not stop until overcome my goals.
After a couple of months learning and researching about kafka streams, I wasn’t able to find much information about how to test my kafka streams so I would like to share how a kafka stream could be tested with unit or integration tests.
We have the following scenarios:
Bank Balance: Extracted from udemy. Process all incoming transactions and accumulate them in a balance.
Customer purchases dispatcher: Process all incoming purchases and dispatch to the specific customer topic informed in the purchase code received.
Menu preparation: For each customer, the stream receives several recipes and this recipes must be grouped into a menu and sent by email to the customer. A single email should be received by each customer
For the above scenarios, we have unit and/or integration tests.
Integration tests has been developed with spring-kafka-test library.
2. Setup
Testing a kafka stream is only available on version 1.1.0 or higher, so we need to set this version for all our kafka dependencies.
Some dependencies, like junit, mockito, etc.. has been omitted to avoid verbosity
3. Unit tests
Unit tests for kafka streams are available from version 1.1.0 and it is the best way to test the topology of your kafka stream. The main advantage of unit tests over the integration ones is that they do not require the kafka ecosystem to be executed, therefore they are faster to execute and more isolated.
Let’s suppose that we have the following scenario:
We have a topic to which all the purchases made in our application and a topic for each customer. Each purchase has an identification code which includes the code of the customer who made the purchase. We have to redirect this purchase to the customer’s own topic. To know the topic related to each client we receive a map where the key will be the customer code and the value of the target topic.
Besides, we have to replace spanish character ‘ñ’ by ‘n’.
The solution provided is the following:
Now, we can test our solution.
Following the documentation, we need to create a TestDriver and a consumer factory if we want to read messages.
The driver configuration should be the same that we have in our kafka environment.
Once we have our TestDrive we can test our topology.
Once we have our test finished we can verify that everything is fine
4. Integration tests
In the same way that the unit tests help us verify that our topology is well designed, the integration tests also help us in this task by adding the extra to introduce the kafka ecosystem in our tests.
This implies that our tests will be more “real” but in the other hand, they will be much slower.
Spring framework has developed a very useful library that provides all necessary to develop a good integration tests. Further information could be obtained here
Let’s suppose that we have the following scenario:
We have a topic with incoming transactions and we must group them by customer and create a balance of these transactions. This balance will have the sum of the transaction amounts, the transaction count, and the last timestamp.
The solution provided is the following:
Now, we can test our solution.
Following the documentation, we have to define some configuration beans.
Once we have our configuration class fine, we can create our integration tests.
Our tests is done, so we can verify everything is fine!
As we talked before, integration tests are very slower, and we can check this issue in the previous test. It takes 16 seconds on my machine, a huge amount of time.
5. Conclusion
If we want to develop a quality kafka streams we need to test the topologies and for that goal we can follow two approaches: kafka-tests and/or spring-kafka-tests.
In my humble opinion, we should develop both strategies in order to tests as cases as possible always maintaining a balance between both testing strategies.
In this Github Repo, there is available the tests for scenario 3.
The full source code for this article is available over on GitHub.