Sample service exposing an API that can be used to validate a .csv file containing trading data and enrich it with
additional content if necessary. By design the service needs to be able to handle very large sets of trades (millions)
and a large set of products (10k to 100k).
The list of currently supported products:
| productId | productName |
|---|---|
| 1 | Treasury Bills Domestic |
| 2 | Corporate Bonds Domestic |
| 3 | REPO Domestic |
| 4 | Interest rate swaps International |
| 5 | OTC Index Option |
| 6 | Currency Options |
| 7 | Reverse Repos International |
| 8 | REPO International |
| 9 | 766A_CORP BD |
| 10 | 766B_CORP BD |
To build and run the application you need the following:
There are several ways to run a Spring Boot application on your local machine. One way is to execute the main method
in the com.interview.AbcBankTradeProcessingApplication class from your IDE.
Alternatively you can use the Spring Boot Maven plugin like so:
mvn spring-boot:runSimilarly, test suite can be run either from your IDE or from a terminal by executing:
mvn spring-boot:test-runThe service expects a multipart/form-data form with .csv file, containing a header row and a comma-separated
list of values with the following structure/format:
| date | productId | currency | price |
|---|---|---|---|
| 20250101 | 1 | EUR | 10.0 |
| 20250101 | 2 | EUR | 20.1 |
| 20250101 | 3 | EUR | 30.34 |
An example .csv file is located in /src/test/resources.
With the service running locally, and after changing the local directory to this code repository, you can run an example flow like so:
curl --form file="@src/test/resources/trade.csv" --header 'Content-Type: multipart/form-data' http://localhost:8080/api/v1/enrich-
I believe that because parsing a file is inherently a sequential task, simply adding more threads won't result in an easy performance boost (because of synchronization). On the other hand we can easily end up with a much more complex solution.
-
Performance is achieved by doing all the work in-memory, but this means the file needs to be streamed line-by-line to not cause OutOfMemoryErrors when working with large files.
-
Of course off-loading the work to a database engine (for instance Postgres) could give us some benefits if data operations become more complex but this brings the additional cost of communication with the database, how this affects the general performance could be assessed with a simple POC.
-
External cache like Redis is another possibility, especially if RAM usage will become a burden.
-
With this approach we don't track what exact files arrived and were returned to the client, I think a safer approach would be to work with files being uploaded to a directory, move incoming files to preserve them, then generate a file that can be downloaded by normal means not involving our API.
-
Logs below from a manual performance test with 10KK trades and 100K products on AMD Ryzen 5 7600X (4.7 GHz, 6 cores).
TODO:
- Swagger documentation
- Performance testing with a tool like Gatling
- Maybe changing GC to non-generational could improve performance? We could also perform GC manually after each batch.
curl --form file="@src/test/resources/trade_full.csv" --header 'Content-Type: multipart/form-data' http://localhost:8080/api/v1/enrich --output output.csv -v
* Host localhost:8080 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying [::1]:8080...
* Connected to localhost (::1) port 8080
* using HTTP/1.x
> POST /api/v1/enrich HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.13.0
> Accept: */*
> Content-Length: 277781740
> Content-Type: multipart/form-data; boundary=------------------------TFXvmR2A0lG5EvKeqOrpmd
> Expect: 100-continue
>
< HTTP/1.1 100
<
} [65536 bytes data]
46 264M 0 0 46 122M 0 243M 0:00:01 --:--:-- 0:00:01 244M* upload completely sent off: 277781740 bytes
< HTTP/1.1 200
< Transfer-Encoding: chunked
< Date: Fri, 25 Jul 2025 10:04:28 GMT
<
{ [8110 bytes data]
100 644M 0 379M 100 264M 8527k 5955k 0:00:45 0:00:45 --:--:-- 8674k
For further reference, please consider the following sections:
- Official Apache Maven documentation
- Spring Boot Maven Plugin Reference Guide
- Create an OCI image
- Spring Web
The following guides illustrate how to use some features concretely: