|
| 1 | +--- |
| 2 | +author: piotr |
| 3 | +tags: |
| 4 | + - spring |
| 5 | + - ai |
| 6 | + - vector |
| 7 | + - vectorstore |
| 8 | +date: 2025-04-22T10:00:00.000Z |
| 9 | +meaningfullyUpdatedAt: 2025-04-22T10:00:00.000Z |
| 10 | +slug: leveraging-spring-ai-vectorstore-for-semantic-search |
| 11 | +title: Leveraging Spring AI's VectorStore for Enhanced Semantic Search |
| 12 | +layout: post |
| 13 | +image: /images/spring-ai-vectorstore.jpg |
| 14 | +hidden: false |
| 15 | +comments: false |
| 16 | +published: true |
| 17 | +language: en |
| 18 | +--- |
| 19 | + |
| 20 | +**In this follow-up article, we explore how Spring AI's VectorStore abstraction simplifies vector storage and retrieval, |
| 21 | +eliminating the need for custom repository methods and direct vector manipulation. Learn how this powerful abstraction |
| 22 | +enhances our Support Assistant application with improved document management and attachment handling.** |
| 23 | + |
| 24 | +## Introduction to VectorStore |
| 25 | + |
| 26 | +In our [previous article](/blog/gentle-intro-to-spring-ai-embedding-model-abstraction), we explored Spring AI's |
| 27 | +Embedding Model abstraction for semantic search using pgvector. While that approach works well, it requires custom |
| 28 | +repository methods and direct manipulation of vector embeddings in our application code. |
| 29 | + |
| 30 | +Spring AI provides a higher-level abstraction called `VectorStore` that simplifies vector storage and retrieval, |
| 31 | +offering several advantages over direct vector manipulation: |
| 32 | + |
| 33 | +1. **Abstraction from database details** - No need to write custom SQL queries or understand vector database internals |
| 34 | +2. **Unified API across different vector databases** - The same code works with PostgreSQL, Redis, Pinecone, and other |
| 35 | + vector databases |
| 36 | +3. **Document-based approach** - Work with rich document objects instead of raw vectors |
| 37 | +4. **Built-in metadata filtering** - Filter search results based on document metadata |
| 38 | +5. **Simplified management** - Add, update, and delete vectors with simple API calls |
| 39 | + |
| 40 | +Let's see how we've enhanced our Support Assistant application using VectorStore. |
| 41 | + |
| 42 | +## From Direct Vectors to VectorStore |
| 43 | + |
| 44 | +### Before: Direct Vector Manipulation |
| 45 | + |
| 46 | +Previously, our application stored embeddings directly in the database entities. This approach required: |
| 47 | + |
| 48 | +1. Adding a vector column to our database tables |
| 49 | +2. Creating custom repository methods with native SQL queries to perform vector similarity searches |
| 50 | +3. Manually generating and storing embeddings for each entity |
| 51 | +4. Writing complex SQL queries with vector operators |
| 52 | + |
| 53 | +This direct approach worked but had several drawbacks: |
| 54 | + |
| 55 | +1. It tightly coupled our application to a specific vector database implementation |
| 56 | +2. It required writing and maintaining custom SQL queries |
| 57 | +3. It mixed vector operations with our business logic |
| 58 | +4. It made it difficult to add new features or change the vector database |
| 59 | + |
| 60 | +### After: Using VectorStore |
| 61 | + |
| 62 | +With VectorStore, we no longer need to store embeddings in our entity or write custom repository methods. |
| 63 | +Instead, we can use the default autoconfigured VectorStore backed by pgvector. We can also tune the configuration, if |
| 64 | +needed, |
| 65 | +by providing a VectorStore bean: |
| 66 | + |
| 67 | +```kotlin |
| 68 | +@Configuration |
| 69 | +class VectorStoreConfig { |
| 70 | + @Bean |
| 71 | + fun pgVectorStore(jdbcTemplate: JdbcTemplate, embeddingModel: EmbeddingModel): PgVectorStore { |
| 72 | + return PgVectorStore.builder(jdbcTemplate, embeddingModel) |
| 73 | + .vectorTableName("support_vector_store") |
| 74 | + .distanceType(PgVectorStore.PgDistanceType.COSINE_DISTANCE) |
| 75 | + .indexType(PgVectorStore.PgIndexType.HNSW) |
| 76 | + .initializeSchema(true) |
| 77 | + .build() |
| 78 | + } |
| 79 | +} |
| 80 | +``` |
| 81 | + |
| 82 | +And our service now uses the VectorStore for similarity search: |
| 83 | + |
| 84 | +```kotlin |
| 85 | +fun suggestResponse(customerMessage: String, limit: Int = 5): String { |
| 86 | + // Search for similar documents in the vector store |
| 87 | + val searchRequest = SearchRequest.builder() |
| 88 | + .query(customerMessage) |
| 89 | + .topK(limit) |
| 90 | + .similarityThreshold(0.5) |
| 91 | + .build() |
| 92 | + |
| 93 | + val similarDocuments = vectorStore.similaritySearch(searchRequest) |
| 94 | + |
| 95 | + // ... |
| 96 | +} |
| 97 | +``` |
| 98 | + |
| 99 | +## The Document Abstraction |
| 100 | + |
| 101 | +At the heart of VectorStore is the `Document` class, which represents a piece of content with associated metadata. This |
| 102 | +abstraction allows us to work with rich, structured data rather than just raw text and vectors. |
| 103 | + |
| 104 | +When creating a ticket, we now convert it to a Document: |
| 105 | + |
| 106 | +```kotlin |
| 107 | +val ticketDocument = Document.builder() |
| 108 | + .id(UUID.randomUUID().toString()) |
| 109 | + .text("${savedTicket.title} ${savedTicket.customerMessage} ${savedTicket.agentResponse}") |
| 110 | + .metadata( |
| 111 | + mapOf( |
| 112 | + "type" to "ticket", |
| 113 | + "ticketId" to (savedTicket.id ?: 0), |
| 114 | + "title" to savedTicket.title, |
| 115 | + "customerMessage" to savedTicket.customerMessage, |
| 116 | + "agentResponse" to savedTicket.agentResponse, |
| 117 | + "category" to savedTicket.category, |
| 118 | + "status" to savedTicket.status.name |
| 119 | + ) |
| 120 | + ) |
| 121 | + .build() |
| 122 | + |
| 123 | +// Add the ticket document to the vector store |
| 124 | +vectorStore.add(listOf(ticketDocument)) |
| 125 | +``` |
| 126 | + |
| 127 | +Note that only the text of a document is part of embedding. |
| 128 | + |
| 129 | +## Enhanced Functionality: Ticket Attachments |
| 130 | + |
| 131 | +One of the major benefits of using VectorStore is how easily we can extend our application with new features. We've |
| 132 | +added support for ticket attachments, which are now included in semantic search. |
| 133 | + |
| 134 | +### Converting Attachments to Documents |
| 135 | + |
| 136 | +When a ticket has attachments, we convert each attachment to a Document: |
| 137 | + |
| 138 | +```kotlin |
| 139 | +fun attachmentToDocument(attachment: TicketAttachment): Document { |
| 140 | + val documentBuilder = Document.builder() |
| 141 | + .id(UUID.randomUUID().toString()) |
| 142 | + .metadata( |
| 143 | + mapOf( |
| 144 | + "type" to "attachment", |
| 145 | + "ticketId" to (attachment.ticket.id!!), |
| 146 | + "fileName" to attachment.fileName, |
| 147 | + "contentType" to attachment.contentType |
| 148 | + ) |
| 149 | + ) |
| 150 | + return when { |
| 151 | + attachment.content != null -> { |
| 152 | + // Text attachment |
| 153 | + documentBuilder |
| 154 | + .text(attachment.content) |
| 155 | + .build() |
| 156 | + } |
| 157 | + |
| 158 | + attachment.contentType == "application/pdf" && attachment.binaryContent != null -> { |
| 159 | + // PDF attachment - in a real application, you would use a PDF parser here |
| 160 | + // For simplicity, we're just creating a document with text content |
| 161 | + documentBuilder |
| 162 | + .text("PDF attachment: ${attachment.fileName}") |
| 163 | + .build() |
| 164 | + } |
| 165 | + |
| 166 | + else -> { |
| 167 | + // Other binary attachment - in a real application, you might use different parsers |
| 168 | + // For simplicity, we're just creating a document with metadata |
| 169 | + documentBuilder |
| 170 | + .text("Binary attachment: ${attachment.fileName}") |
| 171 | + .build() |
| 172 | + } |
| 173 | + } |
| 174 | +} |
| 175 | +``` |
| 176 | + |
| 177 | +### Including Attachments in Search Results |
| 178 | + |
| 179 | +When generating responses, we now include content from both tickets and their attachments: |
| 180 | + |
| 181 | +```kotlin |
| 182 | +val context = similarDocuments.joinToString("\n\n") { document -> |
| 183 | + val metadata = document.metadata |
| 184 | + if (metadata["type"] == "ticket") { |
| 185 | + """ |
| 186 | + Customer: ${metadata["customerMessage"]} |
| 187 | + Agent: ${metadata["agentResponse"]} |
| 188 | + """ |
| 189 | + } else { |
| 190 | + """ |
| 191 | + Attachment: ${metadata["fileName"]} |
| 192 | + Content: ${document.text ?: "No content available"} |
| 193 | + """ |
| 194 | + } |
| 195 | +} |
| 196 | +``` |
| 197 | + |
| 198 | +## Benefits of Using VectorStore |
| 199 | + |
| 200 | +Let's summarize the key benefits we've gained by switching to VectorStore: |
| 201 | + |
| 202 | +### 1. Simplified Code |
| 203 | + |
| 204 | +Our code is now more focused on business logic rather than vector operations. We no longer need custom SQL queries or |
| 205 | +direct vector manipulation. |
| 206 | + |
| 207 | +### 2. Enhanced Maintainability |
| 208 | + |
| 209 | +The VectorStore abstraction isolates our application from the details of the vector database. If we decide to switch |
| 210 | +from PostgreSQL to another vector database like Pinecone or Redis, we only need to change the VectorStore configuration, |
| 211 | +not our application code. |
| 212 | + |
| 213 | +### 3. Improved Extensibility |
| 214 | + |
| 215 | +Adding new features like ticket attachments is straightforward. We simply convert the new content to Documents and add |
| 216 | +them to the VectorStore. |
| 217 | + |
| 218 | +### 4. Better Metadata Management |
| 219 | + |
| 220 | +The Document abstraction allows us to associate rich metadata with our content, which we can use for filtering and |
| 221 | +organizing search results. |
| 222 | + |
| 223 | +### 5. Optimized Performance |
| 224 | + |
| 225 | +VectorStore implementations are optimized for vector operations, providing better performance than custom solutions. |
| 226 | + |
| 227 | +## Conclusion |
| 228 | + |
| 229 | +Spring AI's VectorStore abstraction provides a powerful, flexible way to implement semantic search in your applications. |
| 230 | +By abstracting away the details of vector storage and retrieval, it allows you to focus on your application's business |
| 231 | +logic while still leveraging the power of vector embeddings. |
| 232 | + |
| 233 | +In our Support Assistant application, switching to VectorStore has simplified our code, improved maintainability, and |
| 234 | +enabled new features like ticket attachments. If you're building applications with semantic search capabilities, |
| 235 | +VectorStore is definitely worth considering. |
| 236 | + |
| 237 | +The full source code for this enhanced version is available |
| 238 | +on [GitHub](https://github.com/miensol/spring-ai-gentle-intro). |
0 commit comments