Skip to content

Conversation

@hackerdave
Copy link

@hackerdave hackerdave commented Apr 11, 2025

Description

This transaction integrates Oracle Document Loader and Oracle Vector Search with LangChain.js. The integration enables the use of Oracle's advanced vector search capabilities within the LangChain.js framework.

Key Features

  • Adds support for Oracle Document Loader in LangChain.js.
    • loading documents either from the file system or a table
    • generating embeddings either via an ONNX model loaded into the database or via a third-party REST call
    • summarizing documents either via Oracle Text or via a third-party REST call
  • Adds support for Oracle Vector Search in LangChain.js.
  • Includes example usage and tests for the integration.

Fixes

Testing

Verified integration with unit tests.
Added relevant test cases to ensure proper functionality.

Request

Please let @skmishraoracle and @rohanaggarwal7997 know if there are any issues or areas that need improvement.

@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Apr 11, 2025
@vercel
Copy link

vercel bot commented Apr 11, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchainjs-docs ❌ Failed (Inspect) Jun 8, 2025 1:33am
1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchainjs-api-refs ⬜️ Ignored (Inspect) Jun 8, 2025 1:33am

@vercel vercel bot temporarily deployed to Preview – langchainjs-docs April 11, 2025 00:47 Inactive
@vercel vercel bot temporarily deployed to Preview – langchainjs-docs April 11, 2025 01:06 Inactive
@vercel vercel bot temporarily deployed to Preview – langchainjs-docs May 17, 2025 15:58 Inactive
@rohanaggarwal7997
Copy link

@cjbj can you please let us know if you have any more comments. We would be really grateful if we can merge this in. Several of our customers with a large JS presence would really like to use it as Langchain is the go to platform for everyone nowadays and Oracle DB is where all of their mission critical data is.

@vercel vercel bot temporarily deployed to Preview – langchainjs-docs May 21, 2025 22:54 Inactive
@vercel vercel bot temporarily deployed to Preview – langchainjs-docs May 22, 2025 01:59 Inactive
@vercel vercel bot temporarily deployed to Preview – langchainjs-docs May 22, 2025 02:35 Inactive
}
const result = await this.conn.execute(
<string>(
`select dbms_vector_chain.utl_to_text(t.${this.pref.colname}, :pref) text, dbms_vector_chain.utl_to_text(t.${this.pref.colname}, json('{"plaintext": "false"}')) metadata from ${this.pref.owner}.${this.pref.tablename} t`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have the substituted values been filtered to avoid any SQL Injection issues?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have added a check

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • You need to bind those values !
  • Would it be nice to throw a custom error message if the check fails?
  const cn = 'empno';
  const ow = 'scott';
  const tn = 'emp';
  const qn = `${ow}.${tn}`;
  const sql = `select sys.dbms_assert.simple_sql_name(:sn), sys.dbms_assert.qualified_sql_name(:qn) from dual`;
  const binds = [cn, qn];
  await connection.execute(sql, binds);

Copy link
Author

@hackerdave hackerdave May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to bind values and added a custom error message as suggested

const embeddings = [];

if (oracledb.thin) {
// thin mode, can't use batching
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is something missing in node-oracledb Thin mode?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The batching version of utl_to_embedding passes a collection called vector_array_t which is an array of clobs and I think thin mode didn't support this. Hence, in thin mode we embed each input at a time.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. We (node-oracledb) will revisit this.

@vercel vercel bot temporarily deployed to Preview – langchainjs-docs May 22, 2025 17:47 Inactive
@vercel vercel bot temporarily deployed to Preview – langchainjs-docs May 27, 2025 23:35 Inactive
@hackerdave hackerdave changed the title Oraclevs integration feat: Add Oracle Document Loader and Vector Store integration Jun 3, 2025
@vercel vercel bot temporarily deployed to Preview – langchainjs-docs June 3, 2025 18:08 Inactive
Copy link

@cjbj cjbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no other comments

@vercel vercel bot temporarily deployed to Preview – langchainjs-docs June 8, 2025 01:33 Inactive
@hackerdave
Copy link
Author

Replaced by PR #8659 #8659

@hackerdave hackerdave closed this Sep 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants