Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/authoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@ Rule IDs have the shape `RULE-<PREFIX>-<NN>`. Prefixes are **not pre-allocated**
|--------|-------|---------------|
| JV | Java SDK / JDBC / Hibernate / Spring Data anti-patterns | skills/ydb-table/rules/embed/java.md |
| GO | Go SDK (`ydb-go-sdk/v3`) — driver, sessions, query/table services, retry, transactions | skills/ydb-table/rules/embed/go.md |
| CPP | C++ SDK (`ydb-cpp-sdk`) — query/table clients, retry, transactions, parameterization | skills/ydb-table/rules/embed/cpp.md |

### Severity labels

Expand Down
8 changes: 8 additions & 0 deletions promptfooconfig.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,12 @@ _messages: &messages

--- ydb-table / rules / embed / go.md ---
{{ ydb_table_rules_go }}

--- ydb-table / references / embed / cpp.md ---
{{ ydb_table_refs_cpp }}

--- ydb-table / rules / embed / cpp.md ---
{{ ydb_table_rules_cpp }}
- role: user
content: "{{ user_prompt }}"

Expand Down Expand Up @@ -223,6 +229,8 @@ defaultTest:
ydb_table_rules_java: file://skills/ydb-table/rules/embed/java.md
ydb_table_refs_go: file://skills/ydb-table/references/embed/go.md
ydb_table_rules_go: file://skills/ydb-table/rules/embed/go.md
ydb_table_refs_cpp: file://skills/ydb-table/references/embed/cpp.md
ydb_table_rules_cpp: file://skills/ydb-table/rules/embed/cpp.md
# Hard pre-filter: any output that came back as the defensive
# "[provider error] ..." stub is a fail regardless of what the grader says.
# Without this the grader will happily pass a 400/401 error as a "valid
Expand Down
1 change: 1 addition & 0 deletions skills/ydb-core/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ SDKs, all official under https://github.com/ydb-platform/:
| Python | ydb-python-sdk | PyPI `ydb` | ✅ | ✅ | ✅ |
| Java | ydb-java-sdk | Maven `tech.ydb:ydb-sdk-bom` + `ydb-sdk-query` / `ydb-sdk-topic` / `ydb-sdk-coordination` | ✅ | ✅ | ✅ |
| JS/TS | ydb-js-sdk | npm `@ydbjs/core`, `@ydbjs/query`, `@ydbjs/topic`, `@ydbjs/coordination` | ✅ | ✅ | ✅ |
| C++ | ydb-cpp-sdk | CMake `find_package(ydb-cpp-sdk)` / Debian `libydb-cpp-dev`; link `YDB-CPP-SDK::Driver` + `Query` / `Table` / `Topic` / `Coordination` | ✅ | ✅ | ✅ |

Q = queries, T = topics, C = coordination.

Expand Down
8 changes: 5 additions & 3 deletions skills/ydb-table/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
name: ydb-table
description: Writing and auditing code that runs YQL against YDB tables. Use when the user writes a query, designs a table or primary key, reads an `EXPLAIN`, or asks to review Java (ydb-java-sdk, ydb-jdbc-driver, Hibernate, Spring Data JPA) or Go (`ydb-go-sdk/v3`) application code that talks to YDB. Triggers on YQL keywords (`UPSERT`, `SELECT`, `DECLARE`, `AS_TABLE`, `VIEW <index>`, `CREATE TABLE`, `ALTER TABLE`, `EXPLAIN`), on the `BulkUpsert` SDK API, on JDBC / Hibernate / Spring symbols (`JpaRepository`, `findAllById`, `saveAll`, `deleteAllByIdInBatch`, `hibernate.jdbc.batch_size`, `@Version`, `@Retryable`, `SQLRecoverableException`, `SQLTransientException`), on `ydb-go-sdk/v3` symbols (`ydb.Open`, `db.Query().Do`, `db.Query().DoTx`, `db.Table().Do`, `query.WithIdempotent`, `query.WithCommit`, `query.WithStatsMode`, `query.Stats`, `query.StatsModeBasic`, `result.Close`, `ydb.WithLazyTx`, `ydb.ParamsBuilder`, `s.BeginTransaction`, `table.TxControl`, `table.BeginTx`, `BulkUpsertDataRows`, `balancers.PreferLocalDC`, `balancers.PreferNearestDC`), on flaky empty/zero query stats right after `Query` returns, on YDB transaction-mode names (`SerializableRW`, `SnapshotRO`), and on PostgreSQL / MySQL → YDB conversion prompts. For other SDKs (Python, C++, C#) this skill covers only the YQL / schema / transaction-mode side; SDK-specific guidance for those languages is not in this skill yet — say so and point at upstream docs.
description: Writing and auditing code that runs YQL against YDB tables. Use when the user writes a query, designs a table or primary key, reads an `EXPLAIN`, or asks to review Java (ydb-java-sdk, ydb-jdbc-driver, Hibernate, Spring Data JPA), Go (`ydb-go-sdk/v3`), or C++ (`ydb-cpp-sdk`) application code that talks to YDB. Triggers on YQL keywords (`UPSERT`, `SELECT`, `DECLARE`, `AS_TABLE`, `VIEW <index>`, `CREATE TABLE`, `ALTER TABLE`, `EXPLAIN`), on the `BulkUpsert` SDK API, on JDBC / Hibernate / Spring symbols (`JpaRepository`, `findAllById`, `saveAll`, `deleteAllByIdInBatch`, `hibernate.jdbc.batch_size`, `@Version`, `@Retryable`, `SQLRecoverableException`, `SQLTransientException`), on `ydb-go-sdk/v3` symbols (`ydb.Open`, `db.Query().Do`, `db.Query().DoTx`, `db.Table().Do`, `query.WithIdempotent`, `query.WithCommit`, `query.WithStatsMode`, `query.Stats`, `query.StatsModeBasic`, `result.Close`, `ydb.WithLazyTx`, `ydb.ParamsBuilder`, `s.BeginTransaction`, `table.TxControl`, `table.BeginTx`, `BulkUpsertDataRows`, `balancers.PreferLocalDC`, `balancers.PreferNearestDC`), on `ydb-cpp-sdk` symbols (`#include <ydb-cpp-sdk/client/`, `NYdb::TDriver`, `TDriverConfig`, `NYdb::NQuery::TQueryClient`, `NYdb::NTable::TTableClient`, `RetryQuerySync`, `RetryOperationSync`, `TRetryOperationSettings`, `TParamsBuilder`, `TTxControl`, `TValueBuilder`, `TResultSetParser`, `StreamExecuteQuery`, `BulkUpsert`, `ExecuteSchemeQuery`, `CreateFromEnvironment`, `GetValueSync`, `find_package(ydb-cpp-sdk`, `YDB-CPP-SDK::`), on flaky empty/zero query stats right after `Query` returns, on YDB transaction-mode names (`SerializableRW`, `SnapshotRO`), and on PostgreSQL / MySQL → YDB conversion prompts. For other SDKs (Python, C#) this skill covers only the YQL / schema / transaction-mode side; SDK-specific guidance for those languages is not in this skill yet — say so and point at upstream docs.
---

# YDB Table
Expand All @@ -11,7 +11,7 @@ Writing YQL against YDB tables, designing schemas to back those queries, and aud

1. **Classify the task.** Write a new query or schema, audit existing code, convert from another SQL dialect, or read an `EXPLAIN`.
2. **Load sources** per the table below.
3. **Do the work.** When auditing, cite the rule ID for any anti-pattern flagged — `RULE-JV-NN` for Java, `RULE-GO-NN` for Go. When the topic isn't covered by the loaded sources, say so and link to upstream YDB docs rather than guessing.
3. **Do the work.** When auditing, cite the rule ID for any anti-pattern flagged — `RULE-JV-NN` for Java, `RULE-GO-NN` for Go, `RULE-CPP-NN` for C++. When the topic isn't covered by the loaded sources, say so and link to upstream YDB docs rather than guessing.

## Load sources

Expand All @@ -22,12 +22,14 @@ Writing YQL against YDB tables, designing schemas to back those queries, and aud
| Auditing Java application code against YDB | `rules/embed/java.md` |
| Writing Go application code against YDB | `references/embed/go.md` |
| Auditing Go application code against YDB | `rules/embed/go.md` |
| Writing C++ application code against YDB | `references/embed/cpp.md` |
| Auditing C++ application code against YDB | `rules/embed/cpp.md` |
| Schema design — primary key shape, partitioning | `../ydb-core/SKILL.md#schema-basics` |
| YQL syntax, built-in functions, pragmas | <https://ydb.tech/docs/en/yql/reference/> — do not reproduce the spec from memory |

## Content rules

- Always parameterize: bind values through the SDK's typed parameter API (e.g. `ydb.ParamsBuilder()` in Go, `PreparedStatement` in JDBC), do not concatenate them into the query text. Plan-cache reuse depends on it; concatenated literals miss the cache. A leading `DECLARE` block in the query body is optional in modern YDB — scalar parameter types are inferred from the bound values — and earns its place on compound shapes (`List<Struct<...>>`) or as an explicit caller contract.
- Always parameterize: bind values through the SDK's typed parameter API (e.g. `ydb.ParamsBuilder()` in Go, `TParamsBuilder` in C++, `PreparedStatement` in JDBC), do not concatenate them into the query text. Plan-cache reuse depends on it; concatenated literals miss the cache. A leading `DECLARE` block in the query body is optional in modern YDB — scalar parameter types are inferred from the bound values — and earns its place on compound shapes (`List<Struct<...>>`) or as an explicit caller contract.
- Prefer the Query Service over the deprecated Table Service for new code.
- When converting from another SQL dialect, surface where YDB diverges — primary keys are partition keys, no `SERIAL` / `AUTO_INCREMENT`, JOIN behavior and built-in function names differ — rather than producing code that happens to parse.
- Don't fabricate YQL syntax, built-in names, or SDK symbols. If the loaded sources don't cover the question, link the relevant page under <https://ydb.tech/docs/en/yql/reference/> and state the uncertainty.
165 changes: 165 additions & 0 deletions skills/ydb-table/references/embed/cpp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
# Embedding YDB in C++ applications

## Stack

The official C++ SDK is **`ydb-cpp-sdk`** (<https://github.com/ydb-platform/ydb-cpp-sdk>). Public API lives in namespace `NYdb::inline V3` under `#include <ydb-cpp-sdk/client/...>`.

Two application surfaces for table work:

- **`NYdb::NQuery::TQueryClient`** — Query Service. Preferred for new YQL-centric code. Supports `TTxControl::NoTx()` for DDL, `StreamExecuteQuery` for large reads, `ReadCommittedRW` transaction mode.
- **`NYdb::NTable::TTableClient`** — Table / KQP Service. Use for `PrepareDataQuery`, `BulkUpsert`, `StreamExecuteScanQuery`, schema builders (`TTableBuilder`, `CreateTable`). `NTable::TTxControl` has no `NoTx()` — DDL via `CreateTable` / `ExecuteSchemeQuery`.

Open one **`NYdb::TDriver`** per process; stack surface clients on top. APIs return `NThreading::TFuture<T>`; examples block with `.GetValueSync()` / `.ExtractValueSync()`.

Build: C++20, static libraries. CMake consumer pattern from upstream README:

```cmake
find_package(ydb-cpp-sdk REQUIRED COMPONENTS Driver Query Table)
target_link_libraries(myapp PRIVATE YDB-CPP-SDK::Driver YDB-CPP-SDK::Query)
```

Debian packages: `libydb-cpp-dev` (core), optional `libydb-cpp-iam-dev`. After install, pass `-DCMAKE_PREFIX_PATH=/usr/share/yandex`.

Primary documentation: <https://ydb.tech/docs/en/reference/ydb-sdk/>. Runnable demos for orientation: <https://github.com/ydb-platform/ydb-cpp-sdk/tree/main/examples> — illustrative, not normative.

## Connection

See [`../../../ydb-core/SKILL.md#connecting`](../../../ydb-core/SKILL.md#connecting) for connection-string shape and auth env vars.

```cpp
#include <ydb-cpp-sdk/client/driver/driver.h>

auto cfg = NYdb::TDriverConfig()
.SetEndpoint("grpc://localhost:2136")
.SetDatabase("/local")
.SetAuthToken(std::getenv("YDB_TOKEN") ? std::getenv("YDB_TOKEN") : "");
NYdb::TDriver driver(cfg);
// ... work ...
driver.Stop(true);
```

`TDriverConfig` also accepts a connection string: `grpc://host:port/?database=/path` or `grpcs://...`. Credentials factories: `CreateOAuthCredentialsProviderFactory`, `CreateInsecureCredentialsProviderFactory` — `include/ydb-cpp-sdk/client/types/credentials/credentials.h`.

For production code, prefer letting the SDK pick credentials from the standard `YDB_*` environment variables via `NYdb::CreateFromEnvironment(connectionString)` in `include/ydb-cpp-sdk/client/helpers/helpers.h` — it returns a ready `TDriverConfig` honouring `YDB_SERVICE_ACCOUNT_KEY_FILE_CREDENTIALS`, `YDB_ACCESS_TOKEN_CREDENTIALS`, `YDB_METADATA_CREDENTIALS`, `YDB_OAUTH2_KEY_FILE`, and `YDB_ANONYMOUS_CREDENTIALS`. The full env-var list is in [`../../../ydb-core/SKILL.md#connecting`](../../../ydb-core/SKILL.md#connecting).

## Query execution

Canonical Query Service pattern — `RetryQuerySync` wraps a lambda; the lambda is the retry unit:

```cpp
#include <ydb-cpp-sdk/client/query/client.h>
#include <ydb-cpp-sdk/client/params/params.h>

using namespace NYdb::NQuery;

static TStatus SelectUserById(TSession session) {
auto params = TParamsBuilder()
.AddParam("$id").Uint64(42).Build()
.Build();
return session.ExecuteQuery(
"SELECT name FROM users WHERE id = $id",
TTxControl::BeginTx(TTxSettings::SerializableRW()).CommitTx(),
params
).GetValueSync();
}

ThrowOnError(client.RetryQuerySync(
SelectUserById,
NYdb::NRetry::TRetryOperationSettings().Idempotent(true)));
```

Three load-bearing pieces:

- **`TRetryOperationSettings().Idempotent(true)`** on `RetryQuerySync` / `RetryOperationSync` declares replay-safe work. Required for reads and for writes keyed on a client-generated id. Omit on non-idempotent writes (counter increment, unkeyed `INSERT`).
- **`TParamsBuilder`** binds values — do not concatenate them into the query text. A leading `DECLARE` block is optional for scalars; use it for `List<Struct<...>>` and other compound shapes.
- **Build results inside the lambda.** Assign to outer variables only on the success path (the returned `TStatus` is success). Mutations to outer state mid-lambda survive across retry attempts.

Source: YDB docs — retry recipe at <https://ydb.tech/docs/en/recipes/ydb-sdk/retry> and parameterized queries at <https://ydb.tech/docs/en/reference/ydb-sdk/parameterized_queries>.

## Transactions

**Single-statement** — fuse begin and commit in `TTxControl`:

```cpp
TTxControl::BeginTx(TTxSettings::SerializableRW()).CommitTx()
```

**Multi-step client logic** — first query opens the tx (no `CommitTx`), second commits:

```cpp
auto result = session.ExecuteQuery(query1, TTxControl::BeginTx(TTxSettings::SerializableRW()), params1)
.GetValueSync();
auto tx = *result.GetTransaction();
auto result2 = session.ExecuteQuery(query2, TTxControl::Tx(tx).CommitTx(), params2).GetValueSync();
```

Per the YDB transactions guide: "if the transaction body is fully formed before accessing the database, it will be processed more efficiently" — fuse with `CommitTx()` whenever client logic doesn't sit between statements.

For transaction modes and optimistic-locking consequences, see [`../working-with-data.md`](../working-with-data.md).

Source: YDB docs — <https://ydb.tech/docs/en/concepts/transactions>.

## Retries

YDB uses optimistic concurrency — application code that talks to YDB must run inside SDK retriers, not as bare one-shot RPCs.

`RetryQuerySync` / `RetryOperationSync` classify errors internally per the YDB status-code table:

- **Always retried**: `ABORTED`, `OVERLOADED`, `CLIENT_RESOURCE_EXHAUSTED`, `UNAVAILABLE`, `BAD_SESSION`, `SESSION_BUSY` (session reset).
- **Retried only when `.Idempotent(true)`**: `UNDETERMINED`, `TRANSPORT_UNAVAILABLE`.
- **Non-retryable**: schema / semantic failures (`SCHEME_ERROR`, `BAD_REQUEST`, `PRECONDITION_FAILED`) — propagated to caller.

No outer `for` loop or hand-rolled `Sleep` backoff around SDK calls. Tune via `TRetryOperationSettings` (`MaxRetries`, `FastBackoffSettings`, `SlowBackoffSettings`).

Source: YDB docs — status-code retry table at <https://ydb.tech/docs/en/reference/ydb-sdk/ydb-status-codes>, retry recipe at <https://ydb.tech/docs/en/recipes/ydb-sdk/retry>, error-handling guidance at <https://ydb.tech/docs/en/reference/ydb-sdk/error_handling>.

## Result parsing

```cpp
TResultSetParser parser(result.GetResultSet(0));
while (parser.TryNextRow()) {
auto id = parser.ColumnParser("id").GetOptionalUint64();
}
```

## Bulk upsert

Non-transactional ingest via Table client:

```cpp
NYdb::TValueBuilder rows;
rows.BeginList().AddListItem().BeginStruct()
.AddMember("id").Uint64(1)
.AddMember("payload").Utf8("x")
.EndStruct().EndList();

struct TBulkUpsertOp {
std::string TablePath;
NYdb::TValue Rows;
TStatus operator()(NYdb::NTable::TTableClient& tableClient) const {
return tableClient.BulkUpsert(TablePath, Rows).GetValueSync();
}
};

client.RetryOperationSync(
TBulkUpsertOp{tablePath, rows.Build()},
NYdb::NTable::TRetryOperationSettings().Idempotent(true).MaxRetries(20));
```

`BulkUpsert` is UPSERT-keyed (insert-or-overwrite by primary key), so replaying the same chunk converges to the same final state — that's why `.Idempotent(true)` is safe here and is the conventional setting. Each `BulkUpsert` call is its own non-transactional batch, not part of a surrounding `TTxControl`.

When bulk is appropriate vs `AS_TABLE` in a transaction — see [`../working-with-data.md`](../working-with-data.md).

Source: YDB docs — batch upload guide at <https://ydb.tech/docs/en/dev/batch-upload> (non-transactional ingest path, incompatible with synchronous secondary indexes).

## Large reads

Pick one structural path:

- **Query Service streaming** — `client.StreamExecuteQuery(...)` returns `TExecuteQueryIterator`; iterate with `ReadNext()`. The SDK may replay the lambda on retry, so an in-progress stream can re-emit already-seen rows — consumers must tolerate duplicates or dedupe.
- **Table Service scan** — `session.StreamExecuteScanQuery(...)` for unbounded scans without the Table `ExecuteDataQuery` result cap.
- **Keyset pagination** — outer loop with cursor predicate over the primary key; each page is its own `RetryQuerySync` call. See [`../working-with-data.md`](../working-with-data.md).

If using `ExecuteDataQuery`, check `TResultSet::Truncated()` — a `true` value means the result was cut off and the read must be continued (pagination or streaming).

Source: YDB docs — paging guide at <https://ydb.tech/docs/en/dev/paging> (keyset pagination over the primary key as the canonical strategy).
Loading