Skip to content

Latest commit

 

History

History
292 lines (227 loc) · 10.8 KB

File metadata and controls

292 lines (227 loc) · 10.8 KB

ADBC for COBOL

CI License Linux macOS Windows

Connect GnuCOBOL programs to any database with an ADBC driver — SQLite, DuckDB, Snowflake, PostgreSQL, and more.

A thin C wrapper bridges ADBC's C API to COBOL-friendly types (USAGE POINTER, PIC S9(9) COMP-5, PIC X) while a library of reusable COBOL subprograms provides a high-level interface for common database operations.

Quick Start

Prerequisites

  • Pixi package manager
  • dbc CLI for installing ADBC drivers
  • GnuCOBOL 3.2+ (installed automatically by Pixi on macOS and Linux; on Windows, see below)
Install GnuCOBOL on Windows
  1. Install MSYS2 to C:\msys64
  2. Open an MSYS2 MINGW64 terminal and run:
    pacman -S mingw-w64-x86_64-gnucobol
  3. Add C:\msys64\mingw64\bin to your system Path environment variable
  4. Set these environment variables:
    Variable Value
    COB_CONFIG_DIR C:\msys64\mingw64\share\gnucobol\config
    COB_COPY_DIR C:\msys64\mingw64\share\gnucobol\copy
    COB_LIBRARY_PATH C:\msys64\mingw64\lib

Install drivers

dbc install sqlite
dbc install duckdb
dbc install snowflake  # if you have a Snowflake account

Run demos

Demo Command Description
SQLite pixi run demo-sqlite Bulk ingest, query, parameterized update, and transactions with rollback, on an in-memory SQLite database
DuckDB pixi run demo-duckdb Query a DuckDB file with 20+ Arrow data types (integers, floats, strings, dates, decimals, lists, structs)
Snowflake pixi run demo-snowflake Connect to Snowflake, list tables, fetch TPC-H CUSTOMER rows, write to a COBOL sequential file

The Snowflake demo requires a SNOWFLAKE_URI environment variable:

export SNOWFLAKE_URI='snowflake://user:pass@account/DATABASE/SCHEMA?warehouse=WAREHOUSE'
pixi run demo-snowflake

Run tests

pixi run test

That suite includes a standalone C portability check (test-wrapper-portability) which validates the wrapper's little-endian Arrow encoding/decoding logic. GitHub CI also runs that test under emulated s390x so the endian-critical conversion layer is exercised on a big-endian architecture even when the full database stack is not available there.

Linting

Run the repo lint checks with:

pixi run lint

That lint includes:

  • COBOL syntax checks with cobc -Wall -free -fsyntax-only
  • A 100-character maximum for non-comment COBOL lines in .cbl and .cpy files; comment lines starting with *> are exempt

To enable the same lint checks in Git pre-commit for your local clone:

pixi run install-pre-commit

COBOL API

All subprograms use a shared context (ADBC-CONTEXT) and execution variables (ADBC-EXEC-VARS) defined in the adbc-context.cpy copybook. After each call, check ADBC-OK for success.

Connection

CALL "adbc-connect"
    USING ADBC-CONTEXT ADBC-EXEC-VARS
          WS-DRIVER WS-OPTIONS
IF NOT ADBC-OK PERFORM ERR-EXIT END-IF

WS-DRIVER is a null-terminated driver name (e.g., Z"sqlite"). WS-OPTIONS is a null-terminated comma-separated string of key=value pairs (e.g., Z"uri=:memory:" or Z"host=localhost,port=5432,username=admin").

Queries

*> Query and print results to stdout
CALL "adbc-query-and-print"
    USING ADBC-CONTEXT ADBC-EXEC-VARS
          SQL-SELECT WS-PRINT-MODE

*> Query and fetch into a C-allocated buffer for COBOL overlay
CALL "adbc-query-and-fetch"
    USING ADBC-CONTEXT ADBC-EXEC-VARS
          SQL-SELECT COL-SPEC

Execute (DDL/DML)

*> Execute without parameters
CALL "adbc-execute"
    USING ADBC-CONTEXT ADBC-EXEC-VARS
          SQL-STATEMENT

*> Execute with bound parameters
CALL "adbc-execute-params"
    USING ADBC-CONTEXT ADBC-EXEC-VARS
          SQL-UPDATE PARAM-SPEC
          PARAM-DATA

Bulk Ingest

CALL "adbc-ingest"
    USING ADBC-CONTEXT ADBC-EXEC-VARS
          TABLE-NAME COL-SPEC
          COBOL-TABLE

Type Codes

Specs such as COL-SPEC and PARAM-SPEC use a small subset of Arrow C Data Interface format strings plus a few COBOL-oriented extensions. Not every Arrow format string is supported.

Code Meaning Notes
i Arrow i / int32 4-byte binary integer
l Arrow l / int64 8-byte binary integer
f Arrow f / float32 4-byte binary float
g Arrow g / float64 8-byte binary float
uN Arrow u / UTF-8 string N is the fixed COBOL field width
? Repo extension prefixes a supported fetch or write type; "1" = NULL, "0" = non-NULL
nN Repo fetch extension exact display numeric with N digits
sIvD Repo fetch extension exact signed decimal with I integer digits and D fractional digits

Transactions

CALL "adbc-begin"    USING ADBC-CONTEXT ADBC-EXEC-VARS
CALL "adbc-commit"   USING ADBC-CONTEXT ADBC-EXEC-VARS
CALL "adbc-rollback" USING ADBC-CONTEXT ADBC-EXEC-VARS

Catalog Metadata

*> List table types (TABLE, VIEW, etc.)
CALL "adbc-get-table-types"
    USING ADBC-CONTEXT ADBC-EXEC-VARS
          WS-PRINT-MODE

*> List tables (flattened view of catalogs/schemas/tables)
CALL "adbc-list-tables"
    USING ADBC-CONTEXT ADBC-EXEC-VARS
          WS-DEPTH
          WS-FILTER-CATALOG WS-FILTER-SCHEMA
          WS-FILTER-TABLE WS-FILTER-COLUMN

Cleanup

CALL "adbc-cleanup" USING ADBC-CONTEXT

Releases all ADBC and Arrow resources in the correct order.

Project Structure

src/
  c/
    abi_probe.h          ABI probe helper declarations
    abi_probe.c          ABI probe helper implementation
    adbc_endian.h        Arrow little-endian conversion helpers
    adbc_wrapper.h       C wrapper declarations
    adbc_wrapper.c       C wrapper implementation
  copybooks/
    adbc-context.cpy     Opaque handles (USAGE POINTER) + exec variables
    adbc-status.cpy      ADBC status code constants
    arrow-buffers.cpy    LINKAGE overlays for zero-copy Arrow buffer access
    customer-*.cpy       Snowflake demo copybooks
  programs/
    abi-probe.cbl        ABI boundary probe for new runtimes
    adbc-connect.cbl     Connect to any ADBC-supported database
    adbc-execute.cbl     Execute DDL/DML (no result set)
    adbc-execute-params.cbl  Execute with bound parameters
    adbc-query-and-print.cbl  Query and print results to stdout
    adbc-query-and-fetch.cbl  Query and fetch into C buffer
    adbc-ingest.cbl      Bulk ingest COBOL table to database
    adbc-begin.cbl       Begin transaction
    adbc-commit.cbl      Commit transaction
    adbc-rollback.cbl    Rollback transaction
    adbc-get-table-types.cbl  List table types
    adbc-get-objects.cbl      Raw GetObjects stream
    adbc-list-tables.cbl      Flattened table listing
    adbc-print-stream.cbl     Arrow stream printer
    adbc-write-file.cbl       Write fetch buffer to file
    adbc-row-ptr.cbl          Row pointer arithmetic
    adbc-cleanup.cbl          Release all resources
    adbc-get-error.cbl        Extract error message
    demo-sqlite.cbl      SQLite demo
    demo-duckdb.cbl      DuckDB demo
    demo-snowflake.cbl   Snowflake demo
tests/
  test-wrapper-portability.c  Standalone C endian/encoding checks
  test-*.cbl            COBOL integration tests
Makefile
pixi.toml

How It Works

COBOL cannot directly use ADBC because the API relies on function pointers, nested structs, and pointer-to-pointer patterns. The C wrapper (adbc_wrapper.c) solves this by:

  1. Heap-allocating all ADBC and Arrow structs, exposing them to COBOL as opaque USAGE POINTER values
  2. Providing accessor functions that use only COBOL-compatible types: void* for pointers, int32_t/int64_t for integers, char* for strings, double* via out-parameter
  3. Handling Arrow's columnar format in C (offset/data buffer navigation, nested type traversal, null bitmap checking) so COBOL programs work with simple row-level values

Data flows through Apache Arrow format in memory (the same columnar format used by ADBC drivers).

Portability Notes

  • Arrow numeric and offset buffers are little-endian. The wrapper converts those values explicitly instead of assuming host byte order, so ingest, fetch, parameter binding, and row-level getters are correct on big-endian hosts.
  • The raw buffer overlay pattern shown in arrow-buffers.cpy is a limited API for numeric data. It exposes Arrow's native little-endian buffer bytes directly, so zero-copy overlays of COMP-5, COMP-1, and COMP-2 fields are only safe when the COBOL runtime uses matching in-memory numeric layouts.
  • The C conversion layer assumes IEEE 754 binary32/binary64 float and double. Builds fail fast on platforms where the C compiler uses a different floating-point representation.
  • If you target a mainframe COBOL runtime with different floating-point or character-set semantics, validate those ABI boundaries separately. Binary integers and IEEE floats are not the only portability concern on z/Architecture, and SQL / option strings still need a consistent execution character set across COBOL, C, and the ADBC driver.

Run ABI probe

Before a first run on a new COBOL runtime, especially a mainframe target, run:

pixi run abi-probe

That standalone probe prints raw bytes for:

  • null-terminated text samples used by the repo (driver, options, SQL, col spec)
  • null marker bytes ("0" and "1")
  • COMP-5 integer samples that expose byte order directly
  • COMP-1 and COMP-2 values so you can compare the output to expected IEEE big-endian or little-endian encodings

It is meant to answer the first-pass ABI questions quickly before you try the full ADBC path on a new platform.

License

Apache License 2.0. See LICENSE for details.