Skip to content

Provide a functional API for importers? #108

@njlr

Description

@njlr

A little-used (but very useful!) feature in Npgsql is the NpgsqlBinaryImporter class, which allows more efficient bulk data loading with the Postgres COPY command.

Here is a demo using Npgsql.FSharp:

#r "nuget: Npgsql"
#r "nuget: Npgsql.FSharp"

open System
open Npgsql
open NpgsqlTypes
open Npgsql.FSharp

// Type extension so we can use the Npgsql.FSharp DU
type NpgsqlBinaryImporter with
  member this.WriteAsync(value : SqlValue) =
    match value with
    | SqlValue.Null ->
      this.WriteNullAsync()
    | SqlValue.Uuid x ->
      this.WriteAsync(x, NpgsqlDbType.Uuid)
    | SqlValue.String x ->
      this.WriteAsync(x, NpgsqlDbType.Varchar)
    | SqlValue.Int x ->
      this.WriteAsync(x, NpgsqlDbType.Integer)
    | _ ->
      failwith $"Unsupported value {value}" // TODO

// Domain object
type Book =
  {
    ID : Guid
    Title : string
    Year : int
  }

// Sample data (would typically come from a file)
let books =
  [
    { ID = Guid.Parse "74e99bf4-0b97-45c7-a078-698b96bc7421"; Title = "Consider Phlebas"; Year = 1987 }
    { ID = Guid.Parse "81023fb5-a54d-4a3f-8a8b-f36e022e6a11"; Title = "Hyperion"; Year = 1989 }
    { ID = Guid.Parse "89712977-3b2c-4fef-b902-8d20d1532084"; Title = "The Three-Body Problem"; Year = 2008 }
  ]

task {
  // Setup
  let connectionString = Environment.GetEnvironmentVariable "PG_CONNECTION_STRING"

  let db = Sql.connect connectionString

  let connection =
    db
    |> Sql.createConnection

  do! connection.OpenAsync()

  let db = Sql.existingConnection connection

  let! _ =
    db
    |> Sql.query
      """
      CREATE TABLE IF NOT EXISTS books (
        id UUID NOT NULL PRIMARY KEY,
        title TEXT NOT NULL,
        year INT NOT NULL
      )
      """
    |> Sql.executeNonQueryAsync

  // Binary import
  let writer = connection.BeginBinaryImport("COPY books (id, title, year) FROM STDIN BINARY")

  for book in books do
    let values =
      [
        Sql.uuid book.ID
        Sql.string book.Title
        Sql.int book.Year
      ]

    do! writer.StartRowAsync()

    for v in values do
      do! writer.WriteAsync(v)

  let! numberOfRowsWritten = writer.CompleteAsync()

  do! writer.CloseAsync()

  printfn $"Wrote {numberOfRowsWritten} row(s)"

  // Query back the data to check
  let! books =
    db
    |> Sql.query "SELECT * FROM books"
    |> Sql.executeAsync
      (fun read ->
        {
          ID = read.uuid "id"
          Title = read.string "title"
          Year = read.int "year"
        })

  for book in books do
    printfn $"{book}"

}
|> fun t -> t.Wait()

Perhaps this library could include a wrapper for this functionality?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions