Skip to content

Add new regex_captures_template virtual table? #9

Open
@asg017

Description

@asg017

The new regex_capture() and regex_captures() functions are great, but often you have to call regex_capture() multiple times to extract different capture group from the same capture.

For example:

select
  regex_capture("('ID(?P<id>\d+)Y(?P<year>\d+)ABC')", "ID123Y2023ABC", "id") as id,
  regex_capture("('ID(?P<id>\d+)Y(?P<year>\d+)ABC')", "ID123Y2023ABC", "year") as year;

The regex_captures make this a bit better, but it's still awkward and you repeat yourself a lot.

Possible solution: A "template" virtual table

A new virtual table that can take in a regex pattern as input, and infer column names based on the pattern's capture group names.

For example:

create virtual table parse_id using regex_captures_template(
  'ID(?P<id>\d+)Y(?P<year>\d+)ABC'
);

Would create a parse_id table function with id and year as columns. Then it can be used like so:

select id, year from parse_id('ID123Y2023ABC');

select 
  items.rowid, 
  id, 
  year 
from items join parse_id(items.code);

This is the first "template" virtual table that I have seen, besides maybe the define function (but isn't arbitrary code execution).

Also not sure what to do with capture groups without names. Maybe c0, c1 as column names for those? What about if a capture group has c0 as a name? Probably not too terrible tbh...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions