Skip to content

Quads not being emitted as early as possible in construct query #140

@jeswr

Description

@jeswr

Issue type:

  • 🐌 Performance issue

Description:

If I run the following query with the standard @comunica/query-sparql config, then 1104 results and only the link https://w3id.org/dpv#Purpose is fetched.

import { QueryEngine } from "@comunica/query-sparql";

const fetchFn = globalThis.fetch;

// @ts-ignore
globalThis.fetch = (...args) => {
    console.log(args[0]);
    return fetchFn(...args);
}

async function run() {
    const engine = new QueryEngine();
    const res = await engine.queryQuads('CONSTRUCT { ?s ?p ?o } WHERE { ?s a <https://w3id.org/dpv#Purpose> ; ?p ?o . }', { sources: ['https://w3id.org/dpv#Purpose'], lenient: true });

    let i = 0;
    res.on('data', (quad) => {
        console.log(i += 1);
    });
}

run();

On the other hand when @comunica/query-sparql-link-traversal is used, the links are indefinitely fetched whilst no results are emitted. Note that this is not an issue if the query contains only a single triple in the BGP pattern such as CONSTRUCT WHERE { ?s a <https://w3id.org/dpv#Purpose> . }

I would hazard a guess that the problem is associated with the inner join performing some kind of nested loop which requires the inner stream to terminate before it can iterate over outer stream. As a result I would suggest ensuring the some kind of symmetric hash join is always used in link traversal scenarios.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions