Skip to content

Querying dbonds for the first time is very slow #5

@holgerbrandl

Description

@holgerbrandl

It's likely to be a caching issue, since it is slow just once after a system reboot:

movieDir="/Users/brandl/Desktop/WT_25deg_111102_ForTissueMiner"
db_name=basename(movieDir)

Sys.setenv(TM_HOME="/Volumes/projects/project-raphael/scripts/tissue_miner")
scriptsDir=Sys.getenv("TM_HOME")

#' profile tm query

source(file.path(scriptsDir, "commons/TMCommons.R"))

movieDb <- openMovieDb(movieDir)


system.time(
    dbond <- dbGetQuery(movieDb, "select frame, cell_id, dbond_id, conj_dbond_id, bond_id, vertex_id from directed_bonds")
)

To resolve the problem

  1. figure out where this cache is actually located and how it could be configured
  2. check if the first query is slow per db or per system
  3. check if we could substitute the first query with a faster one (like with using LIMIT 10;) for boostrapping the cache
  4. try to setup dockerized test to allow for reproducing the problem without a reboot
  5. check if native sqlite performance (without R in-between) is better or suffers from the same problem

Additional references:
SELECT query first run is VERY slow
http://sqlite.1065341.n5.nabble.com/SELECT-query-first-run-is-VERY-slow-td33100.html

A Deep Dive Into Unexpectedly Slow SQLite Queries
http://blog.isaachodes.io/p/deep_dive_into_slow_sqlite3_queries/

SQLite performance for 10 Million Records
http://sqlite.1065341.n5.nabble.com/SQLite-performance-for-10-Million-Records-td11019.html

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions