Skip to content

Running multiple tasks asynchronously  #38

Open
@najdanovicivan

Description

@najdanovicivan

I take a look at what been going on here and i wonder if one thing is possible with this.

I work on a project with CI4 which relays heavily on cron to fetch the data from APIs. We're fetching data form about 30 endpoints every minute. And each of the request takes a lot of time to complete the processing. So in other to achieve those I need to spawn a lot of processes to work at the same time. So I have a single command which executed the multiple instances of other command by using something like this

/**
 * Spark Exec
 *
 * @param string   $sparkCommand Command
 * @param int|null $timeout      Timeout
 */
function spark_exec(string $sparkCommand, ?int $timeout = null): void
{
    $command = '';

    if ($timeout) {
        $command .= 'timeout ' . $timeout . ' ';
    }

    $command .= '"' . PHP_BINARY . '" -c "' . (get_cfg_var('cfg_file_path') ?: PHP_CONFIG_FILE_PATH) . '" ' . ROOTPATH . 'spark ' . $sparkCommand . ' > /dev/null &';

    exec($command);
}

And the problem is that I cannot have more that one process working working with the same endpoint and writing the same data db so I have mechanism to put some locks in place so I use files to track the locking

/**
     * Crates the Lockfile with the current class name
     *
     * @param string|null $identifier String to append to the filename
     *
     * @return bool|resource Locked file if successful otherwise false
     */
    public static function lock(?string $identifier = null)
    {
        //Add leading dash to identifier if it is set
        if (isset($identifier)) {
            $identifier = '-' . $identifier;
        }

        //Get the locks file directory
        $lockDir = WRITEPATH . 'cron/locks/';

        //Create the locks file directory if it does not exist
        if (! file_exists($lockDir)) {
            mkdir($lockDir, 0755, true);
        }

        //Get the filename
        $filename = $lockDir . $identifier . '.lock';

        //Open the file for writing to lock it
        $lock = fopen($filename, 'wb');

        if ($lock === false) {
            // Unable to open file, check that you have permissions to open/write
            CLI::error('Unable to write the lock file');

            exit;
        }

        if (flock($lock, LOCK_EX | LOCK_NB) === false) {
            // Lock is already in use by another process
            CLI::error('Another instance is already running');

            exit;
        }

        //Return the locked file
        return $lock;
    }

    /**
     * Closes the file removing the lock
     *
     * @param resource $lock Lock
     */
    public static function unlock(&$lock): void
    {
        //Check if lock exist
        if ($lock && get_resource_type($lock) !== 'Unknown') {
            //Get the filename
            $metaData = stream_get_meta_data($lock);
            $filename = $metaData['uri'];

            //Close the file to remove the lock
            fclose($lock);

            //Remove the lock file
            unlink($filename);
        }
    }

    /**
     * Check if lock files with set prefix exist
     *
     * @param string $prefix filename prefix
     *
     * @return bool Ture if there are lock files otherwise false
     */
    public static function isWorking(string $prefix): bool
    {
        $result = false;

        //We read the locks directory if there are  lock files it means we're still working
        if ($openDir = opendir(WRITEPATH . 'cron/locks/')) {
            while (($file = readdir($openDir)) !== false) {
                if (substr($file, 0, strlen($prefix)) === $prefix) {
                    $result = true;
                    break;
                }
            }
        }

        return $result;
    }

I wonder if there are plans to be able to achieve something similar with the scheduler here. From looking at the code I believe all the scheduled actions run on a single thread. So for example if I have 2 tasks scheduled to run first runs every 5 minutes second task runs every minute. And if first task takes more that 1 minute to be completed. On the next cron run second task will be started but once the first process finishes with task 1 it will continue with the second task in the first process and the order of execution is completely broken.

And there are a lot of real case situations where long running tasks are needed. One example is generating sitemap from the database for the huge site

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions