Skip to content

PHP Multiline strings causing line number to be reported incorrectly #623

@plotbox-io

Description

@plotbox-io

Describe the bug
When scanning two PHP files that contain a duplicate block, but where one file also has a multi-line string before the block, the line number for the reported error will be off by the number of newlines within the PHP string (like the string is always assumed to be one line in the code that calculates this).

To Reproduce
Steps to reproduce the behavior:
Create one file with contexnts:

<?php

final class FirstClass
{
    /** @inheritDoc * */
    public function someFunction(): void
    {
        $sql = "SELECT 
                    LINE1,
                    LINE2,
                    LINE3
                FROM mysql.table";
    }

    public function imageUri(mixed $result, string $subdomain): string
    {
        $portPart = '';
        if ($this->environment->isDeveloperEnv()) {
            $port = (int) $this->environment->getHttpPort();
            if (!in_array($port, [80, 443])) {
                $portPart = ":$port";
            }
        }

        return "ABC123";
    }
}

Create a second file with:

<?php

final class SecondClass
{
    public function getImageUriBasePath(): string
    {
        $portPart = '';
        if ($this->environment->isDeveloperEnv()) {
            $port = (int) $this->environment->getHttpPort();
            if (!in_array($port, [80, 443])) {
                $portPart = ":$port";
            }
        }

        $subdomain = $this->environment->getSubDomain();

        return "ABC123";
    }
}

Put these files in same directory and run a scan. Observe the incorrect line numbers given for the 'FirstClass' file:

Clone found (php):

  • var/cpdtest/GenealogyImageData.php [11:11 - 21:7] (10 lines, 85 tokens)
    var/cpdtest/PlotImageService.php [5:2 - 15:11]

Then reduce the multiline string starting line 8 to a single line string and then rerun the scan:

Clone found (php):

  • var/cpdtest/GenealogyImageData.php [11:11 - 21:7] (10 lines, 85 tokens)
    var/cpdtest/PlotImageService.php [5:2 - 15:11]

Now the line numbers are correct (note the output is the same but the second time is correct because the code has changed making the line numbers match).

Expected behavior
Line numbers should still be correct even when a PHP file contains multiline strings

Desktop (please complete the following information):

  • OS: Linux Manjaro
  • OS Version: Rolling
  • NodeJS Version: v20.9.0
  • jscpd version: 3.5.10

Note: I have already tried updating the prism PHP language config by copying it into node_modules/reprism/languages/php.js from https://github.com/PrismJS/prism/blob/master/components/prism-php.js with same defect occurring.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugAn issue contains information about wrong behaviour

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions