Skip to content

Question marks in Cyrillic file names #31

@KonstantinKorepin

Description

@KonstantinKorepin

Hi all!

I have a problem with the Archive7z class. I made a function

public function unzip(string $pathToSource): DirectoryIterator
   {
       ...

      $obj = new Archive7z($pathToSource);
      $obj->setOutputDirectory($pathToDestination);
      $obj->extract();

      return new DirectoryIterator($pathToDestination);
   }

which returns a DirectoryIterator with the directory to unpack
files.

Next, I collect information about the unpacked files:

 $iterator = $this->zipper->unzip($zipFile->getRealPath());
 foreach ($iterator as $unzippedFile) {
     if (!in_array($unzippedFile->getFilename(), ['.', '..'])) {
         $encoding = mb_detect_encoding($unzippedFile->getFilename()); // ASCII
         $fileName = $unzippedFile->getFilename(); // 12_1_?????.txt
     }
  }

And on my server, the encoding is defined as ASCII, and in the file names instead of Cyrillic letters
question marks.

In the local environment(Docker), everything is displayed normally. mb_detect_encoding($unzippedFile->getFilename())
returns UTF-8 and the file names are correct.

I also tried to reproduce this error in docker and I managed to do it using the link https://zalinux.ru/?p=5740.
That is, I commented out the en_US.UTF-8 UTF-8 encoding in the PHP container in the /etc/locale.gen file and ran the command
locale-gen. After that, I only had ru_RU.UTF-8 UTF-8 encoding left. And after that the encoding of the unpacked files
also began to be defined as ASCII, not UTF-8, and question marks began to appear instead of Cyrillic characters ?

If we return the en_US.UTF-8 UTF-8 encoding in the container and execute locale-gen, then again everything works fine. Tell,
please, what can I do so that when unpacking files, Cyrillic characters are displayed in the file names, not signs
questions. How to make files unpacked in UTF-8 encoding and not ASCII?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions