Skip to content

V4.0.0 beta release #13

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 45 commits into
base: master
Choose a base branch
from
Open

Conversation

Xentraxx
Copy link

@Xentraxx Xentraxx commented Apr 25, 2025

4.0.0-wacheee (by Xentraxx)

Fork/Alternate version

This change is a big overhaul of the project, so only the major improvements or potential breaking changes are mentioned

Tl;dr

  • Added support for reading EXIF data from JXL (JPEG XL), ARW, RAW, DNG, CRW, CR3, NRW, NEF and RAF files
  • Added a "--write-exif" flag which will write missing EXIF information (coordinates and DateTime) from json to EXIF for jpg and jpeg files
  • Added support to get DateTime from .MOV, .MP4 and probably many other video formats through ffprobe. You need to download it yourself (e.g. from here: https://ffbinaries.com/downloads) and make sure the folder you keep it in is in your $PATH variable.
  • Added verbose mode (--verbose or -v)

General improvements

  • upgraded dependencies and fixed breaking changes
  • updated dart to a minimum version of 3.7.0 of the dart SDK
  • included image, intl and coordinate_converter packages
  • applied a list of coding best practices through lint rules to code
  • added/edited a bunch of comments and changed unnecessary print() to log() for debugging and a better user experience
  • Divided code in steps through comments and included steps in output for readability, debuggability and to make it easier to follow the code
  • checked TODOs in README.md
  • Added TODOs to look into in code through //TODO comments
  • moved json_extractor file into date_extractor folder
  • added unit tests for new write-exif functionality
  • made CLI --help output more readable through line breaks
  • renamed some variables/functions to better reflect their purpose
  • moved step 8 (update creation time) before final output
  • added output how often DateTime and Coordinates have been written in EXIF at the final output
  • changed that test data will be created in test subfolder instead of project root directory
  • Added consistent log levels to log output to quickly differenciate between informational and error logs
  • Create symlinks with powershell on windows now which fixed heap corruption on newer win32/ffi
  • Added logging of elapsed time for each step.

Bug fixes

  • fixed existing unit tests which would fail on windows

Added functionality

  • Support for writing coordinates and DateTime to EXIF

    • Added new CLI option "--write-exif".
    • When enabled, the script will check if the associated json of any given file contains coordinates and if the file does not yet have them in its EXIF data, the script will add them.
    • When enabled, the script will check if a DateTime has been extracted from any of the given extraction methods and if the file has no EXIF DateTime set, it will add the DateTime to the EXIF data 'DateTime', 'DateTimeOriginal'and 'DateTimeDigitized'.
    • Currently supported file types are in theory JPG, PNG, Animated APNG, GIF, Animated GIF, BMP, TIFF, TGA, PVR and ICO (based on pub package Image 4.5.4). Howver only jpg and jpeg are confirmed to work. Others might work or will silently fail without problems.
    • Added verbose mode (--verbose or -v) with log levels info, warning and error.
  • Moved from the stale "exif" package to the exif_reader package which is a maintained fork of the exif package

    • This adds support for extracting DateTime from JXL (JPEG XL), ARW, RAW, DNG, CRW, CR3, NRW, NEF and RAF files
  • Added the ffmpeg_cli package and logic to attempt to extract exif data from videos using ffprobe.

    • ffprobe needs to be in $PATH variable. If not, that's okay. But if you have ffprobe locally, Google Photos Takeout Helper now supports reading CreatedDateTime EXIF data from a variety of video file formats.
Previous fixes and improvement (from 3.4.3-wacheee to 4.0.0-wacheee)
Limitations:
  • if album mode is set to duplicate-copy, it will move the album photos to the album folder (as usual), but ALL_PHOTOS will not contain them if the media is not in a year album.
  • it does not fix issues related to reading JSON files (if necessary) for Motion Photo files; however, if the dates are included in the file name (as with Pixel Motion Photos), the correct dates will be established.
  • writing exif to png files does not work. Other file types may or may not work (only jpg and jpeg are confirmed working). If it doesn't work, it will just fail silently and continue. So it's okay.
  • No interactive mode for setting write-exif flag
  • No interactive unzipping
  • 'The hardcoded maximum file size' limits functionality to read DateTime from exif data to image and video! files smaller than 64 MB

Xentraxx added 28 commits April 23, 2025 10:18
bumped version and fixed breaking changes
Pull request in deprecated repo is pending. It makes sense to include it locally to be able to bump version of ffi and win32.
Working on the TODOs next.
Can now write coordinates and DateTime in exif data for:
JPG, PNG / Animated APNG, GIF / Animated GIF, BMP, TIFF, TGA and PVR.
See changelog.
code is very slow. working on it.
did hide new experimental slow functrionality behind "write-exif" flag
Added steps, so it's easier to debug.
Added another step and the steps to more output
added tests and better error handling.
Also moved the date_extractor type definition in the date_extractor folder and added a FillingBar for the removeDuplicates operation for better user experience.
Next commit will be fixes for lint
Applied best practices to code
…t end

Want to have overview at end what happened to better how much a flag/a new version fixed. Also changed a print() to log()
Updated Changelog, project version, bumped dart min SDK version and minor changes to output.
This is a release candidate.
Please test and provide feedback.
the lookupMimeType() did give back "tif" for tiff and null for tga. Fixed by just looking at the extension of path instead file type.
The Exif_reader package is a fork of the exif package which is maintained.
… .mp4 through ffprobe

ffprobe has to be manually downloaded and needs to be present in $PATH variable. Updated changelog.
Release candidate 2.
Made some significant changes to support more file formats.
group.dart was just changed because of lint rule. Otherwise just changed and added some logs and comments to catch more edge cases during debuging and get better log messages on what is happening.
Those stats can be compared with a run of the last stable release with the same TakeOut to quickly see the improvements this version brings to the table.
@Xentraxx
Copy link
Author

Xentraxx commented Apr 26, 2025

I think we should leave "--write-exif" as an optional flag which isn't active by default for the time being. Once it is thoroughly tested we can add it to the interactive mode.

I also leave the maxFileSize of 64MB in place for videos and photos for now. But if you agree that we shouldn't limit it by default and only give the option to limit it (e.g. on a NAS), I will implement that before the release/merge.

Currently the behaviour is, that ffprobe will be attempted for every video file. If it's not available in the environment, it will simply fail. I think we don't need a flag for that. In a stable release, we could also add some interactive output which simplifies things for the not so tech-savy users.

Xentraxx added 11 commits April 26, 2025 23:36
Now again wayyyy faster and I handled more cases. Because of growing complexity I handle everything in a switch case.
Also added a bit more logging.
This is a big one! Didn't refactor the code correctly before. Now the heap corruption is fixed and the folders are also created correctly! Needed changing because of those major upgrades of ffi and win32.
I'm tired of trying to make the com objects work. They cause the heap corruption in the createshortcutWin() function. Powershell handles this now. Works more reliably.
Also remembered to add the powershell symlink change to changelog
Because of the complexity of the removingduplicate operation we don't have a reliable iterable to go over. The creation of the iterable is what takes so long. So rather not have any progress bar but a bit more logging.
Helps identify if changes caused performance improvements or decay by testing them against same dataset with same options.
… now deactivated by default to support larger files like videos
Before a process was created for every video file, even if it was destined to fail because ffprobe was not installed. Now it is being checked intially and if ffprobe is not installed, it is being skipped, which should be way more performant.
@Xentraxx
Copy link
Author

Xentraxx commented Apr 27, 2025

I want to share some statistics for running my 140GB diverse dataset through different modis:

3.6.2-wachee:

  • Successfully renamed JSON files (suffix removed): 50038
  • Skipped 20 duplicates
  • Couldn't find date for 808 photos/videos :/
  • In total the script took 21 minutes.

4.0.0 without any flags and ffprobe unavailable:

  • 19 duplicates were found and skipped
  • For 807 photos/videos we were unable to find any DateTime :/
  • In total the script took 38 minutes to complete

4.0.0 with write-exif flag and ffprobe available:

  • NEED TO RUN AND WILL UPDATE

4.0.0 with write-exif flag and ffprobe unavailable:

  • NEED TO RUN AND WILL UPDATE

4.0.0 without any flags and ffprobe available:

  • 19 duplicates were found and skipped
  • For 44 photos/videos we were unable to find any DateTime :/
  • In total the script took 29 minutes to complete

4.0.0 without any flags and ffprobe unavailable and enforce-max-filesize:

  • 20 duplicates were found and skipped
  • For 808 photos/videos we were unable to find any DateTime :/
  • In total the script took 41 minutes to complete

Interestingly with ffprobe available, the script is faster for me than without...
Also comparing 3.6.2-wachee and 4.0.0 without any flags and ffprobe unavailable and enforce-max-filesize, which is the mode most comparable to the last version, the script took double the time.
Need to do further testing and optimization.
At this point everything should be fairly stable though in regards to functionality.

@Wacheee
Copy link
Owner

Wacheee commented Apr 29, 2025

Hey, thanks! I have reviewed some changes and noticed that you switched shortcut creation on Windows from ffi/win32 to PowerShell. This causes problems when creating a large number of shortcuts, as shown in this PR: TheLastGimbus#390

Also, PowerShell has issues encoding certain characters
TheLastGimbus#389
https://stackoverflow.com/a/74734553/14323533

@Xentraxx
Copy link
Author

Hey, thanks! I have reviewed some changes and noticed that you switched shortcut creation on Windows from ffi/win32 to PowerShell. This causes problems when creating a large number of shortcuts, as shown in this PR: TheLastGimbus#390

Also, PowerShell has issues encoding certain characters TheLastGimbus#389 https://stackoverflow.com/a/74734553/14323533

Hey thanks for the info! But now we're in a bit of a pickle as ffi/win32 in the newest version definitely cause the heap exception with the exact same code in the createShortcutWin() function. So we need to fix that somehow.

Currently causes heap exception!
Just deleted redundant memory freeing
@Wacheee
Copy link
Owner

Wacheee commented Apr 30, 2025

The heap exception likely occurs because the Release() function from previous versions is missing.

image

For the new ffi/win32 version, you must call release() before the finally block, and call it directly on the variable rather than on the interface

//...
if (FAILED(hrSave)) {
      throw 'Error trying to save shortcut: $hrSave';
    } 
    persistFilePtr.release();
    shellLinkPtr.release();
  } finally {
// ....

There may be other issues with freeing memory that should be reviewed.

@Xentraxx
Copy link
Author

Xentraxx commented May 2, 2025

@Wacheee The missing release() is definitely not the problem. I removed them very consciously as I suspected them to cause the heap exception. Removing them didn't help unfortunately. But I am quite sure that the memory is freed twice because of the free() function, so the release() are redundant. But I am not too familiar with ffi/win32.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants