Skip to content

Conversation

@lemmi
Copy link
Contributor

@lemmi lemmi commented Feb 3, 2025

darktable and potentially any program that uses exiv2 will set the type for the exif UserComment tag to undefined for utf-8 encoded strings.

exiftool will happily decode such a string.

This patch adds a fallback that tries to decode this type of strings as utf-8, otherwise return the empty string.

@lemmi
Copy link
Contributor Author

lemmi commented Feb 3, 2025

Here's a test image with UserComment set to äöüß:
undefined_usercomment

Here's a test program that fails to return the string without the patch:

package main

import (
	"fmt"
	"log"
	"os"

	"github.com/bep/imagemeta"
)

func run() error {
	f, err := os.Open(os.Args[1])
	if err != nil {
		return err
	}

	defer f.Close()

	opts := imagemeta.Options{
		R:               f,
		ImageFormat:     imagemeta.JPEG,
		ShouldHandleTag: func(imagemeta.TagInfo) bool { return true },
		HandleTag: func(info imagemeta.TagInfo) error {
			fmt.Printf("%10s: %v\n", info.Tag, info.Value)
			return nil
		},
		Warnf: func(s string, a ...any) {
			fmt.Printf("Warning: "+s+"\n", a...)
		},
	}

	return imagemeta.Decode(opts)
}
func main() {
	if err := run(); err != nil {
		log.Fatal(err)
	}
}
go run . undefined_usercomment.jpg
XResolution: 72
YResolution: 72
ResolutionUnit: 2
YCbCrPositioning: 1
ExifVersion: 0232
ComponentsConfiguration: 1 2 3 0
UserComment:
FlashpixVersion: 0100
ColorSpace: 65535

exiftool:

exiftool undefined_usercomment.jpg
ExifTool Version Number         : 13.03
File Name                       : undefined_usercomment.jpg
Directory                       : .
File Size                       : 683 bytes
File Modification Date/Time     : 2025:02:03 12:03:04+01:00
File Access Date/Time           : 2025:02:03 12:03:06+01:00
File Inode Change Date/Time     : 2025:02:03 12:03:04+01:00
File Permissions                : -rw-r--r--
File Type                       : JPEG
File Type Extension             : jpg
MIME Type                       : image/jpeg
Exif Byte Order                 : Big-endian (Motorola, MM)
X Resolution                    : 72
Y Resolution                    : 72
Resolution Unit                 : inches
Y Cb Cr Positioning             : Centered
Exif Version                    : 0232
Components Configuration        : Y, Cb, Cr, -
User Comment                    : äöüß
Flashpix Version                : 0100
Color Space                     : Uncalibrated
Image Width                     : 16
Image Height                    : 16
Encoding Process                : Progressive DCT, Huffman coding
Bits Per Sample                 : 8
Color Components                : 3
Y Cb Cr Sub Sampling            : YCbCr4:2:0 (2 2)
Image Size                      : 16x16
Megapixels                      : 0.000256

exiv2:

exiv2 -pe undefined_usercomment.jpg
Exif.Image.XResolution                       Rational    1  72/1
Exif.Image.YResolution                       Rational    1  72/1
Exif.Image.ResolutionUnit                    Short       1  2
Exif.Image.YCbCrPositioning                  Short       1  1
Exif.Image.ExifTag                           Long        1  90
Exif.Photo.ExifVersion                       Undefined   4  48 50 51 50
Exif.Photo.ComponentsConfiguration           Undefined   4  1 2 3 0
Exif.Photo.UserComment                       Undefined  16  äöüß
Exif.Photo.FlashpixVersion                   Undefined   4  48 49 48 48
Exif.Photo.ColorSpace                        Short       1  65535

bep added a commit to lemmi/imagemeta that referenced this pull request Feb 3, 2025
@bep
Copy link
Owner

bep commented Feb 3, 2025

Thanks for this. It seems that the new switch produces invalid strings in some cases, which I must admit I don't understand how could happen.

I have added your failing image to the test setup.

lemmi and others added 2 commits February 3, 2025 20:19
darktable and potentially any program that uses exiv2 will set the type
for the exif UserComment tag to undefined for utf-8 encoded strings.

exiftool will happily decode such a string.

This patch adds a fallback that tries to decode this type of strings as
utf-8, otherwise return the empty string.
@lemmi lemmi force-pushed the fix_usercomment_undefined branch from 7c8845e to c2fbacd Compare February 3, 2025 19:19
@lemmi
Copy link
Contributor Author

lemmi commented Feb 3, 2025

It actually makes sense if you look at the dump of the failed test image:

image

The UserComment starts with 00 00 00 00 00 00 00 00 and is therefore caught in the new case, so it's not immediately returned as an empty string. I suspected that exiftool might just trim whitespaces. Although I know next to nothing about perl, I think this is the line:
https://github.com/exiftool/exiftool/blob/5a772cefb038d709fd20b61cd7291517a2fde08f/lib/Image/ExifTool/Exif.pm#L5490
I updated the code to do the same and remove all trailing whitespace.

@codecov-commenter
Copy link

codecov-commenter commented Feb 4, 2025

Codecov Report

Attention: Patch coverage is 50.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 76.95%. Comparing base (0deaa44) to head (c2fbacd).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
helpers.go 50.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #37      +/-   ##
==========================================
- Coverage   77.05%   76.95%   -0.10%     
==========================================
  Files          14       14              
  Lines        1691     1697       +6     
==========================================
+ Hits         1303     1306       +3     
- Misses        303      305       +2     
- Partials       85       86       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bep bep merged commit fe44406 into bep:main Feb 4, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants