Skip to content

Issue when reading file with tensorflow #27

@aurelio-amerio

Description

@aurelio-amerio

Hello, I'm having some problems when trying to read the file with tensorflow.
While it seems that it is possible to correctly read tfrecord files generated with tensorflow, reading tfrecord files generated with this library with tensorflow leads to some problems.

In particular, for some reasons tensorflow doesn't correctly pick the feature fields, and it parses all the data as if there were only one feature (if they are of the same type) or it picks only one field at random.
To reproduce the error, create a tfrecord file in julia with the example:

using TFRecord

n = 10
f1 = rand(Bool, n)
f2 = rand(1:5, n)
f3 = rand(("cat", "dog", "chicken", "horse", "goat"), n)
f4 = rand(Float32, n)

TFRecord.write(
    "example.tfrecord",
    (
        Dict(
            "feature1" => f1[i],
            "feature2" => f2[i],
            "feature3" => f3[i],
            "feature4" => f4[i],
        )
        for i in 1:n
    )
)

And try to read it back using tensorflow in python:

import tensorflow as tf
import numpy as np

raw_dataset = tf.data.TFRecordDataset("example.tfrecord")

# take the first example and print the content
for raw_record in raw_dataset.take(1):
    example = tf.train.Example()
    example.ParseFromString(raw_record.numpy())
    print(example)

Could you please take a look at this? Unfortunately I don't know enough of protobuf and TFRecord files to be able to correctly solve this issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions