Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 69 additions & 2 deletions docs/croissant-spec-draft.md
Original file line number Diff line number Diff line change
Expand Up @@ -1201,14 +1201,58 @@ Commonly used atomic data types:
<td><a href="https://schema.org/Float">sc:Float</a></td>
<td>Describes a float.</td>
</tr>
<tr>
<td><a href="http://mlcommons.org/schema/Float16">cr:Float16</a></td>
<td>Describes a float in half-precision floating-point format.</td>
</tr>
<tr>
<td><a href="http://mlcommons.org/schema/Float32">cr:Float32</a></td>
<td>Describes a float in single-precision floating-point format.</td>
</tr>
<tr>
<td><a href="http://mlcommons.org/schema/Float64">cr:Float64</a></td>
<td>Describes a float in double-precision floating-point format.</td>
</tr>
<tr>
<td><a href="https://schema.org/Integer">sc:Integer</a></td>
<td>Describes an integer.</td>
</tr>
<tr>
<td><a href="http://mlcommons.org/schema/Int8">cr:Int8</a></td>
<td>Describes an 8-bit integer.</td>
</tr>
<tr>
<td><a href="http://mlcommons.org/schema/Int8">cr:Int16</a></td>
<td>Describes an 16-bit integer.</td>
</tr>
<tr>
<td><a href="http://mlcommons.org/schema/Int8">cr:Int32</a></td>
<td>Describes an 32-bit integer.</td>
</tr>
<tr>
<td><a href="http://mlcommons.org/schema/Int8">cr:Int64</a></td>
<td>Describes an 64-bit integer.</td>
</tr>
<tr>
<td><a href="https://schema.org/Text">sc:Text</a></td>
<td>Describes a string.</td>
</tr>
<tr>
<td><a href="http://mlcommons.org/schema/UInt8">cr:UInt8</a></td>
<td>Describes an 8-bit unsigned integer.</td>
</tr>
<tr>
<td><a href="http://mlcommons.org/schema/UInt16">cr:UInt16</a></td>
<td>Describes an 16-bit unsigned integer.</td>
</tr>
<tr>
<td><a href="http://mlcommons.org/schema/UInt32">cr:UInt32</a></td>
<td>Describes an 32-bit unsigned integer.</td>
</tr>
<tr>
<td><a href="http://mlcommons.org/schema/UInt64">cr:UInt64</a></td>
<td>Describes an 64-bit unsigned integer.</td>
</tr>
</table>

Other data types commonly used in ML datasets:
Expand All @@ -1219,13 +1263,17 @@ Other data types commonly used in ML datasets:
<th>Usage</th>
</thead>
<tr>
<td><a href="https://schema.org/ImageObject">sc:ImageObject</a></td>
<td>Describes a field containing the content of an image (pixels).</td>
<td><a href="https://schema.org/AudioObject">cr:AudioObject</a></td>
<td>Represents a segment of audio as a digital sound recording. Refer to the section "ML-specific features > Bounding boxes".</td>
</tr>
<tr>
<td><a href="http://mlcommons.org/schema/BoundingBox">cr:BoundingBox</a></td>
<td>Describes the coordinates of a bounding box (4-number array). Refer to the section "ML-specific features > Bounding boxes".</td>
</tr>
<tr>
<td><a href="https://schema.org/ImageObject">sc:ImageObject</a></td>
<td>Describes a field containing the content of an image (pixels).</td>
</tr>
<tr>
<td><a href="http://mlcommons.org/schema/Split">cr:Split</a></td>
<td>Describes a RecordSet used to divide data into multiple sets according to intended usage with regards to models. Refer to the section "ML-specific features > Splits".</td>
Expand Down Expand Up @@ -1789,6 +1837,25 @@ Bounding boxes are common annotations in computer vision. They describe imaginar
}
```


### AudioObject

Croissant uses Schema.org [AudioObject](https://schema.org/AudioObject) to represent an Audio feature. An AudioObject is a standard feature that represents a segment of audio as a digital sound recording. Croissant provides the audio-specific `cr:samplingRate` attribute, which can be specified at the audio field's `Source`:

```json
{
"@type": "cr:Field",
"@id": "recordset/audio",
"dataType": "sc:AudioObject",
"source": {
"fileSet": { "@id": "files" },
"extract": { "fileProperty": "content" },
"samplingRate": "16000",
}
}
```


### SegmentationMask

Segmentation masks are common annotations in computer vision. They describe pixel-perfect zones that outline objects or groups of objects in images or videos. Croissant defines `cr:SegmentationMask` with two manners to describe them:
Expand Down