Skip to content

Commit 0e00fa6

Browse files
committed
improve definition of Literal
1 parent 1d32c2d commit 0e00fa6

File tree

1 file changed

+111
-97
lines changed

1 file changed

+111
-97
lines changed

spec/index.html

Lines changed: 111 additions & 97 deletions
Original file line numberDiff line numberDiff line change
@@ -733,125 +733,139 @@ <h3>Literals</h3>
733733
<p>Literals are used for values such as strings, numbers, and dates.</p>
734734

735735
<p>A <dfn data-local-lt="RDF literal">literal</dfn> in an <a>RDF graph</a> consists of
736-
two, three, or four elements, as follow:</p>
736+
two, three, or four elements, as follow.</p>
737737

738738
<ol>
739-
<li>a <dfn>lexical form</dfn> consisting of a sequence of
740-
<a data-cite="I18N-GLOSSARY#dfn-code-point" class="lint-ignore">Unicode code points</a> [[!UNICODE]]
741-
which are <a data-cite="I18N-GLOSSARY#dfn-scalar-value">Unicode scalar values</a>,
742-
and therefore do not contain
743-
<a data-cite="I18N-GLOSSARY#dfn-surrogate" class="lint-ignore">Unicode surrogate code points</a></li>
744-
<li>a <dfn>datatype IRI</dfn>, being an <a>IRI</a>
739+
<li>A <dfn>lexical form</dfn>, being a [=string=].
740+
<li>A <dfn>datatype IRI</dfn>, being an <a>IRI</a>
745741
identifying a datatype that determines how the lexical form maps
746-
to a <a>literal value</a></li>
747-
<li>if and only if the <a>datatype IRI</a> is
748-
<code>http://www.w3.org/1999/02/22-rdf-syntax-ns#langString</code>, a
742+
to a <a>literal value</a>.</li>
743+
<li>If and only if the <a>datatype IRI</a> is
744+
<code>http://www.w3.org/1999/02/22-rdf-syntax-ns#langString</code> or
745+
<code>http://www.w3.org/1999/02/22-rdf-syntax-ns#dirLangString</code>, a
749746
non-empty <dfn>language tag</dfn> as defined by [[!BCP47]]. The
750747
language tag MUST be well-formed according to
751748
<a data-cite="bcp47#section-2.2.9">section 2.2.9</a>
752749
of [[!BCP47]],
753750
and MUST be treated consistently, that is, in a case insensitive manner.
754-
Two language tags are the same if they only differ by case.</li>
755-
<li>if and only if the <a>datatype IRI</a> is
751+
Two [[!BCP47]]-complying strings that differ only by case represent the same [=language tag=].</li>
752+
<li>If and only if the <a>datatype IRI</a> is
756753
<code>http://www.w3.org/1999/02/22-rdf-syntax-ns#dirLangString</code>,
757-
a non-empty <a>language tag</a>
758-
that MUST be well-formed according to <a data-cite="bcp47#section-2.2.9">section 2.2.9</a>
759-
of [[!BCP47]],
760-
and MUST be treated consistently, that is, in a case insensitive manner,
761-
and a <dfn>base direction</dfn> that MUST be either `ltr` or `rtl`.</li>
754+
a <dfn>base direction</dfn> that MUST be either<ul>
755+
<li>`ltr`, indicating that the initial text direction is set to left-to-right, or</li>
756+
<li>`rtl`, indicating that the initial text direction is set to right-to-left.</li>
757+
</ul></li>
762758
</ol>
763759

764760
<p>A literal is a <dfn>language-tagged string</dfn> if the third element
765761
is present and the fourth element is not present.
766-
Lexical representations of language tags MAY be case normalized,
767-
(for example, by canonicalizing as defined by
768-
<a data-cite="bcp47#section-4.5">BCP 47 section 4.5</a>).
769-
</p>
770-
771-
<p>A literal is a <dfn id="dfn-dir-lang-string">directional language-tagged string</dfn>
762+
A literal is a <dfn id="dfn-dir-lang-string">directional language-tagged string</dfn>
772763
if both the third element and fourth elements are present.
773-
The third element, the language tag, is treated identically as in a <a>language-tagged string</a>,
774-
and the fourth element, <a>base direction</a>, MUST be either `ltr` or `rtl`, which MUST be in lower case.</p>
775-
776-
<p>The meanings of the <a>base direction</a> values are:</p>
777-
<ul>
778-
<li>`ltr`: indicates that the initial text direction is set to left-to-right.</li>
779-
<li>`rtl`: indicates that the initial text direction is set to right-to-left.</li>
780-
</ul>
781-
782-
<p>Please note that concrete syntaxes MAY support
783-
<dfn data-lt="simple literal" class="export">simple literals</dfn> consisting of only a
784-
<a>lexical form</a> without any <a>datatype IRI</a>, <a>language tag</a>, or <a>base direction</a>.
785-
Simple literals are syntactic sugar for abstract syntax
786-
<a>literals</a>
787-
with the <a>datatype IRI</a>
788-
<code>http://www.w3.org/2001/XMLSchema#string</code>
789-
(which is commonly abbreviated as <code>xsd:string</code>).
790-
Similarly, most concrete syntaxes represent
791-
<a>language-tagged strings</a> and <a>directional language-tagged strings</a> without
792-
the <a>datatype IRI</a> because it always equals either
793-
<code>http://www.w3.org/1999/02/22-rdf-syntax-ns#langString</code> (<code>rdf:langString</code>)
794-
or <code>http://www.w3.org/1999/02/22-rdf-syntax-ns#dirLangString</code> (<code>rdf:dirLangString</code>), respectively.</p>
795-
796-
<p>The <dfn>literal value</dfn> associated with a <a>literal</a> is:</p>
797-
798-
<ul>
799-
<li>If the literal is a <a>language-tagged string</a>,
800-
then the literal value is a pair consisting of its <a>lexical form</a>
801-
and its <a>language tag</a>, in that order.</li>
802-
<li>If the literal is a <a>directional language-tagged string</a>, then the literal value is
803-
a tuple of its <a>lexical form</a>, its <a>language tag</a>, and its <a>base direction</a>,
804-
likewise in that order.</li>
805-
<li>If the literal's <a>datatype</a> is handled by an RDF implmentation,
806-
<ul>
807-
<li>if the literal's <a>lexical form</a> is in the <a>lexical space</a>
808-
of the <a>datatype</a>, then the literal value is the result of applying
809-
the <a>lexical-to-value mapping</a> of the datatype to the
810-
<a>lexical form</a>.</li>
811-
<li>otherwise, the literal is <dfn data-lt-no-plural>ill-typed</dfn> and no literal value can be
812-
associated with the literal. Such a case produces a semantic
813-
inconsistency but is not <em>syntactically</em> ill-formed.
814-
Implementations SHOULD accept [=ill-typed=] literals and produce RDF
815-
graphs from them. Implementations MAY produce warnings when
816-
encountering [=ill-typed=] literals.</li>
817-
</ul>
818-
</li>
819-
<li>If the literal's <a>datatype IRI</a> is <em>not</em>
820-
handled by an RDF implementation, then the literal value is
821-
not defined by this specification. Implementations SHOULD accept
822-
literals with unknown datatype IRIs and produce RDF graphs from them.
823-
</li>
824-
</ul>
764+
</p>
825765

826766
<p><dfn data-local-lt="term-equal">Literal term equality</dfn>:
827-
Two literals are term-equal (the same <a>RDF literal</a>)
767+
two literals are term-equal (the same <a>RDF term</a>)
828768
if and only if:</p>
829769

830770
<ul>
831-
<li>the two <a>lexical forms</a> compare equal</li>
832-
<li>the two <a>datatype IRIs</a> compare equal</li>
833-
<li>the two <a>language tags</a> (if any) compare equal</li>
834-
<li>the two <a>base directions</a> (if any) compare equal</li>
771+
<li>the two <a>lexical forms</a> compare equal,</li>
772+
<li>the two <a>datatype IRIs</a> compare equal,</li>
773+
<li>the two <a>language tags</a> are either both absent, or both present and compare equal,</li>
774+
<li>the two <a>base directions</a> are either both absent, both `ltr`, or both `rtl`.</li>
835775
</ul>
836-
<p>Comparison is performed using
776+
<p>Comparison of the [=lexical forms=] and of the [=datatype IRIs=] is performed using
837777
<a data-cite="I18N-GLOSSARY#dfn-case-sensitive">case sensitive matching</a>
838-
(see description of string comparison in
839-
<a href="#rdf-strings" class="sectionRef"></a>)
840-
except for language tags, where the comparison is performed using
778+
(see description of string comparison in
779+
<a href="#rdf-strings" class="sectionRef"></a>).
780+
Comparison of the [=language tags=] is performed using
841781
<a data-cite="I18N-GLOSSARY#dfn-case-sensitive">ASCII case-insensitive matching</a>.
842-
Thus, two literals can have the same value
843-
without being the same <a>RDF term</a>.
844-
For example:</p>
845-
846-
<pre>
847-
"1"^^xs:integer
848-
"01"^^xs:integer
849-
</pre>
850-
851-
<p>denote the same <a data-lt="literal value">value</a>, but are not the
852-
same literal <a>RDF terms</a> and are not
853-
<a>term-equal</a> because their
854-
<a>lexical forms</a> differ.</p>
782+
</p>
783+
784+
<section>
785+
<h2>Representation of literals</h2>
786+
787+
<p>Some concrete syntaxes MAY support
788+
<dfn data-lt="simple literal" class="export">simple literals</dfn> consisting of only a
789+
<a>lexical form</a> without any <a>datatype IRI</a>, <a>language tag</a>, or <a>base direction</a>.
790+
Simple literals are syntactic sugar for abstract syntax
791+
<a>literals</a>
792+
with the <a>datatype IRI</a>
793+
<code>http://www.w3.org/2001/XMLSchema#string</code>
794+
(which is commonly abbreviated as <code>xsd:string</code>).
795+
</p>
796+
797+
<p>
798+
Similarly, most concrete syntaxes represent
799+
<a>language-tagged strings</a> and <a>directional language-tagged strings</a> without
800+
the <a>datatype IRI</a> because it always equals either
801+
<code>http://www.w3.org/1999/02/22-rdf-syntax-ns#langString</code> (<code>rdf:langString</code>)
802+
or <code>http://www.w3.org/1999/02/22-rdf-syntax-ns#dirLangString</code> (<code>rdf:dirLangString</code>), respectively.
803+
</p>
804+
805+
<p>
806+
Any [=string=] complying with [[!BCP47]] MAY be used to represent a [=language tag=] in concrete syntaxes or implementation.
807+
Such strings MAY be case normalized,
808+
(for example, by canonicalizing as defined by
809+
<a data-cite="bcp47#section-4.5">BCP 47 section 4.5</a>).
810+
On the contrary, an implementation MAY preserve the case from the original representation,
811+
provided that it processes it in a case-insensitive manner.
812+
</p>
813+
814+
<aside class=note>
815+
The treatment of language tags has changed between RDF 1.1 and RDF 1.2.
816+
In RDF 1.1, `"chat"@fr` and `"chat"@FR` were representing two distinct terms, but implementations had license to replace one with the other (which most did).
817+
In RDF 1.2, they are now representing the exact same literal, i.e., the case difference in the concrete syntax does not propagate into the abstract syntax.
818+
</aside>
819+
820+
</section>
821+
822+
<section>
823+
<h2>Literal value</h2>
824+
825+
<p>The <dfn>literal value</dfn> associated with a <a>literal</a> is defined as follow.</p>
826+
827+
<ul>
828+
<li>If the literal is a <a>language-tagged string</a>,
829+
then the literal value is a pair consisting of its <a>lexical form</a>
830+
and its <a>language tag</a>, in that order.</li>
831+
<li>If the literal is a <a>directional language-tagged string</a>, then the literal value is
832+
a tuple of its <a>lexical form</a>, its <a>language tag</a>, and its <a>base direction</a>,
833+
likewise in that order.</li>
834+
<li>If the literal's <a>datatype</a> is handled by an RDF implementation,
835+
<ul>
836+
<li>if the literal's <a>lexical form</a> is in the <a>lexical space</a>
837+
of the <a>datatype</a>, then the literal value is the result of applying
838+
the <a>lexical-to-value mapping</a> of the datatype to the
839+
<a>lexical form</a>.</li>
840+
<li>otherwise, the literal is <dfn data-lt-no-plural>ill-typed</dfn> and no literal value can be
841+
associated with the literal. Such a case produces a semantic
842+
inconsistency but is not <em>syntactically</em> ill-formed.
843+
Implementations SHOULD accept [=ill-typed=] literals and produce RDF
844+
graphs from them. Implementations MAY produce warnings when
845+
encountering [=ill-typed=] literals.</li>
846+
</ul>
847+
</li>
848+
<li>If the literal's <a>datatype IRI</a> is <em>not</em>
849+
handled by an RDF implementation, then the literal value is
850+
not defined by this specification. Implementations SHOULD accept
851+
literals with unknown datatype IRIs and produce RDF graphs from them.
852+
</li>
853+
</ul>
854+
855+
<p>
856+
Thus, two literals can have the same value
857+
without being the same <a>RDF term</a>.
858+
For example:</p>
859+
860+
<pre>
861+
"1"^^xsd:integer
862+
"01"^^xsd:integer
863+
</pre>
864+
865+
<p>denote the same <a data-lt="literal value">value</a>, but are not the
866+
same literal <a>RDF terms</a> because their
867+
<a>lexical forms</a> differ.</p>
868+
</section>
855869
</section>
856870

857871
<section id="section-blank-nodes">

0 commit comments

Comments
 (0)