Here, we got constructions like
# sent_id = CESS-CAST-AA-20000203-2543-s1
# text = El partido aplazado de la vigésima tercera jornada de la Liga de División de Honor de fútbol sala Playas Castellón-Caja San Fernando se disputará el martes 29 de febrero a las 20.45 horas.
30 a a ADP _ _ 32 case 32:case _
31 las el DET _ Definite=Def|Gender=Fem|Number=Plur|PronType=Art 32 det 32:det _
32 20.45 20.45 NUM _ NumForm=Digit|NumType=Card 26 compound 26:compound _
33 horas horas NOUN _ _ 26 compound 26:compound Entity=CESSCASTAA200002032543c2)|SpaceAfter=No
in GSD instead we get
# sent_id = es-train-005-s288
# text = En concreto, la alerta se ha activado para la Cordillera Cantábrica leonesa y la comarca zamorana de Sanabria, en las que permanecerá en vigor entre las 00.00 y las 15.00 horas del martes.
31 las el DET _ Definite=Def|Gender=Fem|Number=Plur|PronType=Art 33 det _ _
32 15.00 15.00 NUM _ NumForm=Digit|NumType=Card 33 nummod _ _
33 horas hora NOUN _ Gender=Fem|Number=Plur 29 conj _ _
34-35 del _ _ _ _ _ _ _ _
34 de de ADP _ _ 36 case _ _
35 el el DET _ Definite=Def|Gender=Masc|Number=Sing|PronType=Art 36 det _ _
36 martes martes NOUN _ _ 29 nmod _ SpaceAfter=No
I don't care too much which standard we follow, but it would be nicer for the models if horas had the same lemma in all contexts, so if it were up to me I'd follow the GSD model and lemmatize everything hora
Here, we got constructions like
in GSD instead we get
I don't care too much which standard we follow, but it would be nicer for the models if
horashad the same lemma in all contexts, so if it were up to me I'd follow the GSD model and lemmatize everythinghora