Skip to content

Timing and value discrepancies found in LSS files (>=1.6.0) #890

@jaspersiebring

Description

@jaspersiebring

Timing and value discrepancies found in LSS files (>=1.6.0)

Some necessary context first.

I occasionally dabble in speedrunning, albeit poorly and under a different alias, and wanted more insights on my progress. Long story short, this ran out of hand and I ended up mapping out the entire LSS file format as part of a Python library called saltysplits.

One of its features is that it makes it very easy to access and validate each element and attribute of a given LSS file (individually and in relation to each other). And although I'm finding it difficult not to make this sound like self-promotion (it's really not), I do believe that I stumbled upon three possible discrepancies that are worth sharing with the Livesplit team:

  1. It is currently possible to have AttemptHistory.Attempt entries without SegmentHistory.Time entries and vice versa
  2. attempt_count does not always match the number of actual attempts
  3. Time information from a run's AttemptHistory.Attempt and SegmentHistory.Time entries don't add up to the same timedelta

I will be providing some code examples to illustrate these findings below. This assumes you've installed saltysplits and cloned the livesplit-core repository (we'll be using some of the LSS files included in this repository). If you want, you can run the same code examples with your own LSS files (>=1.6.0). If it uses an older format, just drag them in LiveSplit and export them again with Save Splits As....

import saltysplits as ss
import pandas as pd
from saltysplits import TimeType

# change these paths to wherever you cloned the livesplit-core repo 
CELESTE_PATH = "YOUR_REPOS/livesplit-core/tests/run_files/Celeste - Any% (1.2.1.5).lss"
SM64_PATH = "YOUR_REPOS/livesplit-core/tests/run_files/clean_sum_of_best.lss"

1. It is currently possible to have AttemptHistory.Attempt entries without SegmentHistory.Time entries and vice versa

# after passing validation, we can access all Livesplit elements and attributes with dot notation (see https://github.com/jaspersiebring/saltysplits/blob/main/src/saltysplits/models.py)
splits = ss.read_lss(lss_path = SM64_PATH)

# gather run IDs across all `AttemptHistory.Attempt` and `SegmentHistory.Time` entries
run_ids_from_attempts = set([attempt.id for attempt in splits.attempt_history])
run_ids_from_segments = set([time_entry.id for segment in splits.segments for time_entry in segment.segment_history ])

# find all `AttemptHistory.Attempt` entries without `SegmentHistory.Time` entries *and* vice versa
attempts_without_times = run_ids_from_attempts - run_ids_from_segments
times_without_attempts = run_ids_from_segments - run_ids_from_attempts

if attempts_without_times:
    print(f"{len(attempts_without_times)} AttemptHistory.Attempt entries without SegmentHistory.Time entries found with the following run IDs:\n{sorted(attempts_without_times, key=int)}")
if times_without_attempts:
    print(f"{len(times_without_attempts)} SegmentHistory.Time entries without AttemptHistory.Attempt entries found with the following run IDs:\n{sorted(times_without_attempts, key=int)}")
109 AttemptHistory.Attempt entries without SegmentHistory.Time entries found with the following run IDs:
['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '80', '81', '82', '83', '84', '85', '86', '87', '88', '89', '90', '91', '92', '93', '94', '95', '96', '97', '98', '99', '100', '106', '108', '110', '112', '113', '114', '117', '118', '123']
2 SegmentHistory.Time entries without AttemptHistory.Attempt entries found with the following run IDs:
['-1', '0']

This shows that the clean_sum_of_best.lss has run attempts without splits and splits without run attempts. The former makes intuitive sense, it simply means that a run attempt was started and stopped before making it through the first segment (i.e. an early reset). The latter however does not. How can you have a split, but not have it be part of an attempt?

Looking at the printed IDs, one can imagine that the SegmentHistory.Time entries associated with run -1 just belong to run 1 and that the sign has somehow been written erroneously (this would already constitute a bug of sorts but not a major one). This however does not explain SegmentHistory.Time entries associated with run 0. Where do these come from?

You can verify these findings by inspecting the LSS file itself in any text editor (although you might want to format it as XML, makes it much easier to read). Here, notice how the AttemptHistory element has no Attempt elements with run IDs -1 and 0?

<?xml version="1.0" encoding="UTF-8"?>
<Run version="1.8.0">
    <GameIcon/>
    <GameName>SM64: 120 Star</GameName>
    <CategoryName>100%</CategoryName>
    <Metadata>
        <Run id=""/>
        <Platform usesEmulator="False"/>
        <Region/>
        <SpeedrunComVariables/>
        <CustomVariables/>
    </Metadata>
    <LayoutPath/>
    <Offset>00:00:00.000000000</Offset>
    <AttemptCount>102</AttemptCount>
    <AttemptHistory>
        <Attempt id="1"/>
        <Attempt id="2">
            <RealTime>02:31:30.375976600</RealTime>
        </Attempt>

And here, one example of a Segments.Segment element with SegmentHistory.Time elements for runs -1 and 0, despite there not existing any AttemptHistory.Attempt elements for these runs.

<Segment>
    <Name>HMC (52)</Name>
    <Icon/>
    <SplitTimes>
        <SplitTime name="Personal Best">
            <RealTime>01:12:40.402200400</RealTime>
            <GameTime>01:12:40.402200400</GameTime>
        </SplitTime>
    </SplitTimes>
    <BestSegmentTime>
        <RealTime>00:11:01.725424300</RealTime>
        <GameTime>00:09:11.204527300</GameTime>
    </BestSegmentTime>
    <SegmentHistory>
        <Time id="-1">
            <GameTime>00:20:35.827574200</GameTime>
        </Time>
        <Time id="0">
            <GameTime>00:09:11.204527300</GameTime>
        </Time>

2. attempt_count does not always match the number of actual attempts

Accessing data through dot notation (as shown above) is nice if you're already familiar with the LSS
structure. But a more spreadsheet-like representation is also available through saltysplits's to_df method.

splits = ss.read_lss(lss_path = CELESTE_PATH)

# example of dot notation access:
splits.attempt_count #this just prints the value associated with the attempt_count attribute, it is *not* (re)computed

# example of dataframe access:
splits_dataframe = splits.to_df()
display(splits_dataframe.iloc[:5, :5]) # display the first 5 "segments" for the first 5 "runs"

# the index of the dataframe are the segment names and the columns are the run IDs
print(f"The first five segment names: {splits_dataframe.index.tolist()[:5]}")
print(f"The first five run IDs: {splits_dataframe.columns.tolist()[:5]}")
1 3 6 7 10
Prologue 0 days 00:00:50.685000 0 days 00:00:51.323000 0 days 00:00:48.631000 0 days 00:00:49.326000 0 days 00:00:48.903000
-Crossing 0 days 00:00:48.657000 0 days 00:00:39.141000 0 days 00:00:41.118000 0 days 00:00:37.362000 0 days 00:00:35.691000
-Chasm 0 days 00:00:50.744000 0 days 00:00:36.429000 0 days 00:00:37.781000 0 days 00:00:33.885000 0 days 00:00:34.434000
Forsaken City 0 days 00:00:40.815000 0 days 00:00:40.650000 0 days 00:00:30.692000 0 days 00:00:48.892000 0 days 00:00:37.156000
-Intervention 0 days 00:01:17.270000 0 days 00:01:10.910000 0 days 00:01:06.674000 0 days 00:01:08.429000 0 days 00:01:05.673000
The first five segment names: ['Prologue', '-Crossing', '-Chasm', 'Forsaken City', '-Intervention']
The first five run IDs: ['1', '3', '6', '7', '10']

As shown before, "runs" don't necessarily have to have time and attempt entries associated with it (run IDs can come from either one). Because of this, what constitutes an "attempt" is somewhat ambiguous. That's ultimately why we have the allow_empty and allow_partial flags in to_df, so people can control what they consider a "run" (and an "attempt" at one).

# here's how you'd compute the attempt_count for runs with 0 or more time entries
attempt_count_with_0_or_more_time_entries = splits.to_df(allow_empty=True, allow_partial=True).columns.size

# here's how you'd compute the attempt_count for runs with 1 or more time entries
attempt_count_with_1_or_more_time_entries = splits.to_df(allow_empty=False, allow_partial=True).columns.size

# here's how you'd compute the total number of completed runs (which would then no longer be attempts))
#completed_run_count = splits.to_df(allow_empty=False, allow_partial=False).columns.size

print(f"The attempt_count attribute value in the LSS file: {splits.attempt_count}")
print(f"The attempt_count for \"attempts\" with 0 or more time entries: {attempt_count_with_0_or_more_time_entries}")
print(f"The attempt_count for \"attempts\" with 1 or more time entries: {attempt_count_with_1_or_more_time_entries}")
The attempt_count attribute value in the LSS file: 32
The attempt_count for "attempts" with 0 or more time entries: 31
The attempt_count for "attempts" with 1 or more time entries: 25

As shown above for the Celeste - Any% (1.2.1.5).lss file, none of the "attempt" definitions return the same attempt_count attribute value as written to the LSS file by LiveSplit, and thus don't appear to reflect the actual attempts made.

3. Time information from a run's AttemptHistory.Attempt and SegmentHistory.Time entries don't add up to the same timedelta

Given all LSS elements and attributes for a completed run, here's three ways to find its completion time:

  • You can take the run's AttemptHistory.Attempt entry and just find the cumulative time in either its RealTime or GameTime element
  • You can go through all Segments.Segment entries, collect all SegmentHistory.Time entries that share this run's id and sum up their RealTime or GameTime elements.
  • You can take the run's AttemptHistory.Attempt entry and subtract its started attribute from its ended attribute (does not include nanoseconds, won't be used here)

Ideally and logically speaking, you'd expect and want at least the first two methods to produce the same answer. Turns out, that's rarely the case (at least for the handful of >=1.6.0 LSS files that I sampled from splits.io).

Maybe there's time between creating that run's last SegmentHistory.Time element and updating the relevant attributes/elements in its associated AttemptHistory.Attempt element?

splits = ss.read_lss(lss_path = CELESTE_PATH)
splits_dataframe = splits.to_df(allow_partial=False, time_type=TimeType.REAL_TIME)
run_ids = splits_dataframe.columns.to_list() 

# `to_df` already collects all `SegmentHistory.Time` entries per run ID, we only have to sum them up
times_from_segments = splits_dataframe.sum(axis=0)

# retrieving run times from each `AttemptHistory.Attempt` through dot notation and list comprehension
times_from_attempts = pd.Series({attempt.id: attempt.real_time for attempt in splits.attempt_history if attempt.id in run_ids}, index=run_ids)

# computing their absolute differences and combining everything into a single dataframe
time_differences = times_from_attempts - times_from_segments
time_differences = time_differences.abs()
times_dataframe = pd.DataFrame([times_from_attempts, times_from_segments, time_differences], index=["Time from Attempts", "Time from Segments", "Differences"]).T

display(times_dataframe)

print(f"On average, each run's total time differs by {time_differences.std().total_seconds()} seconds depending on whether you retrieve it from AttemptHistory.Attempt or the SegmentHistory.Time entries")
Time from Attempts Time from Segments Differences
1 0 days 01:04:31.170000 0 days 01:04:14.118999 0 days 00:00:17.051001
3 0 days 00:59:28.965000 0 days 00:59:12.280998 0 days 00:00:16.684002
6 0 days 00:47:58.618000 0 days 00:47:58.617995 0 days 00:00:00.000005
7 0 days 00:50:12.448000 0 days 00:50:12.447998 0 days 00:00:00.000002
10 0 days 00:47:10.743000 0 days 00:47:10.742998 0 days 00:00:00.000002
11 0 days 00:44:14.143000 0 days 00:44:14.142997 0 days 00:00:00.000003
14 0 days 00:42:55.145000 0 days 00:42:55.144998 0 days 00:00:00.000002
19 0 days 00:41:28.546000 0 days 00:41:28.545998 0 days 00:00:00.000002
28 0 days 00:40:44.782000 0 days 00:40:44.782000 0 days 00:00:00
31 0 days 00:39:12.517000 0 days 00:39:12.516999 0 days 00:00:00.000001
On average, each run's total time differs by 7.112488 seconds depending on whether you retrieve it from AttemptHistory.Attempt or the SegmentHistory.Time entries

Lastly, I wanted to emphasize here that we don't compute any attributes or elements, we just mapped them out with appropriate types and annotations. In fact, we test to ensure that the encoding and decoding of all (standardized) elements and attributes is lossless (i.e. if you were to dump any saltysplits models back to XML, they'd be identical to their original XML representations).

All of this to say, the values and claims here accurately reflect what's currently possible content-wise in LSS files (>=1.6.0).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions