Modifying ranking components #40

MischiefCS · 2025-10-23T18:34:41Z

Aim

The VRS now uses a full VRS-regulated period, with VRS mandated invites and rules, as well as now having had a full major cycle and qualification period. Teams have wisened up and are operating in a manner with the VRS in higher regard, with more of both correct decision making and attempting to game the system. With higher quality output data and a more clear picture available on the future VRS ecosystem, this PR attempts to potentially provide an improved model that matches the current day context by slight adjustments to the component balance. It's important to note that the current VRS model has been shown to work effectively and has done an exceptional job at allocating invites for the major, this is more of a philosophical reinterpretation of what components and the balance that makes up the distribution compared to major change.

Current VRS situation

Figure 1 - Current winrate fit using matchdata up to date as of 06/10/25 and current model

The current fit, using the matchdata_sample_20250510.json, is shown in Figure 1 above. While a worse fit than the original graph uploaded in the README as well as a reduction in performance compared to previously evaluated months, this isnt necessarily indicative of an issue. When calculating Brier scores at the event-level, there's a clear pattern of the high brier score events as events that are open LANs, notably Birch Cup (brier=0.4083). The reason why Birch Cup likely has the highest Brier Score is due to the large number of BO1's throughout the format as well as new teams with limited ranking information.

Additionally at the same time as the recent LAN rush there was a void of typical teams in online events due to clashing schedules which then led to a further trickle down in online invites and a greater variety in teams. From this and during the period there was also noticeably higher brier score values for online events too likely caused by the above

Due to the recent major rush, the recent match period is far more varied than usual with opponents who don't typically meet and has led to unpredictable results from both the VRS as well as other prediction metrics. This fluidity has enabled more extreme matchups than usual, both where teams of differing ranks can interact as well as opportunities for brand new teams to meet established teams. A somewhat worse fit with volatile data isn't necessarily indicative of a model issue.

While the difference in fit isn't particularly indicative, the balance of the model has considerably shifted. Particularly focused on the LAN Wins component and its balance compared to other components.

Figure 2 - LAN Wins vs Global Rank (2023 Data using current model)

Figure 3 - LAN Wins vs Global Rank (2025 April Rankings)

Figure 4 - LAN Wins vs Global Rank (October Data with current model)

Shown in Figures 2,3 & 4 above. There has been a considerable shift in the saturation of LAN wins. LAN wins has gone from being a relatively lower impact and lower saturation component in 2023 to being the most saturated component and dominating contribution to team average at both the Tier 1 level and below.

When comparing the ranking makeup from the Austin Invite period to the Budapest Invite period it is possible to visualise how significant this change has been in just a few months as shown in Figure 5 below.

Figure 5 - Ranking modifier factor makeup comparing Austin to Budapest

LAN wins have had a very significant increase in saturation and there's nothing to suggest that this saturation will ever return to its pre Budapest cycle levels. Teams have realised the strength of LAN Wins and look to utilise local LANs to gain in the rankings. This PR looks at rebalancing the components to not necessarily necessitate maxing out on LAN wins and ensuring the balance between dependent and independent factors.

Minor Changes

Language

Across the repo there is some incorrect language used relating to the current application of the model. This mostly relates to artefacts of the past relating to how the model had previously worked. As teams look to more optimally perform through the VRS model I think there are certain areas which could do with updated language to accurately represent the current implementation

Hidden seed factors

While this PR primarily looks at the addition of a new seed modifier factor, it introduces the presentation of zero value factors (ownNetwork (existing) and ownPerformance(new)). While this will not relate to a team's final rank, this information is useful to present for teams utilising the /details/ of a ranking to aid in the ascertaining the worth of another team / potentially a tournament. Ignoring the PR, the showcase of ownNetwork in general would assist teams.

Match data sample

The appended matchdata_sample_20250510.json is not an exact mirror as it is not reflective of the full format used in the true VRS matchdata json. This was data collected via the LiquipediaDB in line with the Liquipedia API usage guidelines and then tweaked to mirror the stage and event prizing that is present in VRS. The json body has remnants of the Liquipedia API embedded into it. It's not 100% accurate with some missing matches as well as differing start times. However, I think that it's important that there be usable matchdata that can be utilised to enable tweaking and testing of the VRS model. The current supplied matchdata sample will not run anymore and also does not reflect the matchdata that is seen in the VRS regulated world with VRS mandated invites. As focus grows on VRS, with much more discussion online and critique, I think it would be beneficial if the accessibility of testing the model would be improved.

Performance as an additional seed modifier

As there has not been a public breakdown of the methodology / application of the VRS this relies on my own interpretation and opinion of the VRS. As it appears to me the VRS model's components are incredibly balanced and can be split in a multitude of ways. 2 Dependent factors (BCOL & OPPN) vs 2 independent factors (BOFF & LANW), bounty-related vs non bounty, as well as historically 2 smaller impact factors (OPPN & LANW) compared to the larger and quantitatively tangible components.

Figure 6 - Current Opponent Network values against Global Rank

As shown in Figure 6, Opponent Network values are relatively low, with a majority of the concentration being of a lower value compared to Bounty Collected in Figure 7 below.

Figure 7 - Current Bounty Collected values against Global Rank

Traditionally, LANW (LAN wins) has been similar to Opponent Network, being a final swing factor compared to the two dominating Bounty components. When looking at Figure 2, the concentration of high value LAN teams is far lower than current. Whilst the component does stretch to a greater value than the opponent network, its mass concentration is pretty similar. As LANW has now become far more prominent, this has offset the balance away from dependent points that rely on quality of opponent and components which have stakes applied.

LANW's is an important component. Through testing the model i did test the implementation of Tier 1 event LANW's being equal to the current 1.0 and Tier 2 events would only receive 0.5 from a LANW. This saw essentially no fit improvement whilst also reducing the ability to catch and overtake teams in the bubble. LANW's are a pretty crucial aspect to ensure there isnt significant stagnation, however the component may be too dominant and has caused an over pivot away from the quality of the opponent in match.

Instead, this PR looks at introducing another component via the bucketed approach that is dependent on the opponents achievements and has event stakes applied. As shown below.

Changes

In Phase 1, counters are introduced for a team's total rounds played and teams rounds won across all their matches played, scaled by the recency of the match the rounds were in.

team.teamMatches.forEach( teamMatch => {
                let matchTime = teamMatch.match.matchStartTime;
                let timestampModifier = context.getTimestampModifier( matchTime );
                teamMatch.maps.forEach(map => {
                // determine which side this team was on
                const teamIs1 = teamMatch.teamNumber === 1;
                const teamScore = Number(teamIs1 ? map.team1Score : map.team2Score) || 0;
                const oppScore  = Number(teamIs1 ? map.team2Score : map.team1Score) || 0;


                if ((teamScore === 0 && oppScore === 0) || (teamScore < 0 || oppScore < 0) ) return;


                team.roundsWon += Number(teamScore * timestampModifier) || 0;
                team.roundsPlayed += Number(((teamScore + oppScore) * timestampModifier)) || 0;
                });
            });

In Phase 2 this is then converted into a scaled own performance relative to the reference rounds won and rounds played, with a curve function applied to ensure recently formed teams with strong recent performances aren't adversely affected by their recency

let referenceRoundWins = nthHighest( teams.map( t => t.roundsWon ), context.getOutlierCount() );
let referenceRoundsPlayed = nthHighest( teams.map( t => t.roundsPlayed ), context.getOutlierCount() );


teams.forEach( team => {
            team.roundSuccess = Math.min( team.roundsWon / referenceRoundWins, 1 );
            team.roundsParticipation = curveFunction ( Math.min( Math.max ( team.roundsPlayed , 1) / referenceRoundsPlayed, 1 ) );
            team.ownMatchPerformance = Math.min( team.roundSuccess / team.roundsParticipation , 1 );
        } );

In Phase 3 this is then handled in a similar bucket approach as to opponent network and bounty collected, acting under the same decay and event stakes modification.

Results

This produces another component that is relative to opponents recent performance, matching the aim outlined in team.js in which it "rates each team highly if it can regularly win against other prestigious teams.", helping provide further evaluation on the prestige of an opponent.

Figure 8 - Expected Winrate against Observed Winrate using the proposed PR model with current data

Figure 8 is the model_fit using the PR's proposed changes, with improvements in both line and Spearman's Rho.

The opponent performance factors has an expected relation to global rank as shown in Figure 9 below

Figure 9 - Opponent Performance against Global Rank using the proposed PR model with current data

Additionally, using the assumption that a factor should have a good overall fit to the final_rank_value such that it matches both the average of other components as well as H2H the relationship is pretty good as shown in Figure 10 below:

Figure 10 - Opponent Performance against Final Rank Value using the proposed PR model with current data

With similar behaviour to what's observed with bounty offered as shown in Figure 11 below:

Figure 11 - Bounty Offered against Final Rank Value using the proposed PR model with current data

However, there is a noticeable higher weighting for Opponent Performance, with greater concentration in the upper bound of the factor.

Power vs Curve

Again, as there has not been a public breakdown of the methodology this was my own personal interpretation of whether the value should have the powerFunction applied or the curveFunction. As ownPerformance is an attained component compared to ownNetwork which is more an after effect of playing, it could potentially make more sense to align it with the bounty effects. This is because its potentially similar to the bounty application, of cash being a quantitative attained value. However, ownPerformance is still somewhat an after effect of playing further matches and is seemingly top heavy.

From that, it would make more sense to use the powerFunction, with the results as shown below.

Figure 12 - Expected Win Rate against Observed Win Rate for the PR, using powerFunction for opponentPerformance

As shown in Figure 12, using the powerFunction has a worse line then the curveFunction, however its still an improvement to the current model and does make potentially more logical sense in application of the global context.

When using the powerFunction, the weighting and saturation of the component is a lot more in line with the rest of the model as shown in Figure 13 and 14 below.

Figure 13 - Opponent Performance against Global Rank using the proposed PR model (using powerFunction) with current data

Figure 14 - Opponent Performance against Final Rank Value using the proposed PR model (using powerFunction) with current data

Effect

Whether or not the function uses power or curve there is still a rebalancing of the components. With an increase in components the overall impact of LANW drops to 20%. When running the model on the Major matchdata there is also incredibly little movement at the top level and has minimal effect on the standings.

As shown in Figure 15 and 16 below, the EU Major rankings are pretty similar, with only differences in qualified stage for border teams.

Figure 15 - EU Major Rankings using the proposed model with the powerFunction applied to opponent performance

Figure 16 - EU Major Rankings using the proposed model with the curveFunction applied to opponent performance

The improvements in fit are also not due to a concentration of probability to equal odds.

Figure 17 - Probability distribution for expected winrate for current model

Figure 18 - Probability distribution for expected winrate for the PR model (using curve Function)

As shown in Figure 17 and 18, the PR has a similar distribution to the existing model and isnt a concentration of probabilities in the middle of the range.

Overall, this PR looks at increasing the point reward from playing against teams of prestige to better balance the ratio of possible points via independent and dependent methods. This still ensures that LAN wins have significant benefit and are an attractive achievement for teams, but the reduction of LANW's weight from 25% to 20% ensures that its not a necessity to compete. roundParticipation already has a curveFunction applied to ensure initial teams have some worth akin to ownNetwork but due to this application it may be better to have this component work via the powerFunction as well as it fitting more with other applications of the model.

Further considerations

With the LAN component being high, there are potentially other ways to reduce its impact without adjusting the whole model's balance. Personally, I don't think scaling it to Tier or directly to opponent quality is the correct approach due to how it leads to ranking stagnation. When testing stake modification of LAN Wins the predominant outcome is that it becomes very hard to penetrate the top ranks and enables top level teams to fall off slower. Tier 1 teams that have been on a bad streak this cycle, which ended up missing out on the major, would have made it if there were stakes for LANW's due to the protective buffer this would create.

However, there could be a softer stakes application. Currently, Open LANs are only required to be announced with up to two weeks of notice. It's incredibly hard to plan around this margin, but almost necessary due to how lucrative they are. Instead of potentially harming local TO's by increasing the announcement date, where the VRS ranking is more of a benefit to the domestic scene instead of major determining, the flat 1.0 LAN win gain could be scaled to the announcement period. A lot of local LANs of the past have managed to announce their dates well in advance, and LANW's could have a step level scaling due to the notice period, i.e 0.5 for 2 weeks, 0.75 for 2 months, 1.0 for 6 months or any variation. As long as its the announcement dates and visa location for the event it's pretty reasonable for the smaller scale TO's as well as benefitting teams with more planning time for these key events.

Additionally, in the benefit of clarity for both TO's and teams it would potentially be beneficial to clarify the wildcard invite situation for event champions. Assuming it is rosters that hold the invites and not orgs, a solution in code could be completed in which at the same time as the rankings are updated a list of teams that are wildcard eligible are updated using in-built logic for the required conditions. Currently it's a bit of a grey area as to who is eligible for wildcard invites with none of the data being available publicly in one place. Liquipedia is the only location for ranking tiers, but Liquipedia holds no mandate and its VRS ranking categorization does not fall in line with what is used officially as shown in Figure 19 below. HLTV is the official source, but does not include event Tier and there's no indication as to whether it's the first, final or most played roster that would hold the invite.

Figure 19 - The only publicly accessible list of Ranked Tier 2 events on Liquipedia, but highlighted in red are events listed as ranked due to meeting the Liquipedia classification but weren't actually VRS ranked events

Conclusion

Overall, the current ranking model and its methodology is incredibly good. As raised in the Minor Changes section there are some small issues with outdated explanations as well as lack of accessibility to utilise, test and tweak the model but there are niche cases which only apply to a small amount of people. It's of my personal opinion that the current LAN component is too strong compared to the other components and somewhat diminishes the benefit of beating strong teams at either their respective level or above in which a team could generate similar benefit against low level opponents. This is dependent on coverage but occurrences have been observed through this major cycle.

However, it's important to note that the current LANW application is somewhat necessary. LANW's are essentially movers / chance creators in this ranking system. When reducing the worth or scaling them, the ability to break into that top bubble is incredibly difficult. We see strings of events using the same ranking invite, making penetration very difficult and if LANW's impact was significantly halted it would be very hard for upcoming teams to break into and grind to the threshold and it would provide too great a cushion for the top level teams. Roughly ~35% of future events between now and the end of 2026 incorporate non-wildcard non-LAN open qualifiers and breaking through one of these open Qualifiers and then going on to earn points is a tall order. Particularly with periods of Tier 1 events all using the same ranking invite, so the benefit of an Open Qualifier run could be rather diminished.

While I do think that the LANW component is too strong, it is an important equalizer and provides opportunity, I do recognise that if over punished it could make the rankings rather rigid.

Thanks for your time, the current ranking does a great job and has shown it does an incredible job at getting the best teams to the major.

asyowo · 2025-10-24T04:14:22Z

good read, well done.

Mohawmmad · 2025-12-28T17:21:19Z

Dear Valve Regional Operations Team,

We are writing on behalf of Iranian Counter-Strike teams and organizations regarding the recent updates to the Valve Regional Standings (VRS) model, including the introduction of the MENA sub-region under Europe and the removal of multi-region representation.

We appreciate the intent behind these changes, particularly the effort to align teams with regions that best reflect competitive conditions. However, we would like to formally raise a technical and competitive concern regarding the current classification of Iran under the Asia region.

From an infrastructure and connectivity standpoint, Iran’s competitive environment aligns far more closely with Europe (and by extension the MENA/EU sub-region) than with Asia:

• Average ping from Iran to European servers is typically ~60–80 ms with stable routing and negligible packet loss
• Routing to Asian servers (most notably Singapore) results in ~120–150 ms latency with frequent packet loss (often 5–10%+)
• In practice, nearly all Iranian teams train, scrim, and compete on European servers due to reliability and network stability
• Tournament qualifiers, online leagues, and practice ecosystems used by Iranian teams are overwhelmingly EU-based

As a result, classifying Iranian teams as “Asia” places them at a structural disadvantage that is unrelated to competitive skill. The assigned region does not reflect actual server access, practice conditions, or fair online competition.

With the recent creation of the MENA sub-region within Europe, and given that several geographically adjacent countries (e.g. Iraq, Armenia, Azerbaijan, UAE, Saudi Arabia, etc.) are now included, we believe Iran logically and competitively belongs in the same grouping based on real-world network topology rather than continental labels.

We respectfully request that Valve consider:
• Reassigning Iran to the MENA sub-region under Europe in the VRS model, or
• Evaluating regional assignment using objective connectivity metrics (latency, packet loss, routing stability) rather than geography alone

This adjustment would significantly improve competitive integrity for Iranian teams and better align VRS regional representation with actual playing conditions.

Thank you for your time and for your continued efforts to improve fairness and clarity within the VRS system. We would be happy to provide latency data, traceroutes, or further technical details if needed.

Kind regards,

Modifying ranking components

bf36110

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Modifying ranking components #40

Modifying ranking components #40

Uh oh!

MischiefCS commented Oct 23, 2025

Uh oh!

asyowo commented Oct 24, 2025

Uh oh!

Mohawmmad commented Dec 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Modifying ranking components #40

Are you sure you want to change the base?

Modifying ranking components #40

Uh oh!

Conversation

MischiefCS commented Oct 23, 2025

Aim

Current VRS situation

Minor Changes

Language

Hidden seed factors

Match data sample

Performance as an additional seed modifier

Changes

Results

Power vs Curve

Effect

Further considerations

Conclusion

Uh oh!

asyowo commented Oct 24, 2025

Uh oh!

Mohawmmad commented Dec 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants