[SoFIFA] Read_player_ratings return only 1 record

**Describe the bug**
the method read_player_ratings return only the last player. Specifically because there is an incorrect indentation of tht XPath extraction and ratings.append() are outside the player loop, so only the last player's scores are processed and appended.

**Python Version**
Python 3.11.4

**Affected scrapers**
This affects the following scrapers:
- [x] SoFIFA


**Code example**

```python
import soccerdata as sd
sofifa = sd.SoFIFA(leagues="ENG-Premier League", versions="latest")
    print(sofifa.read_player_ratings(team="Arsenal")
```

**Error message**

```
no error message 
```

**Error output**

```
                  fifa_edition        update overallrating  ... gk_kicking gk_positioning gk_reflexes
player                                                      ...
Takehiro Tomiyasu        FC 25  Jul 17, 2025            78  ...          6              5          11

[1 rows x 38 columns]
```


**Additional context**
I fix the problem with GPT-5 mini but im not sure is the correct way (or an effective issue) because i only dowload the collection.

**Code fix sofifa.py**

```python
def read_player_ratings(
        self,
        team: Optional[Union[str, list[str]]] = None,
        player: Optional[Union[int, list[int]]] = None,
    ) -> pd.DataFrame:
        """Retrieve ratings for players.

        Parameters
        ----------
        team: str or list of str, optional
            Team(s) to retrieve. If None, will retrieve all teams.
        player: int or list of int, optional
            Player(s) to retrieve. If None, will retrieve all players.

        Returns
        -------
        pd.DataFrame
        """
        # build url
        urlmask = SO_FIFA_API + "/player/{}/?r={}&set=true"
        filemask = "player_{}_{}.html"

        # get player IDs
        if player is None:
            players = self.read_players(team=team).index.unique()
        elif isinstance(player, int):
            players = [player]
        else:
            players = player

        # prepare empty data frame
        ratings = []

        # define labels to use for score extraction from player profile pages
        score_labels = [
            "Overall rating",
            "Potential",
            "Crossing",
            "Finishing",
            "Heading accuracy",
            "Short passing",
            "Volleys",
            "Dribbling",
            "Curve",
            "FK Accuracy",
            "Long passing",
            "Ball control",
            "Acceleration",
            "Sprint speed",
            "Agility",
            "Reactions",
            "Balance",
            "Shot power",
            "Jumping",
            "Stamina",
            "Strength",
            "Long shots",
            "Aggression",
            "Interceptions",
            "Positioning",
            "Vision",
            "Penalties",
            "Composure",
            "Defensive awareness",
            "Standing tackle",
            "Sliding tackle",
            "GK Diving",
            "GK Handling",
            "GK Kicking",
            "GK Positioning",
            "GK Reflexes",
        ]

        iterator = list(product(self.versions.iterrows(), players))
        for i, ((version_id, version), player) in enumerate(iterator):
            logger.info(
                "[%s/%s] Retrieving ratings for player with ID %s in %s edition",
                i + 1,
                len(iterator),
                player,
                version["update"],
            )

            # read html page (player overview)
            filepath = self.data_dir / filemask.format(player, version_id)
            url = urlmask.format(player, version_id)
            reader = self.get(url, filepath)

            # extract scores one-by-one
            tree = html.parse(reader, parser=html.HTMLParser(encoding="utf8"))

            # get player name safely
            node_player_name_nodes = tree.xpath("//div[contains(@class, 'profile')]/h1")
            if node_player_name_nodes:
                node_player_name = node_player_name_nodes[0]
                # Extract what is before <br>
                before_br = node_player_name.xpath("string(./text()[1])").strip()
                # Extract what is after <br>
                after_br = node_player_name.xpath(
                    "string(./br/following-sibling::text()[1])"
                ).strip()
                player_name = before_br if before_br else after_br
            else:
                player_name = None

            scores = {"player": player_name, **version.to_dict()}

            # Try each XPath until one returns a result
            for s in score_labels:
                value = None
                xpaths = [
                    f"//p[.//text()[contains(.,'{s}')]]/span/em",
                    f"//div[contains(.,'{s}')]/em",
                    f"//li[not(self::script)][.//text()[contains(.,'{s}')]]/em",
                ]
                for xpath in xpaths:
                    nodes = tree.xpath(xpath)
                    if nodes:  # If at least one match is found
                        text = nodes[0].text
                        value = text.strip() if text is not None else None
                        break  # Stop checking other XPaths once we find a valid value

                scores[s] = value  # will be None if not found

            ratings.append(scores)
        # return data frame
        return pd.DataFrame(ratings).pipe(standardize_colnames).set_index(["player"]).sort_index()
```

**Contributor Action Plan**

- [x] I’m unsure how to fix this, but I'm willing to work on it with guidance.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SoFIFA] Read_player_ratings return only 1 record #889

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[SoFIFA] Read_player_ratings return only 1 record #889

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions