Skip to content

PSS/E errors with parsing PSS/E Files (Delimiters, Missing Q Record, Comments) #3386

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

pjanecek0
Copy link
Contributor

Please check if the PR fulfills these requirements

  • The commit message follows our guidelines
  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • A PR or issue has been opened in all impacted repositories (if any)

Does this PR already have an issue describing the problem?

#3385

What kind of change does this PR introduce?

What is the current behavior?

#3385

What is the new behavior (if this is a feature change)?

Does this PR introduce a breaking change or deprecate an API?

  • Yes
  • No

If yes, please check if the following requirements are fulfilled

  • The Breaking Change or Deprecated label has been added
  • The migration steps are described in the following section

What changes might users need to make in their application due to this PR? (migration steps)

Other information:

@pjanecek0 pjanecek0 changed the title Esys raw fix PSS/E errors with parsing PSS/E Files (Delimiters, Missing Q Record, Comments) Mar 26, 2025
@pjanecek0 pjanecek0 marked this pull request as draft March 27, 2025 09:03
@pjanecek0 pjanecek0 marked this pull request as ready for review March 27, 2025 09:08
Copy link
Contributor

@marqueslanauja marqueslanauja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My proposal is to implement the following changes to maximize the preservation of the existing code. Theses changes will fix all the issues.

Change the following main branch code

public String readRecordLine() throws IOException {
        String line = reader.readLine();
        if (line == null) {
            throw new PsseException("PSSE. Unexpected end of file");
        }
        if (isRecordLineDefiningTheAttributeFields(line)) {
            return ""; // an empty line must be returned
        }
        StringBuilder newLine = new StringBuilder();
        Matcher m = FileFormat.LEGACY_TEXT_QUOTED_OR_WHITESPACE.matcher(removeComment(line));
        while (m.find()) {
            // If current group is quoted, keep it as it is
            if (m.group().indexOf(LEGACY_TEXT.getQuote()) >= 0) {
                m.appendReplacement(newLine, replaceSpecialCharacters(m.group()));
            } else {
                // current group is whitespace, keep a single whitespace
                m.appendReplacement(newLine, " ");
            }
        }
        m.appendTail(newLine);
        return newLine.toString().trim();
    }

private static String removeComment(String line) {
        // Only outside quotes
        return line.replaceAll("('[^']*')|(^/[^/]*)|(/[^/]*)", "$1$2");
    }

by

public String readRecordLine() throws IOException {
        String line = reader.readLine();
        if (line == null) {
            if (qRecordFound) {
                throw new PsseException("PSSE. Unexpected end of file");
            }
            return "Q";
        }
        if (isRecordLineDefiningTheAttributeFields(line)) {
            return ""; // an empty line must be returned
        }
        StringBuilder newLine = new StringBuilder();
        Matcher m = FileFormat.LEGACY_TEXT_QUOTED_OR_WHITESPACE.matcher(processText(removeComment(line)));
        while (m.find()) {
            // If current group is quoted, keep it as it is
            if (m.group().indexOf(LEGACY_TEXT.getQuote()) >= 0) {
                m.appendReplacement(newLine, replaceSpecialCharacters(m.group()));
            } else {
                // current group is whitespace, keep a single whitespace
                m.appendReplacement(newLine, " ");
            }
        }
        m.appendTail(newLine);
        return newLine.toString().trim();
    }

    private static String removeComment(String line) {
        return line.replaceAll("('[^']*')|([^/']+)|(/.*)", "$1$2");
    }

    // Compact spaces, remove spaces before the comma, and replace space with comma outside quoted text
    private static String processText(String line) {
        if (line == null || line.isEmpty()) {
            return line;
        }

        StringBuilder result = new StringBuilder();
        Pattern pattern = Pattern.compile("'[^']*'|[^']+");
        Matcher matcher = pattern.matcher(line.trim());

        while (matcher.find()) {
            String part = matcher.group();
            if (part.startsWith("'") && part.endsWith("'")) {
                result.append(part);
            } else {
                // Outside quotes: process the txt
                result.append(part.replaceAll("\\s+,", ",")
                        .replaceAll(",\\s+", ",")
                        .replaceAll("\\s+", ","));
            }
        }

        return result.toString();
    }

Summary of the changes:

  • If we reach the end of the file and the Q record is not found, we return Q to finish properly the reading process.
  • The removeComment method has been fixed.
  • After removing the comments, the text is processed to eliminate unnecessary whitespaces and establish a single delimiter: the comma.

These changes cause some of the unit tests to fail:

Some cases are now exported using the comma as a delimiter, whereas they were previously exported using a whitespace. This is because, when exporting with the update option, the delimiter found in the input file is used in the export process. With the proposed modifications, the delimiter will always be a comma.
PSSE also uses the comma for exporting.
My proposal is to update the resource output files used for verification

The Texas2000_June2016.RAW file is not imported properly because some bus names contain a quotation mark inside a quoted attribute. It was 'magically' imported using the main branch. I've confirmed that PSSE does not support these quoted names either.
My proposal is to import the original case as invalid, replace the inner quotation marks with whitespace, and then import the fixed case as valid.

1485,'O'Donnell ~1', 115.0000,1, 6, 1, 1,1.03539, 16.4735

@Test
   void testSyntheticTexas200June2016() {
       testInvalid("/illinois/synthetic", "Texas2000_June2016.RAW", "Parsing error");
   }

   @Test
   void testSyntheticTexas200June2016Fixed() {
       testValid("/illinois/synthetic", "Texas2000_June2016_fixed_quoted_buses.RAW");
   }

@pjanecek0
Copy link
Contributor Author

@marqueslanauja Yes, that seems to solve the bug. Converting white spaces to commas is probably a better solution.

We could try replacing

1485,'O'Donnell ~1', 115.0000,1, 6, 1, 1,1.03539, 16.4735

with

1485, "O'Donnell ~1", 115.0000,1, 6, 1, 1,1.03539, 16.4735

which I would expect to fix this test case and double quotes are also valid.However, we really don't have to support this...

For the tests, I need to add input
IEEE_14_bus_delimiter.raw
that represents the data I will be using.

Using comma as an export delimiter is a better choice because it allows you to record the default value, and PSS/E does the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants