Feat: Adding a pure Swift Configuration File Reader#2089
Feat: Adding a pure Swift Configuration File Reader#2089AntAmazonian wants to merge 29 commits intomainfrom
Conversation
dayaffe
left a comment
There was a problem hiding this comment.
Glad to see this work is coming along!
| for line in source.lines{ | ||
| var currentLineNumber: Int = 0 |
There was a problem hiding this comment.
This looks like every iteration it would set currentLineNumber to 0 since you aren't keeping track of the line number outside of this scope. Check out enumerated() to fix https://developer.apple.com/documentation/swift/array/enumerated()
There was a problem hiding this comment.
This works for each parsing iteration, every time a new file is loaded the iteration restarts. This is meant to help with error logging, there's a very specific request in the SEP for this so I wanted to keep it simple. I'll look into the enumerated.
There was a problem hiding this comment.
Discussed offline -- this did not work and will switch to using enumerate()
There was a problem hiding this comment.
Enumerated has been implemented.
| for line in source.lines{ | ||
| var currentLineNumber: Int = 0 | ||
| currentLineNumber += 1 | ||
| let blankLine = try! NSRegularExpression(pattern: "^\\s*$", options: []) |
There was a problem hiding this comment.
Suggestion: Make all hardcoded regex patterns static and declare it at top level scope. Its better for readability and marginally for performance since regular expressions involve a compilation step. When they are declared static it only happens once whereas right now its being compiled during every iteration.
There was a problem hiding this comment.
I will looked into it a little, I was forced to use the legacy Regex, I making it static but will put it in an higher scope now that I have fewer than what I started with.
There was a problem hiding this comment.
Regex pattern has been made static
| if line.contains("["){ | ||
| newSection = true | ||
| } | ||
| switch line{ |
There was a problem hiding this comment.
nit: where applicable add a space before { for readability
| targetDictionary[fullSectionKey] = section | ||
| currentSection = section | ||
| targetDictionary[sectionName] = currentSection |
There was a problem hiding this comment.
Q: Don't both targetDictionary[fullSectionKey] and targetDictionary[sectionName] store references to the same object? currentSection = section
There was a problem hiding this comment.
Yes, the intention was to keep the Full Section (e.g. Profile Name) name along with section name (e.g. Name) so that the Full Section is used as a key within the code to distinguish between different sections that had the same name. I tried to just keep the section type instead of the full name but it couldn't distinguish the different sections properly which I suspect might be how the tests are tailored so I just went with the full name.
There was a problem hiding this comment.
Ah I see! In that case I see why what you're doing works but I wonder if we can still have it work with only using the fullSectionkey. Maybe it didn't work when you tried because you do return sections[sectionName] in ConfigFile.section()
Give this a try:
Only use targetDictionary[fullSectionKey] = section
Then modify ConfigFile.section():
func section(for name: String, type: FileBasedConfigurationSectionType) -> (any FileBasedConfigurationSection)? {
let typePrefix = switch type {
case .profile: "profile"
case .ssoSession: "sso-sessions"
case .services: "services"
}
let fullKey = "\(typePrefix) \(name)"
return sections[fullKey]
}(also should it bee sso-sessions plural or sso-session singular?)
There was a problem hiding this comment.
It seems that this approach does not merge properties between the config and cred files. I will try modifying the ConfigFile struct to handle more of the section configuration and then move the merging into a separate function.
I do recall just using the "targetDictionary[fullSectionKey] = section" before I added the the "targetDictionary[sectionName] = currentSection" in the beginning. It passed the majority of tests and only failed the last 5 that had to do with merging properties with duplicate profiles in both config and cred while using the cred properties and then dropping configs.
The main issue was that the temp dictionary did not recognize the most recent section being parsed was in another file just that another section with the same name was there. My approach was pretty much storing the full section twice as a unique key for merging and the primary key for testing. I will see if I can simplify this better.
| mergedSections[fullSectionKey] = credSection | ||
| } | ||
| } | ||
| print("\n---Results after the merge of sections--- \n\(mergedSections)\n") |
There was a problem hiding this comment.
this looks like a debug statement that might have been left in
There was a problem hiding this comment.
The Print statement was used as a debug statement, I'll remove all of them except for the error related ones as a final change.
| var currentSection: ConfigFileSection? | ||
| var isCurrentSectionValid = false | ||
| var newSection = false | ||
| var sections: [String: ConfigFileSection] = [:] | ||
| var currentSubSection: String? | ||
| var currentProperty: String? | ||
| var currentPropertyValue: String? | ||
| var currentSubProperty: String? |
There was a problem hiding this comment.
I dont think we need to keep track of all this state. Doing so makes the code hard to read / modify later on. Consider keeping track of only lines, sections, currentSection, and currentProperty. I think we can also move all of those to be variables limited to the scope of a parseFile function rather than keep them top level. I believe we only need those values when constructing the type [String: ConfigFileSection] from a String (file contents)
There was a problem hiding this comment.
Each one was necessary for error catches and specific parsing requirements per the SEP.
There was a problem hiding this comment.
I am suggesting we try to refactor the code logic so that we can keep track of less state while still accomplishing everything we need to do. I understand this was all necessary to get it working with your current code. The following example may help!
This is a simplified example from your code
var currentSection: ConfigFileSection?
var isCurrentSectionValid = false
var newSection = false
for line in lines {
if line.contains("[") {
newSection = true // Set flag
}
switch line {
case _ where line.contains("["):
if let sectionHeader = parseSectionHeader(line) {
currentSection = ConfigFileSection(name: sectionHeader.name)
isCurrentSectionValid = true // Mark as valid
} else {
isCurrentSectionValid = false // Mark as invalid
}
case _ where line.contains("="):
if !isCurrentSectionValid && newSection {
print("Skipping line because previous section was invalid")
break
}
// Process property...
}
}and this is how we could accomplish the same thing with 1 variable
var currentSection: ConfigFileSection?
for line in lines {
if line.contains("[") {
if let sectionHeader = parseSectionHeader(line) {
currentSection = ConfigFileSection(name: sectionHeader.name)
} else {
currentSection = nil // Invalid section = no current section
}
} else if line.contains("=") {
guard let section = currentSection else {
print("Skipping line - no valid section")
continue
}
// Process property...
}
}Rather than represent isCurrentSectionValid with a bool we directly check if we have data to work with by checking if currentSection is nil (using a guard statement). In the first version we also had to check for newSection but really we just need to know "do we have a valid section to process properties for?" which is also answered by section = currentSection not being nil since we would have previously set currentSection if we encountered line.contains("[") indicating a new section!
There was a problem hiding this comment.
I reviewed my commit history, I used extra states because I couldn't use currentSection for multiple tests. One required to skip all properties under an invalid section but continue parsing and the other tests needed a parse error thrown and to stop parsing if there was a property that was parsed first instead of a section. I know there are a couple more that are very specific but I will continue reviewing to see how I can organize them better.
| import Foundation | ||
| @_spi(FileBasedConfig) import AWSSDKCommon | ||
|
|
||
| public class ConfigFileReader { |
There was a problem hiding this comment.
Suggestion: Organize & Simplify the functions. I think there's a bit too much complexity in how the methods are organized. Taking a step back if I look at this ConfigFileReader I think we should organize this code in the following way (tried making it close to what you already have):
class ConfigFileReader
- public init
- func config()
- private func readFile(_ path: String) -> String?
- private func parseFile(_ content: String) -> [String: ConfigFileSection]
- (helper for parseFile) private func isCommentOrEmpty(_ line: String) -> Bool
- (helper for parseFile) private func handleNewSection(_ current: ConfigFileSection?, _ info: (name: String, type: FileBasedConfigurationSectionType), _ sections: inout [String: ConfigFileSection]) -> ConfigFileSection
- (helper for parseFile) private func handleProperty(_ line: String, _ section: ConfigFileSection?, _ currentProperty: String?, _ lineNum: Int) throws -> ConfigFileSection
- (helper for parseFile) private func parseKeyValue(_ line: String) -> (String, String)
- private func parseSectionHeader(_ line: String) -> (name: String, type: FileBasedConfigurationSectionType)?
- private func cleanValue(_ value: String) -> String
- private func mergeSections(_ config: inout [String: ConfigFileSection], with credentials: [String: ConfigFileSection])
struct ConfigFileSection: FileBasedConfigurationSection
- mutating func addSubProperty(parentKey: String, key: String, value: String)
- func property(for name: FileBasedConfigurationKey) -> FileBasedConfigurationProperty?
struct ConfigFile: FileBasedConfiguration
- same as you have
struct Subsection
- Same
struct ConfigError: Error
- rename MyError
There was a problem hiding this comment.
Are you suggesting I move the helper functions and structs after the func config() ?
There was a problem hiding this comment.
Take a closer look, my suggested functions/signatures arent all the same :)
There was a problem hiding this comment.
Sorry but I cannot visualize what you're suggesting, I think I'm still lacking in experience for translating text to code structure.
There was a problem hiding this comment.
After some review I will be refactoring the func config() to make it more focused and create helper.
There was a problem hiding this comment.
That's good, also see my other suggested functions and helpers. Ex. I am suggesting to create a parseFile() function along with several helpers
There was a problem hiding this comment.
I am a little confused for the purpose of a parseFile func, is it meant to validate and store the file contents first before the other functions do their work?
| } | ||
| } | ||
|
|
||
| struct MyError: Error { |
There was a problem hiding this comment.
Change to something like ConfigError rather than MyError
There was a problem hiding this comment.
I will change it to ParsingError, thanks for reminding me about it.
| switch type { | ||
| case .profile: | ||
| sectionName = name | ||
| case .ssoSession: | ||
| sectionName = name | ||
| case .services: | ||
| sectionName = name | ||
| } |
There was a problem hiding this comment.
I don't think this switch statement is needed since in every case we do the same thing! sectionName = name
There was a problem hiding this comment.
This is another protocol requirement copy and pasted from another reader, I'll look into to it again to see if I can make any changes.
There was a problem hiding this comment.
I originally intended to use the switch to add "type" as one of the parameters in ConfigFile so that can be used to distinguish duplicate sections in the same or different files but the protocols didn't align so I went with the internal full section unique key instead. I'll remove the switch for now.
| self.configFilePath = configFilePath ?? "~/.aws/config" | ||
| self.credentialsFilePath = credentialsFilePath ?? "~/.aws/credentials" |
There was a problem hiding this comment.
SEP requires you to check for environment variables as well (e.g., AWS_CONFIG_FILE, and AWS_SHARED_CREDENTIALS_FILE)
There was a problem hiding this comment.
I implemented the check for environment variables and changed the logic for the extractFileContents function in accordance.
Sources/Core/AWSSDKConfigFileReader/Sources/AWSSDKConfigFileReader/ConfigFileReader.swift
Outdated
Show resolved
Hide resolved
| let type: String | ||
| let lines: [Substring] | ||
| } |
| } | ||
|
|
||
|
|
||
| func config() throws -> FileBasedConfigurationSectionProviding? { |
There was a problem hiding this comment.
Suggestion: Break this function down further, it's too long at 180 lines
There was a problem hiding this comment.
What would be an optimal length? Also I will end up cutting out the print statements used for debugging as a final change.
There was a problem hiding this comment.
Optimal length is subjective opinion so I can't prescribe you to fit a certain length.
There was a problem hiding this comment.
Looking into refactoring this further, will try to condense into smaller helper functions to keep the func config() more focused.
| guard let sectionName = currentSection?.name else { throw MyError("Expected a section definition") } | ||
|
|
||
| currentProperty = key | ||
| currentPropertyValue = value | ||
| currentSection?.properties[key] = value | ||
| targetDictionary[sectionName] = currentSection | ||
|
|
||
| print(" Added new property '\(key)' = '\(value)' to section '\(sectionName)'") | ||
| print(" The current section contains '\(currentSection!)'") | ||
| } | ||
|
|
||
| private func handleSubProperty(key: String, value: String, lineNumber: Int, targetDictionary: inout [String: ConfigFileSection]) throws { | ||
|
|
||
| guard let sectionName = currentSection?.name, | ||
| let currentSubSectionName = currentSubSection, | ||
| let currentProp = currentProperty else { | ||
| throw MyError("Property did not have a name in sub-property") | ||
| } | ||
|
|
||
| if key.rangeOfCharacter(from: .whitespaces) == nil { | ||
| // The logic for subproperties isolated in this function | ||
| currentSection?.properties[currentProp] = nil // Clear standard property if it has subprops | ||
| var subproperties = currentSection?.subproperties ?? [String: [String: String]]() | ||
| var subpropertyKeysAndValues = subproperties[currentSubSectionName] ?? [String: String]() | ||
| subpropertyKeysAndValues[key] = value | ||
| subproperties[currentSubSectionName] = subpropertyKeysAndValues | ||
| currentSection?.subproperties = subproperties | ||
| targetDictionary[sectionName] = currentSection | ||
| } else { | ||
| currentSection?.properties[currentProp] = nil // Clear standard property if it has subprops | ||
| var subproperties = currentSection?.subproperties ?? [String: [String: String]]() | ||
| let subpropertyKeysAndValues = subproperties[currentSubSectionName] ?? [:] | ||
| subproperties[currentSubSectionName] = subpropertyKeysAndValues | ||
| currentSection?.subproperties = subproperties | ||
| targetDictionary[sectionName] = currentSection | ||
| } | ||
| print(" Added sub-property key and value '\(key)' = '\(value)' under subsection '\(currentSubSectionName)'") | ||
| print(" The current section contains '\(currentSection!)'") | ||
| } | ||
|
|
||
| private func handleStandardKeyWithoutValue(key: String, lineNumber: Int, targetDictionary: inout [String: ConfigFileSection]) throws { | ||
| guard let sectionName = currentSection?.name else { throw MyError("Expected a section definition") } | ||
|
|
||
| currentSubSection = key | ||
| currentProperty = key | ||
| currentPropertyValue = "" | ||
| currentSection?.properties[key] = "" | ||
| targetDictionary[sectionName] = currentSection | ||
|
|
||
| print(" Added new property '\(key)' with no value to section '\(sectionName)'") | ||
| print(" Added property '\(key)' as a subsection to current section") | ||
| print(" The current section contains '\(currentSection!)'") | ||
| } | ||
|
|
||
| private func handleSubProptertyKeyWithoutValue(key: String, lineNumber: Int, targetDictionary: inout [String: ConfigFileSection]) throws { | ||
| guard key.rangeOfCharacter(from: .whitespaces) == nil else { | ||
| print("Warning: Silently ignoring subproperty key '\(key)' on line \(lineNumber) because it contains whitespace.") | ||
| return // Exit the function silently | ||
| } | ||
|
|
||
| guard let sectionName = currentSection?.name, | ||
| let currentSubSectionName = currentSubSection, | ||
| let currentProp = currentProperty else { | ||
| throw MyError("Property did not have a name in sub-property") | ||
| } | ||
|
|
||
| currentSection?.properties[currentProp] = nil // Clear standard property if it has subprops | ||
| var subproperties = currentSection?.subproperties ?? [String: [String: String]]() | ||
| var subpropertyKeysAndValues = subproperties[currentSubSectionName] ?? [String: String]() | ||
| subpropertyKeysAndValues[key] = "" | ||
| subproperties[currentSubSectionName] = subpropertyKeysAndValues | ||
| currentSection?.subproperties = subproperties | ||
| targetDictionary[sectionName] = currentSection | ||
|
|
||
| print(" Added sub-property key: '\(key)' with no value under subsection '\(currentSubSectionName)'") | ||
| print(" The current section contains '\(currentSection!)'") | ||
| } |
| self.credentialsFilePath = credentialsFilePath ?? "~/.aws/credentials" | ||
| } | ||
|
|
||
| func readAndDecodeFile(atPath path: String, fileDescription: String) -> String? { |
There was a problem hiding this comment.
nit: Better function name might be something like getFileContents, readAndDecode sounds repetitive
There was a problem hiding this comment.
I changed it to extractFileContents
| let type = String(content[..<spaceIndex]) | ||
| let name = String(content[content.index(after: spaceIndex)...].trimmingCharacters(in: .whitespaces)) |
There was a problem hiding this comment.
nit: You can just do let parts = s.split(separator: " ") instead of doing this index stuff
There was a problem hiding this comment.
The index is required, there are SEP requirements for handling comments and whitespaces, it was the only way I could think of to meet requirements and pass all tests.
There was a problem hiding this comment.
How does using a more Swift native way to cut up a string interfere with SEP requirements? I see that the spaceIndex variable isn't used anywhere else either. Do you have specific text from the SEP that makes you think this is the only way to cut up a string that satisfies SEP requirements? Feel free to send me direct quote in DM.
There was a problem hiding this comment.
I looked into it a little more, the split method does not work well with multiple white spaces between the type and name or if their is no whitespace in between them, I plan on using a better regex to handle the parsing better.
There was a problem hiding this comment.
The split method works totally fine for multiple whitespaces. If you're having trouble using it you likely need to revisit the logic of how your code parses tokens.
let text = " hello world swift "
let words = text.split(whereSeparator: \.isWhitespace)
print(words) // ["hello", "world", "swift"]Consider text to be a line in a config, this gives you all tokens with no whitespaces. Then you can proceed to process the tokens.
There was a problem hiding this comment.
The split method works totally fine for multiple whitespaces. If you're having trouble using it you likely need to revisit the logic of how your code parses tokens.
let text = " hello world swift " let words = text.split(whereSeparator: \.isWhitespace) print(words) // ["hello", "world", "swift"]Consider
textto be a line in a config, this gives you all tokens with no whitespaces. Then you can proceed to process the tokens.
But what about zero whitespaces? I tried using capture groups instead of index with a refined regex that is now static as shown below, this also works with the different parsing for Credential file sections:
`
private enum SectionHeaderParser {
static let sectionRegex: NSRegularExpression? = {
// pattern matches: [ (optional type) (name) (optional comment) ]
let pattern = #"\[\s*(?:(?<type>\S+)\s+)?(?<name>\S+)\s*(?:[#;].*)?\]"#
return try? NSRegularExpression(pattern: pattern)
}()
}
private func parseSectionHeader(from line: String) -> (name: String, type: FileBasedConfigurationSectionType)? {
guard let sectionRegex = SectionHeaderParser.sectionRegex else { return nil }
let typeAndNameRange = NSRange(line.startIndex..., in: line)
guard let sectionMatch = sectionRegex.firstMatch(in: line, range: typeAndNameRange) else {
return nil
}
// Helper to safely extract capture groups
func capture(at index: Int) -> String? {
let nsRange = sectionMatch.range(at: index)
guard nsRange.location != NSNotFound,
let swiftRange = Range(nsRange, in: line) else { return nil }
return String(line[swiftRange])
}
// Group 1: type (optional), Group 2: name
let typeStr = capture(at: 1)
let name = capture(at: 2) ?? ""
if name == "default" && typeStr == nil {
return (name: "default", type: .profile)
}
let sectionType: FileBasedConfigurationSectionType = switch typeStr {
case "profile": .profile
case "sso-session": .ssoSession
case "services": .services
default: .profile
}
return (name: name, type: sectionType)
}`
| return (name: content.trimmingCharacters(in: .whitespaces), type: .profile) | ||
| } | ||
|
|
||
| private func handleNewSectionFound(name sectionName: String, type sectionType: FileBasedConfigurationSectionType, lineNumber: Int, targetDictionary: inout [String: ConfigFileSection]) -> Bool { |
There was a problem hiding this comment.
nit: For this function and others, add newline between each argument so it's not a long single line
| // Extension to strip comments and trim whitespace from raw strings | ||
| private func cleanedValue(from rawString: String) -> String { |
There was a problem hiding this comment.
Suggestion: Instead of having a vague function name with comment that explains what it does, just use a better function name to speak for itself, something like: func stripCommentAndWhitespace(from rawString: string)
…ting current ones.
…files, reopen dictionary to locate section and property dulpicates, added new helper functions.
…locations to enable testing without reading a real file.
sichanyoo
left a comment
There was a problem hiding this comment.
General comment:
- I see that we keep track of state in 13 state variables. I know David already commented on it but I think maintaining all those state variables is hard to maintain going forward and is too confusing.
- Consider using consistent terminology, instead of mixing
processandhandleinterchangeably, just usingparsethroughout the entire config file reader should make code easier to read. - Adding some comments to functions on what line types they handle with examples should make maintenance easier.
| print("\n[DEBUG] RAW CREDS FILE CONTENTS:") | ||
| print(credsData?.isEmpty == false ? credsData! : "(Empty or Missing)") | ||
|
|
||
| print("\n[TEST DEBUG] Number of sources to process: \(sources.count)") |
There was a problem hiding this comment.
Are these print statements supposed to be left in? If so, use actual logger instance instead of just print statements.
There was a problem hiding this comment.
The print statements for for debugging purposes only, they will be removed as a final change and a logger will take the place for error tracking purposes.
There was a problem hiding this comment.
Print Statements have been removed and replaced with a logger.
| let isIndented = currentSubSection != nil && rawString.hasPrefix(" ") | ||
|
|
||
| if !isValue && !isIndented { | ||
| let safeCommentPattern = #"\s+[#;].*"# |
There was a problem hiding this comment.
Extract these in-line regex patterns into named constants.
There was a problem hiding this comment.
Added a struct "FormatContants" to hold the regex
| return ConfigFile(sections: finalResult) | ||
| } | ||
|
|
||
| private func process(line: String, into dict: inout [String: ConfigFileSection]) throws { |
There was a problem hiding this comment.
You could break this down into helper functions easily. Also, use a more descriptive function name than process.
| print("The current section contains '\(currentSection!)'") | ||
| } | ||
|
|
||
| private func processContinuation(_ rawString: String, into targetDictionary: inout [String: ConfigFileSection]) throws { |
There was a problem hiding this comment.
Avoid using the word continuation because it's part of concurrency terminology in Swift and makes the function name ambiguous
There was a problem hiding this comment.
I'll rename it "processExtension", to be more in line with it's purpose.
| for source in sources { | ||
| print("Source Type: \(source.type), Line Count: \(source.lines.count)") | ||
| var targetDict = (source.type == "config") ? configSections : credentialSections | ||
| currentSource = source.type | ||
|
|
||
| for (index, line) in source.lines.enumerated() { | ||
| currentLineNumber = index + 1 | ||
| let lineString = String(line) | ||
|
|
||
| do { | ||
| try process(line: lineString, into: &targetDict) | ||
| } catch { | ||
| handleParseError(error, line: lineString) | ||
| throw error | ||
| } | ||
| } | ||
| if source.type == "config" { configSections = targetDict } | ||
| else { credentialSections = targetDict } | ||
| } |
There was a problem hiding this comment.
Instead of reassigning, something like this might be cleaner:
func processSource(source: (type: String, lines: [Substring]), into sections: inout [String: ConfigFileSection]) {
for (index, line) in source.lines.enumerated() {
currentLineNumber = index + 1
let lineString = String(line)
try process(line: lineString, into: §ions)
}
}
// Usage
for source in sources {
if source.type == "config" {
processSource(source: source, into: &configSections)
} else {
processSource(source: source, into: &credentialSections)
}
}
There was a problem hiding this comment.
I agree, this was cleaner, I had it fully implemented and even add a do/catch to both the function and for loop in order to ensure that parsing is stopped for specific thrown errors.
…ponsibilty and created other functions to do the same, added comments with examples for function readability and maintainability, reduced state variables from 13 to 7 and replaced print statements with logger.
…profiles when needed
…r version if necessary.
| }() | ||
| } | ||
|
|
||
| // MARK: - Initialization |
There was a problem hiding this comment.
nit: we dont need all these marks, they are introduced by AI in order to compensate for code composition as files get too big
There was a problem hiding this comment.
We as in Swift? I sometimes find them in other parts of our and others SDK.
| /// 5. Apply precedence rules (explicit [profile default] overrides [default]) | ||
| /// | ||
| /// Returns: FileBasedConfiguration with all parsed sections and properties | ||
| func config() throws -> FileBasedConfigurationSectionProviding? { |
There was a problem hiding this comment.
It is not a good idea to define nested functions as they hide functionality inside of a method definition and prevents proper testing
|
Overall feedback: Everything seems to be working and we have a lot of tests but the config file reader itself needs a code design document and review. We are keeping track of too much state and modifying them in-place in combination with some less maintainable practices like nested functions. I think we can also re-write this functionality with much less code. Almost there though -- good progress getting all the tests passing. |
…ne of file configuration.
…ys, added a JSON file in advance to be used for more comprehensive location testing.
Issue #
N/A
Description of changes
The purpose is to create more consistency and management with a Swift written Config File Reader.
New/existing dependencies impact assessment, if applicable
No new dependencies were added to this change.
Conventional Commits
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.