-
Notifications
You must be signed in to change notification settings - Fork 0
Feature/File Type Conversion/XML to JSON #47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
import json | ||
import xmltodict | ||
import os | ||
import logging | ||
|
||
def xml_to_json( | ||
xml_path: str, | ||
output_path: str = None | ||
): | ||
""" | ||
Transform an XML file into a JSON format. | ||
""" | ||
|
||
# Open the input XML file and read data in form of python dictionary using the "xmltodict" module. | ||
try: | ||
with open(xml_path) as xml_file: | ||
data_dict = xmltodict.parse(xml_file.read()) | ||
except FileNotFoundError as error: | ||
logging.error(f"Error: {str(error)} (XML file not found)") | ||
except Exception as error: | ||
logging.error(f"Error: {str(error)}") | ||
|
||
# Generate the json_data object using json.dumps(). | ||
json_data = json.dumps(data_dict) | ||
|
||
# Check if output_path is provided, otherwise set it to a default value (original XML folder path). | ||
if output_path is None: | ||
output_path = os.path.dirname(xml_path) | ||
output_directory = f'{output_path}/json' | ||
else: | ||
output_directory = f'{output_path}/json' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If an output path is explicitly provided, will we still need to add the JSON subfolder? |
||
|
||
# Set the name of the json file to the name of the XML file provided. | ||
file_name = os.path.splitext(os.path.basename(xml_path))[0] | ||
|
||
# Create the output directory if it doesn't exist. | ||
if not os.path.exists(output_directory): | ||
os.makedirs(output_directory) | ||
|
||
# Write the contents of the JSON file into the output folder path. | ||
file_path = os.path.join(output_directory, f'{file_name}.json') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same question here. Should a non-empty |
||
try: | ||
with open(file_path, "w") as json_file: | ||
json_file.write(json_data) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not very familiar with XML. Will this write out rows that are bounded by brackets ( |
||
except Exception as error: | ||
logging.error(f"Error: {str(error)}") | ||
|
||
return json_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question about the naming conventions. Because we already have a
translate_csv_file_to_jsonl()
method in thejsonl
callable submodule, I'd argue we should put it there.