Open
Description
This is a continuation of an issue in endpoint; right now the default behavior of yaml.dump()
in scripts/generators/ecs_helpers.py
is to escape unicode characters, but this behavior appears to be platform dependent, meaning that the copyright character will get escaped one of two ways:
example: "Microsoft\xAE Windows\xAE Operating System" # this is the unicode codepoint
or:
example: "Microsoft\xC2\xAE Windows\xC2\xAE Operating System" # this is the UTF-8 value
This results in massive, needless diffs after running the update scripts, depending on who is running them at any given time.
I'm not sure if there's a reason why we're escaping unicode here, I assume it's just because it's the default behavior.
If we change it to this:
yaml.dump(data, outfile, default_flow_style=False, allow_unicode=True)
We'll get this:
example: Microsoft® Windows® Operating System
Which solves the problem.