-
Notifications
You must be signed in to change notification settings - Fork 245
Minor patches for the C code in Merlin #1998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Oh wow! I think it deserve a Changelog entry ! |
Ah yes, I think this patch will fix the warning encountered by the user; but the build failure is unrelated. This patch series doesn't apply cleanly to v4.7-413, unfortunately. The opam files for these old merlin version are missing constraints on menhir. # syntax=docker/dockerfile:1
FROM ocaml/opam:ubuntu-25.04-ocaml-4.13
RUN sudo ln -sf /usr/bin/opam-2.4 /usr/bin/opam && opam init --reinit -ni
ADD --keep-git-dir --link --chown=1000:1000 https://github.com/ocaml/merlin.git#v4.7-413 /home/opam/merlin
RUN git config --global --add safe.directory ~/merlin
WORKDIR /home/opam/merlin
RUN --mount=type=cache,target=/home/opam/.opam/download-cache,sharing=locked,uid=1000,gid=1000 \
opam install -y --deps-only --with-dev --formula='"menhir" {<= "20220210"}' merlin.4.7-413
RUN patch -p1 <<'EOF'
diff --git a/src/frontend/ocamlmerlin/dune b/src/frontend/ocamlmerlin/dune
index faa01b93..cc549032 100644
--- a/src/frontend/ocamlmerlin/dune
+++ b/src/frontend/ocamlmerlin/dune
@@ -14,7 +14,7 @@
(modules (:standard \ gen_ccflags))
(libraries config yojson merlin_analysis merlin_kernel
merlin_utils os_ipc ocaml_parsing query_protocol query_commands
- ocaml_typing ocaml_utils seq))
+ ocaml_typing ocaml_utils))
(executable
(name gen_ccflags)
EOF
ADD --chown=1000 --link https://github.com/ocaml/merlin/pull/1998.diff 1998.diff
RUN patch -p1 < 1998.diff
RUN opam exec -- make -j1 |
Could you do that for me? 🙏 |
a45515a to
600f43e
Compare
|
Marking as draft because the CI hasn't run, and there are errors with my patches on Windows. I'll circle back to this. |
18a16e8 to
94dba5b
Compare
ocamlmerlin.c|
Ready again! |
94dba5b to
4a3dd2f
Compare
4a3dd2f to
5838f52
Compare
|
Hum, something went wrong with your rebase :-) |
5838f52 to
4e51ecb
Compare
Fixed.
There were two issues on Windows, but I confirm that the test suite passes with both MSVC and mingw-w64. |
4e51ecb to
79d92bc
Compare
|
I've added more explanations to some of the commit messages. |
On Windows, use [GetTempPath2][] if available or [GetTempPath][] to retrieve the temporary directory path. On other systems, follow the [XDG Base Directory Specification][XDG] and put the socket file in the XDG runtime directory. It is indicated by the `XDG_RUNTIME_DIR` environment variable, that is always set on platforms implementing the spec. Otherwise, fallback to `TMPDIR` or `/tmp`. [GetTempPath2]: https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-gettemppath2a [GetTempPath]: https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-gettemppatha [XDG]: https://specifications.freedesktop.org/basedir/latest/
In C, arrays passed as arguments to functions decay into pointers, and
the size information is lost. This means all of these are equivalent:
void f(char *s);
void f(char s[]);
void f(char s[3615]);
One way to partially retain this information is to use `static n` with
`n` being the array size: the compiler now knows that the first `n`
elements after the pointed address can be accessed. This C99 feature
is unfortunately not supported by MSVC (but supported by clang-cl).
The compiler uses this info to check that the caller supplies a long
enough buffer to the callee. The size information is not used inside
the callee.
See [Detect if the static keyword is supported in array type
derivations][1] for a rationale behind the `ATLEAST` naming.
[1]: https://lists.gnu.org/archive/html/autoconf/2025-10/msg00004.html
- make the function static; - don't use K&R-style declaration, compilers now reject it before C23. After C23 `f(void)` is similar to `f()`.
Quoting POSIX 2024: > The size of sun_path is required to be constant, but intentionally > does not have a specified name for that constant. Historically, > different implementations used different sizes. For example, 4.3 BSD > used a size of 108, and 4.4 BSD used a size of 104. Since most > implementations originate from BSD versions, the size is typically > in the range 92 to 108. An application can deduce the size by using > `sizeof(((struct sockaddr_un *)0)->sun_path)`.
See also [`sys/un.h`][1] and [unix(7)][2]. [1]: https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/sys_un.h.html [2]: https://man7.org/linux/man-pages/man7/unix.7.html
to count the number of elements in an array. See also: - https://en.cppreference.com/w/c/preprocessor/include.html - https://manpages.debian.org/testing/manpages-dev/countof.3.en.html
`perror` can be used with functions from the CRT that set `errno`, otherwise `GetLastError` and `FormatMessage` have to be used instead.
_snprintf doesn't null-terminate strings. MSVCRT doens't provide snprintf, but if __USE_MINGW_ANSI_STDIO is defined, then mingw-w64 will provide this function. There are environments where mingw-w64 is configured with MSVCRT, and others where it is configured with UCRT. It seems safer to use the compatibility shims. UCRT provides snprintf, and MSVC defaults to UCRT.
Avoid possible truncation when preparing the invocation by using a bigger buffer. > In the Windows API the maximum length for a path is MAX_PATH, which > is defined as 260 characters. https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation > `lpCommandLine` > > The command line to be executed. > The maximum length of this string is 32,767 characters, including > the Unicode terminating null character. If `lpApplicationName` is > `NULL`, the module name portion of `lpCommandLine` is limited to > `MAX_PATH` characters. https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createprocessw https://devblogs.microsoft.com/oldnewthing/20031210-00/?p=41553
79d92bc to
5b18dcc
Compare
It works with mingw-w64 too.
warning: '__p__environ' redeclared without dllimport attribute: previous dllimport ignored [-Wattributes]
https://gcc.gnu.org/onlinedocs/gcc/Microsoft-Windows-Variable-Attributes.html#index-dllimport-variable-attribute
To ensure the ancillary data buffer is suitably aligned, the historical technique is to use a union with a backing char buffer and a member having the desired alignment, as the alignment of the union is the strictest alignment of its members. Another possibility is to use the C11 alignas specifier. Keep the old technique for compatibility. Proper alignment should be ensured when creating the structure and when reading it back from the kernel. > `CMSG_DATA()` > returns a pointer to the data portion of a cmsghdr. The pointer > returned cannot be assumed to be suitably aligned for accessing > arbitrary payload data types. Applications should not cast it to > a pointer type matching the payload, but should instead use > `memcpy(3)` to copy data to or from a suitably declared object. See also: - https://man7.org/linux/man-pages/man7/unix.7.html - https://man7.org/linux/man-pages/man3/cmsg.3.html - https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/sys_socket.h.html - https://docs.kernel.org/core-api/unaligned-memory-access.html#why-unaligned-access-is-bad
5b18dcc to
fe0a3f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We reviewed this with Antonin and I am confident that this is a very nice improvement to our C frontend. The main features are the correction of all warnings, and improved Windows compatibility.
Thanks a lot @MisterDA !
I did a review of
ocamlmerlin.c. I found minor and relatively major issues:Erroneous size computation of the
struct sockaddr_un. On some BSD systems thestruct sockaddrfamily of types have a leadingsxx_lenfield. It was not taken into account. Useoffsetof, as described by POSIX, to compute the socklen.https://man.freebsd.org/cgi/man.cgi?unix(4)
https://man7.org/linux/man-pages/man7/unix.7.html
https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/sys_un.h.html
Functions from WinAPI don't set
errno, and their failure code can be retrieved withGetLastErrorand the error string withFormatMessage. Functions from the CRT do seterrno, andperrorcan be used to display the error string.Structures for passing file descriptor using CMSG and Unix domain sockets were not properly aligned, potentially leading to inefficient or buggy behavior.
There's also a few patches that improve the code quality, in my opinion. The code now assumes a C99 compiler, and makes use of C23 and C2y features or extensions, if available. For reference, the OCaml runtime requires at least C11.