Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event stream POC testing with EngFlow UI #1

Closed
wants to merge 7,688 commits into from

Conversation

nlopezgi
Copy link

@nlopezgi nlopezgi commented Dec 4, 2024

  • Builds up on [WIP] Bazel Build Event Stream support facebook/buck2#811
  • Add hack for tls cert for opal
  • Hardcode project id in requests for default tenant
  • Add lifecycle events for BuildStarted, InvocationAttempt started and InvocationAttemptFInished, BuildFinished
  • Mangle target names from root//foo:bar to //foo:bar
  • Up next: figure out missed key mismatches on UI, probably a side effect of the target name mangling.

stepancheg and others added 30 commits October 19, 2024 20:18
Summary: Port of `shed/provider` or `StarlarkValue::provide` to DICE. Allows generic downcast in DICE. Used in D64078672.

Reviewed By: cjhopman

Differential Revision: D64078670

fbshipit-source-id: 550f5b480eac21e50a3946617e17956bd0ab395f
Summary:
Remove dependency on `buck2_action_impl` from `buck2_build_signals_impl`.

This only works for simple keys, which are not looked into in `buck2_build_signals_impl` like

https://www.internalfb.com/code/fbsource/[346e902275fddbc6f3338da0533be33b90789324]/fbcode/buck2/app/buck2_build_signals_impl/src/lib.rs?lines=588-593

The goal is, as usual, faster incremental compilation (e.g. `arc rust-check`).

Reviewed By: JakobDegen

Differential Revision: D64078672

fbshipit-source-id: 5779f698ef65963f2088c93c559fe7977e9bb80f
Summary: Make sure no one uses this

Reviewed By: stepancheg

Differential Revision: D64347934

fbshipit-source-id: 1fca4163e0ea27877a3fdc0964e63b3c95161998
Summary: So that a future diff can output additional things

Reviewed By: dtolnay

Differential Revision: D64365598

fbshipit-source-id: 0c095c473f2c808dd184b0596fe751b5c223ae39
Reviewed By: JakobDegen

Differential Revision: D64078930

fbshipit-source-id: 32b7ad0ec29613be7f17af9d941c507a0db0b5d1
Summary: For code navigation: `.into()` are quick to write, but hard to see where `BuildSignal` variants are constructed.

Reviewed By: IanChilds

Differential Revision: D64080199

fbshipit-source-id: decad6bc752140c3e282b1e3e3b2bb87197046da
Summary: Because digest "kind" is already a term that exists elsewhere with different meaning

Reviewed By: Will-MingLun-Li

Differential Revision: D64662609

fbshipit-source-id: efc17958170b7b8263d572d578fd982ecbc963a2
Summary: By requiring sha1. I'm fairly sure there's a bug here and this isn't actually the right fix, but I'll need more time to get that figured out so let's unbreak the tests in the meantime

Reviewed By: Will-MingLun-Li

Differential Revision: D64662608

fbshipit-source-id: 4d225d61eae0b11ca9888b13f933c3332cb6f116
Summary:
And stop passing the flag.

To be commited only after the previous diff is released

Reviewed By: IanChilds

Differential Revision: D64365601

fbshipit-source-id: c68b7c1cfdc8582a732c5207581cbb89b02dffac
Summary: To be committed after the previous diff is released

Reviewed By: IanChilds

Differential Revision: D64365600

fbshipit-source-id: a5b7b815f02d6f462933cde300b5fa9689447aa1
Summary: So that we don't have to shell out for it. This saves one buck invocation which, as much as I would like it to be, is not free.

Reviewed By: IanChilds

Differential Revision: D64365599

fbshipit-source-id: 08baaa1fc913cf8ea15ec54e87edc895a79e88e8
Summary: Add a new config to specify when we should warn the user. This will currently warn based on a percentage completion which will handle small and large builds. Follow up diffs will add a 'min action count' that should allow us to warn earlier for large builds.

Reviewed By: ezgicicek

Differential Revision: D64461675

fbshipit-source-id: bfd46c25bf60fdb198dcc976125d74431cf5b90e
Summary: Move the dice state to event_observer. This will allow us to estimate the progress in simple console. For now, the estimation will help us threshold when the cache warning will be shown.

Reviewed By: ezgicicek

Differential Revision: D64462965

fbshipit-source-id: 2c0cab247da46bd86c0f2f7f25349800ea7ae06e
Summary: Refactor to avoid code duplication + these BuckEvents will be used in new tests.

Reviewed By: ezgicicek

Differential Revision: D64596709

fbshipit-source-id: eca73e2124bac5056840944568ec4f9998d7e16d
Summary: Estimate the completion percentage for BuildKey to show cache warning. The BuildKey doesn't have a 1:1 correspondence with the action count. However, using the DICE context as an indicator of completion progress should serve the purpose.

Reviewed By: ezgicicek

Differential Revision: D64462964

fbshipit-source-id: dbe06e727289defcbd3e08b32841fa1d866943e6
Summary: Warn about cache misses after a progress threshold is crossed.

Reviewed By: ezgicicek

Differential Revision: D64462963

fbshipit-source-id: 6b77d8f6770f9fb21da1cf29e16e848da09b98f5
Summary: Use the action count as an additional hueristic to display cache miss warning. This will help us warn earlier for larger builds where 10% completion is already quite late.

Reviewed By: ezgicicek

Differential Revision: D64478261

fbshipit-source-id: 3793ede5975d24162ce3e7741fd24c79568e4fac
Summary: For large builds, 10% completion might be too late. Setting an action count based threshold.

Reviewed By: ezgicicek

Differential Revision: D64478260

fbshipit-source-id: 61c623f13814a0cc334284f5367d81c0ded66689
Summary:
We have a validation pass that looks for things like `lambda a,a:a` and turns it into an error. Previously this looked for the top-level expression directly inside a statement. That means it missed things like `lambda: lambda a,a:a`. Easily fixed by just recursing in the visit_expr call. Our validation is assumed to have happened by the compiler, so that code would lead to:

```
assertion failed: old_local.is_none()
```

Caught by the OSS Fuzzer that we have set up.

Reviewed By: IanChilds

Differential Revision: D64663606

fbshipit-source-id: 13c71abb4609be75f8879dfdf9eb2706c53d6dd1
Summary: This is a clean-room reimplementation of this module by Stiopa. The previous implementation was derived from some Bazel thing which did not have the right licenses to be included in the prelude

Reviewed By: ndmitchell

Differential Revision: D64676495

fbshipit-source-id: 4ef1bb9b3cd6cfaba72bf2b82986c730412fd999
Summary:
Store RE use case in event log. This allows to check use case in tests.

Given these are short strings and there should be very few unique ones in single build, compressed event log size shouldn't be affected much.

Reviewed By: IanChilds

Differential Revision: D64601541

fbshipit-source-id: 8e8bceb5a1e3c974aaeed3305ec54b5af80c06c7
Summary: This seems to be executed locally by CI (https://fburl.com/daiquery/pn50b32e). Let's enable local uploads so that we can get 100% cache hit rate.

Reviewed By: IanChilds

Differential Revision: D64606089

fbshipit-source-id: c7723be31400a32ed12749cefa385bdc2dda7ab5
Summary: Change the meaning of estimated_completion_percent. A None value would imply we don't want thresholding based on completion.

Reviewed By: ezgicicek

Differential Revision: D64688946

fbshipit-source-id: 63df5cf84c2aa19604f076eb88402452aba32afd
Summary: Tag builds with low cache misses at completion. This will ensure that the UI will use the values consistent with what the CLI shows.

Reviewed By: ezgicicek

Differential Revision: D64689265

fbshipit-source-id: b2fb47c150545596572d1ce0277d045c08cf7b5b
Summary:
Got an error when testing build of facebookincubator/phabtest_rust. Adding this
target to the shims seems to fix it.

Reviewed By: bigfootjon

Differential Revision: D64618954

fbshipit-source-id: 4ea6b868647ea450a330b82b51eb94707ffa40cb
Summary: Add test for `buck2_re_client.override_use_case`.

Reviewed By: IanChilds, samkevich

Differential Revision: D64602146

fbshipit-source-id: d15012b5e0fcf5d686246255ec01c0ca931a2da2
Summary:
Previously, we assumed that any test target that occurred in `test_deps` was a generated unittest that had a name of the form `foo-unittest`. This isn't always true, and led to us discarding test dependencies that had other names.

Instead, be stricter about target names, so other test dependencies end up in `standalone_tests` as expected.

Reviewed By: darichey

Differential Revision: D64634915

fbshipit-source-id: 4cdbd2a766baa9515b13c68c415a141d4cf0efe6
Summary:
The fat-jar process does merge all dependencies by opening each entry and deflating it again. Thats specially expensive when using sidecar, given the content can be up to 12GB of shared libs.

The idea here is to use the sidecar, as a base jar, and increment a copy of the jar instead of repacking it from scrach at each stage of the jar generation.

 {F1920027649}

Once sidecar_lib_cosco-* is cached, all subsequent builds will only append to a copy instead of zip at each build

Reviewed By: IanChilds

Differential Revision: D64146985

fbshipit-source-id: 22898c599117bc6337ac3a19806f84ec0405bd63
Summary:
"The evaluation of this key was cancelled" errors seem to indicate a bug in DICE and shouldn't normally be surfaced to users.

Adding enum values just to give some indication of roughly where in DICE the cancellation originated.

Reviewed By: cjhopman

Differential Revision: D64707988

fbshipit-source-id: e171157cf48dc16ad4439fe9a00207fc96395589
Summary: .

Reviewed By: stepancheg

Differential Revision: D64571901

fbshipit-source-id: 9c38e9ece2d0c9b13b3174d5eed658e31f3b1f3f
Ian Childs and others added 26 commits October 30, 2024 15:08
Summary:
We do have `allow_cache_upload` on `cxx_library` but it's used for the compile step.

I think it's reasonable to have a separate attribute for the `archive` action, and that's what this introduces.

Reviewed By: lty1308

Differential Revision: D65137779

fbshipit-source-id: b58524b8a66a32ef84cdcd395eac6cd4f5942e37
Summary:
When focused debugging is enabled we "scrub" the binary and filter the `ArtifactTSet` to only include
a subset of target's debug artifacts. When we are done filtering we don't rebuild our tset, we make a single node that has all the children of all
the targets in the apple_binary.

This causes downstream failures to `debug_artifacts_validators` because the TSet not only looks different, but, the corresponding `.label`
to the artifacts has changed, which requires users to maintain multiple "allowlists" based on the build config.

Instead, preserve the entire `ArtifactInfo` for debug info we want to keep. This makes the `label` corresponding to debug artifacts stay the same
whether we are in focused debugging or not.

Reviewed By: chatura-atapattu

Differential Revision: D65238543

fbshipit-source-id: 60d9a6bb450e200369eab5bbcc017e3e90bd5b2d
…facebook#801)

Summary:
Pull Request resolved: facebook#801

I'm running into a situation where I want to resolve the config for
the root cells while avoiding data fetching for external cells.

This refactors the audit command so we don't iterate all the cells by default,
and instead only when --all-cells is set.

Reviewed By: JakobDegen

Differential Revision: D65218557

fbshipit-source-id: c0c32857199ffd3857f047ee6bc994efbb46f222
Summary: Similarly to `constraint_overrides`, this config value will store a list of all supported target platforms.

Reviewed By: mojsarn

Differential Revision: D65060011

fbshipit-source-id: 40680012eb069ef1713fd3e51805099384346b38
Summary: Fix `apple.incremental_bundling_enabled` buckconfig not working correctly.

Reviewed By: manicaesar

Differential Revision: D65267646

fbshipit-source-id: 4c6e09994bed79d4054ce386148c4bbc69413fc2
Summary: Ignore changes to `buck2_re_client.override_use_case` for legacy config state in DICE.

Reviewed By: JakobDegen

Differential Revision: D64477014

fbshipit-source-id: ad909caee5211992150e0b23e0304b85bcf2bfc2
Summary: Some dependencies (such as stubs) should only ever be used at compile-time and should never be packaged into the final binary. This adds a mechanism to label targets as such and prevent them being added incorrectly.

Reviewed By: navidqar

Differential Revision: D65268897

fbshipit-source-id: 1a19135bd6fc23ed0f798d46b1ad533a662e4855
Summary: This is not necessary as of D62954867

Reviewed By: IanChilds

Differential Revision: D65273613

fbshipit-source-id: 874b3d92eb9408cd4ef63771bb9d70c5d84931c8
Summary: Not needed, they are already added because we have `with_inputs = True` on the `.proto` file that we create.

Reviewed By: navidqar

Differential Revision: D65275230

fbshipit-source-id: 4032f0066ad75a1f44d65e968c943923c0ed4a83
Summary:
In order to determine category from the most interesting error while retaining the current behavior that Tier 0 errors always take precedence over input errors.

See last diff for motivation for using error ranking to determine error category.

There shouldn't be cases where an input error should be considered 'more interesting' than a tier0 tag, if there are the tier0 tag should be changed to the 'unspecified' tier.

This also changes all environment errors to be ranked higher than tier0, since these are all determined by extra checking done on top of tier0 errors, the environment categorization should take precedence.

Reviewed By: Will-MingLun-Li

Differential Revision: D65092671

fbshipit-source-id: 5c9a4f6ce2ab50349e7759daf4f014d3a16b0c17
Summary:
This changes the vendored json module to be a proper target.
This is necessary, since in some places where toolchain applications using json were included in release or otherwise packaged, the json module would be missing. This solves the issue and simplifies internals removing the need to prepend the utility modules path all over the place.

This is first part - that we need to sync back before progressing and using the new application for testing

Reviewed By: TheGeorge

Differential Revision: D65274428

fbshipit-source-id: a92a1dc38cddbaa708093e9a5f8c72c5294b71cb
Summary: Creating a simulator shouldn't use the same code as listing available simulators- calling the simulator manager with "list" returns a list, while calling it with "create" returns a single object, so the output needs to be handled accordingly.

Differential Revision: D65167288

fbshipit-source-id: 91c531b84a54eddf583b2b23e4c1fd07f8374861
Summary:
For errors with no tags we currently ignore the tier when determining the best error. This is fine when using the best error to determine the best tag, but it's a problem if we are trying to determine the tier of the best error.

This should function as if the tier on an error was a tag ranked at the bottom of the list of tags for that tier, like this:

```
env tags
tier0 tags
'environment' tier
'tier0' tier
input tags
'input' tier
unspecified
no tier
```

Although the tier0 tier is ranked below the environment tier, tier0 and environment errors aren't offset so it's still possible for tags of each type to take precedence over the other.

This would be easier to reason about if we exclusively derived tiers from tags and removed the ability to set tier independently. This seems potentially worth doing later.

Reviewed By: Will-MingLun-Li

Differential Revision: D65092673

fbshipit-source-id: 45d3d30535fb8de55727b85b7fb7294b9c4de663
Summary:
The fact that the error_category reported to scuba may not match the category of the 'best error' (basis for best_error_tag and best_error_category_key) is confusing, especially with the introduction of environment errors. Taking category from the best_error is much simpler to reason about.

Also, the existing `error_category()` prevents us from excluding errors with environment tags from UBE-0 if they also have any tier0 tags. We could make the tier for those tier0 tags 'unspecified', but it seems better to leave them tier0 if an environment tag isn't present.

Reviewed By: Will-MingLun-Li

Differential Revision: D65092672

fbshipit-source-id: 221157dc2608a5856e32d32042d030228558cf2a
Summary: Started seeing a large volume of these errors this week, unfortunately this error message is part of a sapling crash and stack trace dump, but in any case we should not consider these Tier 0 errors.

Reviewed By: Will-MingLun-Li

Differential Revision: D65285526

fbshipit-source-id: 7fdf187d80f89b0919379baf13a7253dddf8e4c0
Summary:
# Background
For internal static doc, our doc is prefix with `https://www.internalfb.com/intern/staticdocs/buck2/`. Since we use `<a href = /path/to/type`, it will show 404 for it
 {F1950799844}

# This diff
Based on the Docusaurus's doc, it recommend using `<Link>` tag.
https://docusaurus.io/docs/api/misc/docusaurus/eslint-plugin/no-html-links
It needs us `import Link from 'docusaurus/Link`, so we need use `.mdx` instead of `.md` here.

According to the doc: https://docusaurus.io/docs/docusaurus-core#link, `Link` will apply base url for it. It will solve our problem
 {F1950829794}

Reviewed By: JakobDegen

Differential Revision: D65288103

fbshipit-source-id: ffb9b8df3f23b56fd9ff01ea9b3a3777bb6b2098
Summary:
In other repos, this file is not guaranteed to be at `fbcode/buck2/cfg/experimental/set_cfg_modifiers.bzl`.

For example, in WAIPhone, it is at `tools/buck2/fbcode/buck2/cfg/experimental/set_cfg_modifiers.bzl` and this fails the check.

```
    5: Traceback (most recent call last):
         * tools/buck2/prelude/PACKAGE:15, in <module>
             set_cfg_modifiers(cfg_modifiers = [
         * tools/buck2/fbcode/buck2/cfg/experimental/set_cfg_modifiers.bzl:34, in set_cfg_modifiers
             _set_cfg_modifiers(cfg_modifiers, extra_cfg_modifiers_per_rule)
         * tools/buck2/prelude/cfg/modifier/set_cfg_modifiers.bzl:42, in set_cfg_modifiers
             fail("set_cfg_modifiers is only allowed to be used from a PACKAGE or BUCK_TRE...
       error: fail: set_cfg_modifiers is only allowed to be used from a PACKAGE or BUCK_TREE file, not a bzl file.
         --> tools/buck2/prelude/cfg/modifier/set_cfg_modifiers.bzl:42:13
          |
       42 |             fail("set_cfg_modifiers is only allowed to be used from a PACKAGE or BUCK_TREE file, not a bzl file.")
          |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
          |
```

Since this file must be in the `fbcode` cell (due to loads elsewhere), assume that the file path ends with the path.

Reviewed By: JakobDegen

Differential Revision: D65290874

fbshipit-source-id: 3dd64f27989a9789b30ae56a6492d1571c4ce9c0
Summary:
Underlying issue is fixed by D65280978
re-attemp landing

Original attempt: D65074404

Reviewed By: jiawei-lyu

Differential Revision: D65283258

fbshipit-source-id: 3f78ca49361770fe0424ff3f8790e6d0029afd94
Summary: Will be used in the next diff so there is no dependency of log commands to repo

Reviewed By: iguridi

Differential Revision: D64153503

fbshipit-source-id: 12d8d4ed2061c92995b7614cf5212efcf8a6673d
Summary: Log commands don't have any InvocationRecord generated.  I would like to track details about who/how invokes log commands. Let's ensure they are written to buck2_builds table with no event log option (so scrapped from scribe?).

Reviewed By: iguridi

Differential Revision: D64044944

fbshipit-source-id: d4ec4a9fe3cc12224f8344dea218abe96a49223c
Provide a BuckEvent Publisher service which emits BuckEvents to an
external server implementation.

Closes facebook#226
Repurposes the RemoteEventSink used to connect to Scribe within Meta internal Buck2 and connects
it to a Bazel Build Event Service over gRPC instead. Adds a translation layer between Buck2
build events and Bazel build events.
BuildBuddy uses pattern expand messages to collect the set of targets to report on.
Buck2 does not have a direct correspondance to the target completed message, for now we collect all
targets for which actions are completed and then emit a series of target completed events before we
close the stream.
Stdout and stderr are only included inline in Buck2 and not as CAS items, so we forward them as
inline data.
Failure details are used by BuildBuddy to display build errors in the UI, so we emit these.
@nlopezgi nlopezgi changed the title Nlopezgi/event stream Event stream POC testing with EngFlow UI Dec 4, 2024
@nlopezgi
Copy link
Author

nlopezgi commented Dec 4, 2024

Oops, thought this would be sent to my fork. Closing.

@nlopezgi nlopezgi closed this Dec 4, 2024
aherrmann pushed a commit that referenced this pull request Dec 19, 2024
…cross all host platforms

Summary:
### Motivation

My team has a concrete need for buck to generate 100% matching zip files for the same sets of inputs on all host platforms (macOS, Linux, Windows). Current limitations:
1. File order can be different on file system with different case sensitivity.
2. Windows can't write correct posix mode (i.e. permissions) for any entries.

Although the entries themselves might fully match, those discrepancies result in different metadata, which results in a different zip file.

See D67149264 for an in-depth explanation of the use case that requires this level of determinism.

### Tentative solution #1

In D66386385, I made it so the asset generation rule was only executable from Linux. Paired with buck cross builds, it made so that outputs from macOS and Linux matched, but did not work on Windows [due to some lower level buck problem](https://fb.workplace.com/groups/930797200910874/posts/1548299102494011) (still unresolved).

### Tentative solution facebook#2

In D66404381, I wrote my own Python script to create zip files. I got all the files and metadata to match everywhere, but I could not get around differences in the compression results. Decided not to pursue because compression is important for file size.

###  Tentative solution facebook#3

In D67149264, I duplicated and tweaked buck's zip binary. It did work, but IanChilds rightfully pointed out that I'd be making maintenance on those libraries more difficult and that the team is even planning on deleting those, at some point.

### Tentative solution facebook#4 (this diff!)

IanChilds advised me to try to fix buck itself to produce consistent results, so this is me giving it a try.

Because the root problem could not have been done in a backwards compatible way (the file permissions, specifically; see inlined comment), I decided to use an argument to control whether the zip tool should strive to produce a deterministic output or not, at the expense of some loss of metadata. The changes are simple and backwards compatible, but any feedback on the root problem, idea and execution are welcome.

Reviewed By: christolliday

Differential Revision: D67301945

fbshipit-source-id: c42ef7a52efd235b43509337913d905bcbaf3782
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.