Skip to content

NAS-135620 / 25.10 / Simplify middleware audit startup #16371

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/source/middleware/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ Middleware Daemon
:caption: Contents:

development.rst
state.rst
process_pool.rst
jobs.rst
roles.rst
Expand Down
63 changes: 63 additions & 0 deletions docs/source/middleware/state.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
Middleware State Directories
############################

.. contents:: Table of Contents
:depth: 4

The middlewared process stores state in several directories on the local filesystem. There
are many situations in which the developer may opt to store state information outside of the
configuration database. In this case the onus is on the developer to choose an appropriate
location for this state based on persistence requirements. The following is a brief introduction
to how volatile and persistent state is stored. The most up-to-date definitions for storage
locations are in the truenas/middleware repository, and truenas/scale-build (for datasets
created during installation and upgrade).


Volatile state
**************

Volatile middleware state is stored in the middleware run directory `/var/run/middleware`.
The expected permissions on the volatile state directory are 0o755. This is typically where
sentinel files should be placed. This is defined by `MIDDLEWARE_RUN_DIR` in `middlewared/utils`.


Persistent state
****************

There are several directories that are used to store persistent state related to the middlewared
process and TrueNAS servers.

`/conf` -- this ZFS dataset is readonly and contains configuration that is not expected to change at runtime.
An example of this would be our audit rulesets or some metadata about the boot pool when it is installed.
This dataset is not cloned during the upgrade process and information is not preserved as part of configuration
backups.

`/data` -- this ZFS dataset contains the TrueNAS configuration file `freenas-v1.db` and various install-specific
configuration files that must persist across TrueNAS upgrades. Items that need to be included in the configuration
tarball should generally be placed here. Permissions on this directory must be 0o755, but many files here should
be set to 0o700. All files and directories should be owned by root:root.

`/data/subsystems` -- this directory contains application-specific configuration that must persist between
installs that is not suitable for datastore insertion. The convention is to create a new directory with the name of
the middleware plugin that needs persistent state. Configuration information stored in these directories must be
included in the TrueNAS configuration backup and restored on configuration upload.

`/var/lib/truenas-middleware` -- this directory contains persistent middleware state that is applicable to the
current boot environment only. It is a safe place to store data that we want to persist across reboots, but not
across upgrades. This is defined by the `MIDDLEWARE_BOOT_ENV_STATE_DIR` in `middlewared/utils`. The permissions
on this directory should be 0o755 and it should be owned by root:root.

`/root` -- this dataset contains the middleware directory services cache. The permissions on this directory
should be 0o700 and it should be owned by root:root.

`/var/db/system` -- this is the mountpoint for the system dataset. The storage pool for the system dataset is
runtime configurable and in single node systems may be located on the boot pool. If the server is an HA appliance
the system dataset will always be on a data pool. Examples of when to place state in the system dataset are
if the state must remain consistent for the active storage controller in an HA pair (for example nfs4 state file)
or if we want the state's storage pool to be user-configurable. The system dataset mountpoint has expected
permissions of 0o755 and ownership of root:root.

`/audit` -- this is the dataset containing our auditing databases. It is cloned during the upgrade process and
so persists across upgrades. The auditing databases are not expected to be preserved during backup and restore
operations and are unique to the individual truenas install. Expected permissions are 0o700 and owned by root:root.
Procedure for adding new audit databases are documented separately.
7 changes: 0 additions & 7 deletions src/middlewared/middlewared/alert/source/audit.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,6 @@ class AuditBackendSetupAlertClass(AlertClass, SimpleOneShotAlertClass):
text = "Audit service failed backend setup: %(service)s. See /var/log/middlewared.log"


class AuditSetupAlertClass(AlertClass, SimpleOneShotAlertClass):
category = AlertCategory.AUDIT
level = AlertLevel.ERROR
title = "Audit Service Setup Failed"
text = "Audit service failed to complete setup. See /var/log/middlewared.log"


# --------------- Monitored Alerts ----------------
class AuditServiceHealthAlertClass(AlertClass):
category = AlertCategory.AUDIT
Expand Down
17 changes: 10 additions & 7 deletions src/middlewared/middlewared/plugins/audit/__init__.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
async def init_audit(middleware, event_type, args):
if await middleware.call('system.boot_env_first_boot'):
try:
await middleware.call("audit.setup")
except Exception:
middleware.logger.error("Failed to perform setup tasks for auditing.", exc_info=True)


async def setup(middleware):
middleware.event_subscribe('system.ready', init_audit)

try:
# Set up connections to the auditing databases
await middleware.call("auditbackend.setup")
except Exception:
middleware.logger.error("Failed to set up auditing backend.", exc_info=True)
if await middleware.call("keyvalue.get", "run_migration", False):
# If this is an upgrade then free up space used by refreservation on
# deactivated boot environments
try:
await middleware.call("audit.setup")
except Exception:
middleware.logger.error("Failed to perform setup tasks for auditing.", exc_info=True)
7 changes: 2 additions & 5 deletions src/middlewared/middlewared/plugins/audit/audit.py
Original file line number Diff line number Diff line change
Expand Up @@ -550,9 +550,9 @@ async def setup(self):
configuration to the current boot environment.
"""
try:
os.mkdir(AUDIT_REPORTS_DIR, 0o700)
await self.middleware.run_in_thread(os.mkdir, AUDIT_REPORTS_DIR, 0o700)
except FileExistsError:
os.chmod(AUDIT_REPORTS_DIR, 0o700)
await self.middleware.run_in_thread(os.chmod, AUDIT_REPORTS_DIR, 0o700)

cur = await self.middleware.call('audit.get_audit_dataset')
parent = os.path.dirname(cur['id'])
Expand Down Expand Up @@ -589,13 +589,10 @@ async def setup(self):
'cleanup may be required', ds['id'], exc_info=True
)

# Dismiss any existing AuditSetup one-shot alerts
await self.middleware.call('alert.oneshot_delete', 'AuditSetup', None)
audit_config = await self.middleware.call('audit.config')
try:
await self.middleware.call('audit.update_audit_dataset', audit_config)
except Exception:
await self.middleware.call('alert.oneshot_create', 'AuditSetup', None)
self.logger.error('Failed to apply auditing dataset configuration.', exc_info=True)

# Generate the initial truenas_verify file
Expand Down
12 changes: 11 additions & 1 deletion src/middlewared/middlewared/plugins/system/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

from middlewared.utils import BOOTREADY

from .utils import FIRST_INSTALL_SENTINEL, lifecycle_conf
from .utils import FIRST_INSTALL_SENTINEL, BOOTENV_FIRSTBOOT_SENTINEL, lifecycle_conf


def firstboot(middleware):
Expand All @@ -20,6 +20,15 @@ def firstboot(middleware):
middleware.call_sync('datastore.update', 'system.advanced', config['id'], {'adv_autotune': True})


def firstboot_after_upgrade(middleware):
if not os.path.exists(BOOTENV_FIRSTBOOT_SENTINEL):
os.makedirs(os.path.dirname(BOOTENV_FIRSTBOOT_SENTINEL), mode=0o700)
with open(BOOTENV_FIRSTBOOT_SENTINEL, 'w'):
pass

lifecycle_conf.SYSTEM_BOOT_ENV_FIRST_BOOT = True


def read_system_boot_id(middleware):
try:
with open('/proc/sys/kernel/random/boot_id', 'r') as f:
Expand All @@ -36,6 +45,7 @@ async def setup(middleware):
middleware.event_register('system.shutdown', 'Started shutdown process', roles=['SYSTEM_GENERAL_READ'])

await middleware.run_in_thread(firstboot, middleware)
await middleware.run_in_thread(firstboot_after_upgrade, middleware)

settings = await middleware.call('system.general.config')
middleware.logger.debug('Setting timezone to %r', settings['timezone'])
Expand Down
5 changes: 5 additions & 0 deletions src/middlewared/middlewared/plugins/system/lifecycle.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,11 @@ class SystemService(Service):
async def first_boot(self):
return lifecycle_conf.SYSTEM_FIRST_BOOT

@private
async def boot_env_first_boot(self):
# First boot after upgrading server
return lifecycle_conf.SYSTEM_BOOT_ENV_FIRST_BOOT

@api_method(
SystemBootIdArgs,
SystemBootIdResult,
Expand Down
5 changes: 4 additions & 1 deletion src/middlewared/middlewared/plugins/system/utils.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
import os
import re

from middlewared.utils import MIDDLEWARE_RUN_DIR
from middlewared.utils import MIDDLEWARE_RUN_DIR, MIDDLEWARE_BOOT_ENV_STATE_DIR


DEBUG_MAX_SIZE = 30
FIRST_INSTALL_SENTINEL = '/data/first-boot'
BOOTENV_FIRSTBOOT_SENTINEL = os.path.join(MIDDLEWARE_BOOT_ENV_STATE_DIR, '.first-boot')
RE_KDUMP_CONFIGURED = re.compile(r'current state\s*:\s*(ready to kdump)', flags=re.M)


Expand All @@ -17,6 +18,8 @@ def __init__(self):
self.SYSTEM_READY = False
# Flag telling whether the system is shutting down
self.SYSTEM_SHUTTING_DOWN = False
self.SYSTEM_BOOT_ENV_FIRST_BOOT = False
# Flag telling whether this is the first boot for the boot environment


def get_debug_execution_dir(system_dataset_path: str, iteration: int = 0) -> str:
Expand Down
1 change: 1 addition & 0 deletions src/middlewared/middlewared/utils/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ class ProductNames:

MID_PID = None
MIDDLEWARE_RUN_DIR = '/var/run/middleware'
MIDDLEWARE_BOOT_ENV_STATE_DIR = '/var/lib/truenas-middleware'
MIDDLEWARE_STARTED_SENTINEL_PATH = f'{MIDDLEWARE_RUN_DIR}/middlewared-started'
BOOTREADY = f'{MIDDLEWARE_RUN_DIR}/.bootready'
BOOT_POOL_NAME_VALID = ['freenas-boot', 'boot-pool']
Expand Down
23 changes: 0 additions & 23 deletions tests/api2/test_audit_alerts.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,29 +81,6 @@ def test_audit_backend_alert(setup_state):
assert class_alerts[0]['formatted'].startswith("Audit service failed backend setup"), class_alerts


@pytest.mark.parametrize(
'setup_state', [
[None, 'AuditSetup', 'audit.setup']
],
indirect=True
)
def test_audit_setup_alert(setup_state):
with mock("audit.update_audit_dataset", """
from middlewared.service import private
@private
async def mock(self, new):
raise Exception()
"""):
unused, alert_class, audit_method = setup_state
call(audit_method)
sleep(1)
alerts = call("alert.list")
class_alerts = [alert for alert in alerts if alert['klass'] == alert_class]
assert len(class_alerts) > 0, class_alerts
assert class_alerts[0]['klass'] == 'AuditSetup', class_alerts
assert class_alerts[0]['formatted'].startswith("Audit service failed to complete setup"), class_alerts


def test_audit_health_monitor_alert():
with mock("auditbackend.query", """
from middlewared.service import private
Expand Down