Skip to content

fix: configuration parsing error mmap_log_buffer_size #2363

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

MorganaFuture
Copy link
Contributor

@MorganaFuture MorganaFuture commented Mar 16, 2025

What
This PR introduces a JSON-based configuration system for the Tempesta FW logger daemon with:

  • A dedicated TfwLoggerConfig class for logger settings
  • JSON configuration file support via Boost's property_tree
  • Two operation modes: daemon (background) and handle (foreground for debugging)
  • Support for a logger_config attribute in the access_log directive
  • Updates to Tempesta scripts for compatibility
  • Backward compatibility with existing configuration

Why
The Tempesta logger configuration is currently scattered across multiple parameters, making it hard to manage. A dedicated JSON file provides a cleaner approach that separates logger settings from the main configuration. This structured format improves error handling and simplifies troubleshooting with the new foreground mode.

Links
2313

@MorganaFuture
Copy link
Contributor Author

I have been starting work on 2313.

@MorganaFuture MorganaFuture force-pushed the morganaFuture/fix/configuration_parsing_error_mmap_log_buffer_size branch from c93a0ef to c6560d8 Compare March 16, 2025 20:20
@MorganaFuture MorganaFuture force-pushed the morganaFuture/fix/configuration_parsing_error_mmap_log_buffer_size branch from c6560d8 to 117cd6d Compare March 21, 2025 12:56
@MorganaFuture
Copy link
Contributor Author

MorganaFuture commented Mar 21, 2025

@krizhanovsky I slightly modified tfw_logger to make it work with JSON files. I also did some refactoring on it. Could you please assign someone to review these changes?

@MorganaFuture MorganaFuture marked this pull request as ready for review March 21, 2025 13:08
TFW_CFG_CHECK_NO_ATTRS(cs, ce);

// Check for logger_config attribute
const char *logger_config = tfw_cfg_get_attr(ce, "logger_config", NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please delete this code, tfw_logger_config_path is not used now, there is no sense to keep it. If it become necessary in future it's not a big deal to add it.

@@ -596,6 +611,19 @@ cfg_access_log_set(TfwCfgSpec *cs, TfwCfgEntry *ce)
return -EINVAL;
}

// Parse other attributes if using mmap
if (access_log_type & ACCESS_LOG_MMAP) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this part. For tfw_logger we do validation only in tempesta.sh file, but now we must do it in tfw_logger.

uint16_t port{9000}; // ClickHouse server port
std::optional<std::string> user; // Optional username
std::optional<std::string> password; // Optional password
std::string table_name{"access_log"}; // Table name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no sense to add table name to config, anyway we create table using hardcoded name. Even if change this behavior, I think we not need to have this in the configuration file, at least now. The same for database name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, quite a good question: we supposed that Clickhouse is installed solely for us, but a user might want to use existing database instance for the logs. I.e. web frameworks typicallay provide not only database credentials and IP with port, but also database name and probably a table name.

}
else
{
cpu_cnt = std::thread::hardware_concurrency();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to not use std::thread::hardware_concurrency and always use sysconf as reliable source.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@const-t What's wrong with std::thread::hardware_concurrency?

Copy link
Contributor

@const-t const-t Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kutumov I'm not sure that it works well with NUMA and with machines that has 100+ cpus. It may return invalid cpu count, but not zero and in this case we even can't fallback to sysconf. Even cppref describes it as: The value should be considered only a hint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This must be commented in the source code - I was also thinking that std::thread::hardware_concurrency is good

fs::path config_path = "/etc/tempesta/tfw_logger.json";

try
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{
{

std::shared_ptr<spdlog::logger> logger = nullptr;
fs::path config_path = "/etc/tempesta/tfw_logger.json";

try
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
try
try {

Please have a look on our codingstyle, it not covers c++ completely, but we follow it where it's possible. For instance indentations.

{
int fd = -1, res = 0;
std::shared_ptr<spdlog::logger> logger = nullptr;
fs::path config_path = "/etc/tempesta/tfw_logger.json";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding the config path doesn't look good. You've done a great job adding a configuration file, and I believe we should use its benefits. My suggestion:

  1. Pass config_path as CLI argument into tfw_logger, but construct config_path in tempesta.sh using TFW_ROOT constant. Just as we do for tfw_cfg_path and tfw_cfg_temp.
  2. Remove from tempesta.sh parsing of mmap_* directives, you moved them to tfw_logger let validate them there. Therefore do not pass mmap_* parameters from tempesta's config to tfw_logger, they are already there.
  3. Give CLI arguments higher precedence over config file. To be able to override params defined in config using CLI.

@MorganaFuture If you need some clarifications, feel free to ask.

size_t buffer_size{4 * 1024 * 1024}; // 4MB default buffer size
size_t cpu_count{0}; // 0 means auto-detect
ClickHouseConfig clickhouse; // ClickHouse configuration
bool debug{false}; // Debug mode flag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like unused flag, delete it please.

ch_cfg.password ? *ch_cfg.password : "",
make_block());


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary blank line, please delete them in other places as well.

@@ -266,6 +288,7 @@ run_thread(const int ncpu, const int fd, const std::string &host,
break;
}
catch (const Exception &e) {
std::cerr << "Exception in run_thread: " << e.what() << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the second error message?

std::shared_ptr<spdlog::logger>
setup_logger(const TfwLoggerConfig &config)
{
try
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reduce try area to the lines that may throw spdlog::spdlog_ex

else
{
// If config file not found or invalid, use command line args
std::cout << "Could not load configuration from file, using command line arguments" << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion if expected configuration fails(or its part) it is better to exit in order not to execute the service with undesirable settings. @const-t FYI

return std::nullopt;
}

try
Copy link
Contributor

@Kutumov Kutumov Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reduce the try area to the lines that may throw


TfwLoggerConfig config;
config.parse_from_ptree(tree);
return config;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not std::move(config)?

std::cerr << "Error parsing command line: " << e.what() << std::endl;
std::cerr << "Use --help for usage information" << std::endl;
result = CommandResult::ERROR;
parse_error = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just return here and no need in parse_error?


for (size_t i = 0; i < cpu_count; ++i)
{
std::packaged_task<void(int, int, TfwLoggerConfig)> task(run_thread);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

packaged_task is redundant here. We can just wait when threads finish using std::thread::join

@EvgeniiMekhanik EvgeniiMekhanik self-requested a review March 29, 2025 02:42
@@ -576,7 +577,21 @@ cfg_access_log_set(TfwCfgSpec *cs, TfwCfgEntry *ce)
bool off = false;

TFW_CFG_CHECK_VAL_N(>, 0, cs, ce);
TFW_CFG_CHECK_NO_ATTRS(cs, ce);

// Check for logger_config attribute
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +252 to +253
while (!stop_flag)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
while (!stop_flag)
{
while (!stop_flag) {

Comment on lines +254 to +255
try
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
try
{
try {

TfwClickhouse clickhouse(ch_cfg.host, ch_cfg.table_name,
ch_cfg.user ? *ch_cfg.user : "",
ch_cfg.password ? *ch_cfg.password : "",
make_block());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too wide line and something is broken with identation

Comment on lines +369 to +372
if (fs::exists(config_path))
{
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (fs::exists(config_path))
{
return;
}
if (fs::exists(config_path))
return;

{
spdlog::info("Starting in handle (foreground) mode");
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need daemonization. The modern Linux environments provide this out of the box and in the modern our projects we don't daemonize servers

throw Except("Failed to open PID file");
pid_file << getpid();
pid_file.close();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the purpose of the pid file? Typically in Linux we also should lock it and set the right permissions. @ttaym you recently worked with a nice small library for PID files, could you please reference it that we can borrow it's code here as well?


spdlog::info("Daemon stopped");
// Open the device
fd = open_device();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably the comment is redundant - basically the function name is just the same as the comment

uint16_t port{9000}; // ClickHouse server port
std::optional<std::string> user; // Optional username
std::optional<std::string> password; // Optional password
std::string table_name{"access_log"}; // Table name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, quite a good question: we supposed that Clickhouse is installed solely for us, but a user might want to use existing database instance for the logs. I.e. web frameworks typicallay provide not only database credentials and IP with port, but also database name and probably a table name.

fs::path pid_file{"/var/run/tfw_logger.pid"}; // PID file path
size_t buffer_size{4 * 1024 * 1024}; // 4MB default buffer size
size_t cpu_count{0}; // 0 means auto-detect
ClickHouseConfig clickhouse; // ClickHouse configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issues with identation and usually we name members as clickhouse_, with _ suffix

@MorganaFuture
Copy link
Contributor Author

The pull request is being moved here #2428

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants