Skip to content

Conversation

@febo
Copy link
Collaborator

@febo febo commented Jul 4, 2025

Problem

There are a few places where "raw" values (magic numbers) are being used, which do not clearly describe their use and increase the difficulty of refactoring the code.

Solution

Use named constants instead of raw values.

cc: @d0nutptr

@febo febo requested a review from joncinque July 4, 2025 10:12
Copy link
Collaborator

@joncinque joncinque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some tiny things, feel free to take them or leave them

Comment on lines +82 to +92
/// Offset for the `key` field in the `Account` struct.
const ACCOUNT_KEY_OFFSET: usize = 8;

/// Offset for the `owner` field in the `Account` struct.
const ACCOUNT_OWNER_OFFSET: usize = 40;

/// Offset for the `lamports` field in the `Account` struct.
const ACCOUNT_LAMPORTS_OFFSET: usize = 72;

/// Offset for the `data` field in the `Account` struct.
const ACCOUNT_DATA_OFFSET: usize = 88;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I find these kinds of consts less useful:

  • they're only used once
  • they're located away from the code
  • the code was already clear, so this arguably makes things slightly less clear

But better to check all the boxes I guess

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about creating ofsets based on the prev offset + value, like it was done in solana-record-service/ ?

e.g.

// ...
const ACCOUNT_OWNER_OFFSET: usize = ACCOUNT_KEY_OFFSET + size_of::<Pubkey>();
// ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Jon makes a good point.
As it is used only in one place, constants do not add any more clarity here.
Previous version:

key: offset(account.raw, 8),

New version:

key: offset(account.raw, ACCOUNT_KEY_OFFSET),

New version ends up repeating the same words twice.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's all a matter of taste, so it really doesn't matter that much, but I guess that's a little better. However, if you do something like that, you should also have some compile-time tests to make sure they're correct, ie https://github.com/anza-xyz/solana-sdk/blob/be956210900c076f8f4f448b2fe4b1f5923fe09f/clock/src/lib.rs#L68

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am inclined to remove the constants and have a comment where the "raw" value is being used. That is probably more informative.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a matter of how frequently the constant is used.
If it is used only once, then moving the definition and the explanation to a different spot only makes the code harder to track.
But if it is used in multiple places, then the common constant makes the connection between all those locations explicit.
And it outweighs the downside of the constant explanation being further from the usage site.

Comment on lines +598 to +620
/// Mask to "consume" the mutable borrow bit for lamports.
const UNSET_MUTABLE_LAMPORTS_BIT: u8 = 0b_0111_1111;

/// Mask representing the immutable borrow bits for lamports.
const IMMUTABLE_LAMPORTS_BITS: u8 = 0b_0111_0000;

/// Mask representing the borrow bits for lamports.
const LAMPORTS_BITS: u8 = 0b_1111_0000;

/// Mask to assert the mutable borrow lamports bit.
const MUTABLE_LAMPORTS_BORROW_BIT: u8 = 0b_1000_0000;

/// Mask to "consume" the mutable borrow bit for data.
const UNSET_MUTABLE_DATA_BIT: u8 = 0b_1111_0111;

/// Mask representing the immutable borrow bits for data.
const IMMUTABLE_DATA_BITS: u8 = 0b_0000_0111;

/// Mask representing the borrow bits for data.
const DATA_BITS: u8 = 0b_0000_1111;

/// Mask to assert the mutable borrow data bit.
const MUTABLE_DATA_BORRROW_BIT: u8 = 0b_0000_1000;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you, but I would find it a little clearer for these to be called BITMASK at the end, ie. UNSET_MUTABLE_LAMPORTS_BITMASK or LAMPORTS_BITMASK

Comment on lines +622 to +639
/// Represents masks for borrow state of an account.
#[repr(u8)]
#[derive(Clone, Copy)]
pub enum BorrowState {
/// Mask to check whether an account is already borrowed.
///
/// This will test both data and lamports borrow state. Any position
/// in the borrow byte that is not set means that the account
/// is borrowed in that state.
Borrowed = 0b_1111_1111,

/// Mask to check whether an account is already mutably borrowed.
///
/// This will test both data and lamports mutable borrow state. If
/// one of the mutably borrowed bits is not set, then the account
/// is mutably borrowed in that state.
MutablyBorrowed = 0b_1000_1000,
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Do you still need this with all of the consts? It might be more consistent to go with all one approach or the other, ie. all consts, or all enum variations

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enum values are used in the public is_borrowed method. The tricky thing is that not all constants represent a borrow "state". At the same time, changing the parameter of is_borrow to an u8 could be error prone, since you might pass an "invalid" borrow state.

Also, the constants are private and they don't need to be publilc.

Comment on lines +82 to +92
/// Offset for the `key` field in the `Account` struct.
const ACCOUNT_KEY_OFFSET: usize = 8;

/// Offset for the `owner` field in the `Account` struct.
const ACCOUNT_OWNER_OFFSET: usize = 40;

/// Offset for the `lamports` field in the `Account` struct.
const ACCOUNT_LAMPORTS_OFFSET: usize = 72;

/// Offset for the `data` field in the `Account` struct.
const ACCOUNT_DATA_OFFSET: usize = 88;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Jon makes a good point.
As it is used only in one place, constants do not add any more clarity here.
Previous version:

key: offset(account.raw, 8),

New version:

key: offset(account.raw, ACCOUNT_KEY_OFFSET),

New version ends up repeating the same words twice.

const MUTABLE_LAMPORTS_BORROW_BIT: u8 = 0b_1000_0000;

/// Mask to "consume" the mutable borrow bit for data.
const UNSET_MUTABLE_DATA_BIT: u8 = 0b_1111_0111;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a common convention for bitmasks, is that you have 3 operations:
If you say v &= ~mask you are clearing the bits.
If you say v |= mask you are setting the bits.
If you say v & mask you are reading the bits.

It seems that the UNSET_<field> values currently hold the matching ~<field> value.
But it only applies to 1 of 3 possible operations.
Plus, UNSET_MUTABLE_DATA_BIT is missing the BORROW part from the MUTABLE_DATA_BORRROW_BIT.
It is unclear if the field we are talking about is called MUTABLE_DATA or MUTABLE_DATA_BORROW.
And one can even think those are two different fields.

The higher level APIs I've seen for bit manipulation seem to expose operations directly.
So, as we are here working at a level where we are still dealing with bits, hiding one invert operation, I think, is only more confusing than helpful.

bitmask seems to be providing a list of suitable operations:

  • none - Create a new mask with all flags unset.
  • all - Create a new mask with all flags set.
  • set - Set a single flag if enum flag variant is passed or multiple if mask is passed.
  • unset - Unset a single flag if enum flag variant is passed or multiple if mask is passed.
  • toggle - Same as set/unset but always negates the flags (1 -> 0 and 0 -> 1).
  • contains - Check if the mask contains a flag or a whole mask.
  • intersects - Check if the mask intersects with a flag or a whole mask.
  • is_all - Check if all flag variants are set.
  • is_none - Check if all flag variants are unset.

https://docs.rs/bitmask/latest/bitmask/macro.bitmask.html#methods

My suggestion would be to remove the UNSET_<field> constants completely.

And if there is a desire to hide the bit manipulation, better use a library that provide a higher level API.

Copy link
Collaborator Author

@febo febo Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, the names are not entirely clear since the field hold values for 2 different things and, additionally, a couple of bits represent mutability and others not.

The only 2 values that are used more than once are:

  • MUTABLE_LAMPORTS_BORROW_BIT
  • MUTABLE_DATA_BORRROW_BIT

These were the 2 existing constants (with a different name). So perhaps we revert to that and provide better comments on the use of the other "raw" values? These constants seems to confuse more than help.

@febo
Copy link
Collaborator Author

febo commented Jul 16, 2025

Closed in favour of #200

@febo febo closed this Jul 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants