Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Optimize package initialization #197

Merged
merged 1 commit into from
Mar 20, 2025

Conversation

kuadrosx
Copy link
Contributor

@kuadrosx kuadrosx commented Mar 19, 2025

This commit improves package startup performance by:

  • Embedding static data: Replaced go file data loading with go:embed, reducing I/O overhead.
  • Optimized metadata build: Changed cmd/buildmetadata to download metadata files with the new format.
  • Logarithmic digit counting: Replaced string conversion and length calculation with a faster logarithmic approach for digit counting.
  • Improve initialization: Utilized sync.Map instead of map[string]sync.Once for prevent gunzip the same metadata more than once.
  • Clarify data directory: Removed the gen directory, moving data into the data directory because the data is not longer loaded as generated generated code.

These changes result in a substantial reduction in package startup time.

old:
init github.com/nyaruka/phonenumbers @4.2 ms, 7.2 ms clock, 2994832 bytes, 34852 allocs

new:
init github.com/nyaruka/phonenumbers @2.1 ms, 3.7 ms clock, 2917304 bytes, 34444 allocs

the binary size also went down 1M:

4.0K ./main.go
7.8M ./phoneparser-new
8.9M ./phoneparser-main
17M .

@rowanseymour
Copy link
Member

Have left some comments and generally looks really good... do wish tho the changes were done separately so it'd be possible to see what effect each change is having on package load time.

phonenumbers.go Outdated
}

// load it into our map
onceMap[language].Do(func() {
prefixMap, err := loadPrefixMap(binMap[language])
if _, loaded := visited.LoadOrStore(language, struct{}{}); !loaded {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this lead to a race condition if two threads get here at the same time looking for a language that hasn't yet been loaded.. first does the load, second doesn't.. but second could get to line 3434 before the first thread completes the loading?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the sync.Map should prevent this situation because it blocks https://pkg.go.dev/sync#Map

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It blocks writes to the map within the call to LoadOrStore so thread 1 and 2 arrive at that call and let's say thread 1 is first so it will write struct{}{} into the map and thread 2 will read struct{}{} from the map.

Then both threads can proceed.. loaded is true for thread 1 and false for thread 2. Thread 1 tries to write into langMap[language] but thread 2 might read from it first 💥 .

Unless I'm forgetting how sync.Map works... I think you have a race condition. And I'm not sure the existing solution of using sync.Once is so bad we need to replace it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have the list to prefill the maps of onces in the init() func 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you put the sync.Once stuff back in this PR and then in a separate PR you can try replacing it with a map of mutexes or something else.. and at least then we can measure the difference it makes. In this PR it's impossible to know what is contributing to the fast package load times because we also have the embed changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I did the sync.Once changes because of the embed changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think of adding a mutex ? 🙏 something like:

func fillLangMap(visited *sync.Map, langMap map[string]*intStringMap, file fs.File, language string) error {
	m.Lock()
	defer m.Unlock()

	// load it into our map
	if _, loaded := visited.LoadOrStore(language, struct{}{}); !loaded {
		data, err := io.ReadAll(file)
		if err != nil {
			return err
		}

		prefixMap, err := loadPrefixMap(data)
		if err == nil {
			langMap[language] = prefixMap
		}
	}

	return nil
}

I added this tests to reproduce the error

func TestGetCarrierWithPrefixForNumberWithConcurrency(t *testing.T) {
	number, _ := Parse("+8613702032331", "ZZ")

	wg := sync.WaitGroup{}

	for i := 0; i < 10; i++ {
		wg.Add(1)
		go func() {
			defer wg.Done()
			_, _, err := GetCarrierWithPrefixForNumber(number, "en")
			if err != nil {
				t.Errorf("Failed to getCarrierWithPrefix for the number %s: %s", "+8613702032331", err)
			}
		}()
	}

	wg.Wait()
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutexes seem like a good solution...

  1. You don't need visited *sync.Map.. you have synchronized access to langMap now
  2. Using one mutex means you have now synchronized across all languages.. which is more synchronized than this code was previously.. but I think I'm ok with that for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the more I think about it a mutex for each map seems totally fine here and it's a lot simpler

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, i removed the map and added the mutex 😄

thank you for your time, next I will try using a pool of gzips to see if it helps with the performance

phonenumbers.go Outdated
@@ -607,10 +608,12 @@ var (

// These are our onces and maps for our prefix to carrier maps
carrierOnces = make(map[string]*sync.Once)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these still needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove it, thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@kuadrosx kuadrosx force-pushed the improve-init-performance branch from 67e4ed9 to 1b8fc0a Compare March 19, 2025 19:37

// Create our sync.Onces for each of our languages for carriers
for lang := range gen.CarrierData {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rowanseymour look, here i need the gen data to fill the onces map

@kuadrosx kuadrosx force-pushed the improve-init-performance branch 2 times, most recently from 77c1206 to a4797d6 Compare March 19, 2025 21:06
phonenumbers.go Outdated
return "", 0, err
}

defer file.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the file existence check always repeated? Remember package initialization isn't the only concern here - lookup times are more important.

I think getValueForNumber should work like...

  1. inside mutex...
  2. check if langMap has value for lang
  3. try to load from file if not, storing result in langMap
  4. use value from langMap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i was wondering the same the old code checked whether the language key exists in the metadata/binMap first and then it parses the file...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rowanseymour sorry i am making the changes but, do you know why we do not return an error when the field do not exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I applied the changes I am not sure it is according to your idea, let me know if you see how it can be improved

I added a benchmark these are the results:

prev change:
BenchmarkGetCarrierWithPrefixForNumber 30 34821182 ns/op 2948078 B/op 28948 allocs/op

main
BenchmarkGetCarrierWithPrefixForNumber 109204 11091 ns/op 401 B/op 7 allocs/op

new
BenchmarkGetCarrierWithPrefixForNumber 107184 11363 ns/op 448 B/op 8 allocs/op

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That benchmark tho is just checking the same language over and over again which means only one data load.. but I think it's ok

This commit improves package startup performance by:

* **Embedding static data:** Replaced go file data loading with `go:embed`, reducing I/O overhead.
* **Optimized metadata build:** Changed `cmd/buildmetadata` to download `metadata` files with the new format.
* **Logarithmic digit counting:** Replaced string conversion and length calculation with a faster logarithmic approach for digit counting.
* **Improve initialization:** Utilized `sync.Map` instead of `map[string]sync.Once` for prevent gunzip the same metadata more than once.
* **Clarify data directory:** Removed the `gen` directory, moving data into the `data` directory because the data is not longer loaded as generated
generated code.

These changes result in a substantial reduction in package startup time.

old:
init github.com/nyaruka/phonenumbers @4.2 ms, 7.2 ms clock, 2994832 bytes, 34852 allocs

new:
init github.com/nyaruka/phonenumbers @2.1 ms, 3.7 ms clock, 2917304 bytes, 34444 allocs
@kuadrosx kuadrosx force-pushed the improve-init-performance branch from a4797d6 to 76f8699 Compare March 20, 2025 00:42
@rowanseymour rowanseymour merged commit da74588 into nyaruka:main Mar 20, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants