Skip to content

iesave: code breaks if too many unique values in a string variable #358

@luizaandrade

Description

@luizaandrade

If there are too many unique values in a string/categorical variable, levelsof breaks with an error message of "cannot compute". I have just run into this with a variable that had 700k+ unique values.

It now runs with the workaround of replacing the following lines

* Number of levels and complete observations
qui levelsof `var'
local varlevels = r(r)
local varcomplete = r(N)

with

* Number of levels
preserve 
	keep `var'
	duplicates drop
	count
	
	local varlevels = r(r)
restore

* Number of complete observations
qui count if !missing(`var')		
local varcomplete	= r(N)

There may be a more elegant approach, though. If no one can think of one, I can open a PR with this one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    minor bugBug unlikely to lead to incorrect analysis

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions