Skip to content

Unconditional summary stats results on numerical data: margin case #156

@alextanski

Description

@alextanski

Is there any way I can get the margin result for a 2D cube (first dimensions being a numerical variable, crossed by a categorical), i.e. the mean of all cases with non-empty data for the two dimensions? I am unable to find something that works like cube.measures.scale_means.ScaleMeans.margin() method for numerical data.

Example setup:

Setting up the measure for the mean:

numeric = 'open_realrange'
mean = {
        "function": 'cube_mean',
        "args": [
            {
                "function": "cast",
                "args": [
                    {
                        "variable": "datasets/{}/variables/{}".format(
                            ds.id,
                            ds[numeric].id
                        )
                    },
                    {"class": "numeric"}
                ]
            }
        ]
    }

Then using pycrunch.cubes.fetch_cube and the CrunchCube api to query the results from Crunch:

from pycrunch.cubes import fetch_cube, count
from cr.cube.crunch_cube import CrunchCube
crossed_by = 'dropdown'
cube = fetch_cube(ds.resource, [crossed_by], mean=mean)
cube = CrunchCube(cube)

Which gives me:

CrunchCube(name='dropdown', dim_types='CAT')
slices[0]: CubeSlice(name='dropdown', dim_types='CAT', dims='dropdown')
               N
-------  -------
vl:2013  44
vl:2014  33.4167
vl:2015  33

I guess the 1D structure of that cube would cause a margin() result to fail anyway? Which leads to the related question of how I would get unconditional / 1D statistics on numerical data in general: #157

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions