BuildConsensusReference Error when no clusters contain all datasets

There seems to be an error that arises when running cluster_cnmf_results() on an instance of BuildConsensusReference in the case where no single cluster contains a gene program from all of the source datasets (max cluster size < num datasets). For example if you have 4 datasets where you have run cNMF and all clusters contain 3 or fewer programs you get the error:
`ValueError: 4 columns passed, passed data had 3 columns`

This appears to arise from line 300 in build_consensus_reference.py where the length of assigned column names is set by number of datasets (self.num_results below) and the number of columns in the generated data frame is set by the max size of a cluster.

Line 300-301
`clus_df = pd.DataFrame.from_dict(clus_dict_all, orient='index', 
                                                          columns = ['GEP%d' % x for x in range(1, self.num_results+1)])`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BuildConsensusReference Error when no clusters contain all datasets #20

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BuildConsensusReference Error when no clusters contain all datasets #20

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions