Skip to content

Some questions about normalization #104

@809825706

Description

@809825706

WOT is a great tool for time coures single cell analysis! thank you for developing it. I want to use it to analyze my reprograming data too, but I was confused about the normalization steps in your parper. It seem that you normalized tha data twice (before and after find HVGs).

屏幕截图 2022-11-22 102721

屏幕截图 2022-11-22 102741

I flowed your way (by my own understanding) to use the code below:

import numpy as np
import pandas as pd
import scanpy as sc

adata = sc.read_h5ad("adata_filtered.h5ad")

adata.var_names_make_unique() 
sc.pp.filter_cells(adata, min_genes=200)
sc.pp.filter_genes(adata, min_cells=3)

sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)
adata

adata.write("ExprMatrix.h5ad")

sc.pp.highly_variable_genes(adata, min_mean=0.0125, max_mean=3, min_disp=0.5)
adata = adata[:,adata.var.highly_variable]

sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)
adata

adata.write("ExprMatrix.var.genes.h5ad")

I wondered if the following operation is reasonable:

  1. I didn't downsample, It seems not required.
  2. I used the function of find HVGs in scanpy in place of seurat.
  3. And last but not least, which data should the normalization use after select HVGs? raw data? or just like my code above?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions