Data for all combinatorially-encoded STARmap experiments can be found here:

Per-cell read information:

Use STARmapAnalysis object in to load.

cell_barcode_count.csv: cell by gene count matrix

  • ncell x ngene 


  • same data as in cell_barcode_count.csv but in npy matrix format

cell_barcode_names.csv: names and colorspace sequence of each gene (corresponding to columns of cell_barcode_count)

  • each row is: GeneIdx, ColorSpaceSeq, GeneName

  • where ColorSpaceSeq is an Nround color sequence in [1,2,3,4]

genes.csv: genes used in sequencing experiment + DNA sequence

  • each row is: GeneName,BaseSequence


Cell position/morphology data: 

Use code in to load.

labels.npz: cell locations and morphology

  • 2D image encoding the cell segmentation, where each cell is represented as a block of pixels with the same numeric ID. 

  • to find cell locations in Python:

    • import numpy as np

    • labels = np.load("labels.npz")["labels"]

    • qhulls,coords = GetQHulls(labels)

    • all_centroids  = np.vstack([c.mean(0) for c in coords])
      # get centroids of cells

  • NOTE: using regionprops will not work, as there is size filtering of cell size and differences in indexing.

  • Plot expression using: plot_poly_cells_expression(labels, qhulls, counts.iloc[:,n], cmap)

Raw read data:


  • bases: 1xNspots -- colorspace sequence of bases

  • qualScores: Nspots x Nrounds -- quality scores per spot, per round

  • allPoints: Nspots x 3 -- 3D spatial location of each spot



Data for the sequentially-encoded STARmap experiment can be found here:

File format: MATLAB struct containing fields

  • goodLocs: Ncell x 3 locations in 3D
  • expr: Ncell x 28 expression of each gene