Folders and file naming#

Materials#

  1. Presentation about folders and file naming for bioimage analysis:

  1. Lab lesson about using the shell:

How to organize your folders#

Here, we suggest how the folders should be organized when doing an image analysis project.

Having a fixed structure helps making the data findable, accessible, and reusable.

Our example was adapted from the cellpainting-gallery to fit our needs.

The parent structure is:

<project>
└──<project_specific>
    ├──images
    └──illum
├── workspace
└── workspace_dl
  • <project> top level folder. Example: LiveCellPainting.

  • <project_specific> has the name of a subproject. Example: AgNPViability.

  • images will have all the images organized by plate.

  • illum will have illumination correction functions (if calculated).

  • workspace will contain all the CellProfiler results, pipelines, cellpose models, etc.

  • workspace_dl features and results from deep learning analysis.

Folder structure in detail#

images and illum
<project>
└───<project_specific>
    ├───illum
    │   ├───<plate>
    │       ├───<plate>_Illum<Channel>.npy
    │       └───<plate>_Illum<Channel>.npy
    │   ├───<plate>
    │   └───<plate>
    └───images
        ├───<plate>
            ├───B2_02_1_1_GFP_001.tif
            ├───B2_02_1_2_GFP_001.tif
        ├───<plate>
        └───<plate>

Inside each <plate> folder, it will contain either one illumination function for each channel (illum), or all the images acquired in that plate (images).

workspace/workspace_dl

Everything related to the project that are not images will be in workspace for CellProfiler features or workspace_dl for deep learning features.

It can contain additional or less folders depending on the project.

Inside each folder of workspace, we’ll have a <project_specific> folder to maintain the organization by subproject.

<project>
└───<project_specific>
    ├───illum
    ├───images
└───workspace
    ├───assaydev
    ├───backend
    ├───cellpose
    ├───load_data_csv
    ├───metadata
    ├───models
    ├───pipelines
    └───profiles
  • assaydev: contains example images from your experiments, segmented and outlined, used to check if object’s segmentation is correct;

  • backend: contains the single-cell SQLite files (one per plate)/OR single-cell CSV files;

  • cellpose: contains the images used to generate a custom model in cellpose GUI;

  • load_data_csv: contains the load_data.csv files used as an input for CellProfiler pipelines;

  • metadata: metadata files used to annotate the profiles (barcode_platemap.csv and platemaps.csv for each plate);

  • models: each model generated by cellpose that will be used in an analysis pipeline;

  • pipelines: contains the pipelines used in CellProfiler (assaydev.cppipe, analysis.cppipe, illum.cppipe);

  • profiles: contains a set of profiles (well-level aggregated), normalized and/or feature selected.

load_data_csv
└──load_data_csv
     └──<project_specific>
         ├──<plate>
         │   ├──load_data.csv
         │   └──load_data_with_illum.csv
         └──<plate>

Inside each plate folder, you’ll have load_data.csv for pipelines that do not use an illumination correction function and a load_data_with_illum.csv for pipelines that do use an illumination correction function. Those can be generated using the notebook in this repository.

metadata
└───metadata
    └───platemaps
        └───<project_specific>
            ├───barcode_platemap.csv
            └───platemap
                ├───platemap1.txt
                ├───platemap2.txt
    └───layouts
        └───<project_specific>
            ├───<plate>_layout.csv
  • platemaps: contains always a barcode_platemap.csv file that describes, for each plate, what’s the platemap associated with it;

  • platemap: contains the platemaps.txt with informations about well, plate, compounds, concentration, etc;

  • layouts: contains the layout of each plate in the project in the plate format (96-well) for visualization;

profiles
└──profiles
     └──<project_specific>
         ├──2022_05_25_LiveCellPainting_agg_median.csv
         ├──2022_05_25_LiveCellPainting_agg_median_normalized.csv
         ├──2022_05_25_LiveCellPainting_agg_median_normalized_feature_select.csv
         ├──2022_05_25_LiveCellPainting_agg_median_normalized_negcon_feature_select.csv
         └──2022_05_25_LiveCellPainting_agg_median_normalized_negcon_feature_select_pycombat.csv

This folder contains all the well-aggregated profiles for that project analyzed in different ways (normalized and feature selected, normalized to negcon, etc).

Complete folder structure#

See full structure
CellRecovery
└───2022_05_25_LiveCellPainting
    └───illum
    │   ├───220508_09856_Plate_1
    │       ├───<plate>_Illum<Channel>.npy
    │       └───<plate>_Illum<Channel>.npy
    │   ├───220510_89856_Plate_1
    │   └───220515_09853_Plate_1
    └───images
        ├───220508_09856_Plate_1
            ├───B2_02_1_1_GFP_001.tif
            ├───B2_02_1_2_GFP_001.tif
        ├───220510_89856_Plate_1
        └───220515_09853_Plate_1
└──workspace
    └───assaydev
        └───2022_05_25_LiveCellPainting
            └───montage_segmentation.png
    └───backend
        └───2022_05_25_LiveCellPainting
            └───220508_09856_Plate_1
                ├──2022_05_25_LiveCellPainting_single_cells.csv
                └──2022_05_25_LiveCellPainting_single_cells.sqlite
    └───cellpose
        └───2022_05_25_LiveCellPainting
            ├──train
                └───models
            └───test
    └──load_data_csv
        └──2022_05_25_LiveCellPainting
            ├──220508_09856_Plate_1
            │   ├──load_data.csv
            │   └──load_data_with_illum.csv
            └──220510_89856_Plate_1
    └───metadata
        ├───platemaps
            └───2022_05_25_LiveCellPainting
                ├───barcode_platemap.csv
                └───platemap
                    ├───platemap1.txt
                    ├───platemap2.txt
        └───layouts
            └───2022_05_25_LiveCellPainting
                ├──220508_09856_Plate_1_layout.csv
                └──220510_89856_Plate_1_layout.csv
    └───models
        └───2022_05_25_LiveCellPainting
            ├───cellpose_model_hoechst
    └───pipelines
        └───2022_05_25_LiveCellPainting
            ├───assaydev.cppipe
            ├───analysis.cppipe
            └───illum.cppipe
    └──profiles
        └──2022_05_25_LiveCellPainting
            ├──2022_05_25_LiveCellPainting_agg_median.csv
            ├──2022_05_25_LiveCellPainting_agg_median_normalized.csv
            ├──2022_05_25_LiveCellPainting_agg_median_normalized_feature_select.csv
            ├──2022_05_25_LiveCellPainting_agg_median_normalized_negcon_feature_select.csv
            └──2022_05_25_LiveCellPainting_agg_median_normalized_negcon_feature_select_pycombat.csv