Once you have segmented an image you usually want to gather information on the objects that you "discovered". Instead of painstakingly do this manually, skimage offers a simplified way to do this with its ```regionprops_table``` tool.
%% Cell type:code id: tags:
``` python
importnumpyasnp
importpandasaspd
importmath
importmatplotlib.pyplotasplt
importskimage
importskimage.io
importskimage.morphology
importscipy.ndimageasndi
importstackview
```
%% Cell type:markdown id: tags:
Let's first create a mask of the nuclei and clean it up with morphological operations:
In order to measure objects in the image separately, we first need to label them individually. For that we can just use the ```skimage.morphology.label()``` function which looks for independent groups of white pixels and assigns them integer numbers:
%% Cell type:code id: tags:
``` python
my_labels=skimage.morphology.label(mask_nuclei)
```
%% Cell type:markdown id: tags:
The label map shows that numbers are assigned from top to bottom in the image:
%% Cell type:code id: tags:
``` python
plt.subplots(figsize=(10,10))
plt.imshow(my_labels);
```
%% Cell type:markdown id: tags:
## Region properties
%% Cell type:markdown id: tags:
Now that we have each region labeled with a different number we can use the ```skimage.measure.regionprops_table()``` function, which takes such a label map and analyzes some properties of each region. We have to specify which ```properties``` we want to use.
The list of available ```properties``` can be found in the documentation of the function [here](https://scikit-image.org/docs/stable/api/skimage.measure.html#skimage.measure.regionprops).
Let's start by adding some of morphological properties in our list of properties and provide some explainations.
%% Cell type:markdown id: tags:
-```label```
The label of the region. It allows us to indentify each segmented object corretly.
%% Cell type:markdown id: tags:
-```area``` and ```perimeter```
Area and perimeter of the region.
Note that the pixel ```spacing``` along each axis of the image can be provided as argument to the function ```skimage.measure.regionprops_table()```. If provided, these properties will be returned in calibrated units. Otherwise, in number of pixels. Even if you did not feed the pixel spacing to the ```skimage.measure.regionprops_table()``` method, you can still convert the results to calibrated units later (as long as you have the spacing information from your image metadata) by a simple multiplication. We will see that later.
%% Cell type:markdown id: tags:
Now let's try to call the function with these three properties.
The output is a dictionary of all properties that we asked to get out:
%% Cell type:code id: tags:
``` python
my_regions
```
%% Cell type:markdown id: tags:
### Dictionaries
Until now, in terms of data structures, we have briefly seen lists ```mylist = [5, 4, 2]``` and Numpy arrays via the images. However Python offers additional types of data structures and dictionaries are one of them. As you can see in the output above, they are defined with curly parentheses ```{}``` and contain pairs of elements: keys like ```label``` and ```area``` and a *content* for each key, here two Numpy arrays. To better understand let's just create a simple one:
As you can see these dictionaries can contain all types of variables: strings, numbers, lists etc. They are for this reason ideal to hold information of various types and useful to describe entities thanks to the dictionary keys. Each entry in the dictionary can then be recovered via its key:
%% Cell type:code id: tags:
``` python
my_dict['weigth']
```
%% Cell type:markdown id: tags:
## Recovering image intensity information
%% Cell type:markdown id: tags:
In what we did above, we only recovered information about our mask. However often we want to obtain information on pixel values of the **original** image. For example, "what is the average intensity of each nucleus?"
Luckily ```regionprops_table``` allows us to pass as additional argument ```intensity_image``` the image we want to use to quantify intensity. Then we can for example add a property to extract the ```mean_intensity```:
Additionnal properties as the ```intensity_max```, ```intensity_min```, ```intensity_std``` for max, min and std of intensity values in the region, respectively, can also be computed. In some contexts, adding these features and not just looking at mean intensity may be relevant.
%% Cell type:markdown id: tags:
Now that we have this information, we can of course, plot it. For example we can produce a histogram of mean nuclei intensities:
%% Cell type:code id: tags:
``` python
plt.hist(my_regions['mean_intensity']);
```
%% Cell type:markdown id: tags:
## Filtering information
Obviously, we had some "bad segmentations", i.e. some fragments remaining from the processing that are not actual nuclei. We can easily filter those out for example based on size using Numpy logical indexing:
%% Cell type:code id: tags:
``` python
my_regions['area']
```
%% Cell type:markdown id: tags:
We create a logical array by setting a condition on one dictionary entry:
%% Cell type:code id: tags:
``` python
selected=my_regions['area']>100
selected
```
%% Cell type:markdown id: tags:
And then use it for logical indexing:
%% Cell type:code id: tags:
``` python
my_regions['mean_intensity'][selected]
```
%% Cell type:markdown id: tags:
## One step further: Pandas
In the above example, if we wanted to use one measurement to filter all other measurements, we would have to repeat the selection multiple times. Ideally, we would put all the measured properties into a table, one column per property, and then do typical database operations to sub-select parts of the data. This can be done using yet another data structure called a DataFrame. These structures are provided by the Pandas library, the main data science library in Python. We here give a very brief insight into that library. First we import it:
%% Cell type:code id: tags:
``` python
importpandasaspd
```
%% Cell type:markdown id: tags:
To understand what a DataFrame is, let's transform a plain Numpy array into a DataFrame:
%% Cell type:code id: tags:
``` python
np.random.seed(42)
my_array=np.random.randint(0,100,(3,5))
my_array
```
%% Cell type:markdown id: tags:
We can simply turn this array into a DataFrame by using:
%% Cell type:code id: tags:
``` python
pd.DataFrame(my_array)
```
%% Cell type:markdown id: tags:
We see that the array content is still there but in addition now we have column and row names, currently just indices. We could however give specific column names:
The difference with Numpy arrays is that DataFrames can contain different types of information (text, numbers etc.) and that they should really be seen as "organized" data. So for example we can recover a column of the table without resorting to the type of indexing we did before:
%% Cell type:code id: tags:
``` python
my_df['c']
```
%% Cell type:markdown id: tags:
Now how can such a structure help us to do the sort of data filtering we have mentioned before? Just like with arrays, we can use some constraining tests. For example we can ask: are there data points in column ```c``` which are smaller than 50?
%% Cell type:code id: tags:
``` python
my_df['c']<50
```
%% Cell type:markdown id: tags:
Similarly to what happened with arrays, we get a new column that is boolean. And again similarly to what we did with arrays we can use it for logical indexing using square parentheses:
%% Cell type:code id: tags:
``` python
my_df[my_df['c']<50]
```
%% Cell type:markdown id: tags:
What happened here is that we kept only those entries in the table where the values in the ```c``` column were smaller than 50: we filtered all the properties (columns) in our table in one go!
%% Cell type:markdown id: tags:
### Back to our problem
In our analysis we ended up with a dictionary:
%% Cell type:code id: tags:
``` python
my_regions
```
%% Cell type:markdown id: tags:
We can also easily turn this dictionary into a DataFrame:
%% Cell type:code id: tags:
``` python
my_regions_df=pd.DataFrame(my_regions)
my_regions_df
```
%% Cell type:markdown id: tags:
And now we can use what we have just learned: let's remove tiny regions with an area smaller than 100:
%% Cell type:code id: tags:
``` python
my_regions_df[my_regions_df['area']>100]
```
%% Cell type:markdown id: tags:
We see that we indeed removed two elements in that table, indices 4 and 16.
%% Cell type:markdown id: tags:
Imagine we forgot to provide the pixel spacing information to ```regionprops_table()``` method but you know this information from your metadata. You can still convert the results to physical units by multiplying the columns concerned by the pixel spacing value as follows:
%% Cell type:code id: tags:
``` python
pixel_spacing=0.06# Let's say that 1 pixel corresponds to 0.06 um in our case
Pandas is a very powerful library and in this course we can't offer more than this brief insight into how it can be useful for data post-processing. To learn more you can also visit this other course: https://guiwitz.github.io/DAVPy/Readme.html
%% Cell type:markdown id: tags:
## Exercise 1
%% Cell type:markdown id: tags:
1. Load blobs image from the images folder and vizualise it
%% Cell type:code id: tags:
``` python
# Write your code here
```
%% Cell type:code id: tags:
``` python
# Solution
image_ex_1=skimage.io.imread('images/blobs.tif')
plt.subplots(figsize=(10,10))
plt.imshow(image_ex_1,cmap='gray');
```
%% Cell type:markdown id: tags:
2. Segment the blobs and find a way to filter out all the blobs that have elongated shapes (do not split them for now)
In some projects, you might be interested in computing distances between different objects. For example, to filter out objects that are far away from an other set of objects. An efficient approach is based on distance maps, and we will see an example of usage along with ```skimage.measure.regionprops_table()``` method in the exercise 2 below. We give before a little theory reminder about distance maps.
%% Cell type:markdown id: tags:
Distance transforms have many applications, we can for example use them to quantify how a structure of interest is away from object boundaries or other structures as just mentionned. They are also used to characterize the morphology of an object in 2D and 3D, find its center, dimensions, etc.. Distance transforms can also be used as a pre-processing step to improve the segmentation results and split touching objects. Distance maps may use different distance metrics, as the Euclidian or the Manhattan one for example.
4. Compute distance map and inverse distance map of nuclei. Hint [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.distance_transform_edt.html)