Commit a4e52106 authored by Bannier Delphine's avatar Bannier Delphine
Browse files

update evaluation

parent 61dd2e62
%% Cell type:markdown id: tags:
 
# CNN with 4 images producing 1 image - 1 depth
 
%% Cell type:code id: tags:
 
``` python
import numpy as np
import pandas as pd
import os
import logging
logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.INFO)
```
 
%% Cell type:markdown id: tags:
 
## A - Explanation of the model used
 
%% Cell type:markdown id: tags:
 
**In the dataset, you are provided with a baseline chest CT scan and associated clinical information for a set of patients. A patient has an image acquired at time Week = 0 and has numerous follow up visits over the course of approximately 1-2 years, at which time their FVC is measured.**
 
%% Cell type:markdown id: tags:
 
https://www.pyimagesearch.com/2019/01/28/keras-regression-and-cnns/
 
As you’ll find out in the rest of today’s tutorial, performing regression with CNNs and Keras is as simple as:
 
- Removing the fully-connected softmax classifier layer typically used for classification
- Replacing it with a fully-connected layer with a single node along with a linear activation function.
- Training the model with a continuous value prediction loss function such as mean squared error, mean absolute error, mean absolute percentage error, etc.
 
We essentially have **three options**:
 
1. Pass the images one at a time through the CNN and use the price of the house as the target value for each image
2. Utilize multiple inputs with Keras and have four independent CNN-like branches that eventually merge into a single output
3. Create a montage that combines/tiles all four images into a single image and then pass the montage through the CNN
 
The first option is a **poor choice** — we’ll have multiple images with the same target price.
 
If anything we’re just going to end up “confusing” our CNN, making it impossible for the network to learn how to correlate the prices with the input images.
 
The second option is also **not a good idea** — the network will be computationally wasteful and harder to train with four independent tensors as inputs. Each branch will then have its own set of CONV layers that will eventually need to be merged into a single output.
 
Instead, we should choose the **third option** where we combine all four images into a single image and then pass that image through the CNN (as depicted in Figure 2 above).
 
%% Cell type:markdown id: tags:
 
## B - Preprocessing : Reading Data
 
%% Cell type:code id: tags:
 
``` python
os.chdir('../')
```
 
%% Cell type:code id: tags:
 
``` python
from preprocessing.read_load_data import read_data
 
input_directory='../osic-pulmonary-fibrosis-progression'
train_df, test_df, sample_df = read_data(input_directory)
train_df.head()
```
 
%% Output
 
Patient Weeks FVC Percent Age Sex SmokingStatus
0 ID00007637202177411956430 -4 2315 58.253649 79 Male Ex-smoker
1 ID00007637202177411956430 5 2214 55.712129 79 Male Ex-smoker
2 ID00007637202177411956430 7 2061 51.862104 79 Male Ex-smoker
3 ID00007637202177411956430 9 2144 53.950679 79 Male Ex-smoker
4 ID00007637202177411956430 11 2069 52.063412 79 Male Ex-smoker
 
%% Cell type:markdown id: tags:
 
## C - Preprocessing : Reviewing Data
 
%% Cell type:code id: tags:
 
``` python
patients_train_ids= train_df.Patient.unique()
patient_test_list= test_df.Patient.unique()
list_p = train_df.Patient.isin( patient_test_list )
#patients_train_ids = [pat for pat in patients_train_ids if not pat in list_p]
patients_train_ids = [pat for pat in patients_train_ids]
```
 
%% Cell type:code id: tags:
 
``` python
from preprocessing.read_load_data import print_images_patient
 
patient = 'ID00007637202177411956430'
print_images_patient(input_directory, patient)
```
 
%% Output