Commit cfba3568 authored by ahmetkerem's avatar ahmetkerem
Browse files

cleanup&readme update

parent 6b0b4d92
MIT License
Copyright (c) 2020 Ahmet Kerem Aksoy
Copyright (c) 2021 Ahmet Kerem Aksoy
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
......
# CCML: A Novel Uncertainty-aware Collaborative Learning Method for Remote Sensing Image Classification Under Multi-Label Noise
# Multi-Label Noise Robust Collaborative Learning Model for Remote Sensing Image Classification
This repository contains code of the paper `A Novel Uncertainty-Aware Collaborative Learning Method for Remote Sensing Image Classification Under Multi-Label Noise`. This work has been developed developed at the [Remote Sensing Image Analysis group](https://www.rsim.tu-berlin.de/menue/remote_sensing_image_analysis_group/) by Ahmet Kerem Aksoy, Mahdyar Ravanbakhsh, Tristan Kreuziger and Begüm Demir.
This repository contains code of the paper `Multi-Label Noise Robust Collaborative Learning Model for Remote Sensing Image Classification`. This work has been developed developed at the [Remote Sensing Image Analysis group](https://www.rsim.tu-berlin.de/menue/remote_sensing_image_analysis_group/) by Ahmet Kerem Aksoy, Mahdyar Ravanbakhsh and Begüm Demir.
If you use this code, please cite our paper given below:
> Ahmet Kerem Aksoy, Mahdyar Ravanbakhsh, Tristan Kreuziger, Begüm Demir, "[A Novel Uncertainty-Aware Collaborative Learning Method for Remote Sensing Image Classification Under Multi-Label Noise](https://arxiv.org/abs/2105.05496)", arXiv preprint arXiv: 2105.05496, 2021.
> Ahmet Kerem Aksoy, Mahdyar Ravanbakhsh, Begüm Demir,"[Multi-Label Noise Robust Collaborative Learning Model for Remote Sensing Image Classification](https://arxiv.org/abs/2012.10715)", arXiv preprint arXiv: 2012.10715, 2021.
![](images/CCML_Code_Figure.png)
![](images/swap_v2-1.png)
_CCML intuition: For a batch of input images, the models `f` and `g` need to agree on a subset, which contains only the correctly annotated images._
_RCML intuition: For a batch of input images, the models `f` and `g` need to agree on a subset, which contains only the correctly annotated images._
## Description
We propose a multi-label learning method based on the idea of co-training for scene classification of remote sensing (RS) images with noisy labels. It identifies noisy samples and excludes them from back-propagation, aiming to train the classifier solely with clean samples. It also detects noisy labels in samples and aims to fix them by means of a relabeling mechanism.
We propose a multi-label learning method based on the idea of co-training for scene classification of remote sensing (RS) images with noisy labels. It identifies noisy samples and excludes them from back-propagation, aiming to train the classifier solely with clean samples. It also predicts the most noisy label in each sample.
Two Deep Neural Networks are trained with the same architecture simultaneously. The model is enhanced with a discrepancy module to make the two networks learn complementary features of the same data, while ensuring consistent predictions. This is achieved by creating a statistical difference between the logits of the two networks through the maximum mean discrepancy distance, and then converging the outputs of the networks using the same distance. Learning complementary features allows the two networks to correct each other by selecting clean instances with the loss information provided by the opposite network, and only using the clean instances to update their weights. The proposed method can also identify noisy labels in samples using the group lasso method. Then, the identified noisy labels can be flipped and incorporated into training with their fixed values. It can weigh the potential noisy samples down or up according to the noise type that they include.
Two Deep Neural Networks are trained with the same architecture simultaneously. The model is enhanced with a discrepancy module to make the two networks learn complementary features of the same data, while ensuring consistent predictions. This is achieved by creating a statistical difference between the logits of the two networks through the maximum mean discrepancy distance, and then converging the outputs of the networks using the same distance. Learning complementary features allows the two networks to correct each other by selecting clean instances with the loss information provided by the opposite network, and only using the clean instances to update their weights. The proposed method can also identify noisy labels in samples using the group lasso method. It can weigh the potential noisy samples down or up according to the noise type that they include.
## Dependencies
The code in this repository has been tested with `Python 3.7.6`. To run it, the following packages must be installed:
The code in this repository has been tested with `Python 3.6.9`. To run it, the following packages must be installed:
- `tensorflow==2.3.0`
- `tensorflow-gpu==2.3.0`
- `tensorflow-addons==0.11.2`
- `noisifier==0.4.4`
- `scipy==1.5.4`
- `matplotlib==3.3.3`
- `PyYAML==5.3.1`
......@@ -37,52 +36,19 @@ To use BigEarthNet, three [TFRecord](https://www.tensorflow.org/tutorials/load_d
The UC Merced Land Use dataset is loaded through [pickle](https://docs.python.org/3/library/pickle.html) files. To use it, 6 pickle files must be created: `namely x_train.pickle`, `x_validation.pickle`, `x_test.pickle`, `y_train.pickle`, `y_validation.pickle`, `y_test.pickle`. The directory with these 6 files must be provided as `dataset_path` in the command line arguments.
## Synthetic Label Noise
For testing purposes, we introduced synthetic label noise to the datasets using the python package [noisifier](https://git.tu-berlin.de/rsim/noisifier). The `Random Noise per Sample` label noise approach from the package is applied to the dataset labels.
## Arguments
- `batch_size`: The size of the mini-batches that are sampled in each epoch from the dataset.
- `epoch`: The number of epochs for the training.
- `arch` : Possible architectures are `resnet`, `denseNet`, `SCNN`, `paper_model`, `keras_model`, `modified_SCNN`, `batched_SCNN`.
- `dataset_path`: The path to the data as desribed above.
- `channel` : Possible values: `RGB`, `ALL`. Use only `RGB` for UC Merced. Use `ALL` to use 10 bands of BigEarthNet.
- `label` : Possible values are `BEN-12`, `BEN-19`, `BEN-43`, `ucmerced`.
- `sigma` : Sigma for the rbf kernel in mmd.
- `swap` : If 1, swap between models, if 0 do not swap.
- `swap_rate` : The percentage of swap between models. If there is no swap, set to 1.
- `lambda2` : Strength of the dicrepancy component, a value between 0 and 1.
- `lambda3` : Strength of teh consistency component, a value between 0 and 1.
- `flip_bound` : The percentage of unflipped training epochs.
- `flip_per` : Class flipping percentage determing how much of the consented classes is flipped.
- `miss_alpha` : Rate of missing label noise to be included into the error loss. More missing labels are detected, when this value is high.
- `extra_beta` : Rate of extra label assignment noise to be included into the error loss. More extra label assignments are detected, when this value is high.
- `add_noise` : Add noise to the dataset. Enter 0 for no noise; enter 1 for adding noise.
- `noise_type` : Choose the noise type to be added to the dataset. Possible values are 1 for Random Noise
per Sample and 2 for Mix Label Noise.
- `sample_rate` : Percentage of samples in a mini-batch to be noisified, a value between 0 and 1.
- `class_rate` : Percentage of labels in a sample to be noisified, a value between 0 and 1. Mix Label Noise does not use this.
- `metric` : Possible metrics: `mmd`, `shannon`, `wasserstein`, `nothing`.
- `alpha` : Influence of the group lasso ranking loss compared to binary cross entropy for detection of noisy samples.
- `test` : 1 to test the model using only a small portion of the datasets. Default is 0.
An example configuration can be found in the `run.sh` file under the `scripts` directory.
## How to run
`pip install -r requirements.txt && mkdir output && cd script/ && ./run.sh`
`pip install -r requirements.txt && cd script/ucmerced && ./run.sh`
## Authors
Ahmet Kerem Aksoy
Tristan Kreuziger
## License
The code in this repository is licensed under the **MIT License**:
```
MIT License
Copyright (c) 2019 Ahmet Kerem Aksoy
Copyright (c) 2021 Ahmet Kerem Aksoy
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
......
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
File: data.py
Author: Tristan Kreuziger (tristan.kreuziger@tu-berlin.de)
Modified by: Ahmet Kerem Aksoy (a.aksoy@campus.tu-berlin.de)
Created: 2020-07-29 15:18
Copyright (c) 2020 Tristan Kreuziger under MIT license
"""
from functools import partial
# Third party imports
import tensorflow as tf
import numpy as np
import pickle
from functools import partial
BAND_STATS = {
......@@ -220,14 +213,6 @@ def load_archive(filenames, num_classes, batch_size=0, shuffle_size=0, test=0, n
unshuffled_dataset_labels = dataset.map(lambda x, y, indices: (y))
######################################
'''
# Exclude the samples that are noisified
if noise_percentage != 0.:
dataset = dataset.filter(lambda x, y, index: tf.math.reduce_all(tf.math.not_equal(index, tf.convert_to_tensor(noisy_sample_indices))))
'''
######################################
# Shuffle the data as the very first step if desired.
if shuffle_size > 0:
dataset = dataset.shuffle(shuffle_size, reshuffle_each_iteration=True)
......
......@@ -3,6 +3,8 @@ File: evaluate.py
Author: Ahmet Kerem Aksoy (a.aksoy@campus.tu-berlin.de)
'''
import os
# Third party imports
import tensorflow as tf
from tensorflow import keras
......@@ -13,9 +15,6 @@ from sklearn.metrics import average_precision_score as aps
# Local imports
from data_prep.stackBands import prepare_input
from utils.logging.printers import test_printer
from utils.logging.sample_saver import SampleSaver
import os
from utils.logging.loggers import TestMetrics
from utils.arguments.add_arguments import add_arguments
from data_prep.prep_tf_records import load_archive, load_ucmerced_set
......@@ -75,7 +74,6 @@ def main(args):
def evaluate(args, metrics, best_model=None):
sampleSaver = SampleSaver()
model1_probabilities = []
model2_probabilities = []
y_true = []
......@@ -114,19 +112,6 @@ def evaluate(args, metrics, best_model=None):
y_true.append(Y)
'''
# If multispectral data, take RGB channels to save the image
if args.label_type != 'ucmerced':
X = sampleSaver.unnormalize_extract_BEN(batch)
# Save samples from network 1
sampleSaver.save_samples(2,
batch_idx,
X,
Y,
predictions1)
'''
print(f'------- TEST RESULTS -------')
test_printer(metrics['testMetrics_1'])
if not args.base:
......@@ -157,43 +142,7 @@ def evaluate(args, metrics, best_model=None):
MAP_macro_2 = aps(y_true, model2_probabilities, average='macro')
print(f"MAP_macro_2: {MAP_macro_2}")
'''
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import f1_score
pred_tresholds = np.arange(0.1,1.0,0.05)
print(f"model1_sigmoids: {model1_sigmoids}")
for treshold in pred_tresholds:
print(f"treshold: {treshold}")
print('----------------------------------------------------------')
tresholded_logits_1 = tf.cast(model1_sigmoids >= treshold, tf.float32)
print(f"tresholded_logits_1: {tresholded_logits_1}")
tresholded_logits_2 = tf.cast(model2_sigmoids >= treshold, tf.float32)
p1 = precision_score(y_true, tresholded_logits_1, average='micro', zero_division=0)
p2 = precision_score(y_true, tresholded_logits_2, average='micro', zero_division=0)
r1 = recall_score(y_true, tresholded_logits_1, average='micro', zero_division=0)
r2 = recall_score(y_true, tresholded_logits_2, average='micro', zero_division=0)
f1 = f1_score(y_true, tresholded_logits_1, average='micro', zero_division=0)
f2 = f1_score(y_true, tresholded_logits_2, average='micro', zero_division=0)
c_p1 = precision_score(y_true, tresholded_logits_1, average=None, zero_division=0)
c_p2 = precision_score(y_true, tresholded_logits_2, average=None, zero_division=0)
c_r1 = recall_score(y_true, tresholded_logits_1, average=None, zero_division=0)
c_r2 = recall_score(y_true, tresholded_logits_2, average=None, zero_division=0)
c_f1 = f1_score(y_true, tresholded_logits_1, average=None, zero_division=0)
c_f2 = f1_score(y_true, tresholded_logits_2, average=None, zero_division=0)
for metric in [p1,p2,r1,r2,f1,f2,c_p1,c_p2,c_r1,c_r2,c_f1,c_f2]:
print(metric)
print('----------------------------------------------------------')
# testMetrics.write_summary()
'''
# Save some results for the best model in a numpy array
# Save the results for the best model in a numpy array
if best_model:
np.save(os.path.join(args.logname, 'y_res.npy'),
np.array([tf.make_ndarray(tf.make_tensor_proto(y_true)), tf.make_ndarray(tf.make_tensor_proto(model1_probabilities))]),
......
......@@ -26,12 +26,6 @@ def loss_fun(y_batch_train,
weight_rate,
args):
'''
Group Lasso
Error loss array is the average of group lassos of extra and missing class labels.
noisy_sample is the sample with the highest loss within the minibatch.
noisy_class is the class of the noisy_sample that is gonna be flipped.
'''
probabilities_1 = tf.math.sigmoid(logits['logits_1'])
error_loss_array_1, classes_1 = groupLasso(y_batch_train, probabilities_1, args.miss_alpha, args.extra_beta)
error_loss_array_1 = tf.stop_gradient(error_loss_array_1)
......@@ -41,26 +35,6 @@ def loss_fun(y_batch_train,
error_loss_array_2, classes_2 = groupLasso(y_batch_train, probabilities_2, args.miss_alpha, args.extra_beta)
error_loss_array_2 = tf.stop_gradient(error_loss_array_2)
'''
if not args.base and start_flipping > args.flip_bound:
"""
Flip the labels
To flip the labels both of the networks must be in consensus
Get the noisy samples with corresponding classes from the mini batch.
"""
noisy_samples, noisy_classes = choose_noisy_classes(error_loss_array_1, error_loss_array_2, classes_1, classes_2, args.flip_per)
# Flip classes
unflipped_y_batch_train = y_batch_train
y_batch_train = flip_classes(y_batch_train, noisy_samples, noisy_classes)
# Calculate the loss arrays again.
error_loss_array_1, classes_1 = groupLasso(y_batch_train, probabilities_1, args.miss_alpha, args.extra_beta)
error_loss_array_1 = tf.stop_gradient(error_loss_array_1)
error_loss_array_2, classes_2 = groupLasso(y_batch_train, probabilities_2, args.miss_alpha, args.extra_beta)
error_loss_array_2 = tf.stop_gradient(error_loss_array_2)
'''
loss_array_1 = calculate_loss(args, y_batch_train, logits['logits_1'], batch_indices, epoch)
loss_1 = tf.reduce_mean(loss_array_1)
......
......@@ -12,24 +12,16 @@ def calculate_loss(args, y, logits, batch_indices, epoch):
loss_array = args.SAT.loss_fn(y, logits, batch_indices, epoch)
elif args.loss_fn == 'ELR':
loss_array = args.ELR.loss_fn(y, logits, batch_indices)
'''
elif args.loss_fn == 'JOCOR':
loss_array = args.JoCor.loss_fn(y_batch_train, logits, logits['logits_2'], epoch)
'''
return loss_array
def bce_loss(y_batch_train, logits):
loss_array = tf.reduce_sum(tf.nn.weighted_cross_entropy_with_logits(y_batch_train, logits, 1.0), axis=1)
return loss_array
def focal_loss(y_batch_train, logits):
# FOCAL LOSS TFA IMPLEMENTATION
loss_array = tfa.losses.sigmoid_focal_crossentropy(y_batch_train, logits, from_logits=True)
return loss_array
......@@ -72,12 +64,6 @@ class ELR:
self.bce_with_logits = tf.nn.sigmoid_cross_entropy_with_logits
def loss_fn(self, labels, logits, indices):
"""Early Learning Regularization.
Args
* `index` Training sample index, due to training set shuffling, index is used to track training examples in different iterations.
* `output` Model's logits, same as PyTorch provided loss functions.
* `label` Labels, same as PyTorch provided loss functions.
"""
y_pred = tf.math.sigmoid(logits)
y_pred = tf.clip_by_value(y_pred, 1e-4, 1.0 - 1e-4)
y_pred_ = tf.stop_gradient(y_pred)
......@@ -90,139 +76,3 @@ class ELR:
final_loss = ce_loss + self.lam * elr_reg
return final_loss
class LossJoCoR:
def __init__(self, forget_rate=0.5, num_gradual=10, co_lambda=0.1, exponent=1, n_epoch=100):
self.co_lambda = co_lambda
self.rate_schedule = np.ones(n_epoch) * forget_rate
self.rate_schedule[:num_gradual] = np.linspace(0, forget_rate ** exponent, num_gradual)
self.bce_with_logits = tf.nn.sigmoid_cross_entropy_with_logits
self.kld = tf.keras.losses.KLDivergence(reduction=tf.keras.losses.Reduction.NONE)
def kl_loss_compute(self, logits, soft_labels, reduction='none'):
kl = self.kld(tf.math.log_sigmoid(logits), tf.math.sigmoid(soft_labels))
if reduction != 'none':
return tf.math.reduce_mean(kl)
else:
return kl
def loss_fn(self, labels, logits1, logits2, epoch):
loss_pick_1 = tf.math.reduce_sum(self.bce_with_logits(labels, logits1) * (1 - self.co_lambda), 1)
loss_pick_2 = tf.math.reduce_sum(self.bce_with_logits(labels, logits2) * (1 - self.co_lambda), 1)
loss_kl1 = self.kl_loss_compute(logits1, logits2, reduction='none')
loss_kl2 = self.kl_loss_compute(logits2, logits1, reduction='none')
loss_pick = (loss_pick_1 + loss_pick_2 + self.co_lambda * loss_kl1 + self.co_lambda * loss_kl2)
ind_sorted = np.argsort(tf.stop_gradient(loss_pick))
loss_sorted = tf.gather(loss_pick, ind_sorted)
remember_rate = 1 - self.rate_schedule[epoch - 1]
num_remember = int(remember_rate * len(loss_sorted))
ind_update = ind_sorted[:num_remember]
# exchange
loss = tf.gather(loss_pick, ind_update)
return loss
'''
class AsymmetricLoss:
def __init__(self, gamma_neg=4, gamma_pos=1, clip=0.05, eps=1e-8, disable_torch_grad_focal_loss=True):
self.gamma_neg = gamma_neg
self.gamma_pos = gamma_pos
self.clip = clip
self.disable_torch_grad_focal_loss = disable_torch_grad_focal_loss
self.eps = eps
def loss_fn(self, x, y):
""""
Parameters
----------
x: input logits
y: targets (multi-label binarized vector)
"""
# Calculating Probabilities
x_sigmoid = tf.math.sigmoid(x)
xs_pos = x_sigmoid
xs_neg = 1 - x_sigmoid
# Asymmetric Clipping
if self.clip is not None and self.clip > 0:
xs_neg = (xs_neg + self.clip).clamp(max=1)
# Basic CE calculation
los_pos = y * torch.log(xs_pos.clamp(min=self.eps))
los_neg = (1 - y) * torch.log(xs_neg.clamp(min=self.eps))
loss = los_pos + los_neg
# Asymmetric Focusing
if self.gamma_neg > 0 or self.gamma_pos > 0:
if self.disable_torch_grad_focal_loss:
torch._C.set_grad_enabled(False)
pt0 = xs_pos * y
pt1 = xs_neg * (1 - y) # pt = p if t > 0 else 1-p
pt = pt0 + pt1
one_sided_gamma = self.gamma_pos * y + self.gamma_neg * (1 - y)
one_sided_w = torch.pow(1 - pt, one_sided_gamma)
if self.disable_torch_grad_focal_loss:
torch._C.set_grad_enabled(True)
loss *= one_sided_w
return -loss.sum()
class AsymmetricLossOptimized:
"""
Notice - optimized version, minimizes memory allocation and gpu uploading,
favors inplace operations.
"""
def __init__(self, gamma_neg=4, gamma_pos=1, clip=0.05, eps=1e-8, disable_torch_grad_focal_loss=False):
self.gamma_neg = gamma_neg
self.gamma_pos = gamma_pos
self.clip = clip
self.disable_torch_grad_focal_loss = disable_torch_grad_focal_loss
self.eps = eps
# prevent memory allocation and gpu uploading every iteration, and encourages inplace operations
self.targets = self.anti_targets = self.xs_pos = self.xs_neg = self.asymmetric_w = self.loss = None
def loss_fn(self, x, y):
""""
Parameters
----------
x: input logits
y: targets (multi-label binarized vector)
"""
self.targets = y
self.anti_targets = 1 - y
# Calculating Probabilities
self.xs_pos = tf.math.sigmoid(x)
self.xs_neg = 1.0 - self.xs_pos
# Asymmetric Clipping
if self.clip is not None and self.clip > 0:
self.xs_neg.add_(self.clip).clamp_(max=1)
# Basic CE calculation
self.loss = self.targets * torch.log(self.xs_pos.clamp(min=self.eps))
self.loss.add_(self.anti_targets * torch.log(self.xs_neg.clamp(min=self.eps)))
# Asymmetric Focusing
if self.gamma_neg > 0 or self.gamma_pos > 0:
if self.disable_torch_grad_focal_loss:
torch._C.set_grad_enabled(False)
self.xs_pos = self.xs_pos * self.targets
self.xs_neg = self.xs_neg * self.anti_targets
self.asymmetric_w = torch.pow(1 - self.xs_pos - self.xs_neg,
self.gamma_pos * self.targets + self.gamma_neg * self.anti_targets)
if self.disable_torch_grad_focal_loss:
torch._C.set_grad_enabled(True)
self.loss *= self.asymmetric_w
return -self.loss.sum()
'''
......@@ -20,7 +20,6 @@ from data_prep.prep_tf_records import load_archive, load_ucmerced_set
from utils.logging.dataset_statistics import count_samples_per_num_classes, count_samples_per_class
from utils.logging.setup_logger import setup_logger
from losses.losses import SelfAdaptiveTrainingCE, ELR, LossJoCoR
from data_prep.get_noisy_sample_indices import get_noisy_sample_indices, get_noisy_labels_per_noisy_sample
def main(args):
......@@ -33,6 +32,7 @@ def main(args):
# Setup loggers
args.noise_comparison_logger = setup_logger('noise_comparison', args.logname, 'noise_camparison.log')
# Set seeds
tf.random.set_seed(args.seed)
np.random.seed(args.seed)
......@@ -62,19 +62,6 @@ def main(args):
NOISY_SAMPLE_INDICES = None
NOISY_LABELS_PER_SAMPLE = None
'''
print(NOISY_SAMPLE_INDICES)
print(NOISY_LABELS_PER_SAMPLE)
try:
print(NOISY_SAMPLE_INDICES.shape)
except:
print(len(NOISY_SAMPLE_INDICES))
try:
print(NOISY_LABELS_PER_SAMPLE.shape)
except:
print(len(NOISY_LABELS_PER_SAMPLE))
'''
# Load the dataset
if args.label_type == 'ucmerced':
train_dataset, val_dataset, test_dataset, unshuffled_dataset_labels = load_ucmerced_set(args.dataset_path, args.batch_size,
......@@ -86,20 +73,9 @@ def main(args):
args.batch_size, 1000, args.test)
val_dataset, _ = load_archive(args.dataset_path + '/val.tfrecord', NUM_OF_CLASSES,
args.batch_size, 1000, args.test)
# Get some statistics
# count_samples_per_num_classes(train_dataset, val_dataset, test_dataset)
# count_samples_per_class(train_dataset, val_dataset, test_dataset)
else:
raise ValueError('Argument Error: Give the path to the folder where tf.record files are located.')
'''
a = 0
for batch in train_dataset:
a += batch[1].shape[0]
print(f'Number of samples in the training set is {a}')
'''
modelFactory = ModelFactory(NUM_OF_CLASSES, args.batch_size, args.epochs, f'{args.logname}/models')
model1 = modelFactory.create_model('model1', args.channels, args.architecture, args.label_type)
model1.build()
......
......@@ -40,13 +40,6 @@ def run(args):
if not args.base:
optimizer_2 = keras.optimizers.SGD(learning_rate=lr_schedule_2)
optimizers['optimizer_2'] = optimizer_2
'''
optimizer_1 = keras.optimizers.Adam(learning_rate=0.001)
optimizers = {'optimizer_1': optimizer_1}
if not args.base:
optimizer_2 = keras.optimizers.Adam(learning_rate=0.001)
optimizers['optimizer_2'] = optimizer_2
'''
trainMetrics_1 = TrainMetrics(logdir, '1', args.num_classes)
valMetrics_1 = ValMetrics(logdir, '1', args.num_classes)
......@@ -99,21 +92,10 @@ def run(args):
# Evaluate the last model
evaluate(args, metrics)
'''
if not args.test:
draw_histogram_plots(args.logname, samplewise_focal_losses_1_for_n_epochs,
samplewise_focal_losses_2_for_n_epochs, samplewise_error_losses_1_for_n_epochs,
samplewise_error_losses_2_for_n_epochs)
'''
if args.logname[-2:] != '/0':
# Load the best model
model_path = os.path.join(args.logname, 'models/best.h5')
best_model = tf.keras.models.load_model(model_path)
# Validation results for the best model
print('----------VALIDATION RESULTS USING THE BEST MODEL----------')
validate(args, metrics, epoch, best_model)
# Evaluate the best model
print('----------TEST RESULTS USING THE BEST MODEL----------')
......
import numpy as np
import tensorflow as tf
import math
@tf.function
def groupLasso(y,p):
inverted_y = -y + 1
select_examples = tf.cast(tf.expand_dims(y, 1) * tf.expand_dims(inverted_y, 2), tf.float32)
k = tf.subtract(tf.expand_dims(p, 2), tf.expand_dims(p, 1))
k = tf.math.multiply(k, 2)
k = tf.math.add(k, 1)
k = tf.math.maximum(tf.constant(0, dtype=tf.float32), k)
errors = tf.math.square(k)
# Select instances
errors = errors * select_examples
# Calculate the loss_miss
k = tf.reduce_sum(errors, 2)
groups_miss = tf.math.sqrt(k)
loss_miss = tf.reduce_sum(groups_miss, axis=1)
# Choose the potential missing
potential_missing = tf.argmax(groups_miss, axis=1)
missing_high = tf.reduce_max(groups_miss, axis=1)
# Calculate the loss_extra
k = tf.reduce_sum(errors, 1)
groups_extra = tf.math.sqrt(k)
loss_extra = tf.reduce_sum(groups_extra, 1)
# Choose potential extra
potential_extra = tf.argmax(groups_extra, 1)
extra_high = tf.reduce_max(groups_extra, 1)
# Choose the missing or the extra
CLASSES = tf.where(missing_high > extra_high, potential_missing, potential_extra)
# Enter the loss for the sample
loss = (loss_miss + loss_extra) / 2.0
return loss, CLASSES
#@tf.function
def noForLasso(y,p):
# Get the ones and zeros as ragged tensors
bool_y = tf.cast(y, tf.bool)
ones = tf.ragged.boolean_mask(p, bool_y)
inverted_labels_bool = tf.math.logical_not(bool_y)
zeros = tf.ragged.boolean_mask(p, inverted_labels_bool)
# Stack corresponding ones and zeros on top of each other
one_zero_pairs = tf.ragged.stack([ones,zeros], axis=1)
# print(one_zero_pairs.shape)
# print(one_zero_pairs)
tf.RaggedTensor
LOSSES = tf.TensorArray(dtype=tf.float32, size=0, dynamic_size=True)
CLASSES = []
return LOSSES, CLASSES
def calc_errors(one_zero_pairs):
# Calculate missing class errors
onez = tf.expand_dims(one_zero_pairs[0], axis=0)
zeroz = tf.expand_dims(one_zero_pairs[1], axis=1)
k = tf.math.subtract(zeroz, onez)
k = tf.math.multiply(k, 2)
k = tf.math.add(k, 1)
k = tf.math.maximum(tf.constant(0, dtype=tf.float32), k)
errors_miss = tf.math.square(k)
# Calculate extra class errors
onez = tf.expand_dims(one_zero_pairs[0], axis=1)
zeroz = tf.expand_dims(one_zero_pairs[1], axis=0)
k = tf.math.subtract(zeroz, onez)
k = tf.math.multiply(k, 2)
k = tf.math.add(k, 1)
k = tf.math.maximum(tf.constant(0, dtype=tf.float32), k)
errors_extra = tf.math.square(k)