HARL - Heuristic Attention Representation Learning for Self-Supervised Pretraining

This is offical implemenation of HARL framework.

Colabs for HARL framework are added, see here.

End-to-End HARL Framework (from our blog here).

Use a large init learning rate {0.3, 0.4} for short training epochs. This would archieve better performance, which could be hidden by the initialization if the learning rate is too small.Use a small init learning rate for Longer training epochs should use value around 0.2.

--max_epochs 100 \ --batch_size 512 \ --lr 0.5 \ 3 Distributed training in 1 Note

Controlling number of GPUs in your machine by change the --gpus flag --gpus 0,1,2,3,4,5,6,7\ --accelerator gpu \ --strategy ddp \

HARL Pre-trained models

We opensourced total 8 pretrained models here, corresponding to those in Table 1 of the HARL paper:

Depth	Width	SK	Param (M)	Pretrained epochs	SSL pretrained learning_rate	Projection head MLP Dimension	Heuristic Mask	Linear eval
ResNet50 (1x)	1X	False	24	1000	0.5	256	Deep Learning mask	73.6
ResNet50 (1x)	1X	False	24	1000	0.3	256	Deep Learning mask	73.8
ResNet50 (1x)	1X	False	24	1000	0.2	512	Deep Learning mask	74.0
ResNet50 (1x)	1X	False	24	300	0.3	256	Deep Learning mask	69.4
ResNet50 (1x)	1X	False	24	300	0.4	512	Deep Learning mask	70.7
ResNet50 (1x)	1X	False	24	300	0.5	512	Deep Learning mask	71.4
ResNet50 (1x)	1X	False	24	100	0.	512	DRFI mask	61.2
ResNet50 (1x)	1X	False	24	100	0.2	512	Deep Learning mask	62.0

These checkpoints are stored in Google Drive Storage:

Finetuning the linear head (linear eval)

To fine-tune a linear head (with a single GPU), try the following command:

For fine-tuning a linear head on ImageNet using GPUs, first set the CHKPT_DIR to pretrained model dir and set a new MODEL_DIR, then use the following command: Stay tune! The instructions will update soon

Semi-supervised learning and fine-tuning the whole network

You can access 1% and 10% ImageNet subsets used for semi-supervised learning via tensorflow datasets: simply set dataset=imagenet2012_subset/1pct and dataset=imagenet2012_subset/10pct in the command line for fine-tuning on these subsets.

You can also find image IDs of these subsets in imagenet_subsets/.

To fine-tune the whole network on ImageNet (1% of labels), refer to the following command:

Stay tune! The instructions will update soon

Other resources

Our offical implementations in Different Frameworks

(Feel free to share your implementation by creating an issue)

Implementations in Tensorflow 2:

Official Implementation

Known issues

Pretrained models / Checkpoints: HARL are pretrained with different weight decays, so the pretrained models from the two versions have very different weight norm scales. For fine-tuning the pretrained models from both versions, it is fine if you use an LARS optimizer, but it requires very different hyperparameters (e.g. learning rate, weight decay) if you use the momentum optimizer. So for the latter case, you may want to either search for very different hparams according to which version used, or re-scale th weight (i.e. conv kernel parameters of base_model in the checkpoints) to make sure they're roughly in the same scale.

Citation

@article{Tran2022HeuristicAR, title={Heuristic Attention Representation Learning for Self-Supervised Pretraining}, author={Van-Nhiem Tran and Shenxiu Liu and Yung-hui Li and Jia-Ching Wang}, journal={Sensors (Basel, Switzerland)}, year={2022}, volume={22} }

Name		Name	Last commit message	Last commit date
Latest commit History 156 Commits
HARL		HARL
bash_files		bash_files
downstream_tasks		downstream_tasks
heuristic_mask_techniques		heuristic_mask_techniques
images		images
tests		tests
.DS_Store		.DS_Store
README.md		README.md
contribution_guideline.md		contribution_guideline.md
main_linear.py		main_linear.py
main_pretrain.py		main_pretrain.py
main_umap.py		main_umap.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HARL - Heuristic Attention Representation Learning for Self-Supervised Pretraining

Table of Contents

Installation

Requirements

Dataset -- Heuristic Mask Retrieval Techniques

Using Custome Dataset

Self-supervised Pretraining

Preparing Dataset:

in pretraining Flags:

HARL Pre-trained models

Finetuning the linear head (linear eval)

Semi-supervised learning and fine-tuning the whole network

Other resources

Our offical implementations in Different Frameworks

Known issues

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HARL - Heuristic Attention Representation Learning for Self-Supervised Pretraining

Table of Contents

Installation

Requirements

Dataset -- Heuristic Mask Retrieval Techniques

Using Custome Dataset

Self-supervised Pretraining

Preparing Dataset:

in pretraining Flags:

HARL Pre-trained models

Finetuning the linear head (linear eval)

Semi-supervised learning and fine-tuning the whole network

Other resources

Our offical implementations in Different Frameworks

Known issues

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages