How I got my first paper — Part — 1
It was 2019 and I was in the second semester of my MS when I thought that I should start working towards finding a potential topic for my Thesis for which I decided to start working with a researcher in the hope of finding something along the way to work on during my Thesis for which I sent emails to different professors, initially, in the same university in which I was studying, got responses and started working with one of them but after a week or so, realized that I was not enjoying it so I stopped and started emailing to different university professors and got a response from one such professor and went to meet him with the intention of a summer internship, initially, and based on that decide on whether to pursue Thesis with the same professor or not but when I met him, he gave me an offer of doing a Thesis on one of the projects in his lab and I asked him for a week’s time to think about it as I was not expecting such an offer, so I went after a week and decided to work in collaboration with a researcher already working in the lab on Image Matching, there was also a PhD student working on the same topic and the researcher with which I started working had also done his MS Thesis on the same topic, my thinking was that it would be easier to publish a paper as many people around me were available for help so I started working with them.
The problem statement was as follows:
Given an image, find similar images from a database.(Image Search Engine)
Two ways possible:
Street-Satellite Matching (Multi-View)
After the problem statement, I started off by reading Thesis of the researcher with whom I was working which basically said:
Learn features from auto encoder network and then use a classifier to predict whether or not the images are similar.
A problem which I was told about:
Angle differences b/w images or any transformation which is not known at test time.
The available methods were around:
Street-Street — 99%
Street-Satellite — 90–91%, SOTA-93% (2017)
So, we decided to work towards Street-Satellite Image Matching with the initial assumption that we would have coordinates available for the dataset but before that I had to read the complete Thesis so I started and got to know the following.
There are 4 major steps in Content Based Image Retrieval:
The Thesis utilized residual short skip connections, Resnet with 18 layers was used, performance was evaluated on two benchmark remote sensing datasets:
Land Use/ Land Cover Dataset
21 classes, 100 images each, 256 x 256
19 classes, 1005 images in total, 600 x 600
Some other datasets reviewed were as follows:
GTCross View (used for training)
8 cities alligned used for training
3 cities unalligned
Total 11 cities of US
1 million pairs
Street View & Overhead Images
30,000 unique landmarks, 2 million images annotated by human
Different performance metrics used were as follows:
Average Normalized Modified Retrieval Rank (AN-MRR) — [0,1]
low AN-MRR — more accurate retrieval
Mean Average Precision
Deep auto encoder to learn powerful features of all images in the dataset.
Discriminative classification network takes a pair of features, query and target and classifies if both are similar.
Generalization capability of Batch Normalization is better than dropout.
Mainly trained on GTCross View and then fine tuned on other datasets as it contains 1 million pairs having both street and satellite views and that is the reason of using it as a base for training.
Cross Entropy used as a loss for the discriminative network.
After reading the Thesis, next step was to perform a literature review, about which I will write soon, hopefully.