Q&A with Salesforce Research Intern Akhilesh Gotmare on how "Optimization and Machine Learning" led him to ICLR.

2 min read

In the research paper, “A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation” Futureforce PhD Intern Akhilesh Gotmare, worked with Research Scientist Nitish Shirish Keskar, Director of Research Caiming Xiong, and Salesforce Chief Scientist Richard Socher, to leverage recent tools built specifically for analyzing deep networks, visualization, mode connectivity and singular value canonical correlation analysis—where they investigated three strategies in detail:

  1. Cosine learning rate decay
  2. Learning rate warmup
  3. Knowledge distillation

This research was one of 6 papers from the Salesforce Research Team recently accepted at the Seventh International Conference on Learning Representations (ICLR).

In the below Q&A, Gotmare explains why he chose to focus on this research, and his reaction to the ICLR acceptance.

What is your primary focus area?
My focus area is Optimization and Machine Learning. I have found optimization algorithms to be at the heart of most machine learning problems, making it such a sweet spot of research that spans across a wide array of niche areas within ML.

Tell us about your ICLR research.
My work at Salesforce revolved around experimentally verifying some of the conjectures made on the heuristics involved in training deep neural nets. Previous works have proposed some novel ideas that benefit the DNN training procedure but provide oversimplified intuitions to reason these findings. We revisit these propositions and verify them using novel tools of mode connectivity and Canonical Correlational Analysis (CCA) on layer activations.

What was your reaction upon finding out your research had been selected for ICLR?
I was very happy as this will be my first paper published at an ML conference. It was 4 am at my place when I heard from my co-author about the acceptance and I couldn’t sleep for the rest of the night. The enormous amount of effort I and my co-authors had put into the making of this paper was all worth it when I got the acceptance decision.

How might your research benefit the average human?
Understanding of deep learning phenomena promotes principled approaches to DL research and has the potential of unlocking novel efficient training schemes. A major motivation for my research on novel parallelization methods too is improving the efficiency of deep learning training procedures, which typically require tremendous computing resources. Lower computation costs would bring down the barrier to benefit from these advances in deep learning, thereby making the field more open and accessible.

What would you tell someone interested doing a PhD internship with the research team at Salesforce?
As a new member of the Salesforce Ohana, things might seem too good to be true initially, but with time you’d realize this is a genuinely great place to work. The research team works on a very diverse set of problems thereby offering you the freedom to work on practically any area of DL research.

Akhilesh will be joining our team as a full time employee this month, catch him at ICLR to learn more! To learn how you can join him on our Salesforce Research Team check out available roles at: https://einstein.ai/careers.