Interview Questions — Azure Data Engineer

Suraj Jeswara
4 min readAug 22, 2023

Hey Guys! For a month I’ve been appearing for interviews and being a data engineer I’ve come across some questions which I have encountered in most of them. So if you are and aspiring data engineer or in fact an experience once I’m sure this blog will help you in one way or another.

Usually the interviewers starts with your introduction and project experiences and then based on that questions are asked. For a data engineer and to be more specific for an Azure Data Engineer you are expected to know the below technologies in most of the cases:

  • Azure Data Factory
  • Azure Databricks
  • Azure Synapse
  • Azure Logics App
  • SQL
  • Pyspark

Since, I had most of my experience in Azure Data Factory, I was asked more questions around Azure Data Factory, SQL, Python and some around Spark.

Questions

Let’s see some of the questions that I encountered in my interview:

  • How to do an incremental load in ADF?
  • What is data profiling?
  • Difference between ETL and ELT?
  • Difference between data lake and delta lake?
  • Azure blob vs azure ADLS gen2?
  • Can we call a pipeline iteratively in ADF?
  • How can you ingest and store an on premise data into azure blob storage?
  • What are Indexes?
  • What Azure Key Vault is used?
  • What is list comprehension?
  • What is map function?
  • What are transforms and what are actions in spark?
  • What is Lazy Evaluation?
  • What is spark context?
  • Difference between pandas Data frame and PySpark Data frame?
  • Work with Streams? How Streams can be processed?
  • How to connect ADF with Data Governance tools?
  • Moving sum partition by group?
  • Why Parquet is used by a lot of systems?
  • Difference between repartition and coalesce ?
  • What is CTE?
  • Difference between delete and truncate?
  • What are Delta tables and how are they advantageous to data frames?

Below are some of the coding questions I was asked and either I was asked to solve them in SQL or python.

Solve this

Solve this

Input
Output

Solve this

Preparation

During my initial interviews I was really bad. I could hardly answer the questions properly and moreover I was a bit shaky with a few concepts. Eventually as I appeared in interviews I kept on learning from my mistakes and prepared based on the areas I lagged in the interview. Many a time people think they are not ready for interviews and hence they procrastinate appearing for interviews but instead I would suggest that the best way to prepare for any interview is by giving them.

In my free time I started solving SQL and python questions from coding platforms. 1–2 hours a week was good enough as I was already proficient with coding, I just had to revisit those concepts.

Keep a calm while appearing for an interview. Often I lost my calm when a tough question was asked and I messed up at those moments. After appearing for a couple of interviews and even clearing 3–4 rounds of interviews in few companies I figured it out that the idea is not to see you writing a code but to check you approach of solving a problem. So I then started sharing my screen and instead of focusing on writing a query which often gives panic attacks to many my primary focus was to suggest a step by step approach to solving the problem in hand and then writing pseudo code for that. This eventually worked and many a times the interviewer was happy with my approaches and skipped the coding part 😅

I kept giving interview and with each one I kept improving. For 6–8 interviews I either received a rejection or didn’t hear back from the company. Eventually towards the end I had 3 offer😊

I hope this blog will help you prepare better. You can connect with me over linkedin here

I’m an ex Kpmgindia and ex Cognizant guy. The questions covered here is comprised of my interviews at Deloitte India Capgemini Invent Prismforce Tredence Inc

--

--

Suraj Jeswara

I am passionate about learning new things and sharing it with others. :)