Once I posted my first blog “How to Start a Career in Analytics for Free”, I was flooded with questions on how to prepare for analytics interviews. After giving a good amount of interviews in the last 6 months I could find a clear pattern in which they are conducted.
Although Analytics as a field has been present for a long time, people have recently started adopting it as a career. Hence, there is a lot of scattered information available on the net, relating to interview preparation, but no material complete in itself.
It is also impossible to prepare for each potential question, as Analytics is a vast field. However, the same questions can be bucketed into different categories and one can concentrate on specific categories depending on the work profile. Each company tests on these categories by holding different interview rounds which will also be covered subsequently.
Before jumping into categories of interview questions and different interview rounds, it is necessary to understand the flow of data analysis and associated designations.
FLOW OF DATA ANALYSIS
- Define your Questions: One must begin with the right question(s). Questions should be measurable, clear and concise. Design your questions to either qualify or disqualify potential solutions to your specific problem or opportunity.
- Decide on the Objectives: It is impossible to do sound analysis without knowing what you wish to achieve. Too often an analysis is started without a clear idea of where it is going. The result is usually a lot of wasted time and an inadequate analysis. Avoid this by deciding on the objectives of the analysis before starting it.
- Data Collection: The way you collect data should relate to how you’re planning to analyze and use it.Essentially, collecting data means, putting a design for collecting information into operation. You’ve decided how you’re going to get information – whether by direct observation, interviews, surveys, experiments and testing, or other methods – and now you want to implement your plan. Recording and organizing data may take different forms, depending on the kind of information you’re collecting.
- Data Cleaning: Improving data quality is an essential step in data analysis. It includes converting data into a structured format, handling missing data, feature engineering and weeding out futile information. Analyzing bad quality data will result in erroneous conclusions unless steps are taken to validate and clean it.
- Analysis: Analysis involves examining information in ways that reveal the relationships, patterns and trends in it. That may mean subjecting it to statistical operations that can tell you not only what kinds of relationships seem to exist among variables, but also to what level you can trust the answers you’re getting. The point, in terms of your evaluation, is to get an accurate assessment in order to better understand your work and its effects on those you’re concerned with, or in order to better understand the overall situation.
- Data Modeling: This involves building models that correlate the data with the business outcomes and then make suitable recommendations. This is where the unique expertise of an analyst becomes critical to business success—correlating the data and building models that predict business outcomes. Such an analyst must have a strong background in statistics and machine learning to build scientifically accurate models and avoid the traps of meaningless correlations and models that are so reliant on existing data that their future predictions are useless. But statistical background is not enough; analysts need to understand the business well enough that they will be able to recognize whether the results of the mathematical models are meaningful and relevant.
- Optimize and Repeat: The flow of data analysis is a continuous and repeatable process. Each stage should be monitored and optimized accordingly for better results.
- Data Architect:-
Large enterprises generate huge amounts of data from various different sources. The Data Architect is someone who can understand all the sources of data and work out a plan for integrating, centralizing and maintaining all the data. He must be able to understand how the data relates to the current operations and the effects that any future process changes will have on the use of data in the organization. He needs to be able to have an end-to-end vision, and to see how a logical design will translate into one or more physical Databases, and how the Data will flow through the successive stages involved.
This may include things like designing relational databases, developing strategies for data acquisitions, archive recovery, and implementation of a database, cleaning and maintaining the database by removing and deleting old data etc.
- Data Engineer:-
Data engineers are hard core engineers who know the internals of database softwares. He compiles and installs database systems, writes complex queries, scales it to multiple machines, ensures backups and puts disaster recovery systems in place. He usually has a deep knowledge and expertise in one or more different database softwares (SQL / NoSQL).
- Data Analyst/Business Analyst:-
The primary task of a data analyst/business analyst is compilation and analysis of numerical information. They usually have a computer science and business degree. They get analytical insights out of all the data which an organization can have (Database soft wares or just excel sheets) which makes sense for the organization and compile them into decent reports so that other non-technical folks can understand and decide their course of action.
An analyst usually works to get analytical insights out of data and this job profile does not include working with statistics (usually) and has nothing to do with “Big Data” in particular.A decent mid-sized organization can have many analysts. For example – a sales analyst may look at total sales in the past quarter and figure out a proper sales strategy (where to sell and whom to sell to maximize profits). He will then communicate the report to the leadership.
- Data Scientist:-
“Data Scientist” is a very recent phenomenon. The overall mission of a scientist is same as an analyst but once the volume and velocity of data crosses a certain level, it requires really sophisticated skills to get those insights out.
A “Data Scientist” usually has many overlapping skills – Database Engineering, handling Big Data systems, knowledge of statistical programming languages, business knowledge and knowledge of statistics / data mining.
Whereas a traditional data analyst may look only at data from a single source a data scientist will most likely explore and examine data from multiple disparate sources. The data scientist will sift through all incoming data with the goal of discovering a previously hidden insight, which in turn can solve a business problem. Good data scientists will not just address business problems, they will pick the right problems that have the most value to the organization.
- Business Intelligence Engineer:
These are the people who consume historical data from transactional databases, denormalize it, write huge gigantic SQL/Hive queries to flatten, reshape and aggregate the data, do some basic statistical analysis and then build fancy data visualizations using tools/programming to present/communicate the data effectively. The dashboards they build are often used by senior management and VPs. These people are not the strongest technical people, but they are jack of all trades and can fill in anyone’s shoes whenever required. This is also a very cross-functional role as you work with data engineers to get the data, data scientists to get statistical analysis done and with business analysts/managers to present the insights.
- The above mentioned designations are loosely defined in the analytics industry. Hence it is best to look at work profiles rather than designations as they can often be misleading.
- Now that we have covered the flow of data analysis and associated designations, we can move towards the categories of interview questions and different interview rounds.
CATEGORIES OF INTERVIEW QUESTIONS
- Basic Overview of Analytics Field: One should have a broad picture about what the analytics industry is all about. A knowledgeable candidate will always have an upper hand.
- Basic Domain Knowledge: Analytics can help solve problems for various industries like e-commerce, retail, banking, telecom, pharmaceutical, BFSI etc. If the role offered is industry specific, then it’s always a plus to have a basic domain knowledge.
- Communication and Presentation Skills: This is one of the most important skills. No matter how strong your other skills are, a company will not compromise on effective communication and presentation skills (both verbal and written) i.e. a candidate lacking such skills will rarely make it through.
- Energy and Passion: The first question of any one-to-one interview will always be “tell me about yourself and how you landed here?” Skills can be taught but passion sure cannot, a passionate candidate leaves a wonderful first impression. Matching this passion with energy reflects confidence of the candidate and becomes an instant hit with the interviewer.
- Logical Thinking: Logical thinking is the process in which one uses reasoning consistently to come to a conclusion. Problems or situations that involve logical thinking call for structure, relationships between facts and chains of reasoning that “make sense.
”Companies evaluate logical thinking on “• points” by the help of “-> points”:-
- Structural Approach
- Problem Solving Skills
- Attention to Details
- Ability to Handle Pressure
-> Aptitude Test
-> Guess Estimates
- Basic Statistics: Statistics is the study of the collection, analysis, interpretation, presentation, and organization of data.
Generally our knowledge of statistics is limited to what we learnt in high-school. Hence, it is important to brush up and learn more advanced concepts which can be done using innumerable free resources available online.
- Basic Mathematics: Mathematics is the science of numbers and their operations, interrelations, combinations, generalizations, abstractions of space configurations and their structure, measurement, transformations, and generalizations.
If you’re mathematical skills are rusty, it is advisable to revisit the high school mathematics curriculum which in my opinion is quite comprehensive.
- Machine Learning: In general, machine learning is about learning to do better in future based on what was experienced in the past. The emphasis of machine learning is on automatic methods. In other words, the goal is to devise learning algorithms that do the learning automatically without human intervention or assistance.
For example, Facebook’s News Feed changes according to the user’s personal interactions with other users. If a user frequently tags a friend in photos, writes on hiswall or “likes” his links, the News Feed will show more of that friend’s activity in the user’s News Feed due to presumed closeness.
Machine Learning is considered an advanced skill and definitely a must in highly technical roles.
Other examples of machine learning problems include face detection, spam filtering, medical diagnosis, customer segmentation, fraud detection and weather prediction.
- Databases & Big Data Concepts: A database is a structured set of data held in a computer, especially one that is accessible in various ways whereas Big data is a buzzword used to describe a massive volume of both structured and unstructured data that is so large that it is difficult to process using traditional database and software
It is important to understand how data was traditionally stored and retrieved from databases and update oneself on how to handle such data, now that it has increased in volume exponentially.
- Basic Programming: In the most basic sense, programming means creating a set of instructions for completing some specific task.
All programming languages are built on (more or less) the same basis. You can be asked to perform simple operation/algorithms using any programming language (of your choice) to check your basic programming skills.
- Statistical Tools: There are tools that help you in statistical analysis.
This is a must skill to have and the most popular statistical tools in the market today are R, SAS and Python.
- Data Visualization: Data visualization is a general term that describes any effort to help people understand the significance of data by placing it in a visual context. Patterns, trends and correlations that might go undetected in text-based data can be exposed and recognized easier with data visualization
In my opinion, this is not a necessary skill as it can be easily acquired but it definitely gives you an edge over other candidates.
- Related Projects/Competitions: The best way to check skills (both technical and non-technical) of a candidate is by asking him to explain his related projects/competitions.
Hence it is very important to do a lot of projects and also participate in competitions (Kaggle, Data-Hackathons etc.) which display your ability to practically apply your skills. A candidate will never be selected just on the basis of his theoretical knowledge.
- Resume: I cannot stress enough on the fact that this is probably the most vital and often neglected (from the candidate’s side) part of the interview process. This is a powerful document that summarizes your entire professional life in at most a couple of pages.
I have been in interviews where all the technical questions were asked only from my resume and if we think about it, this should not sound strange. One is called for an interview on the basis of their resume (as the company finds your mentioned work profile, skills and projects appealing). Hence many a times they limit their questions to the points mentioned in your resume only.
- Preparation for freshers vs experienced candidates:-
- Freshers are generally tested more on their non-technical skills and taught technical skills on board, though knowledge of technical skills do provide them an edge. (Some companies also test freshers on knowledge of their respective undergraduate discipline)
- Experienced candidates (having no technical skills) wanting to shift their career into analytics should also concentrate on non-technical skills and compete against freshers for entry level positions.
- Experienced candidates (having technical skills) but no related work experience should concentrate on both non-technical and technical skills when applying for desired work profiles.
- Experienced candidates (having technical skills) with related work experience should concentrate more on technical skills when applying for desired work profiles.
- It is important to understand the work profile when applying for a certain designation. Preparation should then be work profile specific and not designation specific.
- Aptitude: This round checks for quantitative, verbal and problem solving skills. This is not used to identify suitable candidates but rather remove the unsuitable ones.
- Group Discussion: In short, group discussions tests if you know the topic well, are able to present your point of view in a logical manner, are interested in understanding what others feel about the same subject and are able to conduct yourself with grace in a group situation.
This (like the aptitude round) also acts as an elimination round.
- Personality Fit: Despite having apt skills, selection of a candidate might boil down to whether he will fit the company’s corporate culture or not.
It is best to research about the company, its core values and processes beforehand.
- Logical: This round involves testing on puzzles, case-studies and guess estimates as discussed in “Categories of Interview Questions (Non-Technical)”
- Technical: This round involves testing on technical skills as discussed in “Categories of Interview Questions (Technical)”
- Coding: A coding round is used to assess ones basic programming skills.
- Resume Based: As discussed earlier, resume is a vital part of the interview process. A candidate should be thorough with each and every line written in their resume.
- Different companies have different “categories of interview questions” and different “interview rounds”. This blog, to my knowledge, covers each and every type of the same.
- People willing to start a career in analytics can read my first blog “how to start a career in analytics for free”
- Both blogs are aimed towards freshers or professionals in early stages of their careers.
- Both blogs are applicable for all types of organizations (small, mid-sized or large corporations)
As some of you would already know, I started from scratch a little more than a year ago. There are a lot of free (high quality) resources available online to kick start your career in analytics.
Currently, there is a huge demand for people with the right skills in the analytics industry and companies, for a change, are running after such candidates. You are not only in a position to choose between companies but also dictate your own salary. If one concentrates on acquiring the right skills then I am sure everything will fall into place.
As per me, after weighing my options, I have decided to move into “cardekho.com/gaadi.com” next month as a Junior Data Scientist.
If you like this post then do share your questions, comments or add suggestions (You will only need to enter your email address and name).
You also have an option of sharing this blog via Twitter, Facebook or Google+