Big Data R & D work to help cool by side

I. Background

In about March 10 voted to help resume operations, has no response, thought it was brushed resume, did not expect to March 24 hr called to say the afternoon of March 25 interview, I thought to myself voted in large data, certainly hadoop , hive, spark these must ultimately, so full review of data related to the day-old course, began the following interview today:

Second, the interview process

Interviewer brother were very good, did not let me introduce myself up, just tell me that today is divided into three large pieces of content: basic data structures, algorithms and data structures, some of the topics ideas and projects he has done.

3.1 basis

1, the hash table

q: do you know HashMap, you dissolve the list it?

a: I'll say a little (actually a HashMap and know what hash table is)

q: results of the interviewer asked me based on my understanding, let me design a hash table an industrial level, how would I design?

a: I direct ignorant, told a key, hashcode, array.

q: Array? For example, to store 10 million data array suitable for you?

a: I said it with a chain, because there may not be so large contiguous memory.

q: list, and then what?

a: not anymore

2, quick drain

q: Do you know fast row it?

a: you should know it.

q: Fast discharge time complexity is how much, how to derive

a: n * log n, derived: . . .

q: The worst time complexity, leading to how, and why?

a: just forgot to say a n ^ 2, the feeling behind it is nonsense

q: Put another question is asked of it, the core idea of ​​what is fast row?

a: I mentioned division

q: according to the division I mentioned, a related problem, sort the data on 10 million

a: I just said sorted batches, and then merge

q: If there are duplicate for each batch of how to do

a: I said no, and prompt you about it

q: Use hash value of the data partition, based on the hash average, the final will not have to re-up (I still do not understand)
the interviewer to see what I would not give up and asked

3.2 Code

q: You will see the spark, the spark of RDD you know what?

a :( I was asked senseless, RDD how there is in it) ah. . . Ok. . . Ok. . . A day and teenagers did not say a word, and later asked what is what?

q: What rdd the operator?

a :( muddled brain rust-day teenagers, and one that I will, on the excitement, and began long-winded) rdd operator first transform into operator and operator action, transform operator has a map, mappartition ... ( and then wanted is not up, not even the most simple flatMap also thought up), action operators have reduce, reduceByKey (and then wanted is not up)

q: That spark will not let you write code, write a basic bar code

1, a recursive binary tree depth

2, Xiaoqing leapfrog level issues, 100 order (let me use recursion, but I use recursion timeout, and said 100 too much time out, change the numbers, right, 30, 40? The interviewer said OK, and then with 30 ran through a)

3, breadth-first binary output

4, to find the substrings contained

Such as: axdcfadcfcfa

​ cfa

I was ignorant, I did not write it, then think of the double pointer directly solve the problem

3.3 Project

Let me introduce my project with a spark, and I simply say it because it is a big data job, the interviewer said you project this big company simply despise, still have to focus on basic

Third, ask link

q: We, as students learn big data, big data framework is more important, or is it a basis for more important?

a: You want to go see what the company now, large companies value most basic, like data structures, networks, operating systems, databases, these are very demanding, understand big data framework is a plus items.

q: I would like to ask me how to behave today? At this first interview (laughs)?

a: coding okay, but the theory does not work, come on!

IV Summary

1, during the entire interview experience is very good, in fact, not so scary interview before their own thinking;

2, do not talk during the interview in La rebuff, asked what answer;

3, the foundation is very important! ! ! ! ! ! ! ! ! Be sure to review the basics of good, big data framework is not so important.

Published 42 original articles · won praise 3 · Views 2048

Guess you like

Origin blog.csdn.net/stable_zl/article/details/105101186