"About me sweeping the first-line factories"-Baidu back-end (with answers)

Baidu LOGO

The author guarantees that this series is all pure dry goods and true records, and it is definitely not an interview made up by some marketing accounts.

1. Company profile

Baidu is the world's largest Chinese search engine, China's largest comprehensive Internet service company with information and knowledge as its core, and the world's leading artificial intelligence platform company. It was founded in Zhongguancun on January 1, 2000. The company’s founder, Robin Li, owns a patent for " ultra-chain analysis " technology, making China one of the only four countries in the world that have core search engine technologies outside of the United States, Russia, and South Korea . One.

Baidu is the world's largest Chinese search engine. Baidu responds to billions of search requests from more than 100 countries and regions every day, and is the main entrance for netizens to obtain Chinese information. With the mission of "using technology to make the complex world simpler", Baidu continues to insist on technological innovation, and is committed to "becoming the world's top high-tech company that understands users best and can help people grow."

Baidu is China's largest comprehensive Internet service company with information and knowledge as its core. Driven by AI, Baidu's mobile ecosystem is China's largest mobile ecosystem centered on information and knowledge, with Baijia accounts, smart applets, and hosted pages as the main pillars. In 2019, the number of Baidu users exceeded 1 billion. Baidu App has 222 million daily active users, and its information flow ranks first in China. The number of Baijiahao creators reached 3 million. Baidu Smart Mini Program is the only fully open source Mini Program platform in China, with monthly active users exceeding 354 million. The six major knowledge products , including Baidu Know , Baidu Baike , and Baidu Library , have produced over 1 billion high-quality content, building the largest knowledge content system in China.

Baidu is the world's leading artificial intelligence platform company. Baidu Brain is China's only "soft and hard-integrated AI production platform". It is a masterpiece of Baidu AI and has output more than 250 AI capabilities in all aspects. Flying Paddle is China's first fully open source and fully functional industrial-level deep learning platform. It is an "operating system in the intelligent age" independently developed by China. Baidu Smart Cloud is an important carrier and exporter of Baidu's AI To B business and a leader in industrial intelligence. Xiaodu Assistant is the largest conversational artificial intelligence operating system in China. It has the largest and most prosperous conversational artificial intelligence ecosystem in the Chinese market. In March 2020, Xiaodu Assistant has more than 6.5 billion voice interactions. As the world's largest open platform for autonomous driving, Apollo represents China's strongest autonomous driving capability and is listed as one of the world's top four leaders in autonomous driving by the well-known research company Navigant Research. At present, three open platforms for autonomous driving, vehicle-road collaboration, and intelligent vehicle linkage have been formed. In terms of autonomous driving, there are more than ten firsts in China, and its technological strength leads the industry. In terms of intelligent transportation, Baidu's "ACE Traffic Engine" is the world's first full-stack intelligent transportation solution integrating vehicle, road and travel.

 

Stop talking nonsense, which company BATTMD is not a big fan, just look at the salary

School recruitment

SP stands for special offer

The offer level of SSP is still above SP

Under normal circumstances, 18*15 in the salary column

Means 1.8w a month, a total of 15 months in a year

15 of which is generally 13 months’ salary, including 2 months of year-end bonus

Let's take a look at the data after the end of simple science

This is the salary of master's degree, 1-2k is subtracted from undergraduate degree, and 1-2k is subtracted from research and development for test posts.

Social recruitment

In terms of salary, it is not very competitive in BATTMD, but it has increased a lot this year. It is considered very sincere and suitable for students who like the so-called "technology" and who have no better choice.

2. Evaluation of the company

Yu Jun, the god of products, left on the eve of the mobile Internet, and Baidu fell behind at 10-20. This is an undeniable fact. But in the past two years, Baidu has been changing, betting on unmanned vehicles. The stocks did perform well last year, and they can stick to it for the time being.

Market value: US$114.5 billion (more than doubled from last year, I thought he was cold before)

Three, the interview process

1) How to judge whether the linked list has a ring?

Decently asked me to write it on the spot and talk about my thoughts.

In order to represent the rings in a given linked list, we use the integer pos to indicate the position where the end of the linked list is connected to the linked list (the index starts from 0). If pos is -1, then there is no ring in the linked list.

Example 1:

Input: head = [3,2,0,-4], pos = 1
Output: true
Explanation: There is a ring in the linked list, and its tail is connected to the second node.

 

Example 2:

Input: head = [1], pos = -1
Output: false
Explanation: There is no ring in the linked list.

Advanced:

Can you  solve this problem with O(1) (ie, constant) memory?

 

Idea: First of all, we must understand that this situation is impossible for linked lists:

Because there is only one pointer for a node, the linked list can only look like instance one, with a ring at the end.

The slow pointer is one step at a time, and the fast pointer is two steps at a time. To meet is to have a ring, otherwise there is no ring. Just like running in the playground, those who run fast can one day catch up and meet each other slowly.

/**
 * Definition for singly-linked list.
 * class ListNode {
 *     int val;
 *     ListNode next;
 *     ListNode(int x) {
 *         val = x;
 *         next = null;
 *     }
 * }
 */
public class Solution {
    public boolean hasCycle(ListNode head) {
        if (head == null || head.next == null) {
            return false;
        }
        ListNode slow = head;
        ListNode fast = head.next;
        while (slow != fast) {
            if (fast == null || fast.next == null) {
                return false;
            }
            slow = slow.next;
            fast = fast.next.next;
        }
        return true;
    }
}

So, this may be a bad street problem for people who understand algorithms, and some people may disdain to read it. Then, the second question is here: if the fast pointer is allowed to take three steps at a time, can it be done correctly? The answer? How about four steps at a time? What about five steps?

If there are many people watching, I will announce the answer in the next issue, haha, don’t think that you can pass the test by taking the test.

2) Introduce the data structure of the heap

Big root pile requirements

① The key of the root node is not only greater than or equal to the key value of the left subtree, but also greater than or equal to the key value of the right subtree.

② is a complete binary tree .

Note that this is defined recursively.

For large roots and small root heaps, recursive definition, implementation, space complexity, time complexity of various operations, real binary tree writing and array simulation are required.

Someone wants to ask, what is the use of these algorithms? In fact, Java's priority queue is a heap structure. The heap sort, one of the eight sorts, is also the heap structure on the array. The interviewer asked me to implement one manually. The following is the implementation.

/*
================================================
功能:堆排序
输入:数组名称(也就是数组首地址)、数组中元素个数
注:画画
================================================
*/
/*
功能:建堆
输入:数组名称(也就是数组首地址)、参与建堆元素的个数、从第几个元素开始
*/
void sift(int *x, int n, int s)
{
    int t, k, j;
    t = *(x+s); /*暂存开始元素*/
    k = s;   /*开始元素下标*/
    j = 2*k + 1; /*左子树元素下标*/
    while (j<n)
    {
        if (j<n-1 && *(x+j) < *(x+j+1))/*判断是否存在右孩子,并且右孩子比左孩子大,成立,就把j换为右孩子*/
        {
            j++;
        }
        if (t<*(x+j)) /*调整*/
        {
            *(x+k) = *(x+j);
            k = j; /*调整后,开始元素也随之调整*/
            j = 2*k + 1;
        }
        else /*没有需要调整了,已经是个堆了,退出循环。*/
        {
            break;
        }
    }
    *(x+k) = t; /*开始元素放到它正确位置*/
}
/*
功能:堆排序
输入:数组名称(也就是数组首地址)、数组中元素个数
注:
            *
         *     *
       *   -  *   *
      * * * 
建堆时,从从后往前第一个非叶子节点开始调整,也就是“-”符号的位置
*/
void heap_sort(int *x, int n)
{
    int i, k, t;
//int *p;
    for (i=n/2-1; i>=0; i--)
    {
        sift(x,n,i); /*初始建堆*/
    }
    for (k=n-1; k>=1; k--)
    {
        t = *(x+0); /*堆顶放到最后*/
        *(x+0) = *(x+k);
        *(x+k) = t;
        sift(x,k,0); /*剩下的数再建堆*/
    }
}

3) What do you know about sorting? To introduce the introduction?

Answer: All I know, the plenum wrote, and then only talked about all the ideas and optimizations of bubbling, and the fast queue BFPRT was stopped. I will share all sorting introductions and implementations with you.

Sort all

At the end of the interview, the interviewer said he was very satisfied with me and said that he would let another person do it right away.

Two sides:

The brother on the other side said, on the one hand, your algorithm is a thief, let's not talk about the algorithm this time, and talk about the project.

4) Seeing that redis is used in my project, I asked what data structure redis has.

I said there are string, list, hash, set, zset.

Question: You basically have these Java and other languages. You said you understand redis. Are these data structures fast? How did it happen?

I gave an example:

  • 1) String

Redis does not use the traditional C language string representation, it builds a simple abstract type of dynamic string itself.

When a string that can be modified is needed, redis will use its own implementation of SDS (simple dynamic string). For example, in the redis database, the bottom layer of key-value pairs containing strings are implemented by SDS. Not only that, SDS is also used as a buffer: for example, the AOF buffer in the AOF module and the input buffer in the client state. Area.

Let's take a look at the implementation of sds in detail below:

  1. struct sdshdr

  2. {

  3. int len;//buf已使用字节数量(保存的字符串长度)

  4. int free;//未使用的字节数量

  5. char buf[];//用来保存字符串的字节数组

  6. };

sds follows the convention that strings end with'\0' in c, and the space of this byte is not included in len.

The advantage of this is that we can directly reuse some of the functions in c. Such as printf;

    The improvement of sds relative to c

    Get length: c string does not record its own length, so get length can only traverse the string once, redis can read len directly.

    Buffer security: c strings are prone to buffer overflow, for example: programmers do not allocate enough space to perform splicing operations. Redis will first check whether the sds space meets the required requirements, and if it does not meet the requirements, it will automatically expand.

    Memory allocation: Since c does not record the length of the string, for a string containing n characters, the bottom layer is always an array of length n+1, and every time the length changes, the array must always be re-allocated. . Because memory allocation involves complex algorithms and may need to perform system calls, it is usually a time-consuming operation.   

    Redis memory allocation:

1. Space pre-allocation: If the modified size is less than 1MB, the program allocates the same unused space as len; if the modified size is greater than 1MB, the program allocates 1MB of unused space. Check when modifying the length, and if it is enough, use the unused space without reallocating it. 

2. Lazy space release: When the string is shortened, there is no need to release space, just use free to record and save it for future use.

    Binary security

In addition to the end of the c string, it cannot contain null characters, otherwise the program will mistakenly think that it is the end when the program reads a null character, which limits the c string can only save text, and binary files cannot be saved.

The redis strings are binary safe, because there is len to record the length.

This is the realization and main points of string in redis, and I probably told him it all. Then he said you don't need to talk about it, and when it comes to algorithmic data structure, let's talk about something else.

5) Let’s talk about it. You’ve been blowing redis. Do you know what problems using redis will cause you?

I was a little confused, so I told him about the difference between nosql and transmission database, and then he said you don’t tell me about these comparisons. For example, have you heard of the cache avalanche? I understood what he wanted to talk about, and told him the following.

Cache penetration

The general cache system caches queries according to the key. If the corresponding value does not exist, it goes to the back-end system to look up (such as DB).

Some malicious requests will deliberately query non-existent keys, and the amount of requests is large, which will cause a lot of pressure on the back-end system. This is called cache penetration.

 

How to avoid it?

1: The query result is also cached when the query result is empty, so that when the query result is accessed again, the cache layer will directly return a null value. The cache time is set to be shorter, or the data corresponding to the key is inserted and the cache is cleared.

2: Filter the keys that must not exist. Please see Bloom filter for details

Cache breakdown

It is for data that is not in the cache but exists in the database.

The scenario is that when a key fails, if a large number of requests suddenly flood in to request the same key, these requests will not hit Redis, but will request the DB, causing the database to be too stressed, or even unable to hold it, and hang up.

Solution

1. Set the hot key, automatically detect the hot key, increase the expiration time of the hot key or set it to never expire, or set it to logically never expire

2. Add a mutual exclusion lock. When it is found that there is no hit to Redis, when you go to check the database, lock on the operation of updating the cache. When a thread accesses, other threads wait. After this thread accesses, the data in the cache will be rebuilt, so that other threads can Get the value from the cache.

Cache avalanche

It means that a large number of keys fail at the same time, and requests for these keys will hit the DB again, which will also cause excessive pressure on the database or even hang up.

Solution

1) To spread the expiration time of the Key, you can add a random value to the uniform expiration time, or use a more advanced algorithm to disperse the expiration time.

2) Build multiple redis instances, and if individual nodes are down, there are others that can be used.

3) Multi-level cache: For example, increase local cache to reduce redis pressure.

4) Add current limiting measures to the storage layer, and provide downgrade services when the request exceeds the limit (generally just return an error)

He said that as a student, it is enough to know this knowledge, and he is more satisfied with me, saying that it will be three-sided.

The three-sided interviewer seems to be a certain leader and asks more casually.

6) The process of entering the URL to seeing the web page

Answer: (as detailed as you can, you will probably die if you recite the answer, you have to understand) domain name resolution --> TCP three-way handshake --> send http request --> respond to http request, browser will get html code --> The browser parses the code and requests resources (js, css, pictures, etc.) in the html code --> the browser renders the page to the user

7) Let the handwriting fast type (I wonder if there is nothing else to ask)

Then it's over.

 

Fourth, feel

It feels that the overall difficulty is average. I changed the question of the link list to three or four steps. I didn't expect it. It proved that the interviewer had something. The second interviewer gave me the feeling that I had also read the redis source code, and I probably had incomplete records. The two people talked about it very speculatively. The overall experience was quite good, probably because it was also introduced by seniors, the three interviews passed like chatting. The handwritten algorithm is basically not stuck, because I usually attach great importance to the code style. They think they are more satisfied with the code. As a result, I made an offer through all the interviews, but didn't go there in the end.

Guess you like

Origin blog.csdn.net/hebtu666/article/details/114999315