1. Basic concepts
-
1. The main problems of artificial intelligence research
知识的获取,知识的表示,知识的运用
-
2. Please briefly describe the connections and differences among knowledge, data and information
数据、信息和知识是知识工作者对客观事物感知和认识的3个连贯的阶段。 数据即事实,信息是事实的载体,知识是人对信息的加工、吸收、提取、评价的结果
-
3. Similarities and differences between blind search and heuristic search
describe blind search Search without information Search according to the predetermined control strategy, and the intermediate information obtained during the search process does not change the control strategy heuristic search information search Included in the search are illuminating information about the question, used to guide the search in the most promising direction -
4. Briefly explain regression, classification and clustering concepts
First, according to whether the training data has labeled information, the learning tasks are divided into "supervised learning" and "unsupervised learning". Among them, classification and regression are representatives of the former, and clustering is a representative of the latter.
Second, all three are aimed at prediction problems. If the predicted value is discrete, it is called "classification", if it is continuous, it is called "regression", and if there is no label information in advance (or the training set has no standard answer), then it is called "clustering".The regression method is a supervised learning algorithm for predicting and modeling numerical continuous random variables. Use cases typically include continuously changing cases such as house price forecasts, stock movements, or test scores. Regression tasks are characterized by labeled datasets with numerical target variables. That is, each observation sample has a numerical ground truth value to supervise the algorithm.
A classification method is a supervised learning algorithm that models or predicts discrete random variables. Use cases include tasks such as email filtering, financial fraud, and predicting employee turnover as output categories. Many regression algorithms have classification counterparts, and classification algorithms are generally suited to predicting a class (or the probability of a class) rather than continuous values.
Clustering is an unsupervised learning task in which an algorithm finds natural groups (i.e., clusters) of observations based on the internal structure of the data. Use cases include segmenting customers, clustering news, recommending articles, and more. Because clustering is a type of unsupervised learning (i.e. the data is not labeled), and the results are often evaluated using data visualization. If there are "correct answers" (i.e. pre-labeled clusters in the training set), then a classification algorithm may be more appropriate.
-
5. Briefly answer which biological genetic concepts such as individual, population, chromosome, gene, and fitness correspond to the applications of genetic algorithms?
biological genetic concept Application of Genetic Algorithm individual untie group A set of solutions selected according to the fitness value (the number of solutions is the size of the group) chromosome encoding of the solution (string, vector, etc.) match Crossover selects two chromosomes for crossover to produce a new set of chromosomes Gene Each component in the encoding of the solution Mutations The process of changing a certain component of the code adaptability fitness function value survival of the fittest The solution with a larger target value is more likely to be selected -
6. Predicate Representation of Knowledge
-
1 Some people like playing basketball, some people like running, some people like both playing basketball and running
P(x): x is a person
L(x,y): x likes y; where, the individual domain of y is {playing basketball, running}.
Express knowledge with predicates as
(∃x) (P(x)→L(x,playing basketball)∨L(x,running)∨(L(x,playing basketball)∧L(x,running))) -
2 The new type of computer is fast and has a large storage capacity;
solution: define the predicate
NC(x): x is a new type of computer
F(x): x is fast
B(x): x has a large capacity
. Express knowledge as (∀ x )(NC(x)→F(x)∧B(x)) -
Someone goes to play basketball every afternoon;
solution: define the predicate
P(x): x is the person
B(x): x plays basketball
A(y): y is the afternoon
Express the knowledge as a predicate:
(∃x)(∀y )(A(y)→B(x)∧P(x)) -
Anyone who likes programming likes computers;
Solution: Define the predicate
P(x): x is a person
L(x,y): x likes y.
Express knowledge as a predicate:
(∀x)(P(x)∧L (x, programming) → L(x, computer))
-
-
- Semantic Network Representation
AKO (A-Kind-Of) Represents that something is a type of another
AMO (A-Member-Of) Represents that something is a member of another
ISA (Is-A) Represents something A thing is an instance of another thing
but not limited to these relationships
-
1 Both trees and grass are plants
-
2 Trees and grasses have leaves and roots
-
3 The Chinese table tennis team defeated the Japanese team 4:0
-
4 Teacher Gao lectured on "Computer Network" to computer students from July to August
- Semantic Network Representation
-
- frame representation
- 1 Basic information about a person, such as: including name, age, height, job, etc. For example:
frame
name: <Teacher-1>
name: Xia Bing Age: 36
Gender: Female
Title: Associate Professor
Department: Department of Computer Software Teaching and Research Office Address: 〈adr-1〉
Salary:〈sal-1〉
Start working time: 1988, 9
Deadline: 1996, 7 - 2 Suppose there is a weather forecast as follows: "Beijing area is sunny today, the northerly wind is 3, the highest temperature is 12°C, the lowest temperature is -2°C, the body temperature is 5°C, and the probability of precipitation is 15%." Solution: frame name:
<
weather Forecast>
Region: Beijing
Time period: Today's daytime
Weather: Sunny
Wind direction: North
Wind force: Level 3
Temperature: Highest: 12°C
Lowest: -2°C
Feeling: 5°C
Precipitation probability: 15%
-
9. Production reasoning , which will infer the animals represented by the given facts according to the rules in the rule base, and write out the reasoning process. for example:
Reasoning process:
Step 1 : check the rule base IF has milk THEN mammal, its antecedent can match r16 has milk in the fact base, execute this production formula, and generate a new fact of "mammal"; add r21 to the fact base: breastfeeding animal;
Step 2 : Check rule base r7 again: IF mammal AND ungulate THEN ungulate animal; add r22 to fact base: ungulate animal;
Step 3 : Detect rule base r11 again: IF ungulate animal AND long neck AND long legs AND dark spots THEN giraffe;
At this point, a clear classification conclusion is obtained, and the reasoning ends;
Fact base:
r16: milk
r17: hooves
r18: long neck
r19: long legs
r20: dark spots
r21: mammals
r22: ungulates animals
r23: giraffes
2. Calculation
-
1 Compute cosine similarity, Euclidean distance, Manhattan distance for vectors X = (1, 1, 0, 1), Y = (1, 1, 1, 0) ?
Cosine similarity:
=(1+1+0+0)/(√3*√3)=2/3
Euclidean distance formula: =√2
Manhattan distance formula: =2 -
2 search
- Please use the depth-first search method and breadth-first search method to find a solution path from node S to node R (note that it is not a search path, for example, S->O->P->R is a solution path). Assuming that initially OPEN = {1}, CLOSE = {}, given the changes in the OPEN table and CLOSE table when visiting each node.
Solution: Depth-first search:
Find a solution path: S => D => L => R
This path is not the optimal solution path, the optimal solution path is S=>O=>R
- 2
solution:
no expansion, {8,5,3}, 8 reaches the depth limit, can no longer expand, close; 5 only has a path to 8, no longer expand, close; 3 continue to expand.
- Please use the depth-first search method and breadth-first search method to find a solution path from node S to node R (note that it is not a search path, for example, S->O->P->R is a solution path). Assuming that initially OPEN = {1}, CLOSE = {}, given the changes in the OPEN table and CLOSE table when visiting each node.
-
3 Use the heuristic search algorithm to solve the "eight digit problem", find the solution path from the initial state to the target state, and give the evaluation function value of each state. The initial state is S0 and the target state is Sg.
untie:
3. Design questions
-
A data set with 3 attributes {A,B,C} and two categories {C1,C2}, please use the method of Naive Bayesian classification to predict the category of records (A=1,B=0,C=1) label, please give the specific calculation process.
Understand Naive Bayes
Solution: According to the information given in the table, the probabilities of belonging to categories C1 and C2 can be obtained:
P(Y=C1)=6/10, P(Y=C2)=4/10,
At the same time, the conditional probability can be obtained:
P(A=0|Y=C1)=3/6, P(A=1|Y=C1)=3/6, P(A=0|Y=C2)=0, P(A=1|Y=C2)=1
P(B=0|Y=C1)=4/6, P(B=1|Y=C1)=2/6, P(B=0|Y= C2)=2/4, P(B=1|Y=C2)=2/4
P(C=0|Y=C1)=2/6, P(C=1|Y=C1)=4/6 , P(C=0|Y=C2)=3/4, P(C=1|Y=C2)=1/4
Then,
records (A=1,B=0,C=1) belong to category C1 The probability is:
P(A=1,B=0,C=1) = P(Y=C1) P(A=1|Y=C1) P(B=0|Y=C1) P(C =1|Y=C1)
= 6/10×3/6×4/6×4/6=2/15
The probability of belonging to category C2 is:
P(A=1,B=0,C=1) = P (Y=C2)·P(A=1|Y=C2)·P(B=0|Y=C2)·P(C=1|Y=C2) = 4/10×1×2/4×
1 /4=1/202/15>1/20, so it belongs to category C1