insert image description here

1 topic

See [The 11th Teddy Cup Data Mining Challenge in 2023] C-question Teddy internal promotion platform recruitment and job hunting two-way recommendation system construction modeling and python code detailed explanation Question 1

2 Question 3 Constructing a Model of Job Matching Degree and Job Seeker Satisfaction

In the process of recruitment and job hunting, when faced with many high-quality job seekers, enterprises will consider the job seekers' ability requirements, skill mastery and other aspects. A variety of recruitment information will also select the position that suits one's heart according to one's own conditions and requirements, so the job seeker's satisfaction index can objectively reflect the job seeker's satisfaction with the company's recruitment position. For job seekers who do not meet the minimum job requirements, the company can define their job matching degree as 0. Similarly, job seekers can define their job seeker satisfaction as 0 for positions that do not meet the minimum requirements of job seekers.

According to the recruitment information and job seeker information in question 2, a model of job matching degree and job seeker satisfaction is constructed. Based on this model, job seekers with non-zero job matching degree are provided for each recruitment information, and the results are sorted in descending order and stored in " result3 - 1.csv" file, and provide each job seeker with job seeker satisfaction non-zero recruitment information, sort the results in descending order and store them in the "result3 - 2. csv" file. (see the CSV file in Attachment 1 for the template file)

3 Thought Analysis

Modeling scheme

In order to build a model of job matching degree and job seeker satisfaction, it is necessary to clean and process the result1-1.csv and result1-2.csv data first, and then calculate the job matching degree corresponding to each recruitment information and each job applicant corresponding job seeker satisfaction.

Clean and process result1-1.csv:

Remove duplicate recruitment information ids;
Delete or replace null or meaningless data;
Digitally process classified data such as the number of employees, education background, and job experience to facilitate subsequent calculations.

Clean and process result1-2.csv:

Remove duplicate job applicant ids;
Delete or replace null or meaningless data;
Keyword extraction and word segmentation are performed on expected positions and skills to facilitate subsequent calculations.

Calculate job matching degree:

For each piece of recruitment information, job applicants are screened according to expected positions and skills, and qualified job applicants are screened out;
For each qualified job seeker, calculate its job matching degree with the recruitment information, which can be calculated using indicators such as skill matching degree and job matching degree;
For each piece of recruitment information, the job matching degrees of all qualified job seekers are summed to obtain the total job matching degree of the recruitment information.

Calculate job applicant satisfaction:

For each job seeker, the recruitment information is screened according to the qualifications, job experience and other conditions, and the qualified recruitment information is screened out;
For each qualified recruitment information, calculate the job seeker satisfaction with the job seeker, which can be calculated using indicators such as salary, company type, and work location;
For each job seeker, the job seeker satisfaction of all qualified job information is summed to obtain the total job seeker satisfaction of the job seeker.

According to the calculated job matching degree and job seeker satisfaction, sort according to job matching degree to get result3-1.csv, and sort according to job seeker satisfaction to get result3-2.csv.

4 code implementation

4.1 Job matching degree

import pandas as pd
import numpy as np

# 读取招聘信息和求职者信息
job_info = pd.read_csv('data/result1-1.csv')
job_seekers = pd.read_csv('data/result1-2.csv')

# 将招聘信息和求职者信息合并，使用交叉连接的方式
job_matching = pd.merge(job_info.assign(key=1), job_seekers.assign(key=1), on='key').drop('key', axis=1)

# 定义计算匹配度函数
def calculate_match(row):
    # 判断学历匹配度，如果不符合则匹配度为0
   。。。略，请下载完整代码
    # 判断岗位是否匹配，如果有一个岗位匹配则匹配度为1，否则为0
   。。。略，请下载完整代码
    return 0

# 计算岗位匹配度
job_matching['匹配度'] = 。。。略，请下载完整代码
# 根据匹配度降序排序
job_matching = job_matching.sort_values(by='匹配度', ascending=False)
# 保存结果
job_matching[['招聘信息id', '求职者id', '匹配度']].to_csv('data/result3-1.csv', index=False)

4.2 Job Seeker Satisfaction


from ast import literal_eval
# 读取招聘信息和求职者信息
job_info = pd.read_csv('data/result1-1.csv')
job_seekers = pd.read_csv('data/result1-2.csv')
# 使用literal_eval()函数将字符串转换为列表
job_seekers['预期岗位'] = job_seekers['预期岗位'].apply(literal_eval)
# 使用explode()函数将列表中的元素分解成单独的行
job_seekers_all = job_seekers.explode('预期岗位')

# 将招聘信息和求职者信息合并，使用左连接的方式
。。。略，请下载完整代码
# 求四列数据的最大值
max_value = max(job_satisfaction[['最低薪资','最高薪资','预期最低薪资','预期最高薪资']].max())
# 求四列数据的最小值
min_value = min(job_satisfaction[['最低薪资','最高薪资','预期最低薪资','预期最高薪资']].min())
# 将薪资归一化
def min_max_normalize(x):
    if min_value == max_value:
        return x
    return (x - min_value) / (max_value - min_value)
job_satisfaction['最低薪资'] = job_satisfaction['最低薪资'].apply(min_max_normalize)
job_satisfaction['最高薪资'] = job_satisfaction['最高薪资'].apply(min_max_normalize)
job_satisfaction['预期最低薪资'] = job_satisfaction['预期最低薪资'].apply(min_max_normalize)
job_satisfaction['预期最高薪资'] = job_satisfaction['预期最高薪资'].apply(min_max_normalize)


# 对于求职者满意度的计算，可以采用类似的方法，将招聘信息和求职者信息合并后，按照求职者的要求和条件进行筛选和计算匹配度。以下是一个基于pandas库的求职者满意度计算方案：
# 定义计算满意度函数
def calculate_satisfaction(row):
    # 判断薪资是否满意，如果不满意则满意度为0
    。。。略，请下载完整代码
 
    # 计算专业技能匹配度
   。。。略，请下载完整代码
    # 计算工作经验匹配度
    。。。略，请下载完整代码
    # 计算学历匹配度
    。。。略，请下载完整代码

    # 还有地区距离满意度
    。。。略，请下载完整代码
    # 权重自己定
    # total_satisfaction = 0.4 * salary_satisfaction + 0.2 * skill_satisfaction + 0.2 * exp_satisfaction + 0.2 * edu_satisfaction
    return total_satisfaction

# 计算满意度
job_satisfaction['满意度'] = job_satisfaction.apply(calculate_satisfaction, axis=1)
# 根据满意度降序排序
job_satisfaction = job_satisfaction.sort_values(by='满意度', ascending=False)
# 保存结果
job_satisfaction[['招聘信息id', '求职者id', '满意度']].to_csv('data/result3-2.csv', index=False)

[The 11th Teddy Cup Data Mining Challenge in 2023] Question C: Construction modeling of two-way recommendation system for recruitment and job hunting on Teddy’s internal promotion platform and detailed explanation of python code Question 3

Related Links