stock forecast analysis python

First, I use the pandas library to read the data of the training set and the test set, and handle the null values. Next, I merged the data of training set and test set, and converted the date column to datetime type, and sorted by date. Then, I used the matplotlib.pyplot library to draw a line chart of each column of data about the date, showing the opening price, highest price, lowest price, closing price and trading volume. Next, I extracted the features and target variables of the training and test sets and created a linear regression model.
I train the model using the training set data and make predictions on the test set. Then, I calculated the mean squared error (MSE), mean absolute error (MAE) and coefficient of determination (R2) of the predictions. Next, I generated the prediction results file and saved it in CSV format. Finally, I used the matplotlib.pyplot library to draw a line chart comparing the prediction results with the test set data, showing the opening price, highest price, lowest price, closing price, and trading volume.

import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
 
# 读取训练集和测试集数据
train_data = pd.read_csv('zgpa_train.csv')
test_data = pd.read_csv('zgpa_test.csv')
 
# 处理空值
train_data = train_data.dropna()
test_data = test_data.dropna()
 
# 合并训练集和测试集数据
all_data = pd.concat([train_data, test_data])
 
# 将日期列转换为日期时间类型
all_data['date'] = pd.to_datetime(all_data['date'])
 
# 按日期排序数据
all_data = all_data.sort_values(by='date')
 
# 获取数据列名称
columns = ['open', 'high', 'low', 'close', 'volume']
 
# 逐列生成折线图
for column in columns:
plt.figure()  # 创建新的图表
plt.plot(all_data['date'], all_data[column])  # 绘制折线图
 
# 设置图例和标题
plt.legend([column.capitalize()])
plt.title('Stock Data: {}'.format(column.capitalize()))
plt.xlabel('Date')
plt.ylabel('Value')
 
# 显示图形
plt.show()
 
# 提取特征和目标变量
train_features = train_data[['open', 'high', 'low', 'volume']]
train_target = train_data['close']
 
test_features = test_data[['open', 'high', 'low', 'volume']]
test_target = test_data['close']
 
# 创建线性回归模型
model = LinearRegression()
 
# 在训练集上训练模型
model.fit(train_features, train_target)
 
# 在测试集上进行预测
predictions = model.predict(test_features)
 
# 计算均方误差
mse = mean_squared_error(test_target, predictions)
print('均方误差(MSE)：{:.2f}'.format(mse))
 
# 计算平均绝对误差
mae = mean_absolute_error(test_target, predictions)
print('平均绝对误差(MAE)：{:.2f}'.format(mae))
 
# 计算判定系数
r2 = r2_score(test_target, predictions)
print('判定系数(R2)：{:.2f}'.format(r2))
 
# 生成预测结果文件
result_df = pd.DataFrame({'date': test_data['date'],
'open': ['{:.2f}'.format(x) for x in predictions],
'high': ['{:.2f}'.format(x) for x in predictions],
'low': ['{:.2f}'.format(x) for x in predictions],
'close': ['{:.2f}'.format(x) for x in predictions],
'volume': ['{:.0f}'.format(x) for x in test_data['volume']]})
 
result_df.to_csv('项目一submit董昊晨.csv', index=False)
 
# 读取测试集数据
test_data = pd.read_csv('zgpa_test.csv')
 
# 将日期列转换为日期时间类型
test_data['date'] = pd.to_datetime(test_data['date'])
 
# 按日期排序测试集数据
test_data = test_data.sort_values(by='date')
 
# 读取预测结果数据
prediction_data = pd.read_csv('项目一submit姓名.csv') 
 
# 将日期列转换为日期时间类型
prediction_data['date'] = pd.to_datetime(prediction_data['date'])
 
# 按日期排序预测结果数据
prediction_data = prediction_data.sort_values(by='date')
 
# 获取数据列名称
columns = ['open', 'high', 'low', 'close', 'volume']
 
# 绘制折线图
for column in columns:
plt.figure()  # 创建新的图表
 
# 绘制测试集数据折线图
plt.plot(test_data['date'], test_data[column], label='Test Data')
 
# 绘制预测结果数据折线图
plt.plot(prediction_data['date'], prediction_data[column], label='Prediction')
 
# 设置图例和标题
plt.legend()
plt.title('Comparison: {} - Test Data vs Prediction'.format(column.capitalize()))
plt.xlabel('Date')
plt.ylabel('Value')
 
# 显示图形
plt.show()

The most important tool in dynamic analysis technology is the debugger, which is divided into two types: user mode and kernel mode.
User-mode debugger: A debugger used to debug user-mode applications, working at the Ring3 level, such as OllyDbg, x64dbg. There are also debuggers that come with compilers such as Visual C++.
Kernel-mode debugger: A debugger that can debug the operating system kernel, such as WinDbg.

OllyDbg
is referred to as OD, a user-level debugger. OD is a very powerful 32-bit debugger. Although the author has stopped updating it, it can still be used as a learning tool. In practice it is recommended to use the 32-bit version of x64dbg.

64-bit platforms can use x64dbg, IDA Pro, etc.

operation window

Configure
ollydbg.ini: All settings in the Options menu in OD are saved in ollydbg.ini
UDD file: OD project file, used to save some status of current debugging, such as breakpoints, comments, etc., so that the next debugging can continue use.
Plug-ins: OD supports plug-ins and provides related APIs. This makes it very scalable.

Debug Settings
Click "Options" - "Debugsingoplions" option to open the debug settings options dialog box, generally keep the default. Among them, the "Exceptions" option is used to set OllyDbg to ignore or not ignore some exceptions, it is recommended to select all. Knowledge about exceptions will be explained in Chapter 8.

Load symbol file
Click "Debug" - "Seleet importlibraries" option to open the import library window for loading. The symbol file can let the function be represented by the function name during the debugging process. For example, there is a function called "abc", but it is likely to be a string of numbers "004012f7" in the debugger, and the symbol file can make it displayed as "abc".

Basic operation
Common shortcut keys:

Breakpoint
Commonly used breakpoints are: INT3 breakpoint, hardware breakpoint, message breakpoint.

INT3 breakpoint
When you use the bp command or press F2, the set breakpoint is INT3 breakpoint. The principle is to replace the instruction at the breakpoint with the CC instruction to cause an exception, and the debugger interrupts after catching the exception and restores the original instruction. For example:

Copy Code Hidden Code
004011F3 68 D0404000
｜｜｜｜｜｜｜｜｜｜
004011F3 CC D0404000
Advantages: You can set countless INT3 breakpoints, as long as the computer can stand it.
Disadvantage: Easy to be detected due to modified program

The method of setting a hardware breakpoint
is to right-click the specified line of code and execute the "Breakpoint" - "Hardware, on execution" command in the shortcut menu ("breakpoint", "hardware execution") command (you can also use the command line Set "HE address" in)
can set up to four hardware breakpoints.

Among them, DR0, DR1, DR2, and DR3 can set addresses for disconnection, and DR7 is used to control the state.
Principle: After setting the breakpoint, when the address is changed, the CPU will send an exception to the debugger, and the debugger will interrupt after catching it.
Advantages: Fast execution speed, not easy to be detected
Disadvantages: Only four can be set

stock forecast analysis python

Guess you like