Preface: Using R for numerical simulation in a large regression operation, I hit a wall when generating random numbers obeying the Laplace distribution. So I had an idea and spent a long time building a function in python. After generating the data, I exported it and imported it into R to continue. I also thought that if I use Laplace random numbers in the future, I can reuse them! As a result, in the middle of the night, I received a magical message from Mr. Xiaocui’s junior fellow student. It turned out that R’s bag came with it. I was so angry that I got up and planned to record the stuff I made in this hour, which proved that it was not completely in vain. Harvest (attempting to comfort oneself)
Table of contents
1. Generate Laplace random numbers in R: (r version cannot be too low)
2.Python reads excel file data
3.Export python data to xlsx (the method is simple and relatively stupid)
point:
1. Generate Laplace random numbers in R: (r version cannot be too low)
install.packages("VGAM")
library(splines)
library(stats4)
library(VGAM)
rlaplace(100,0,0.8)
2.Python reads excel file data
Use pandas: read_excel; store as array form
import pandas as pd
path=r'文件路径.xlsx'
df=pd.read_excel(path,header=0) #header=0默认第一行为表头
3. python data is exported to xlsx (the method is simple and stupid)
import xlsxwriter
m #假设m为一个100*1000维的矩阵
workbook=xlsxwriter.Workbook('random_laplace.xlsx') #创建一个xlsx文件
wworksheet=workbook.add_worksheet('sheet1') #创建一个工作表,可为空
#将矩阵中每个元素填入到工作表中
for i in range(100):
for j in range (1000):
ws.write(i,j,m[i,j])
workbook.close() #关闭该工作表
4.R imports data from xlsx (csv) files
library(openxlsx)
#读取到data.frame中
read.xlsx(file, sheetIndex, sheetName=NULL,rowIndex=NULL,
startRow=NULL, endRow=NULL, colIndex=NULL,
as.data.frame=TRUE, header=TRUE)
#读取csv文件中的数据
#read.csv 其他参数类似,不需要额外导入包
step1: read into the dataframe, the first row will be the title row by default.
The default header=TEUR, so the first line will default to the header line; if you do not want to use the first line as the header line, then header=FALSE
Step 2: The data type after reading will be converted into a data frame or list form, and cannot be directly returned.
Therefore, we need to return the list-->numeric type.
y_num<-unlist(y_list)
Create an empty matrix in python (more difficult, directly create a zero matrix corresponding to the dimension and then gradually delete it)
matrix=np.zeros(shape=(n,m))
5.Practice process
1.python: "generate laplace random numbers" function
import numpy as np
import xlsxwriter
def random_laplace(nsample,repeat_time,miu,b):
"""
生成服从拉普拉斯分布(双指数分布)随机数
nsample:生成n个随机数
repeat_time:重复生成次数
miu,b:laplace分布的参数
return:得到随机数矩阵
"""
n=0
# 创建一个空矩阵
random_matrix=np.zeros(shape=(repeat_time,nsample))
while n<repeat_time:
# 生成拉普拉斯随机数
random_matrix[n,:]=np.random.laplace(miu,b,nsample)
n+=1
return random_matrix
m=random_laplace(nsample,repeat_time,miu,b)
#将m保存到xlsx文件中
workbook=xlsxwriter.Workbook('random_laplace.xlsx')
ws=workbook.add_worksheet()
for i in range(1000):
for j in range (100):
ws.write(i,j,m[i,j])
workbook.close()
2. Read and run in R
library(openxlsx)
r_laplace<-read.xlsx("F:/个人嘿嘿嘿/北师大BNU/研一上-课业资料/应用多元线性回归/hw01大作业/laplace_random.xlsx")
xdata<-c(……)
ydata_l<-y_real+r_laplace[i,] #提取第i行的随机数作为误差项加入到y
#将list转化为numeric
ydata_laplace<-unlist(ydata_l)
fit_ols<-lm(ydata_laplace~xdata)