Machine learning [AI] electricity supplier business context of | the Bay Area Artificial Intelligence

 Model specification


Let us consider a business situation: a monopoly to provide certain goods to consumers with a network effect, constitute a particular social network between the consumer 640?wx_fmt=jpegadjacency matrix representation), and subject to its consumers in the social consumption goods affect other consumer spending amount is directly connected to the network. Specifically, consumer 640?wx_fmt=svgutility is

640?wx_fmt=svg

Which 640?wx_fmt=svgis consumer 640?wx_fmt=svgconsumption, 640?wx_fmt=svgand 640?wx_fmt=svgare (for econometrician it) can be observed and unobserved factors, 640?wx_fmt=svgindicates the degree of network effects, 640?wx_fmt=svgrepresents the consumer 640?wx_fmt=svg, and 640?wx_fmt=svgwhether or not connected. If consumers face price 640?wx_fmt=svg, its consumption should be selected to meet

640?wx_fmt=svg

All consumers stacked above equation, is the 640?wx_fmt=svgso final consumption (vector) can be written as 640?wx_fmt=svgone of the conditions holds matrix is 640?wx_fmt=svginvertible, according to the literature, which is a sufficient condition 640?wx_fmt=svgwhere 640?wx_fmt=svgis the adjacency matrix 640?wx_fmt=jpeglargest eigenvalues, according to the nature of the adjacency matrix shows 640?wx_fmt=svgso we assume that there is always640?wx_fmt=svg




 Problem Description


Trying to solve the problem of monopoly enterprises is how to develop pricing strategies, can make to maximize their own profits (assuming that the cost is zero, the profit maximization is equivalent to maximize revenue). We allow complete monopoly discriminatory pricing (discriminatory pricing), that is, for different prices for each consumer, without considering the positive consumption inevitable limitations, the literature suggests that the optimal monopoly pricing should be 640?wx_fmt=svg, however, the monopoly companies can not be directly observed 640?wx_fmt=svg. Further, monopolies usually do not know the size of the network effect 640?wx_fmt=svg, even for various reasons, monopolies can not properly observe the real situation of the social network 640?wx_fmt=svg, but only some of them observed, denoted 640?wx_fmt=svg. Monopoly the data available experimental data only once: when monopoly adopt uniform pricing (uniform pricing), that is, to charge the same price for all consumers 640?wx_fmt=svg, it observed that the consumption of consumer choice based on real information 640?wx_fmt=svg.

If monopolies know 640?wx_fmt=svgand 640?wx_fmt=svgthen be able to determine the experimental data to launch anti- 640?wx_fmt=svg:

640?wx_fmt=svg

A natural idea is that if the monopolist can accurately estimate based on available information 640?wx_fmt=svgand 640?wx_fmt=svgthen also be able to accurately estimate 640?wx_fmt=svgand the corresponding optimal pricing.

Existing literature is more limited for helping the above problems:

In terms of econometric literature is more concerned about how to estimate correctly 640?wx_fmt=svg, usually observed default network 640?wx_fmt=svgis the real network, even if the network is apparently missing, such as friends Add Health data: In the time of the survey, the default only asked respondents to provide up to five friend's name.

Adjusting the estimated or real network documents, is often dependent on multiple observations of the same network behavior, i.e. usually requires panel data. In reality, several tests may be limited in time or money required and can not do, or data obtained econometricians itself only cross-section data. If estimated based on a single experiment or cross-sectional data, we are faced with only 640?wx_fmt=svgone individual has to estimate 640?wx_fmt=svgthe indefinite equation parameters of the problem.

Computer scientists have estimated that the network problem a lot of research, this kind of problem is known as "link prediction", but the fly in the ointment is that many studies also dependent on repeated observation, and usually do not make use of additional information (such as our side The experimental results).

In addition, standing monopoly point of view, in fact, the company does not care parameters 640?wx_fmt=svgand 640?wx_fmt=svgcorrectly estimate; on the contrary, if the parameter estimates are not accurate, but 640?wx_fmt=svgstill be able to accurately estimate, that the pricing strategy is correct, and then be able to maximize profits then the enterprise is such an estimate is good enough. Conversely, we usually judge 640?wx_fmt=svgand 640?wx_fmt=svgwhether or not "accurate" estimate of the standard, such as the norm, and so the difference between the p-value or true value, but the lack of practical significance.




 Estimation


We first consider a very simple estimate as a reference, we estimate that called Naive Estimator.

First, assume that the company can observe the true adjacency matrix 640?wx_fmt=svg, then it should be how to estimate 640?wx_fmt=svgit?

The best response is to rewrite the previous equation 640?wx_fmt=svgand make then we have the where and are known. We found this very close to the linear regression in form (but in fact it does not meet some of the requirements of linear regression), so consider using a similar linear regression estimates640?wx_fmt=svg 640?wx_fmt=svg640?wx_fmt=svg640?wx_fmt=svg640?wx_fmt=svg

640?wx_fmt=svg

进一步地,公司错误地认为实际观察到的网络640?wx_fmt=svg就是真实的网络640?wx_fmt=svg,那么我们就获得了Naive Estimator:640?wx_fmt=svg 640?wx_fmt=svg接下来我们看如何在这个问题上运用机器学习。在我之前提的这个问题中:机器学习在理论经济学研究中有哪些可能的应用前景?https://www.zhihu.com/question/320514976

我提到最近了解到哈佛大学Pakes教授利用强化学习来求解拍卖问题的最优投标函数的做法。

对于强化学习,我粗浅的想法是这样的:只要知道游戏规则,那么我们就可以通过不断玩游戏,并通过游戏给予的正负反馈调整玩游戏的方式,来学会如何正确地玩这个游戏。对于我们的模型设定,我们要玩的“游戏”就是从观察到的网络和参数中反推正确的网络和参数,而正负反馈则是根据我们反推/估计的网络和参数所获得的利润而定,接下来就是如何“不断玩游戏”了。如果我们相信我们的模型正确地描述了这个问题(或者说,消费者确实是按照上述是最优反应结合真实参数选择自己的消费量的),那么我们的就可以模拟大量的真实网络和真实参数,然后计算出相应的单次实验结果,然后手动“损坏”网络,并将损坏后的网络和实验结果作为输入,真实网络和真实参数作为输出,通过监督学习的方式训练我们的模型(当然并不一定要用机器学习,一些非参估计的计量方法也是可以的)。

这和以往的计量做法有什么样的区别呢?我的理解是这样的:

利用面板数据进行估计,就好比是和同样的对手玩某个游戏(比如狼人杀),只要玩足够多次,你总能玩得越来越好;但是如果你只能和这个对手玩一次,你想要把这一次玩好该怎么办呢?一个当然的做法就是和不同的对手玩这个游戏,只要玩足够多次,你充分理解这个游戏,那无论对手是谁你都能玩得不错,这就像是强化学习。

具体来说,我们考虑一个朴素神经网络,包括输入层,两个全连接隐藏层(每层8个神经元并采用ReLU激活),最后一个全连接输出层(采用Sigmoid激活),损失函数为自定义的

640?wx_fmt=svg

优化器采用带默认参数的Adadelta优化器。此外考虑两个变种:

  1. 如果真实网络很稀疏,而观察到的网络和真实网络的差距较小,那意味着两者之差大部分为0。仿照ResNet的想法,我们考虑在输入层和输出层之间直接加一个shortcut path。

  2. 为了和基准的Naive Estimator作对比,我们假设神经网络不对 640?wx_fmt=svg 作更新而只对网络效应进行估计,但是采用上述自定义损失函数。




 结果与讨论


(由于我们的参数设计似乎有点问题……)即使是Naive Estimator,垄断公司也能够实现超过95%的利润,但神经网络的表现更优一筹,无论是否加上shortcut path,神经网络能够实现99.5%以上的利润(而神经网络只对640?wx_fmt=svg进行估计的话则和Naive Estimator大致相当)。最极端地情况下,如果我们完全没有网络信息,此时Naive Estimator无法使用,而神经网络仍然能够实现相当的利润(事实上神经网络的表现几乎不随观察到的网络中丢失的连接数量而变化)。

640?wx_fmt=jpeg

但是我们注意到一些奇怪的现象:

1、事实上,上述非常低的利润损失几乎总是伴随着非常不准确的网络和参数估计,估计出的网络和真实网络之间可以有一半以上的结点不同(采用平均绝对误差)。相反,任何试图改善网络和参数估计的做法,比如在损失函数中对网络和参数的错误估计增加惩罚项,都只能轻微地改善估计(甚至还不如直接用观察到的网络)并伴有较高的利润损失。另外,如果只估计参数 640?wx_fmt=svg ,并且用该估计的均方误差作为损失函数,则我们可以获得非常好的参数估计,而利润损失仍然很大。

我们认为这一现象的根源是,我们估计的是一个non-identification的问题。回到上文:640?wx_fmt=svg由于我们有太多的未知参数,实际上对于任意的640?wx_fmt=svg640?wx_fmt=svg,都存在640?wx_fmt=svg使得上式成立。

Naive Estimator的做法是,选择640?wx_fmt=svg最小的那个,但这种选择是有问题的,因为我们的主要目的就是要估计640?wx_fmt=svg,没有任何理由认为离原点越近的640?wx_fmt=svg就越正确。

反过来,采用神经网络的这些做法又缺乏一个“如何判定哪种估计更正确”的“真实规则”,这导致了其最终学到的是一个混合了所有可能性的“四不像”。

2、一开始我们尝试了每层128个神经元,结果训练结果要差得多,而且随着epochs增加,loss反而上升了。这可能主要是因为上面所说的identification的问题,但我猜测是否随着神经元数量增加,待训练参数增加,损失函数上出现了更多鞍点或局部极值的情况。

3、如果我们将邻接矩阵看作是二进制图象,那么我们这里的问题相当于根据额外信息恢复一个破损/带有噪音的二进制图像。我们尝试加上了卷积层和池化层,但是和上一条类似,训练结果反而变差了。另外,邻接矩阵本身和一般的二进制图像也并不相同,比如邻接矩阵中某个结点的一些拓扑信息和邻接矩阵在局部的表现(比如,个别点之间的连接情况无关),而是跟某个结点和其它所有结点的连接情况有关(比如度)。我注意到BrainNetCNN可能更适用于这个问题。




 总结


尽管我们开始于一个非常具体的案例,但本文的重点并不在于如何帮助垄断公司定价,而是以此为例说明,机器学习在商业(或者其它社会经济)情境中可能的运用方法和问题。

类似于我们所举的场景的例子在现实中很常见,它包含有以下要素:

  1. 存在一些信息丢失,可能是无法观察,也可能是测量误差;

  2. By some indirect means to detect this information, but because of too much freedom can not lead to a perfect recovery information;

  3. The existence of these means of limiting the frequency of use, and therefore can not be repeated.

  4. Missing information will be further utilized, and at the time of use will be accompanied by clear criteria.

  5. We believe that our model correctly describes the process of detection and process information further use of information.

Many fixed-point experiment or test operation of commercial policy are in line with these characteristics. In these elements, especially basic 5) on, we can consider the practice a similar article, using the model simulation data manually "lost" part of the information, and the use of supervised learning to train learn to model how to "rebuild" the loss of information in order to reach us purpose (rather than just rebuilding "correct" information - sometimes "error" message instead of better).


-----END-----



@Richard Xu

Copyright Notice

This article belongs "Richard Xu", reprint your own contact.



Letter to switch to AI students in school


[AI] complete AI self-learning course, the most detailed resource consolidation!


AI switch need to look at some of the articles


Switch to learn AI, how to choose the direction of specific segments, insights from the front line engineers


With legal weapons, blow Tencent infringement! ! ! Bay Area Artificial intelligence can improve the status of intellectual property rights it?


[PDF] to send the book Python programming from entry to practice


Python from entry to the master, the depth of learning and machine learning materials spree!


[Free] an institution latest 3980 yuan machine learning / course high-speed downloads of large data, limited to 200 copies


640?wx_fmt=png


640?wx_fmt=png



640?wx_fmt=png


 Feel good, feel free to forward, a look at the trouble spots!




Guess you like

Origin blog.csdn.net/BTUJACK/article/details/91473372