The Multilinear Structure of ReLU Networks - 代码天地

The Multilinear Structure of ReLU Networks

其他 2018-07-19 22:29:45 阅读次数: 0

两种非常常见的非线性单元：rectified linear units (ReLUs) 和 leaky ReLUs

我们选取binary hinge loss进行二分类

对于多分类，我们可以定义multiclass hinge loss

定义Ω为网络的参数空间， L(ω)为loss。

由于我们选了ReLU非线性单元作为loss, 那么L(ω)是分片线性的。对于参数空间，我们可以将其进行一个划分，

扫描二维码关注公众号，回复： 2252566 查看本文章

分成有限个open cells Ω_u 和 边界N，则损失函数L(ω)在cell的内部是光滑的，在边界上是不可微的。

下面我们将loss限制在某个cell Ω_u上单独考虑，并且loss拥有multilinear form. 由于multilinear form是调和的，由strong maximum principle知，极值点必定在边界处N. 换句话说，ReLU 神经网络 with hinge loss L(ω)是不存在可微的局部极值点的。

目前为止，我们可以知道局部极值有两种情况，

Type I (Flat). 局部极值在cell中，loss为常值。

Type II (Sharp). 局部极值在边界N上。

Main Result 1. 在Type II局部极值点，L(ω)>0.

也就是说，如果存在极值0，那么Type II极值点都是sub-optimal的。

若我们考虑更一般的情况：fully connected networks with leaky ReLU nonlinearities. 那么我们有以下结果，

Main Result 2. 在Type I局部极值点，L(ω)=0. 在Type II局部极值点，L(ω)>0.

在存在极值0的情况下，flat 局部极小值都是optimal的，sharp 局部极小值都是sub-optimal的。若不存在极值0，所有的局部极值点都是sharp的。

未完待续。。。

猜你喜欢

转载自www.cnblogs.com/skykill/p/9338233.html

The Multilinear Structure of ReLU Networks

ReLU——Deep Sparse Rectifier Neural Networks

EffNet: An Efficient Structure for Convolutional Neural Networks

Spurious Local Minima are Common in Two-Layer ReLU Neural Networks

Curriculum Learning and Graph Neural Networks (or Graph Structure Learning)

论文阅读 - Outlier detection in social networks leveraging community structure

structure

【阅读笔记】：End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion

【论文阅读笔记】Data-Driven Sparse Structure Selection for Deep Neural Networks

11_Training Deep Neural Networks_VarianceScaling_leaky relu_PReLU_SELU _Batch Normalization_Reusing

【读书笔记】Virality Prediction and Community Structure in Social Networks（社交网络的社区结构和病毒营销预测）

Relu的理解

Relu的缺点

Relu函数与Leaky Relu函数

Tensor Distance based Multilinear Multidimensional Scaling for Image and Video Analysis

Tensor Distance Based Multilinear Locality-Preserved Maximum Information Embedding

【转】Structure

project structure

Data Structure

Data Structure?

DRVENABLEDATA structure

DRVFN structure

SystemVerilog Structure

structure streaming

BBED Structure

The Company Structure

创建包含其它structure的structure

relu激活函数解读 Tensorflow学习——ReLu

激活函数（sigmoid、tanh、ReLU、leaky ReLU）

Dynamic ReLU：根据输入动态确定的ReLU

今日推荐

基于大语言模型的开源知识库问答系统 MaxKB GitHub Star 数量突破 5,000 个！

美国拟限制 AI 大模型出口中国和俄罗斯

苹果将与 OpenAI 达成协议，将 ChatGPT 应用于 iPhone

openKylin 社区生态委员会第六次会议圆满召开

阿里云正式发布通义千问 2.5

Python 3.13 发布首个 Beta：实验性自由线程模式和 JIT、改进交互式解释器

Stack Overflow 拿我的代码去训练 AI 大模型，还封了我的账号

Pop!_OS 的 COSMIC 桌面完成 App Store 上架工作

《2024 年一季度互联网投融资运行情况》研究报告

报告：Django 仍然是 74% 开发者的首选

15 年前上了“FFmpeg 耻辱柱”，今天他还得谢谢咱——腾讯QQPlayer一雪前耻？

TIOBE 5 月榜单：Fortran “复活”进入 Top 10

周排行

记一下去大梅沙的准备（2018-05-26）

Spring 注解事务

基于HTTP协议的客户端缓存

阿里云rds 备份和还原

[PHP] 几个拖慢 PHP 程序/API 运行速度的点

python 代码风格------------PEP8规则

js控制json生成菜单——自制菜单（一）

将字符串: 'k:1|k1:2|k2:3|k3:4 ' ,处理成 python 字典: {'k':1, 'k1':2, ...}

微信小程序转支付宝小程序

Qt551.窗口滚动条

每日归档

更多

2024-05-13(18)

2024-05-12(0)

2024-05-11(38)

2024-05-10(38)

2024-05-09(35)

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)

2024-05-05(0)

2024-05-04(7)