B站UP搭建世界首个纯红石神经网络、基于深度学习动作识别的色情检测、陈天奇《机器学编译MLC》课程进展、AI前沿论文 | ShowMeAI资讯日报 #07.05

ShowMeAI日报系列全新升级!覆盖AI人工智能 工具&框架 | 项目&代码 | 博文&分享 | 数据&资源 | 研究&论文 等方向。点击查看 历史文章列表,在公众号内订阅话题 #ShowMeAI资讯日报,可接收每日最新推送。点击 专题合辑&电子月刊 快速浏览各专题全集。

1.工具&框架

工具:c2go - 将C语言代码转换为Go语言代码

这是 Go+ 项目的一个子项目,在没有任何人为干预的情况下,将任何C项目转换为Go,并保持性能接近C。

GitHub: github.com/goplus/c2go

工具库:OpenDP - 差分隐私计算库

OpenDP库是一个模块化的统计算法集合,遵循差分隐私的定义,可以使用许多不同的隐私模型来构建隐私保护计算的应用程序。OpenDP是在Rust中实现的,有Python的易用API。

GitHub: github.com/opendp/open…

工具库:mlprodict - 工业化机器学习预估器

mlprodict 最初是为了帮助实现 ONNX 的转换器,主要功能是一个用于ONNX的 Python 运行时、可视化工具和一个用于 ONNX 的 Numpy API。该软件包还提供了比较预测的工具,以及使用 sklearn-onnx 转换的基准模型。

GitHub: github.com/sdpython/ml…

工具框架:easyFL - 联邦学习的轻量框架

easyFL 是 IJCAI-21 论文 Federated Learning with Fair Averaging 的 PyTorch 实现。easyFL是一个强大且可复用的联邦学习(FL)算法研究实验平台,它提供了一些易于使用的模块,以供那些想做各种联邦学习实验的人使用。 简而言之,FL 研究人员很容易快速实现和比较流行的集中式联邦学习算法。

GitHub: github.com/WwZzz/easyF…

2.项目&代码

代码实现:P-HAR - 基于深度学习动作识别的色情检测

这只是一个有趣的副项目,旨在了解最先进的(SOTA)人类行为识别 (HAR)模型在色情领域的表现如何。 HAR 是深度学习领域中一个相对较新的活跃研究领域,其目标是从各种输入流(例如视频或传感器)中识别人类行为。

从技术角度来看,光线变化、遮挡以及不同摄像机角度和拍摄技术(POV,专业摄影师)的巨大变化使得位置(动作)识别变得困难。两个相同的位置(动作)在不同的相机视角中被捕获,会完全混淆模型的预测。

该存储库使用三种不同的输入流以获得最佳结果:rgb帧、人体骨架和音频。相应地,在这些输入流上训练三个不同的模型,并通过后期融合将其结果合并。目前,该多模型模型达到的最佳准确率为75.64%,考虑到训练集较小,结果在未来仍能得到改善。

GitHub: github.com/rlleshi/pha…

3.博文&分享

分享:世界首个纯红石神经网络(B站)

耗时半年,B站UP主 辰占鳌头 在『我的世界』里搭建出了世界上第一个纯红石的神经网络,可以识别 15x15的手写数字,并且准确率达到80%(在MNIST数据集上模拟)。目前播放量已突破125万,位列B站排行榜最高17名。Yann LeCun 教授在Twitter和Facebook上也转发点赞啦!

项目代码将上传至github.com/leamoon/Sto…

博文:AI 如何从文本创建逼真图像(Imagen & Parti原理与比较)

你有没有见过小狗从一个破裂的鸡蛋中冒出来? 飞艇俯瞰蒸汽朋克城市的照片有多炫酷? 两个机器人如何在影院度过浪漫之夜? 这些听起来似乎不可思议,但“文本到图像生成”的机器学习新技术使此成为可能。 这些模型可以由简单的文本提示生成高质量、逼真的图像。

地址: blog.google/technology/…

分享:机器学习100讲

这是 Mark Schmidt 在 UBC 教授的 机器学习 相关课程的资料合集,包括6个部分,涵盖了大量机器学习及相关主题。

  • Part 1: Machine Learning and Data Mining (Fall 2019)
  • Part 2: Data Science
  • Part 3: Advanced Machine Learning (January-April, 2022)
  • Part 4: Computer Science 540
  • Part 5: Large-Scale Machine Learning
  • Part 6: Machine Learning Reading Group

地址: www.cs.ubc.ca/~schmidtm/C…

4.数据&资源

资源列表:数据集蒸馏相关工作列表

这是一份很棒的 数据集蒸馏/冷凝(Dataset Distillation / Condensation)论文集合。数据集蒸馏的任务是合成小数据集,这样在合成集上训练的模型将与在完整数据集上训练的模型,测试精度相匹配。

GitHub: github.com/Guang000/Aw…

课程:陈天奇《机器学习编译MLC》课程

课程以 python为主,尽量以python交互的方式展现机器学习编译的核心思想,并且通过具体实践的介绍和联系已经有的知识点(numpy)来加深理解。课程会保持中文内容在B站每周同步更新,并且在课程主页上更新笔记。

5.研究&论文

公众号后台回复关键字 日报,免费获取整理好的论文合辑。

论文:Discrete Morse Sandwich: Fast Computation of Persistence Diagrams for Scalar Data -- An Algorithm and A Benchmark

论文标题:Discrete Morse Sandwich: Fast Computation of Persistence Diagrams for Scalar Data -- An Algorithm and A Benchmark

论文时间:27 Jun 2022

所属领域:机器学习

论文地址arxiv.org/abs/2206.13…

代码实现github.com/topology-to…

论文作者:Pierre Guillou, Jules Vidal, Julien Tierny

论文简介:This fast processing of the dimensions 0 and (d-1) further reduces, and drastically, the number of critical simplices to consider for the computation of D 1 ( f ) D_1(f) , the intermediate layer of the sandwich./这种对维度0和(d-1)的快速处理进一步减少了计算 D 1 ( f ) D_1(f) 时需要考虑的关键简图的数量,即中间层。

论文摘要:This paper introduces an efficient algorithm for persistence diagram computation, given an input piecewise linear scalar field f defined on a d-dimensional simplicial complex K, with d≤3. Our method extends the seminal "PairCells" algorithm by introducing three main accelerations. First, we express this algorithm within the setting of discrete Morse theory, which considerably reduces the number of input simplices to consider. Second, we introduce a stratification approach to the problem, that we call "sandwiching". Specifically, minima-saddle persistence pairs (D0(f)) and saddle-maximum persistence pairs (Dd−1(f)) are efficiently computed by respectively processing with a Union-Find the unstable sets of 1-saddles and the stable sets of (d-1)-saddles. This fast processing of the dimensions 0 and (d-1) further reduces, and drastically, the number of critical simplices to consider for the computation of D1(f), the intermediate layer of the sandwich. Third, we document several performance improvements via shared-memory parallelism. We provide an open-source implementation of our algorithm for reproducibility purposes. We also contribute a reproducible benchmark package, which exploits three-dimensional data from a public repository and compares our algorithm to a variety of publicly available implementations. Extensive experiments indicate that our algorithm improves by two orders of magnitude the time performance of the seminal "PairCells" algorithm it extends. Moreover, it also improves memory footprint and time performance over a selection of 14 competing approaches, with a substantial gain over the fastest available approaches, while producing a strictly identical output. We illustrate the utility of our contributions with an application to the fast and robust extraction of persistent 1-dimensional generators on surfaces, volume data and high-dimensional point clouds.

本文介绍了一种高效的持久性图计算算法,给定一个定义在d维简明复数K上的输入片状线性标量场f,d≤3。我们的方法通过引入三个主要的加速器扩展了开创性的 "PairCells "算法。首先,我们在离散Morse理论的背景下表达这个算法,这大大减少了需要考虑的输入简约的数量。第二,我们对问题引入了一种分层的方法,我们称之为 "夹层"。具体来说,通过分别用Union-Find处理1-saddles的不稳定集和(d-1)-saddles的稳定集,可以有效地计算出最小-saddle持久性对(D0(f))和saddle-最大持久性对(Dd-1(f))。这种对维度0和(d-1)的快速处理进一步减少了计算D1(f)(夹层的中间层)时需要考虑的关键简图的数量。第三,我们记录了一些通过共享内存并行化实现的性能改进。我们提供了一个我们算法的开源实现,以达到可重复性的目的。我们还提供了一个可重复的基准包,它利用了一个公共资源库的三维数据,并将我们的算法与各种公开的实现进行了比较。广泛的实验表明,我们的算法将其扩展的开创性的 "PairCells "算法的时间性能提高了两个数量级。此外,它还改善了内存占用和时间性能,超过了14种竞争性方法的选择,比现有的最快方法有了很大的提高,同时产生了严格意义上的相同输出。我们通过在表面、体积数据和高维点云上快速、稳健地提取持久的一维生成器的应用来说明我们的方法优势。

论文:Learning Gait Representation from Massive Unlabelled Walking Videos: A Benchmark

论文标题:Learning Gait Representation from Massive Unlabelled Walking Videos: A Benchmark

论文时间:28 Jun 2022

所属领域:计算机视觉

对应任务:Contrastive Learning,Gait Recognition,步态识别,对比学习

论文地址arxiv.org/abs/2206.13…

代码实现github.com/shiqiyu/ope…

论文作者:Chao Fan, Saihui Hou, Jilong Wang, Yongzhen Huang, Shiqi Yu

论文简介:As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks./据我们所知,GaitLU-1M是第一个大规模的无标签步态数据集,而GaitSSB是第一个在上述基准上取得显著无监督结果的方法。

论文摘要:Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, our method outperforms existing methods by a large margin in most cases. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks. The source code of GaitSSB will be integrated into OpenGait which is available at github.com/shiqiyu/ope…

步态描述了个人独特而有区别的行走模式,已经成为人类识别中最有前途的生物特征之一。作为一项精细的识别任务,步态识别很容易受到许多因素的影响,而且通常需要大量的完全标注的数据,成本很高,而且无法满足。本文提出了一个大规模的自监督的对比学习的步态识别基准,旨在通过提供信息丰富的行走先验和不同的现实世界的变化,从大规模的无标记的行走视频中学习一般的步态表示,用于实际应用。具体来说,我们收集了一个大规模的无标签步态数据集GaitLU-1M,包括102万个步行序列,并提出了一个概念上简单但经验上强大的基线模型GaitSSB。通过实验,我们在四个广泛使用的步态基准上评估了预训练的模型,即CASIA-B、OU-MVLP、GREW和Gait3D,无论是否有转移学习。无监督的结果与早期基于模型和基于GEI的方法相当,甚至更好。在转移学习之后,我们的方法在大多数情况下比现有的方法要好很多。在理论上,我们讨论了针对步态的对比性框架的关键问题,并提出了进一步研究的一些见解。据我们所知,GaitLU-1M是第一个大规模的无标签步态数据集,而GaitSSB是第一个在上述基准上取得显著无监督结果的方法。GaitSSB的源代码将被整合到OpenGait中,OpenGait可在 github.com/shiqiyu/ope… 获取。

论文:A Comprehensive Survey on Deep Gait Recognition: Algorithms, Datasets and Challenges

论文标题:A Comprehensive Survey on Deep Gait Recognition: Algorithms, Datasets and Challenges

论文时间:28 Jun 2022

所属领域:计算机视觉

对应任务:Gait Recognition,步态识别

论文地址arxiv.org/abs/2206.13…

代码实现github.com/shiqiyu/ope…

论文作者:Chuanfu Shen, Shiqi Yu, Jilong Wang, George Q. Huang, Liang Wang

论文简介:Besides, we also present a comprehensive summary of all vision-based gait datasets and the performance analysis./此外,我们还对所有基于视觉的步态数据集和性能分析进行了全面总结。

论文摘要:Gait recognition aims at identifying a person at a distance through visual cameras. With the emergence of deep learning, significant advancements in gait recognition have achieved inspiring success in many scenarios by utilizing deep learning techniques. Nevertheless, the increasing need for video surveillance introduces more challenges, including robust recognition under various variances, modeling motion information in gait sequences, unfair performance comparison due to protocol variances, biometrics security, and privacy prevention. This paper provides a comprehensive survey of deep learning for gait recognition. We first present the odyssey of gait recognition from traditional algorithms to deep models, providing explicit knowledge of the whole workflow of a gait recognition system. Then deep learning for gait recognition is discussed from the perspective of deep representations and architecture with an in-depth summary. Specifically, deep gait representations are categorized into static and dynamic features, while deep architectures include single-stream and multi-stream architecture. Following our proposed taxonomy with novelty, it can be beneficial for providing inspiration and promoting the perception of deep gait recognition. Besides, we also present a comprehensive summary of all vision-based gait datasets and the performance analysis. Finally, the article discusses some open issues with significant potential prospects.

步态识别的目的是通过视觉相机识别远处的人。随着深度学习的出现,步态识别的重大进展通过利用深度学习技术在许多场景中取得了鼓舞人心的成功。然而,日益增长的视频监控需求带来了更多的挑战,包括各种变异下的鲁棒性识别、步态序列中的运动信息建模、协议变异导致的不公平性能比较、生物识别安全和隐私保护。本文对步态识别的深度学习进行了全面研究。我们首先介绍了步态识别从传统算法到深度模型的发展历程,提供了步态识别系统整个工作流程的明确知识。然后,从深度表征和架构的角度对步态识别的深度学习进行了深入总结。具体来说,深度步态表征被分为静态和动态特征,而深度架构包括单流和多流架构。按照我们提出的具有新颖性的分类法,它可以为深度步态识别提供思路并促进其发展。此外,我们还对所有基于视觉的步态数据集和性能分析进行了全面总结。最后,文章讨论了一些具有重大潜在前景的开放问题。

论文:Identifying and Combating Bias in Segmentation Networks by leveraging multiple resolutions

论文标题:Identifying and Combating Bias in Segmentation Networks by leveraging multiple resolutions

论文时间:29 Jun 2022

所属领域:计算机视觉

论文地址arxiv.org/abs/2206.14…

代码实现github.com/Deep-MI/Fas…

论文作者:Leonie Henschel, David Kügler, Derek S Andrews, Christine W Nordahl, Martin Reuter

论文简介:We analyse how this resolution-bias in the data distribution propagates to systematically biased predictions for group L at higher resolutions./我们分析了数据分布中的这种分辨率偏差是如何在更高的分辨率下传播到对L组的系统性偏差预测的。

论文摘要:Exploration of bias has significant impact on the transparency and applicability of deep learning pipelines in medical settings, yet is so far woefully understudied. In this paper, we consider two separate groups for which training data is only available at differing image resolutions. For group H, available images and labels are at the preferred high resolution while for group L only deprecated lower resolution data exist. We analyse how this resolution-bias in the data distribution propagates to systematically biased predictions for group L at higher resolutions. Our results demonstrate that single-resolution training settings result in significant loss of volumetric group differences that translate to erroneous segmentations as measured by DSC and subsequent classification failures on the low resolution group. We further explore how training data across resolutions can be used to combat this systematic bias. Specifically, we investigate the effect of image resampling, scale augmentation and resolution independence and demonstrate that biases can effectively be reduced with multi-resolution approaches.

对偏见的探索对医疗环境中深度学习管道的透明度和适用性有重大影响,但迄今为止还没有得到充分的研究。在本文中,我们考虑了两个独立的小组,其训练数据只在不同的图像分辨率下可用。对于H组来说,可用的图像和标签都是首选的高分辨率,而对于L组来说,只有低分辨率数据存在。我们分析了数据分布中的这种分辨率偏差是如何传播到L组在高分辨率下的系统性偏差预测的。我们的结果表明,单一分辨率的训练设置会导致体积组差异的显著损失,从而转化为DSC测量的错误分割以及随后低分辨率组的分类失败。我们进一步探讨如何利用跨分辨率的训练数据来消除这种系统性的偏差。具体来说,我们研究了图像重采样、比例增强和分辨率独立性的影响,并证明了多分辨率方法可以有效地减少偏差。

论文:Shifts 2.0: Extending The Dataset of Real Distributional Shifts

论文标题:Shifts 2.0: Extending The Dataset of Real Distributional Shifts

论文时间:30 Jun 2022

所属领域:计算机视觉

对应任务:Autonomous Driving,Image Classification,无人驾驶,图像分类

论文地址arxiv.org/abs/2206.15…

代码实现github.com/shifts-proj…

论文作者:Andrey Malinin, Andreas Athanasopoulos, Muhamed Barakovic, Meritxell Bach Cuadra, Mark J. F. Gales, Cristina Granziera, Mara Graziani, Nikolay Kartashev, Konstantinos Kyriakopoulos, Po-Jui Lu, Nataliia Molchanova, Antonis Nikitakis, Vatsal Raina, Francesco La Rosa, Eli Sivena, Vasileios Tsarsitalidis, Efi Tsompopoulou, Elena Volf

论文简介:This creates a need to be able to assess how robustly ML models generalize as well as the quality of their uncertainty estimates./这就需要能够评估ML模型泛化的稳健程度,以及其不确定性估计的质量。

论文摘要:Distributional shift, or the mismatch between training and deployment data, is a significant obstacle to the usage of machine learning in high-stakes industrial applications, such as autonomous driving and medicine. This creates a need to be able to assess how robustly ML models generalize as well as the quality of their uncertainty estimates. Standard ML baseline datasets do not allow these properties to be assessed, as the training, validation and test data are often identically distributed. Recently, a range of dedicated benchmarks have appeared, featuring both distributionally matched and shifted data. Among these benchmarks, the Shifts dataset stands out in terms of the diversity of tasks as well as the data modalities it features. While most of the benchmarks are heavily dominated by 2D image classification tasks, Shifts contains tabular weather forecasting, machine translation, and vehicle motion prediction tasks. This enables the robustness properties of models to be assessed on a diverse set of industrial-scale tasks and either universal or directly applicable task-specific conclusions to be reached. In this paper, we extend the Shifts Dataset with two datasets sourced from industrial, high-risk applications of high societal importance. Specifically, we consider the tasks of segmentation of white matter Multiple Sclerosis lesions in 3D magnetic resonance brain images and the estimation of power consumption in marine cargo vessels. Both tasks feature ubiquitous distributional shifts and a strict safety requirement due to the high cost of errors. These new datasets will allow researchers to further explore robust generalization and uncertainty estimation in new situations. In this work, we provide a description of the dataset and baseline results for both tasks.

分布性转变,或训练和测试数据之间的不匹配,是机器学习在高风险的工业应用中使用的一个重大障碍,如自动驾驶和医学。这就需要能够评估ML模型泛化的稳健程度,以及其不确定性估计的质量。标准的ML基线数据集不允许评估这些属性,因为训练、验证和测试数据往往是相同的分布。最近,出现了一系列专门的基准,具有分布匹配和移位的数据。在这些基准中,Shifts数据集在任务的多样性和数据模式方面非常突出。虽然大多数基准在很大程度上由二维图像分类任务主导,但Shifts包含了结构化天气预报、机器翻译和车辆运动预测任务。这使得模型的鲁棒性特性可以在一系列不同的工业规模的任务中得到评估,并得出普遍的或直接适用于特定任务的结论。在本文中,我们用两个数据集扩展了Shifts数据集,这两个数据集来自于具有高度社会重要性的工业、高风险应用。具体来说,我们考虑了三维磁共振脑图像中白质多发性硬化症病变的分割任务和海运货船的耗电量估计。这两项任务的特点是无处不在的分布性变化和由于错误的高成本而产生的严格安全要求。这些新的数据集将使研究人员能够进一步探索新形势下的稳健概括和不确定性估计。在这项工作中,我们提供了对这两项任务的数据集和基线结果的描述。

论文:BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing

论文标题:BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing

论文时间:30 Jun 2022

所属领域:自然语言处理

对应任务:Language Modelling,Multi-Task Learning,语言模型,多任务学习

论文地址arxiv.org/abs/2206.15…

代码实现github.com/bigscience-…

论文作者:Jason Alan Fries, Leon Weber, Natasha Seelam, Gabriel Altay, Debajyoti Datta, Samuele Garda, Myungsun Kang, Ruisi Su, Wojciech Kusa, Samuel Cahyawijaya, Fabio Barth, Simon Ott, Matthias Samwald, Stephen Bach, Stella Biderman, Mario Sänger, Bo wang, Alison Callahan, Daniel León Periñán, Théo Gigant, Patrick Haller, Jenny Chim, Jose David Posada, John Michael Giorgi, Karthik Rangasai Sivaraman, Marc Pàmies, Marianna Nezhurina, Robert Martin, Michael Cullan, Moritz Freidank, Nathan Dahlberg, Shubhanshu Mishra, Shamik Bose, Nicholas Michio Broad, Yanis Labrak, Shlok S Deshmukh, Sid Kiblawi, Ayush Singh, Minh Chien Vu, Trishala Neeraj, Jonas Golde, Albert Villanova del Moral, Benjamin Beilharz

论文简介:Training and evaluating language models increasingly requires the construction of meta-datasets --diverse collections of curated data with clear provenance./训练和评估语言模型越来越需要构建元数据集--具有明确出处的各种数据集合。

论文摘要:Training and evaluating language models increasingly requires the construction of meta-datasets --diverse collections of curated data with clear provenance. Natural language prompting has recently lead to improved zero-shot generalization by transforming existing, supervised datasets into a diversity of novel pretraining tasks, highlighting the benefits of meta-dataset curation. While successful in general-domain text, translating these data-centric approaches to biomedical language modeling remains challenging, as labeled biomedical datasets are significantly underrepresented in popular data hubs. To address this challenge, we introduce BigBIO a community library of 126+ biomedical NLP datasets, currently covering 12 task categories and 10+ languages. BigBIO facilitates reproducible meta-dataset curation via programmatic access to datasets and their metadata, and is compatible with current platforms for prompt engineering and end-to-end few/zero shot language model evaluation. We discuss our process for task schema harmonization, data auditing, contribution guidelines, and outline two illustrative use cases: zero-shot evaluation of biomedical prompts and large-scale, multi-task learning. BigBIO is an ongoing community effort and is available at github.com/bigscience-…

训练和评估语言模型越来越需要构建元数据集--具有明确出处的各种数据集。最近,自然语言提示通过将现有的、有监督的数据集转化为多种多样的新的预训练任务,突出了元数据集的好处,从而改善了零样本泛化。虽然在一般领域的文本中取得了成功,但将这些以数据为中心的方法转化为生物医学语言建模仍然具有挑战性,因为标记的生物医学数据集在流行的数据中心中的代表性明显不足。为了应对这一挑战,我们介绍了BigBIO,这是一个由126个以上生物医学NLP数据集组成的社区库,目前涵盖了12个任务类别和10种以上的语言。BigBIO通过对数据集及其元数据的程序化访问,促进了可重复的元数据集,并与目前的平台兼容,用于提示工程和端到端少/零样本的语言模型评估。我们讨论了我们的任务模式协调、数据审计和贡献指南的过程,并概述了两个说明性的使用案例:生物医学提示的零样本评估和大规模多任务学习。BigBIO是一个持续的社区努力 github.com/bigscience-…

我们是 ShowMeAI,致力于传播AI优质内容,分享行业解决方案,用知识加速每一次技术成长!点击查看 历史文章列表,在公众号内订阅话题 #ShowMeAI资讯日报,可接收每日最新推送。点击 专题合辑&电子月刊 快速浏览各专题全集。

猜你喜欢

转载自juejin.im/post/7116734757253169166
今日推荐