Label extraction and data classification in data preprocessing - コードワールド

Label extraction and data classification in data preprocessing

開発 2023-09-15 19:42:40 訪問数: null

The .csv file is processed. The first thing to do is to read the table data in the .csv file, using the read_csv function in pandas.

So the question is, what is the return value of reading the file in this way?

Let’s output it:

It is found that it is DataFrame type data, so what exactly is this data type?

After searching for information, the novice tutorial explained this: DataFrame is a tabular data structure, which contains a set of ordered columns, and each column can be a different value type (numeric value, string, Boolean value). DataFrame has both row and column indexes, and it can be viewed as a dictionary composed of Series (shared with one index).

Pandas data structure – DataFrame | Novice Tutorial (runoob.com)

Then the problem comes, I don’t remember the two data structures of Series and dictionary very well, so I will continue to learn.

Data structure - Series:

Pandas Data Structure – Series | Novice Tutorial (runoob.com)

Data Structure - Dictionary: Curly Brace

Python Dictionary (Dictionary) | Newbie Tutorial (runoob.com)

Next, we need to separate the data in the table according to labels, because some labels have too little data and need to expand the label data.

Since the label data of the data set is in the last column, shape[1] in numpy is used to read the number of columns, and shape[0] reads the number of rows, for two-dimensional data. shape[1]-1 is the column index number, because the index number starts from 0.

It is necessary to extract the tags into an unordered set of non-repeating elements. First use the iloc function to extract all the tag columns, remove the index from values, ravel converts the multi-dimensional array into a one-dimensional array, and then use the set function to create an unordered set of non-repeating elements. .

Separating the data by labels is:

The index of the label data group is the index. Add all the data of the label in the new array, and then resave the new data into a .csv file with the DataFrame data type according to the label number.

おすすめ

転載: blog.csdn.net/qq_46012097/article/details/129280474

Label extraction and data classification in data preprocessing

Data preprocessing—sliding window sampling data

Data classification and classification data identification - to realize the identification of some sensitive data

[Data preprocessing] Data dimensionality reduction and feature extraction

Target detection data preprocessing - grid puzzles of various image sizes with labels

Ultra-detailed Tokenizer - text training data preprocessing

MySQL Data Model and Common Syntax and Classification

python tool method 44 data simulation generation (paste target slices to the background image, data label verification)

【Image captioning】File preprocessing of Meshed-memory transformer’s own data set

Geospatial Analysis 2 – A critical step in optimizing geospatial analysis: A closer look at data cleaning and preprocessing

Smart waste classification big data visualization supervision system

Java code implements recursive query tree classification data

SG Former in practice: training a custom classification data set

[Sklearn] Data classification prediction based on multilayer perceptron algorithm (Excel can directly replace data)

(Thesis plus source code) EEG emotion recognition based on DEAP EEG data set (using generative adversarial networks for feature extraction and data expansion)

Data feature extraction and classification based on Autoencoder

C# actual combat: Realize enterprise certificate identification and data extraction practice based on Tencent OCR technology

I can’t imagine how cool it would be to automatically label training data.

mmcls multi-label classification practice (2): resnet multi-label classification

Transformers data preprocessing: Preprocessing data

[Network security takes you to practice reptiles-100 exercises] Practice 9: Post submission/extraction of json data packets

Atmospheric Chemistry Online Coupling Model (WRF/Chem)丨Environmental configuration, emission source production, simulation result extraction, data visualization

MATLAB Realization of Data Analysis Based on K-Means Clustering and Support Vector Machine Classification

GEE case: Comparison of built-up areas using two periods of high-resolution land classification data

ax.plot( Xs, [0] * FLAGS.data_length, 'b', lw=3, alpha=0.5, label='steering') の意味

Data preprocessing and data enhancement

Data specification for data preprocessing

Multivariate classification prediction | Matlab based on TSOA-CNN-GRU-Attention Transit optimized convolutional neural network combined with gated recurrent unit fusion attention mechanism for data classification prediction

[ElM Classification] Optimizing Extreme Learning Machine NGO-ElM Data Classification Based on Matlab Northern Goshawk Algorithm [Including Matlab Source Code Issue 3118]

data extraction

おすすめ

Microsoft、オープンソースフォント Cascadia Code のメジャーアップデートをリリース

ランキング

iOS访问https无数据，访问http正常

【转载】QNET：腾讯WeTest的APP弱网络测试专家

Win10システムの後、キーボードの故障を復元

コンピュータ構成の原則：メモリ

コンピュータネットワーク - DNS ドメイン名解決プロセス

mongoの基本操作 --- ドキュメントの追加、削除、変更、クエリ

ドッキングウィンドウ・コンで接続TomcatとMySQLの

カランボラPython上級講義13ループ（1）トラバーサル

캐릭터 인터랙션 보너스 표시의 Unity 구현

設置環境はUbuntuのシステムの下での競合に依存遭遇しました

アーカイブ

もっと

2024-05-27(1)

2024-05-26(0)

2024-05-25(1)

2024-05-24(11)

2024-05-23(35)

2024-05-22(9)

2024-05-21(36)

2024-05-20(5)

2024-05-19(0)

2024-05-18(30)