From .csv, read only or split into sections separated by "<string>" - Code World

From .csv, read only or split into sections separated by "<string>"

Others 2022-04-20 22:23:38 views: 0

JG89 :

I have a .csv file that is split in sections, each starting with < string > on a row of its own as in this example. This is followed by a set of columns and their respective rows of values. Columns are not consistent between sections.

< section1 ><br>
col1 col2 col3<br>
val1 val2 val3

< section2 ><br>
col3 col4 col5<br>
val4 val5 val6<br>
val7 val8 val9

...etc. Is there a way in which I can, either when the file's in .txt or .csv, import each section either: 1) into seperate dataframes? 2) into the same dataframe, but something like df[section][col]?

Many thanks!

sammywemmy :

Depending on the size of your csv, you could read in the entire file into Pandas and split the dataframe into multiple dataframes via a list comprehension.

data = '''ï»¿<Network>;;;;;;;;;;;;;;;;;;;;;
            Property;Value;;;;;;;;;;;;;;;;;;;;
            Title;;;;;;;;;;;;;;;;;;;;;
            Version;6.4;;;;;;;;;;;;;;;;;;;;
            ;;;;;;;;;;;;;;;;;;;;;
            <Sites>;;;;;;;;;;;;;;;;;;;;;
            Name;LocationCode;Longitude;Latitude;;;;;;;;;;...'''

df = pd.read_csv(StringIO(data), header=None)

create a list of dataframe names (the headers of each df)

df_names = df[0].str.extract(r'(<[a-zA-Z]+>)')[0].str.strip('<>').dropna().tolist()

find the indices for the headers regions = df.loc[df[0].str.contains(r'<[a-zA-Z]+')].index.tolist()

last_row = df.index[-1]

regions.append(last_row)

from more_itertools import windowed

create windows for each 'sub' dataframe

regions_window = list(windowed(regions,2))

the function helps with some cleanup during the dataframe extraction

def some_cleanup(df):
    df.columns = df.iloc[0].str.extract(r'(<[a-zA-z]+>)')[0].str.strip('<>')
    df = df.iloc[1:]
    return df

extract the dataframes

M = [df.loc[start:end].pipe(some_cleanup) for start,end in regions_window]

create a dict with the keys as the dataframe names

dataframe_dict = dict(zip(df_names,M))

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=27800&siteId=1

From .csv, read only or split into sections separated by "<string>"

[Tips] using the Python split () method of the string sections

How to split a string from the first space occurrence only Java

Cast string to array--Read data from CSV file

Redis separated from the master copy, read and write

MYSQL separated from the master copy, read and write

Matlab read csv matrix with numbers separated by spaces (or other special characters)

String use of Split (separated by a specific character, to extract the required information)

Split applications in C # partition string (C #, Split, separated, string) enumeration of the RegexOptions

how to store comma separated values in a String to CSV using java

Redis replication (Master / Slave), separated from the master copy, read and write

docker mysql configuration and is separated from the primary read and write django

Mysq separated from the main copy and read: the history of the most detailed! !

Split in java is separated by ".", "\", "|"

How to get substring from string without split?

Problems working with values from String.split

C++ uses the standard library to easily read data in csv comma-separated format

How to read CSV file into HashMap only for rows without blanks

How to read CSV file into HashMap only for rows without blanks

How to read CSV file into HashMap only for rows without blanks

Solve the problem that the string is split into many units when writing the string to the csv file using the CSV package

Split string only if BOTH the negative lookahead and negative lookbehind are statisfied

Parsing date as string from csv to pandas

python split multiple spaces separated

Met pythonde index (subscript slice to see if converted to uppercase v ... begin counting statistics count across the space to remove the partition strip separated split split string formatting (Method string) format s series into

Optimization MYSQL - sub-table and warehouses, separated from the master copy, read and write

Number plus string, separated

Lua separated format string

List to string comma separated

json go read from the file and the string conversion

Recommended

Ranking

vue project automated build tools 1.0, support for multi-page build

JDBC add, update, delete data

SpringCloud combat service in response to the decline tutorial series (Chapter IV)

A segmentation fault (core dumped) error occurs when C+11 compiles and calls the PCL library

Django achieve websocket

Go语言并发之道--笔记1

Three-tier structure and application

Zhejiang data structure after-school exercise practice two 7-2 Reversing Linked List (25 points)

Front-end interview brushing day9 (updated daily high-frequency inspection points for front-end interviews)

The difference between the shell of the single and double quote

Daily

More

2024-05-02(0)

2024-05-01(4)

2024-04-30(36)

2024-04-29(5)

2024-04-28(12)

2024-04-27(29)

2024-04-26(22)

2024-04-25(32)

2024-04-24(30)

2024-04-23(30)