Python teaches you to use Rows to quickly manipulate csv files

Rows is a third-party Python module dedicated to manipulating tables.

As long as the csv file is read through Rows, she can generate Python objects that can be calculated.

Compared with pandas' pd.read_csv, I think the advantage of Rows lies in its easy-to-understand calculation syntax and various convenient export and conversion syntax. It can easily extract text in pdf, convert csv to sqlite files, merge csv, etc., and can also execute sql syntax on csv files, which is quite powerful.

Of course, its influence is definitely not as big as Pandas, but let's get to know it, so you don't have too many skills.

1. Prepare

Before starting, you need to make sure that Python and pip have been successfully installed on your computer. If not, you can visit this article: Super detailed Python installation guide  for installation.

(Optional 1)  If you use Python for data analysis, you can install Anaconda directly: Anaconda, a good helper for Python data analysis and mining , has built-in Python and pip.

(Optional 2)  In addition, it is recommended that you use the VSCode editor, which has many advantages: The best partner for Python programming—VSCode detailed guide .

Please choose one of the following ways to enter the command to install dependencies :
1. Open Cmd (Start-Run-CMD) in the Windows environment.
2. Open Terminal in the MacOS environment (command+space to enter Terminal).
3. If you are using VSCode editor or Pycharm, you can directly use the Terminal at the bottom of the interface.

pip install rows

2. Basic use

Through the following small example, you can know the basic usage of Rows.

Suppose we have a csv table data like this:

state,city,inhabitants,area
AC,Acrelândia,12538,1807.92
AC,Assis Brasil,6072,4974.18
AC,Brasiléia,21398,3916.5
AC,Bujari,8471,3034.87
AC,Capixaba,8798,1702.58
[...]
RJ,Angra dos Reis,169511,825.09
RJ,Aperibé,10213,94.64
RJ,Araruama,112008,638.02
RJ,Areal,11423,110.92
RJ,Armação dos Búzios,27560,70.28
[...]

If we want to find the cities whose state is RJ and whose population is greater than 500,000, just do:

import rows

cities = rows.import_from_csv("data/brazilian-cities.csv")
rio_biggest_cities = [
    city for city in cities
    if city.state == "RJ" and city.inhabitants > 500000
]
for city in rio_biggest_cities:
    density = city.inhabitants / city.area
    print(f"{city.city} ({density:5.2f} ppl/km²)")

It is very similar to Pandas, but the syntax is simpler than Pandas, and the whole module is lighter than Pandas.

If you want to create a new "table" yourself, you can write:

from collections import OrderedDict
from rows import fields, Table


country_fields = OrderedDict([
    ("name", fields.TextField),
    ("population", fields.IntegerField),
])

countries = Table(fields=country_fields)
countries.append({"name": "Argentina", "population": "45101781"})
countries.append({"name": "Brazil", "population": "212392717"})
countries.append({"name": "Colombia", "population": "49849818"})
countries.append({"name": "Ecuador", "population": "17100444"})
countries.append({"name": "Peru", "population": "32933835"})

Then you can iterate over it:

for country in countries:
    print(country)
# Result:
# Row(name='Argentina', population=45101781)
# Row(name='Brazil', population=212392717)
# Row(name='Colombia', population=49849818)
# Row(name='Ecuador', population=17100444)
# Row(name='Peru', population=32933835)
# "Row" is a namedtuple created from `country_fields`

# We've added population as a string, the library automatically converted to
# integer so we can also sum:
countries_population = sum(country.population for country in countries)
print(countries_population) # prints 357378595

It is also possible to export this table to CSV or any other supported format:

# 公众号:Python实用宝典
import rows
rows.export_to_csv(countries, "some-LA-countries.csv")

# html
rows.export_to_html(legislators, "some-LA-countries.csv")

Import from dictionary to rows object:

import rows

data = [
    {"name": "Argentina", "population": "45101781"},
    {"name": "Brazil", "population": "212392717"},
    {"name": "Colombia", "population": "49849818"},
    {"name": "Ecuador", "population": "17100444"},
    {"name": "Peru", "population": "32933835"},
    {"name": "Guyana", }, # Missing "population", will fill with `None`
]
table = rows.import_from_dicts(data)
print(table[-1]) # Can use indexes
# Result:
# Row(name='Guyana', population=None)

3. Command line tools

In addition to writing Python code, you can also directly use Rows command-line tools. Here are a few tools that may be used frequently.

Read the text in the pdf file and save it as a file:

# 需要提前安装: pip install rows[pdf]
URL="http://www.imprensaoficial.rr.gov.br/app/_edicoes/2018/01/doe-20180131.pdf"
rows pdf-to-text $URL result.txt # 保存到文件 显示进度条
rows pdf-to-text --quiet $URL result.txt # 保存到文件 不显示进度条
rows pdf-to-text --pages=1,2,3 $URL # 输出三页到终端
rows pdf-to-text --pages=1-3 $URL # 输出三页到终端(使用 - 范围符)

Convert csv to sqlite:

rows csv2sqlite \
     --dialect=excel \
     --input-encoding=latin1 \
     file1.csv file2.csv \
     result.sqlite

Merge multiple csv files:

rows csv-merge \
     file1.csv file2.csv.bz2 file3.csv.xz \
     result.csv.gz

Perform a sql search on the csv:

# needs: pip install rows[html]
rows query \
    "SELECT * FROM table1 WHERE inhabitants > 1000000" \
    data/brazilian-cities.csv \
    --output=data/result.html

For more functions, please refer to the Rows official documentation:

http://turicas.info/rows

This is the end of our article. If you like today's Python practical tutorial, please continue to pay attention to Python Practical Collection.

If you have any questions, you can reply in the background of the official account: join the group , answer the corresponding red letter verification information , and enter the mutual assistance group to ask.

Originality is not easy, I hope you can give me a thumbs up below and watch to support me to continue creating, thank you!

Click below to read the original text for a better reading experience

Python Practical Collection (pythondict.com)
is not just a collection.
Welcome to pay attention to the official account: Python Practical Collection

b0f125c8a178e0518cfab689a6b860f7.jpeg

Guess you like

Origin blog.csdn.net/u010751000/article/details/126635146