Python is a fun and practical language. It can not only crawl, analyze, and visualize, but also apply to data mining and artificial intelligence. There are many small projects based on Python. I have shared them twice before. They are liked by many students on CSDN and public accounts. This is beyond my expectations. At the same time, I encourage myself to have time to share some practical dry goods.
Not much to say, today I recommend 5 high-quality Python projects for everyone, they are:
- missingno
- streamlit
- pyecharts
- qdm
- pyxelate
missingno
missingno is a visualization tool for missing data in Python. Data is the most important part of our work, there is no one. Missing data is our most important task in data quality verification.
Many students who have just started working in the industry began to focus on the research and development of algorithms when they first came into contact with the project. After a lot of hard work, they found that the effect did not meet expectations. When they looked back on the problem, they found that the data was really serious. And in this process A lot of time and energy have been wasted in
If we verify the quality of the data before starting research and development, and visualize the missing situation, we can avoid unnecessary waste of manpower.
installation method
pip install missingno
1. Matrix
The msno.matrix invalid matrix is a data-intensive display, which allows you to quickly and intuitively select the pattern data to complete.
If you want to process time series data, you can use keyword parameters to specify periodic freq
>>> null_pattern = (np.random.random(1000).reshape((50, 20)) > 0.5).astype(bool)
>>> null_pattern = pd.DataFrame(null_pattern).replace({
False: None})
>>> msno.matrix(null_pattern.set_index(pd.period_range('1/1/2011', '2/1/2
2. The bar chart
msno.bar is a simple visualization of invalidity by column:
msno.bar(collisions.sample(1000))
3. Tree diagram
Through the tree diagram, you can more comprehensively correlate variables to reveal the deeper trend:
msno.dendrogram(collisions)
streamlit
Streamlit allows you to create applications for your machine learning projects using seemingly simple Python scripts. It supports hot reload, so when you edit and save files, your application will update in real time. No need to process HTTP requests, HTML, JavaScript, etc. All you need is your favorite editor and browser. Take a look at the actual effect of Streamlit:
installation
pip install streamlit
streamlit hello
pyecharts
In Python development, when you mention drawing, you should think of matplotlib in all likelihood. It is an old and powerful drawing library. However, there are some drawbacks in the process of using it, for example, it is not suitable for offline viewing and the supported drawing interface is relatively single. .
Echarts is a data visualization open sourced by Baidu. With good interactivity and exquisite chart design, it has been recognized by many developers. And Python is an expressive language, very suitable for data processing. When data analysis met data visualization, pyecharts was born. It can save the drawing result as an html file, can display the drawing result dynamically, and can open and view at any time. In addition, it supports very rich drawing types.
tqdm
tqdm is a Python progress bar tool. If I first started learning Python, I would dismiss it. The programming language itself has not been understood yet. Why use these fancy things? Simply tasteless!
However, when the development project was discovered after a long time, it has irreplaceable value. For example, when there is a problem in the database, with this progress bar, we can see our running process at a glance.
pyxelate
pyxelate is a tool for generating image pixel art photos. It down-samples the image and then combines unsupervised learning to generate a palette of synthetic clothes pixel images.
installation
pip install git+https://github.com/sedthh/pyxelate.git
Example
from pyxelate import Pyxelate
from skimage import io
import matplotlib.pyplot as plt
img = io.imread("blade_runner.jpg")
height, width, _ = img.shape
factor = 14
colors = 6
dither = True
p = Pyxelate(height // factor, width // factor, colors, dither)
img_small = p.convert(img) # convert an image with these settings
_, axes = plt.subplots(1, 2, figsize=(16, 16))
axes[0].imshow(img)
axes[1].imshow(img_small)
plt.show()
Recommended reading
- Python combat | One-click export WeChat reading records and notes
- It smells so good! Python can make the drawn pictures move!
- Mosaic becomes HD in seconds, this method called PULSE is on fire
- Make a visual travel strategy, travel after the epidemic
- GitHub Hot List|5 high-quality Python gadgets, the last one is a welfare!
For more exciting content, follow the WeChat public account "Python learning and data mining"
In order to facilitate technical exchanges, this account has opened a technical exchange group. If you have any questions, please add a small assistant WeChat account: connect_we. Remarks: The group is from CSDN, welcome to reprint, favorites, codewords are not easy, like the article, just like it! Thanks